There are three types of programmers in this world.
Terran Programmer
Ruggedy, the terran programmer gets shit done and is smart enough to make it work at every level. The code isn't sexy nor elegant, but it gets the job done and works well enough. Their tools are what-ever they can afford.
A terran programmer usually works best in a start-up or as a technical leader. A canonical example of a terran based company is 37 signals.
Zerg Programmer
The company matters most to the zerg programmer. They need their IDE (i.e. creep). Management needs to hire lots of them to ship even the most basic of products, but they can hire hordes to solve problems of scale. They depend on their queen vendor.
A zerg programmer works best as a cog in some corporate machinery, and they tend to use a Microsoft or Oracle products. Most offshore outsourcing company are an example of a zerg company.
Protoss Programmer
Shiny and advanced mathematics is the primary tool for the protoss; this greatly limits their numbers. They use languages like Lisp or ML to develop spectacular results, and they are free to use anything.
The protoss tend to cluster in academia until they have matured to the point where they have the insight that can power a company. For instance, Google's PageRank is a protoss insight that powers Google thus making Google a protoss company.
Moral
If you build a company, then you need to ultimately use people to get things done. You need to find the right people for the right job to get the company as a whole executing.
Each type of programmer has their pros and cons in a business, and the goal is to utilize and structure the company such that everyone works together effectively.
If we ignore (or worse argue) about differences, then we miss out on the potential to work together and build truly great things.
Sunday, December 26, 2010
Testing is not a waste of time, I don't know that your code works
I've been in the land of formal mathematics where all the equations are good and correct.
My venture into computing however is not so pure as I would have liked. I tried to be virtuous in this land by bring the gospel of proof and correctness, but I have failed. I have given into temptation of the wild out laws. I have abandoned static types in favor of dynamic types. I have abandoned super-planning waterfall methodologies to agile methodologies. I have abandoned my safe IDE for the raw ninja powers of nano/gedit. I have abandon formal techniques and am working on developing testing based techniques (relevant to my domains and projects).
Here are my thoughts on testing.
Interaction between Proof and Testing
If you can prove it, then you can test it. You know where it should work and where it shouldn't work. Thinking about testing first will give you hypothesis on what to prove. Once you get it working, do you need to prove it? Usually no; but you should have unit tests that enable you or your team to reasonably do the engineering needed to take it to production.
For instance, if you are writing PHP, then the academic portion of the brain can bang out the parser and interpreter in a couple of hours. You can bang out the proof in a day. The problem is that real engineering has to come in and optimize it for production. This means going down into the bowels of code and optimizing loops, changing data structures, finding code equivalences. Unit tests help engineering efforts undergo regression testing as the entire code base is optimized. As the code grows, the feasibility of a proof diminishes. Ideally thou, the proofs enabled the tests to be constructed such that "If all these tests pass, then this proof reasonable asserts the code is good"
Stupid Shit
When I dropped static typing, we hired a QA guy whose job was to use the product every day. So far, we don't having typing issues. We have buttons that don't work due to typos or overlapping divs. Sometimes, your code can be proved correct if it just works. Usually humans need to test this.
I think I can solve some of the issues with selenium IDE, but the time for me to solve it using that methodology combined with the opportunity cost of doing something else means I should I just hire some scrubs off the street to check buttons and check for other faults as well.
You and I are not in control
Let me introduce you to google maps and how they did versioning. I picked a URL for google maps that represents their bleeding edge version. One day, half of our clients didn't work any more. WTF? They updated to a new version and broke our shit. Fortunately, they also provided an archive of old versions and a little hacking into production server and it was working again. I didn't know this at the time, but fortunately Google knew enough about it and started archiving.
We live in a time of interdependence. I trust you to develop service XYZ, and you trust me to develop ABC. Either you or myself can fuck up, and those bridges need to be tested and measured. For instance, I have a server to proxy geo coding requests. I do this so I can enable workarounds when google fucks up (and they do about 0.01% of time which is why I have a table of 18 addresses that google couldn't geo code correctly).
When google started promoting version 3 of their api, I found out that a quick URL change didn't work for me. My tests however told me where it didn't work and I could treat my tests as a simple todo list and grind through the version change. As long as the tests measured every use case that I needed, then I was fine.
You need to develop tests for things that you don't control. You will have many black boxes and you need to test how you use them and your assumptions.
In summary
No matter how smart you are, I guarantee that you will eventually need testing because testing will find you and the business will depend on it. Refusing to test is just ego because a test will fail and that will compromise the integrity of your ego's self image of you as a perfect and heroic being. Once you accept this, allow your ego to rest and stay hidden during real engineering.
My venture into computing however is not so pure as I would have liked. I tried to be virtuous in this land by bring the gospel of proof and correctness, but I have failed. I have given into temptation of the wild out laws. I have abandoned static types in favor of dynamic types. I have abandoned super-planning waterfall methodologies to agile methodologies. I have abandoned my safe IDE for the raw ninja powers of nano/gedit. I have abandon formal techniques and am working on developing testing based techniques (relevant to my domains and projects).
Here are my thoughts on testing.
Interaction between Proof and Testing
If you can prove it, then you can test it. You know where it should work and where it shouldn't work. Thinking about testing first will give you hypothesis on what to prove. Once you get it working, do you need to prove it? Usually no; but you should have unit tests that enable you or your team to reasonably do the engineering needed to take it to production.
For instance, if you are writing PHP, then the academic portion of the brain can bang out the parser and interpreter in a couple of hours. You can bang out the proof in a day. The problem is that real engineering has to come in and optimize it for production. This means going down into the bowels of code and optimizing loops, changing data structures, finding code equivalences. Unit tests help engineering efforts undergo regression testing as the entire code base is optimized. As the code grows, the feasibility of a proof diminishes. Ideally thou, the proofs enabled the tests to be constructed such that "If all these tests pass, then this proof reasonable asserts the code is good"
Stupid Shit
When I dropped static typing, we hired a QA guy whose job was to use the product every day. So far, we don't having typing issues. We have buttons that don't work due to typos or overlapping divs. Sometimes, your code can be proved correct if it just works. Usually humans need to test this.
I think I can solve some of the issues with selenium IDE, but the time for me to solve it using that methodology combined with the opportunity cost of doing something else means I should I just hire some scrubs off the street to check buttons and check for other faults as well.
You and I are not in control
Let me introduce you to google maps and how they did versioning. I picked a URL for google maps that represents their bleeding edge version. One day, half of our clients didn't work any more. WTF? They updated to a new version and broke our shit. Fortunately, they also provided an archive of old versions and a little hacking into production server and it was working again. I didn't know this at the time, but fortunately Google knew enough about it and started archiving.
We live in a time of interdependence. I trust you to develop service XYZ, and you trust me to develop ABC. Either you or myself can fuck up, and those bridges need to be tested and measured. For instance, I have a server to proxy geo coding requests. I do this so I can enable workarounds when google fucks up (and they do about 0.01% of time which is why I have a table of 18 addresses that google couldn't geo code correctly).
When google started promoting version 3 of their api, I found out that a quick URL change didn't work for me. My tests however told me where it didn't work and I could treat my tests as a simple todo list and grind through the version change. As long as the tests measured every use case that I needed, then I was fine.
You need to develop tests for things that you don't control. You will have many black boxes and you need to test how you use them and your assumptions.
In summary
No matter how smart you are, I guarantee that you will eventually need testing because testing will find you and the business will depend on it. Refusing to test is just ego because a test will fail and that will compromise the integrity of your ego's self image of you as a perfect and heroic being. Once you accept this, allow your ego to rest and stay hidden during real engineering.
Labels:
technology
Sunday, December 19, 2010
17 Thoughts on Programming
- Your constants are your client’s variables.
- All software is layered like cake because no one can commit. Those that can commit, fail.
- Program's don't learn. Programmers just learn new tools.
- Eventually, your program becomes someone else’s function.
- Be one with the machine, and you will be annoyed by your code.
- The code you are working now that is special fits within someone else's general framework. In a month, you will have wished you knew about that framework.
- If you don't have any loops, then you haven't done anything except play with Legos. Why is it bad to play with legos?
- If you could communicate complexity, then it wouldn't be complex.
- Velocity induces complexity (either technical or managerial).
- Your software will be abused my criminal minds.
- One half of a business always builds top down, the other builds bottom up; the people doing it top down will get the credit.
- If your crappy code makes it a need to hire ten people, then at least feel good about the economy. Also, be the owner of the vending machine.
- It is fun to optimize, but it is hard to evolve; if you evolve, then you grow and find new things to optimize.
- Every language emits beauty, and every language emits horror. Choose wisely and cluster people appropriately.
- Sometimes, you will solve a real problem; most times, you will solve a problem at someone else's expense.
- Software as a Service is an infinite recursive chain of passing the buck. If you accept the buck, then you can keep it.
- The person that follows your steps probably has different designs, enable them to rebuild and learn from your work than force them into the same idioms. After all, they have to maintain it.
Labels:
technology
Saturday, December 18, 2010
Defer Deletion, Garbage Collection, and Bulk Undelete (using WIN + CouchDB)
I really, really, really hate providing the delete function. So, for WIN, I provide a delete that doesn't delete until 31 days have passed. It allows me to sleep at night and dream of unicorns.
I have an updator that will change the name space of the document and adds meta data to the document related to the deletion.
https://github.com/mathgladiator/win/blob/master/lib/win.config.js#L110
Then I provide a function to the environment to easily call the updator that looks and tastes like a delete:
https://github.com/mathgladiator/win/blob/master/lib/win.environment.js#L139
Every day, I have a cron job that looks at this indexer
https://github.com/mathgladiator/win/blob/master/lib/win.config.js#L118
and it kills them one by one. Now, I know that actual deletes will be done in 31 (or so) days.
What if I need to undo? Well, it is very easy to undo one element. If I would like to undo a whole bunch, I have to provide a common key to un-delete from. That's what the action parameter does. If I write a loop that deletes a bunch of stuff, then I need to build a fairly unique key that enables me to undo that batch delete.
I have an updator that will change the name space of the document and adds meta data to the document related to the deletion.
https://github.com/mathgladiator/win/blob/master/lib/win.config.js#L110
Then I provide a function to the environment to easily call the updator that looks and tastes like a delete:
https://github.com/mathgladiator/win/blob/master/lib/win.environment.js#L139
Every day, I have a cron job that looks at this indexer
https://github.com/mathgladiator/win/blob/master/lib/win.config.js#L118
and it kills them one by one. Now, I know that actual deletes will be done in 31 (or so) days.
What if I need to undo? Well, it is very easy to undo one element. If I would like to undo a whole bunch, I have to provide a common key to un-delete from. That's what the action parameter does. If I write a loop that deletes a bunch of stuff, then I need to build a fairly unique key that enables me to undo that batch delete.
Labels:
technology
Thursday, December 16, 2010
Understanding a sea of JSON with Map Reduce
CouchDB stores a lot of data in a sea of JSON, and it isn't exactly easy to get a good grasp on what there is.
For WIN, I force each object to have a name-space field called 'ns'; this enables me to partition the data and enable developers to partition the data. Ideally, this helps in keeping things separate.
A fundamental problem is that I want to have an idea of what it is in the data set and be able (and enable developers) to write appropriate documentation so everyone stays on the same page. I would also like data to adher to some kind of structural quality. However, it would be nice to be able to look for oddities that could become future support issues (it would also be nice if everyone used the same language and kept things consistent; I would rather nip inconsistencies in the bud earlier rather than later).
So, I flatten the structural qualities of each object and count them using this code (for CouchDB's incremental MapReduce).
http://pygments.org/demo/12753/ (alternative http://pastie.org/1384759 )
This enables me to grep the code base and then use blame to work with the developer to resolve oddities. Or, I can turn a blind eye because it isn't in a table that matters that much (i.e. meta data or user controlled data).
I can monitor this for changes daily to determine what is happening on development (where oddities first get introduced).
This mode of thinking enables me to think about unicorns when it comes to the database (oh, and never allowing anyone to delete; everything goes to trash with an trash_goes_out_on field that is set for 60 days in the future when it will be actually deleted).
For WIN, I force each object to have a name-space field called 'ns'; this enables me to partition the data and enable developers to partition the data. Ideally, this helps in keeping things separate.
A fundamental problem is that I want to have an idea of what it is in the data set and be able (and enable developers) to write appropriate documentation so everyone stays on the same page. I would also like data to adher to some kind of structural quality. However, it would be nice to be able to look for oddities that could become future support issues (it would also be nice if everyone used the same language and kept things consistent; I would rather nip inconsistencies in the bud earlier rather than later).
So, I flatten the structural qualities of each object and count them using this code (for CouchDB's incremental MapReduce).
http://pygments.org/demo/12753/ (alternative http://pastie.org/1384759 )
This enables me to grep the code base and then use blame to work with the developer to resolve oddities. Or, I can turn a blind eye because it isn't in a table that matters that much (i.e. meta data or user controlled data).
I can monitor this for changes daily to determine what is happening on development (where oddities first get introduced).
This mode of thinking enables me to think about unicorns when it comes to the database (oh, and never allowing anyone to delete; everything goes to trash with an trash_goes_out_on field that is set for 60 days in the future when it will be actually deleted).
Labels:
technology
Tuesday, December 14, 2010
Database Development Mistakes as NoSQL propaganda
Context
http://stackoverflow.com/questions/621884/database-development-mistakes-made-by-application-developers
Summary
All of these are consequences of using a one-size fits-all solution for storing your data. Fact is, application developers shouldn't worry about how they use data. They should be able to get their job done without worrying about the long-beard in the back room. I've been in this role, and I can sympathize with it.
Then, I realized something has to change. I took away SQL and built a very simple RESTful layer to the data layer, and then I watched how application developers solved their problems. I was amazed at their cleverness. Instead of saying "oh, these silly application developers are so dumb and don't know shit about databases", I said "I wonder how clever they could be if I just gave them memcached and simple get/put/by_index".
They taught me a thing or to about how awesome memcache can be (especially with cron-jobs).
Ideally, if you are building the data layer, then all you need to do to enable application developers is get the right complexity class out of the data. If you have ten billion things, then you need to provide the functions that get to a thousand things relevant to what the application developer needs to do. For bigger tasks, computations are best represented with MapReduce, and I feel that MapReduce is way easier to learn for fresh application developers. CouchDB's incremental MapReduce is by far the easiest to learn.
That being said, performance is always going to be an issue. If you enable developers this way, then you need to provide a realistic environment.
Related entry: Big Data enables Agile Data.
http://stackoverflow.com/questions/621884/database-development-mistakes-made-by-application-developers
Summary
- Not using appropriate indexes
- Not enforcing referential integrity
- Using natural rather than surrogate (technical) primary keys
- Writing queries that require DISTINCT to work
- Favouring aggregation over joins
- Not simplifying complex queries through views
- Not sanitizing input
- Not using prepared statements
- Not normalizing enough
- Normalizing too much
- Using exclusive arcs
- Not doing performance analysis on queries at all
- Over-reliance on UNION ALL and particularly UNION constructs
- Using OR conditions in queries
- Not designing their data model to lend itself to high-performing solutions
- Selfish database design and usage.
- Abusing denormalised data
- Scared of writing SQL
- Dogmatic 'No Stored Procedures' policies.
- Not understanding database design
- Not using version control on the database schema
- Working directly against a live database
- Not reading up and understanding more advanced database concepts (indexes, clustered indexes, constraints, materialized views, etc)
- Failing to test for scalability ... test data of only 3 or 4 rows will never give you the real picture of real live performance
- They only test on toy databases.
- Not using indexes.
- Not communicating with experienced DBAs.
- Poor Performance Caused by Correlated Subqueries
- Forgetting to set up relationships between the tables.
- Not using parameterized queries.
- Favoring "Elegant" code over highly performing code.
- Not doing the correct level of normalization.
- You want to make sure that data is not duplicated
- Using Excel for storing (huge amounts of) data.
- Unnecessarily using a function on a value in a where clause with the result of that index not being used.
- Not adding check constraints to ensure the validity of the data.
- Adding unnormalized columns to tables out of pure laziness or time pressure.
- not so much about the database per se but indeed annoying.
- Not taking advantage of CLUSTERED INDEXES
- Not using a SERIAL (autonumber) datatype as a PRIMARY KEY
- Not UPDATING STATISTICS on a table when many records have been INSERTED or DELETED.
All of these are consequences of using a one-size fits-all solution for storing your data. Fact is, application developers shouldn't worry about how they use data. They should be able to get their job done without worrying about the long-beard in the back room. I've been in this role, and I can sympathize with it.
Then, I realized something has to change. I took away SQL and built a very simple RESTful layer to the data layer, and then I watched how application developers solved their problems. I was amazed at their cleverness. Instead of saying "oh, these silly application developers are so dumb and don't know shit about databases", I said "I wonder how clever they could be if I just gave them memcached and simple get/put/by_index".
They taught me a thing or to about how awesome memcache can be (especially with cron-jobs).
Ideally, if you are building the data layer, then all you need to do to enable application developers is get the right complexity class out of the data. If you have ten billion things, then you need to provide the functions that get to a thousand things relevant to what the application developer needs to do. For bigger tasks, computations are best represented with MapReduce, and I feel that MapReduce is way easier to learn for fresh application developers. CouchDB's incremental MapReduce is by far the easiest to learn.
That being said, performance is always going to be an issue. If you enable developers this way, then you need to provide a realistic environment.
- Have a development server with more data than production and with a slower CPU (if you can't do this, then you the ability to connect to production in a read-only mode).
- Force them to profile their code (ab works very well for most situations)
- Work with business people to define how consistency should work
- Train them to do cache invalidation
Related entry: Big Data enables Agile Data.
Labels:
technology
Sunday, December 12, 2010
Why I gave up on static types
I like programming language theory and how to use typing to do some pretty impressive things, but I'm getting older now and I just don't give a shit about types for day to day stuff. I also gave up on object-orientated code. I also said F-U to relational database theory. Why?
Because people using your product don't give a shit about how it gets done. That's the reality. They don't care if you use assembler or JavaScript. They just don't. The question is: can you make people happy. The more important question is: can you sell? can your team sell? can your sales team make compromises to make the sell?
This last question is the question that I ponder about since it affects my profits. Do I want to put up some academic/aesthetic wall in front of a sale? Or, do I want to enable them to make a sell?
This is where all that rigidity breaks down and I ask a new question. Is this methodology or technology better for sales?
Static typing? No.
Object Orientation? No.
Relational Databases? No.
There is a lot of bull-shit technology out there (especially built on .NET or Java) that is simply a wall to sales. Now, it does depends on what you are doing, but ultimately it comes down to sales.
My issue with static types is that I can't add new members at run-time; nor does it propagate. Everything I do now is basically a giant JavaScript object that I pass around with JSON. I don't care what is in it. From a business point of view, I know that if everything in the system doesn't try to map the JSON into a static class, then I keep all the data; it just propagates. This enables me to change elements at the data store like adding a boolean named "my_sales_team_is_awesome_and_sold_a_feature_that_can_be_added_by_a_bool", then I can sleep knowing that the entire system will just deal with it and pass it along. I don't need to deploy a binary nor compile across an entire system to add a little bool.
My issue with object orientated code is that most of my stuff is non-inherited. I have things that can not be objects. While I do use the JavaScript object a bit, I don't use prototypes. I just treat it like a map and move on with my day. I don't give a shit about binding code to data; this is the worst possible thing you can do. I need all my data in a format that it is (a) obvious what it is and (b) easy to transform by looking from the outside. This is my data model guide line; if any idiot can look at the data and know what it means, then it is a good data model.
My issue with databases is the same as static types. I don't want to plan out how my data is going to look. I don't want to think. I want to be agile and just capture data and throw it into the database. I want to capture as much data as possible then organize it later. I don't want to think about normalizing which I can always break (show me your schema, and I will find a feature that will break it). I just want to put my data somewhere safe and have it replicate. This is why I use CouchDB. It's very relaxing.
Looking back at my life, I realize that I was wasting a bunch of time and energy trying to reach a goal with stupid means. My goal was to enable crazy fast development, and I achieved this goal by simply changing my outlook and aesthetics.
Having said that, I realize that there are reasons these things exist. If you need them, then you should use them. I love static types, but only for raw performance. There are performance patterns that can be implemented as a server that are very flexible, and those are important things to learn as they enable you to deploy safe services. The problem thou is always with specifics.
Oh, it also helps to have mastered grep and write code that enables grep to be useful; this is an amazing productivity boosts for when static types are actually very useful.
I haven't completely given up on types, I just now realize that their place is not where I would have liked it. If you look at my github, then you can probably tell where I've been spending my time in terms of type system.
That's right, I'm a node.js junkie. I just spent a weekend cutting a new version of my platform, and I have to say that I get amazing velocity with it. So much so that I can focus on leveling up my design rather than painting yet another bike shed.
Because people using your product don't give a shit about how it gets done. That's the reality. They don't care if you use assembler or JavaScript. They just don't. The question is: can you make people happy. The more important question is: can you sell? can your team sell? can your sales team make compromises to make the sell?
This last question is the question that I ponder about since it affects my profits. Do I want to put up some academic/aesthetic wall in front of a sale? Or, do I want to enable them to make a sell?
This is where all that rigidity breaks down and I ask a new question. Is this methodology or technology better for sales?
Static typing? No.
Object Orientation? No.
Relational Databases? No.
There is a lot of bull-shit technology out there (especially built on .NET or Java) that is simply a wall to sales. Now, it does depends on what you are doing, but ultimately it comes down to sales.
My issue with static types is that I can't add new members at run-time; nor does it propagate. Everything I do now is basically a giant JavaScript object that I pass around with JSON. I don't care what is in it. From a business point of view, I know that if everything in the system doesn't try to map the JSON into a static class, then I keep all the data; it just propagates. This enables me to change elements at the data store like adding a boolean named "my_sales_team_is_awesome_and_sold_a_feature_that_can_be_added_by_a_bool", then I can sleep knowing that the entire system will just deal with it and pass it along. I don't need to deploy a binary nor compile across an entire system to add a little bool.
My issue with object orientated code is that most of my stuff is non-inherited. I have things that can not be objects. While I do use the JavaScript object a bit, I don't use prototypes. I just treat it like a map and move on with my day. I don't give a shit about binding code to data; this is the worst possible thing you can do. I need all my data in a format that it is (a) obvious what it is and (b) easy to transform by looking from the outside. This is my data model guide line; if any idiot can look at the data and know what it means, then it is a good data model.
My issue with databases is the same as static types. I don't want to plan out how my data is going to look. I don't want to think. I want to be agile and just capture data and throw it into the database. I want to capture as much data as possible then organize it later. I don't want to think about normalizing which I can always break (show me your schema, and I will find a feature that will break it). I just want to put my data somewhere safe and have it replicate. This is why I use CouchDB. It's very relaxing.
Looking back at my life, I realize that I was wasting a bunch of time and energy trying to reach a goal with stupid means. My goal was to enable crazy fast development, and I achieved this goal by simply changing my outlook and aesthetics.
Having said that, I realize that there are reasons these things exist. If you need them, then you should use them. I love static types, but only for raw performance. There are performance patterns that can be implemented as a server that are very flexible, and those are important things to learn as they enable you to deploy safe services. The problem thou is always with specifics.
Oh, it also helps to have mastered grep and write code that enables grep to be useful; this is an amazing productivity boosts for when static types are actually very useful.
I haven't completely given up on types, I just now realize that their place is not where I would have liked it. If you look at my github, then you can probably tell where I've been spending my time in terms of type system.
That's right, I'm a node.js junkie. I just spent a weekend cutting a new version of my platform, and I have to say that I get amazing velocity with it. So much so that I can focus on leveling up my design rather than painting yet another bike shed.
Labels:
technology
Saturday, December 11, 2010
WIN is looking good; good enough to start documenting and testing more hard-core
Well, I put WIN into production. I learned that if you rely on unsupported couchdb code, then strange things happen since there is no debug code. I found a bug in node.js that I need to mock up and send to the node.js team. I learned that I don't like looking at more than 1K code.
I just spent half a day re-factoring and cleaning up win so it makes more sense, and I added crap comments. I also linted to look for stupid issues, so it looks a lot cleaner now.
So, now, I'm going to write the guide ultra-hard-core fashion. I am confident in the patterns that I am going to present, and I'm confident that the system can be hacked to get anything anyone would want.
I just spent half a day re-factoring and cleaning up win so it makes more sense, and I added crap comments. I also linted to look for stupid issues, so it looks a lot cleaner now.
So, now, I'm going to write the guide ultra-hard-core fashion. I am confident in the patterns that I am going to present, and I'm confident that the system can be hacked to get anything anyone would want.
Labels:
technology
Thursday, December 9, 2010
Why Mustache is for WIN
Mustache is a logic-less templating language. By being lacking in logic, it easily enables cross-language template interpretation. This is important for two reasons.
The key is to think of Mustache as just a simple HTML encoder over a giant JSON represention of the module, page, layout, etc. You will put in some silly things in the JSON, but in the end it will enable something very powerful in you architect around getting a giant a JSON object back.
Namely, it is very easy to automate testing on giant JSON files. That is, it is easier to script against JSON than junk HTML. For me and WIN, this is a fairly important question as I would like to be able to crawl my entire projects to look for errors.
- It protects work in constructing good DOM. This is true for many template languages, but it makes sure the assets are protected from language change.
- By enabling templates to work in multiple languages, you enable it to work it multiple contexts. For instance, if you have a search feature that you would like ajaxified, then you can work towards producing a JSON object. For SEO, you use the template to send off the HTML. For Ajax, you can just get the JSON object and do the JSON to HTML in the browser. Generally, JSON is more efficient to send over the wire when compared to HTML; ergo, you get a snappier response in addition to faster development time (by only writing one template and not worrying about DOM manipulation).
The key is to think of Mustache as just a simple HTML encoder over a giant JSON represention of the module, page, layout, etc. You will put in some silly things in the JSON, but in the end it will enable something very powerful in you architect around getting a giant a JSON object back.
Namely, it is very easy to automate testing on giant JSON files. That is, it is easier to script against JSON than junk HTML. For me and WIN, this is a fairly important question as I would like to be able to crawl my entire projects to look for errors.
Labels:
technology
Wednesday, December 8, 2010
Say Yes to Internet Censorship
Why?
Because it will make things worse.
When things are bad, talk begins of revolution.
Viva La RevoluciĆ³n
By the way, this was troll-bait. Just an experiment. Of course, there should be no censorship, but that is obvious to me. Is this not obvious to others???
Because it will make things worse.
When things are bad, talk begins of revolution.
Viva La RevoluciĆ³n
By the way, this was troll-bait. Just an experiment. Of course, there should be no censorship, but that is obvious to me. Is this not obvious to others???
Labels:
personal
Tuesday, December 7, 2010
3 reasons why I don't key off of email anymore.
For some of my clients, I built their stuff such that a user only needed an email and a password. Registration was easy and it was awesome. Now, I have introduced a login name back in. Here is why.
Emails Change
I had clients that lost jobs and they needed to change their email; well, that required writing a change email function. That's not pleasant because emails may already exist due to a prior sign up or a different use case.
People get fired, two employees at a company. One used their personal to sign up to product where as the other used their business. The one who used their personal got fired and the needs to transfer access to the other employee, but its already taken. So, either I have account merger or they manage multiple credentials. Never the less, they have to call in for support if we present an obstacle. Additionally, companies get bought and emails change.
By adding the level of indirection, I'm enabling them to handle these issues themselves rather than supporting it on our end.
Multiple Accounts per Email
If you enable a single email to manage multiple accounts, then you help them out companies that have different billable uses of your product. Otherwise, you require them to be able to setup multiple emails which just sucks.
Multiple Managers/Owners
If you focus on providing a single account, then you can enable your product to be managed by multiple people (or enable collaborative features). It is easy to key off of the account's login name to enable multiple users to access the account.
Emails Change
I had clients that lost jobs and they needed to change their email; well, that required writing a change email function. That's not pleasant because emails may already exist due to a prior sign up or a different use case.
People get fired, two employees at a company. One used their personal to sign up to product where as the other used their business. The one who used their personal got fired and the needs to transfer access to the other employee, but its already taken. So, either I have account merger or they manage multiple credentials. Never the less, they have to call in for support if we present an obstacle. Additionally, companies get bought and emails change.
By adding the level of indirection, I'm enabling them to handle these issues themselves rather than supporting it on our end.
Multiple Accounts per Email
If you enable a single email to manage multiple accounts, then you help them out companies that have different billable uses of your product. Otherwise, you require them to be able to setup multiple emails which just sucks.
Multiple Managers/Owners
If you focus on providing a single account, then you can enable your product to be managed by multiple people (or enable collaborative features). It is easy to key off of the account's login name to enable multiple users to access the account.
Labels:
technology
Sunday, December 5, 2010
Entrepreneurial Enlightenment and Insight
“Ideas are a dime a dozen” is a very stupid saying just like “work smarter than harder”.
Many of my math professors said that you know more or less all the math you are going to need to know, but you need to be able to communicate it (that is, you need to be able to communicate math). That’s the true goal of the master’s degree program.
Understanding “Ideas are a dime a dozen” is the same thing that many entrepreneurs know but can’t express in an effective manner. Communicating is harder than knowing.
Here is how you get it to knowing that “Ideas are a dime a dozen” (or at least, how I’m trying to try to sell it to you):
Sit in a business.
Watch the people.
Be creative on ways to make them awesome. What would make the business better? What could you sell them to make them better?
Do this every day, and you will have lots of ideas.
Once you have an idea, you need to be able to execute on it. Execution is the art of getting things done and progressing the state of a business from conception to cash flow.
I have a torrent of ideas, so I’m set for life in the idea category. Now, how can I execute. Here is a generic three step business plan.
Your goal is to maximize H-S.
If the market is small, then you can be a one super awesome consultant.
If the market is big and mundane, then you can build a company.
If the market is big and complex, then you can build a firm.
Once you have ideas, how do you execute in #1, #2, #3? How do you build it? Do you know someone that just builds things? How do you find customers? Do you know someone that can network or go door to door? How do you sell it? Do you know someone that can sell ice to an eskimo?
Once you can answer these questions, you can build a business. However, you must keep in mind that the people in the business are the only true asset it has. Do you have the right people doing the right jobs that they want to do? That last bit is basically my digested form of Execution: The Discipline of Getting Things Done (which is a good book, but you have to put yourself in their shoes to understand what they are saying. It’s not an easy book to read.)
Once you start to have 10+ ideas, you can either just blindly following them (I'm very guilty of this in my life as a coder), or you measure them and pick the best one. What is the best one? After-all you have your own H-S function. It could be by impact on the world, revenue, profit, job creation, or just plain fun.
I hope that I have communicated how to gain entrepreneurial insight into how to manufacture ideas. This is why I write my blog, so I can level up my communication capabilities. If you find yourself like me, knowing things but feeling an inability to express them, then you need to start writing now.
Many of my math professors said that you know more or less all the math you are going to need to know, but you need to be able to communicate it (that is, you need to be able to communicate math). That’s the true goal of the master’s degree program.
Understanding “Ideas are a dime a dozen” is the same thing that many entrepreneurs know but can’t express in an effective manner. Communicating is harder than knowing.
Here is how you get it to knowing that “Ideas are a dime a dozen” (or at least, how I’m trying to try to sell it to you):
Sit in a business.
Watch the people.
Be creative on ways to make them awesome. What would make the business better? What could you sell them to make them better?
Do this every day, and you will have lots of ideas.
Once you have an idea, you need to be able to execute on it. Execution is the art of getting things done and progressing the state of a business from conception to cash flow.
I have a torrent of ideas, so I’m set for life in the idea category. Now, how can I execute. Here is a generic three step business plan.
- Build it (Engineering)
- Find Customers (Marketing)
- Sell it (Sales)
Your goal is to maximize H-S.
If the market is small, then you can be a one super awesome consultant.
If the market is big and mundane, then you can build a company.
If the market is big and complex, then you can build a firm.
Once you have ideas, how do you execute in #1, #2, #3? How do you build it? Do you know someone that just builds things? How do you find customers? Do you know someone that can network or go door to door? How do you sell it? Do you know someone that can sell ice to an eskimo?
Once you can answer these questions, you can build a business. However, you must keep in mind that the people in the business are the only true asset it has. Do you have the right people doing the right jobs that they want to do? That last bit is basically my digested form of Execution: The Discipline of Getting Things Done (which is a good book, but you have to put yourself in their shoes to understand what they are saying. It’s not an easy book to read.)
Once you start to have 10+ ideas, you can either just blindly following them (I'm very guilty of this in my life as a coder), or you measure them and pick the best one. What is the best one? After-all you have your own H-S function. It could be by impact on the world, revenue, profit, job creation, or just plain fun.
I hope that I have communicated how to gain entrepreneurial insight into how to manufacture ideas. This is why I write my blog, so I can level up my communication capabilities. If you find yourself like me, knowing things but feeling an inability to express them, then you need to start writing now.
Labels:
business,
technology
Thursday, December 2, 2010
The problem only the best programmers can solve: trust
I just watched DHH's key note at the Ruby X Conf, and I must admit that DHH's concept of freedom was inspiring.
After thinking about it more, I know why. Programmers tend to be control freaks. I know I'm a control freak, and I'm slowly giving up control so I can get away from the computer and get outside more.
The central problem of team programming is trust.
Static typing tells me that I'm not trustful enough to keep to my own convention and keep my shit straight. Having worked in JavaScript for so long now, I don't even think about types. I like the benefits that static typing can provide (performance), but for day to day stuff, I don't care nor do I really think about it. If it does what I want, then I'm done.
Monkey patching is very interesting, and the same capabilities is present in JavaScript. I find it useful. For instance, in WIN, I needed a trim function for strings. Why doesn't JavaScript have a .trim function? I don't know, but I can extend it's prototype. I find this is very convenient. I can also plant bombs in the string prototype or the object prototype, but I don't want to do this. I need write to tests to test the basic assumptions about the code that I'm using.
When I defended lock based SCM, I was basically saying "I don't trust developers to work together". Now, we use mercurial and I don't worry about it. If developers have a conflict, then it is their responsibility to fix it, and it is management's goal to manage assignments such that conflict is rare.
That's the mentality that I've had to develop when switching from academia programming versus industry programming. When someone else breaks something, they have to fix it. Fucked up a type? You fix it. Broke the string prototype? You fix it. Got a conflict? You fix it. You did something stupid? Ok, we are human, now go fix it.
When people have the responsibility to fix things, the quality get better organically, and best of all. I'm usually left out of the picture for most of it.
Oh? you would like root password. Sure, that's fine. We have a root password ceremony (it has hooded capes and everything draconian with candles), but I trust them enough. If a developer fucks up prod, then they fix it.
The key to enabling DHH's freedom is empowering trust. The key to empowering trust is knowing how to protect liabilities. If you are able to take backups everyday, then you should. We do. You should also test the restore every week. We do. If you are unable to take backups, then you need some form of revision control and never ever delete anything.
When you are programming, the biggest liability you have is how you persist the state of the business. The next liability is how much you annoy your customers (i.e. infinite loop of sending emails = very bad). Once you figure out how to protect the company's ass and enable developer freedom, then you are golden.
I think that DHH's concept of freedom is an ultimate goal for the next ten years for both programmers and service providers in many industries. Some industries however, are always going to be control festivals simply because that is how that market works. I would not want a airplane control system written in ruby. I would rather it be done in OCaml with the most insane type system ever. Fortunately for the majority of programmers, these examples are in the minority. If you are in that minority, then you know enough about programming languages to build your own prison to protect the business.
I think DHH's sentiment on programming languages extends to databases, and this is why I work with and promote CouchDB. It just makes me happy.
After thinking about it more, I know why. Programmers tend to be control freaks. I know I'm a control freak, and I'm slowly giving up control so I can get away from the computer and get outside more.
The central problem of team programming is trust.
Static typing tells me that I'm not trustful enough to keep to my own convention and keep my shit straight. Having worked in JavaScript for so long now, I don't even think about types. I like the benefits that static typing can provide (performance), but for day to day stuff, I don't care nor do I really think about it. If it does what I want, then I'm done.
Monkey patching is very interesting, and the same capabilities is present in JavaScript. I find it useful. For instance, in WIN, I needed a trim function for strings. Why doesn't JavaScript have a .trim function? I don't know, but I can extend it's prototype. I find this is very convenient. I can also plant bombs in the string prototype or the object prototype, but I don't want to do this. I need write to tests to test the basic assumptions about the code that I'm using.
When I defended lock based SCM, I was basically saying "I don't trust developers to work together". Now, we use mercurial and I don't worry about it. If developers have a conflict, then it is their responsibility to fix it, and it is management's goal to manage assignments such that conflict is rare.
That's the mentality that I've had to develop when switching from academia programming versus industry programming. When someone else breaks something, they have to fix it. Fucked up a type? You fix it. Broke the string prototype? You fix it. Got a conflict? You fix it. You did something stupid? Ok, we are human, now go fix it.
When people have the responsibility to fix things, the quality get better organically, and best of all. I'm usually left out of the picture for most of it.
Oh? you would like root password. Sure, that's fine. We have a root password ceremony (it has hooded capes and everything draconian with candles), but I trust them enough. If a developer fucks up prod, then they fix it.
The key to enabling DHH's freedom is empowering trust. The key to empowering trust is knowing how to protect liabilities. If you are able to take backups everyday, then you should. We do. You should also test the restore every week. We do. If you are unable to take backups, then you need some form of revision control and never ever delete anything.
When you are programming, the biggest liability you have is how you persist the state of the business. The next liability is how much you annoy your customers (i.e. infinite loop of sending emails = very bad). Once you figure out how to protect the company's ass and enable developer freedom, then you are golden.
I think that DHH's concept of freedom is an ultimate goal for the next ten years for both programmers and service providers in many industries. Some industries however, are always going to be control festivals simply because that is how that market works. I would not want a airplane control system written in ruby. I would rather it be done in OCaml with the most insane type system ever. Fortunately for the majority of programmers, these examples are in the minority. If you are in that minority, then you know enough about programming languages to build your own prison to protect the business.
I think DHH's sentiment on programming languages extends to databases, and this is why I work with and promote CouchDB. It just makes me happy.
Labels:
technology
The influence of advanced mathematics on programming.
I'm selling many of my books on amazon, and as I was going through the books I realized that most of it was useful, but only useful an indirect way. I would to share some thoughts on how my studies in graduate level mathematics influences by day to day operations of building products, managing databases, and doing everything a free electron can do in a given day at a start-ups.
Topology
I think the best introductory book to topology is "Introduction to Topology by Crump W. Baker". Topology is basically the study of connectedness and surfaces. When studying topology, you think about how are things different. Are a donut and coffee cup the same? Well. Yes they are once you define what "same" means. There are practical programming challenges in topology in how once can process and do feature selection in computer vision. But, there are more mundane ways of applying topology.
For instance, a relational database is a topology in a discrete graph sense. How does this help me? Well, I'm about to do some stupid DELETEs and UPDATEs on a very large data set. Is the data set before and after the same in regards to current business value? Did I botch up? Topology comes in with the idea of topological invariants. A topological invariant is a quality that can be measured and is invariant under any continuous transformation (isomorphism).
If I were to write a query that measures the business value of the database (say, by the sum of the transactions, sum of paying accounts, and so forth), then I can use these to get a good sense of whether or not I botched up my changes by measuring before and after.
Algebra
If you take a bunch of things and make those things operate on each other, then you have an algebra. There are a lot of properties involved in what the operation implies, and the first year is basically dedicated to defining all those properties and understanding their significance. The ultimate results you typically end up looking at in a first or second year course are the unsolvable theorems (i.e. Doubling a cube).
My most immediate thought on how any of this has any practical bearing on programming is MapReduce. Algebra, in my mind, plays a huge role in how to think about designing algorithms in a MapReduce environment. Namely, the reduce phase where you think about merging. Given two or more documents, how do you reduce them to one? The algebraic properties are things that one must consider (and you may get them for free).
Analysis
This is my favorite branch of mathematics because it is the puzzle of inserting zeros and bounding values. I recommend anyone to check out The Cauchy-Schwarz Master Class: An Introduction to the Art of Mathematical Inequalities. It is an amazing little book that I go through every year to make sure I'm still smart enough to call myself a Mathematician. The most obvious application of this art is numerical analysis. However, most of the time, I don't need to do any numerical analysis since I work primarily on search problems these days.
Unfortunately, most people get the short end of the stick when they study calculus and get a very boiler plate version of the Calculus. I recommend Differential and Integral Calculus.
I take analysis concepts outside of code and into management. For instance, how can I measure the code and enforce a code quality metric to prevent SQL injection hacks? How can I enable developers to converge to a right answer under QA? What does QA need to do?
Proofs
I must admit that I was a stickler when it came to proofs since writing a proof is just as much fun as programming is to me. When anyone writes a program, they are writing a constructive proof that something exists. This begs the question of whether or not that something is what you want.
Does your program need a proof? Two years ago, I would have sad "absolutely". Now, I don't think so because proofs are kind of useless. The problem is that I have to understand enough about the formalism for the proof to make any sense. Well, the source code is a formalism of its own; in fact, a very precise formalism. The proof is already written.
Are proofs useless? I think going through the years of writing proofs has helped me write very good tests. I can look at the code, and know where the problem spots are going to be. Those trouble spots are going to need tests to ensure they work as expected. For instance, reliance on third party services always requires some kind of tests to ensure that updates are working as expected. Things I don't control are things I don't have a chance at proving, so I need tests that are automated and tested daily.
Problem Solving
I think the study of mathematics is probably the fastest way to build problem solving skills since you constantly fail and each failure costs nothing, but the caveat is you may not be solving practical problems. However, building it up as a skill enables you to be more effective at being a programmer.
Do you need advanced math?
Not really. Most math is basically a form of mental masturbation and building the mental discipline/stamina to sit down and think very hard. I think it makes you better in some ways, but there is an opportunity cost. It all depends on what you want to do. If you want to ship products, then you are probably fine to avoid it. If you want to make awesome libraries and sell them to product people, then you probable need some advanced math.
Topology
I think the best introductory book to topology is "Introduction to Topology by Crump W. Baker". Topology is basically the study of connectedness and surfaces. When studying topology, you think about how are things different. Are a donut and coffee cup the same? Well. Yes they are once you define what "same" means. There are practical programming challenges in topology in how once can process and do feature selection in computer vision. But, there are more mundane ways of applying topology.
For instance, a relational database is a topology in a discrete graph sense. How does this help me? Well, I'm about to do some stupid DELETEs and UPDATEs on a very large data set. Is the data set before and after the same in regards to current business value? Did I botch up? Topology comes in with the idea of topological invariants. A topological invariant is a quality that can be measured and is invariant under any continuous transformation (isomorphism).
If I were to write a query that measures the business value of the database (say, by the sum of the transactions, sum of paying accounts, and so forth), then I can use these to get a good sense of whether or not I botched up my changes by measuring before and after.
Algebra
If you take a bunch of things and make those things operate on each other, then you have an algebra. There are a lot of properties involved in what the operation implies, and the first year is basically dedicated to defining all those properties and understanding their significance. The ultimate results you typically end up looking at in a first or second year course are the unsolvable theorems (i.e. Doubling a cube).
My most immediate thought on how any of this has any practical bearing on programming is MapReduce. Algebra, in my mind, plays a huge role in how to think about designing algorithms in a MapReduce environment. Namely, the reduce phase where you think about merging. Given two or more documents, how do you reduce them to one? The algebraic properties are things that one must consider (and you may get them for free).
Analysis
This is my favorite branch of mathematics because it is the puzzle of inserting zeros and bounding values. I recommend anyone to check out The Cauchy-Schwarz Master Class: An Introduction to the Art of Mathematical Inequalities. It is an amazing little book that I go through every year to make sure I'm still smart enough to call myself a Mathematician. The most obvious application of this art is numerical analysis. However, most of the time, I don't need to do any numerical analysis since I work primarily on search problems these days.
Unfortunately, most people get the short end of the stick when they study calculus and get a very boiler plate version of the Calculus. I recommend Differential and Integral Calculus.
I take analysis concepts outside of code and into management. For instance, how can I measure the code and enforce a code quality metric to prevent SQL injection hacks? How can I enable developers to converge to a right answer under QA? What does QA need to do?
Proofs
I must admit that I was a stickler when it came to proofs since writing a proof is just as much fun as programming is to me. When anyone writes a program, they are writing a constructive proof that something exists. This begs the question of whether or not that something is what you want.
Does your program need a proof? Two years ago, I would have sad "absolutely". Now, I don't think so because proofs are kind of useless. The problem is that I have to understand enough about the formalism for the proof to make any sense. Well, the source code is a formalism of its own; in fact, a very precise formalism. The proof is already written.
Are proofs useless? I think going through the years of writing proofs has helped me write very good tests. I can look at the code, and know where the problem spots are going to be. Those trouble spots are going to need tests to ensure they work as expected. For instance, reliance on third party services always requires some kind of tests to ensure that updates are working as expected. Things I don't control are things I don't have a chance at proving, so I need tests that are automated and tested daily.
Problem Solving
I think the study of mathematics is probably the fastest way to build problem solving skills since you constantly fail and each failure costs nothing, but the caveat is you may not be solving practical problems. However, building it up as a skill enables you to be more effective at being a programmer.
Do you need advanced math?
Not really. Most math is basically a form of mental masturbation and building the mental discipline/stamina to sit down and think very hard. I think it makes you better in some ways, but there is an opportunity cost. It all depends on what you want to do. If you want to ship products, then you are probably fine to avoid it. If you want to make awesome libraries and sell them to product people, then you probable need some advanced math.
Labels:
technology
Subscribe to:
Posts (Atom)