Monday, November 15, 2010

How to use CouchDB? like this

CouchDB is a very interesting persistence package, and it solves 90% of the problems you find when you build a back-end for a web application. The 90% that CouchDB includes get/put/b-tree indexing/reliability; all this is good standard-stuff in the database world. I want to talk about the other 10% since crud is boring.

The last 10% is usually something like search; it is the novel algorithm that takes all your data and provides it in a meaningful way that makes your product awesome. CouchDB rarely solves this (neither do other packages). The more special the algorithm is, the more painful it will be to try to solve with CouchDB's MapReduce framework alone.

Fortunately, CouchDB has replication built in. I use the replication to push data from CouchDB to a custom server where I aggregate it into a meaningful service. The library is called Otto (short for ottoman).

The biggest problem you are going to have is what happens when your custom server crashes?

This can be solved by
  1. providing your own persistence, and deal with reliability
  2. not worrying about it and launch 3 servers with a custom HTTP server that replicates three ways; spend more money.
  3. don't care and re-replicate the entire data set, and have potentially non-trivial down-time.
All three of these options suck at some level, but 2 is where you will want to go. In the beginning thou, 3 is the best choice.

From a complexity standpoint, you can make your life easier by enabling your custom software to merge in bulk sets. This enables you to lazily run your algorithm as you collect a lot of data at once. I have found that this style of bulk insertions makes the third option feasible in many domains.

The nice thing about using CouchDB is that you don't need a schema. You don't need to "plan". If you need store data, then you just store it. Just give it a namespace and insert.

Please comment if you think I should write a book on CouchDB? I've done relational databases for years, and I've lurked in the CouchDB for a while. I'm currently building a web framework around node.js and CouchDB called WIN.

No comments:

Post a Comment