Comparing Mongo DB and Couch DB

We are getting a lot of questions "how are mongo db and couch different?"  It's a good question: both are document-oriented databases with schemaless JSON-style object data storage.  Both products have their place -- we are big believers that databases are specializing and "one size fits all" no longer applies.

We are not CouchDB gurus so please let us know in the forums if we have something wrong.

MVCC

One big difference is that CouchDB is MVCC  based, and MongoDB is more of a traditional update-in-place store.  MVCC is very good for certain classes of problems: problems which need intense versioning; problems with offline databases that resync later; problems where you want a large amount of master-master replication happening.  Along with MVCC comes some work too: first, the database must be compacted periodically, if there are many updates.  Second, when conflicts occur on transactions, they must be handled by the programmer manually (unless the db also does conventional locking -- although then master-master replication is likely lost).

MongoDB updates an object in-place when possible.  Problems requiring high update rates of objects are a great fit; compaction is not necessary. Mongo's replication works great but, without the MVCC model, it is more oriented towards master/slave and auto failover configurations than to complex master-master setups.  With MongoDB you should see high write performance, especially for updates.

Horizontal Scalability

One fundamental difference is that a number of Couch users use replication as a way to scale.  With Mongo, we tend to think of replication as a way to gain reliability/failover rather than scalability.  Mongo uses (auto) sharding as our path to scalabity (sharding is GA as of 1.6).  In this sense MongoDB is more like Google BigTable.  (We hear that Couch might one day add partitioning too.)

Query Expression

Couch uses a clever index building scheme to generate indexes which support particular queries.  There is an elegance to the approach, although one must predeclare these structures for each query one wants to execute.  One can think of them as materialized views.

Mongo uses traditional dynamic queries.  As with, say, MySQL, we can do queries where an index does not exist, or where an index is helpful but only partially so.  Mongo includes a query optimizer which makes these determinations.  We find this is very nice for inspecting the data administratively, and this method is also good when we don't want an index: such as insert-intensive collections.  When an index corresponds perfectly to the query, the Couch and Mongo approaches are then conceptually similar.  We find expressing queries as JSON-style objects in MongoDB to be quick and painless though.

Update Aug2011: Couch is adding a new query language "UNQL".

Atomicity

Both MongoDB and CouchDB support concurrent modifications of single documents.  Both forego complex transactions involving large numbers of objects.

Durability

CouchDB is a "crash-only" design where the db can terminate at any time and remain consistent.

Previous versions of MongoDB used a storage engine that would require a repairDatabase() operation when starting up after a hard crash (similar to MySQL's MyISAM). Version 1.7.5 and higher offer durability via journaling; specify the --journal command line option

Map Reduce

Both CouchDB and MongoDB support map/reduce operations.  For CouchDB map/reduce is inherent to the building of all views.  With MongoDB, map/reduce is only for data processing jobs but not for traditional queries.

Javascript

Both CouchDB and MongoDB make use of Javascript.  CouchDB uses Javascript extensively including in the building of views .

MongoDB supports the use of Javascript but more as an adjunct.  In MongoDB, query expressions are typically expressed as JSON-style query objects; however one may also specify a javascript expression as part of the query.  MongoDB also supports running arbitrary javascript functions server-side and uses javascript for map/reduce operations.

REST

Couch uses REST as its interface to the database.  With its focus on performance, MongoDB relies on language-specific database drivers for access to the database over a custom binary protocol.  Of course, one could add a REST interface atop an existing MongoDB driver at any time -- that would be a very nice community project.  Some early stage REST implementations exist for MongoDB.

Performance

Philosophically, Mongo is very oriented toward performance, at the expense of features that would impede performance.  We see Mongo DB being useful for many problems where databases have not been used in the past because databases are too "heavy".  Features that give MongoDB good performance are:

  • client driver per language: native socket protocol for client/server interface (not REST)
  • use of memory mapped files for data storage
  • collection-oriented storage (objects from the same collection are stored contiguously)
  • update-in-place (not MVCC)
  • written in C++

Use Cases

It may be helpful to look at some particular problems and consider how we could solve them.

  • if we were building Lotus Notes, we would use Couch as its programmer versioning reconciliation/MVCC model fits perfectly.  Any problem where data is offline for hours then back online would fit this.  In general, if we need several eventually consistent master-master replica databases, geographically distributed, often offline, we would use Couch.
  • mobile
    • Couch is better as a mobile embedded database on phones, primarily because of its online/offine replication/sync capabilities.
    • we like Mongo server-side; one reason is its geospatial indexes.
  • if we had very high performance requirements we would use Mongo.  For example, web site user profile object storage and caching of data from other sources.
  • for a problem with very high update rates, we would use Mongo as it is good at that because of its "update-in-place" design.  For example see updating real time analytics counters
  • in contrast to the above, couch is better when lots of snapshotting is a requirement because of its MVCC design.

Generally, we find MongoDB to be a very good fit for building web infrastructure.

Follow @mongodb

MongoDB Paris - Jun 14
MongoDB UK - Jun 20
MongoDC - June 26
MongoDB Sao Paulo - July 13


Labels

couchdb couchdb Delete
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.

PLEASE POST QUESTIONS IN THE USER GROUPS FORUM. Post non-question comments and helpful hints here.

blog comments powered by Disqus