Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I use it.

At first I was in love with it and now I just like it. It is a good tool for some things but it is not a universal replacement for everything database-like.

I like :

* futon (yes, I said it, I like couch because of a pretty GUI). I can quickly look at the database and debug what the problem is. Visibility into the database, the serialization format and the protocol are important to me.

* HTTP interface. Access it with curl or any other tools that speak HTTP.

* Documents are json. I like json.

* Can attach binary blobs to documents + have mime content types markers. Can serve this data to a web browser.

* Durable -- when my request finished, I know the data is on the disk.

* Map/Reduce -- I just like it. It makes sense to me.

Don't like:

* Not very good for high rate updates. Need to compact it periodically

* Not that fast. I know I can tweak it here and there, but by default it is on the slow end of the NoSQL DB speed range. I can live with this.

* MVCC. For some people this is a GoodThing for me it is a "Meh". Sometimes I would like to just update the document in place and overwrite the old one. Perhaps Redis or Mongo would be a better tool here. Now I just do it by repeatedly reading (on failure), updating _rev and then writing back.

* Map/Reduce -- Am I the only one who likes this? It is hard to get others onboard and start thinking of this instead of traditional queries.

Things I don't care about but others like:

* Offline replication. Others rave about it. I used it only a couple of times. Always afraid to end up conflicts silently buried in the DB history.

* Javascript -- don't really like it that much. I use the Python view server to write views. It is actually faster! :

http://packages.python.org/CouchDB/views.html



I agree on most of your points, but I use Node.js so the Javascript stuff is just great. I also like Couchapps so in any case, CouchDB is just fantastic for those.

I think people are tainted on Map/Reduce because they see how it is done with other NoSQL DBs like MongoDB where it doesn't store the indexes. The fact that CouchDB stores the views like it does, makes complex queries fast, and easy to use. Maybe there is a little initial overhead in writing the views, but after that it is efficient and leaves a lot of code out of the program, so you are getting the data you want in the way you want it.


I think people are tainted on Map/Reduce because they see how it is done with other NoSQL DBs like MongoDB where it doesn't store the indexes

Maybe I misunderstand what you mean by stored indexes, but MongoDB supports indexes (http://www.mongodb.org/display/DOCS/Indexes), it just doesn't require them so you can perform exploratory ad-hoc queries when you need to.


I used the wrong term, I apologize. From what I understand MongoDB doesn't store the results from Map/Reduce calculation and treats each query using Map/Reduce separately and on demand. Which means that if you have a large data set, your M/R might be too computationally demanding to run every time.

Where as with CouchDB, every time the view (M/R results) are accessed, it gets incrementally updated and stored. So even complex M/R queries on large datasets will be fast subsequently.


Just some clarification here as comparing MongoDB and CouchDB's Map/Reduce is a bit of "Apples and Oranges" as they are designed for different purposes.

While in CouchDB, all queries are created with Map/Reduce, in MongoDB Map/Reduce is designed for aggregation. There is a separate system for standard querying which performs much better than Map/Reduce (Before people jump on me here my performance comparison is Mongo queries to MONGO Map/Reduce. Not Couch Map/Reduce or Map/Reduce in general; Not looking to get into a benchmark discussion here, just feature clarification).

The two differ greatly---while Couch requires precalculation of "Views", MongoDB focuses on dynamic querying. Given CouchDB's use of Map/Reduce for these static views, the way they do the iterative addition of new data without re-reducing the entire dataset makes sense.

However, MongoDB doesn't require you to use Map/Reduce to run queries and its MapReduce not designed for day to day querying. Rather, one should use MongoDB Map/Reduce for data aggregation tasks.

Also notable is that in 1.8, MongoDB has added some additional functionality for those who are using Map/Reduce which allows you to merge output and build data across jobs. I did a write up on the changes a few weeks ago: http://blog.evilmonkeylabs.com/2011/01/27/MongoDB-1_8-MapRed...

(In the interest of Full Disclosure: I work for 10gen, the company behind MongoDB; I'm also working on a book about MongoDB.)


I think he meant that couchdb keeps stuff on disk (through btrees to be more precise) for the views so that when the corresponding database is updated, the map reduce does not run over the whole dataset, only on the new document.

As far as I know, this does not work on mondodb.

Personally, I simply do not see the point of map reduce for something like couch: the whole point is to be able to parallel jobs on multiple machines so that data do not need to move back and forth through the network. Debugging them is particularly painful, especially if you need to support non programmers (sql is used by many non-programmers people in my experience).


MongoDB uses btree indexes as well, it just doesn't use Map/Reduce for its regular querying like CouchDB does. See my above comment on differences as well as a writeup I did on new output options in MongoDB's M/R system.


The original part in couchdb design is using btree for views to hold temporary data for M/R jobs, as btree are indeed used by mongodb (and most db engines).

The new stuff for M/R in mongodb looks interesting, thanks for the update.


You're definitely not the only one who likes Map/Reduce - storing arbitrarily shaped json docs and then having a set of fixed views over these seems to be a very good map for how I like to think about application data. Having to rigidly decompose structures into a relational data model always gave me indigestion, which I used to feel guilty about, whereas I'm much happier using a document oriented database for my own projects.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: