Failing with MongoDB

Failing with MongoDB(blog.schmichael.com)

154 points by lenn0x 14 years ago | 122 comments

nomongo 14 years ago |

Why is a database that fails so easily and most of the time even loses data so popular? Is it really all just a huge marketing budget?

hello_moto 14 years ago | |

There are tons of reasons for that. Let me pull some of them from my butt:

Reason #1: Devs aren't Ops.

Reason #2: Devs need something new on their resume.

Reason #3: Certain type of Devs would read blogs and get excited and skipping scientific mumbo-jumbo and directly take the blogs as _the_ source of truth.

Reason #4: It's easy to bootstrap (schemaless, etc) your weekend project. Dealing with DB apparently is tedious for devs.

I'm sure others can add more...

Let me feel your love HN-ers ;)

viraptor 14 years ago | | |

Why do I get the feeling that you're an op and look down on development people? If that's really true, try to start developing some project and see how you like frequent schema changes, trying to synchronise schemas with peers, resolving relation issues when merging features, etc. On the other hand if you abstract your interaction with data enough, you can change the whole backend later once it's stable and not care about it up-front.

What I hear you saying is unfortunately - it's worse for ops, so noone should use it.

Devilboy 14 years ago | | |

Right there with you. The excuses I hear for not wanting to use a good 'ole RDBMS just does not make sense to me sometimes. CREATE TABLE too hard? Time consuming? Difficult?

Those who do not study the history of databases are doomed to repeat it. Soon we'll add back row-level write locks, transaction logging, schemas, multiple indexes and one day they wake up with MongoSQL.

japherwocky 14 years ago | | |

Well here's a fuck you back from a dev: my time is finite and everyone wants a piece of it; If I can save an hour a day by never having to think about my database? If I can shave a week or two of labor off a project?

It's really easy to work with. This is why people keep using it.

rbranson 14 years ago | |

10gen has focused strongly on ease of adoption, which seems like the highest priority of MongoDB at this point. From what I can tell, the idea is to get everyone using it, and then "scale" it once you've got people willing to pay out $ for fixes, but sometimes bad decisions made early on (like the global locks and in-place updates) are harder to change than originally thought.

stoneg 14 years ago | | |

yes, we are using mongo from 1.6.3. Reliability, Locking and Data security (not losing data) are never first priorities on their to-do list, they just push new features and busy doing marketing propaganda about how web-scale mongodb is(which is fake). I submitted a jira issue about losing data when sync a slave, it's already 3+ months, all they did is let me try the new releases to see if it fix the problem. I tried the latest 2.0.1 release, and it's still cause data loss. Every time I sync a new slave, I pray to god, hoping not lose data.

How come a DB lose data so frequently and it sill call itself web-scale? It just breaks when you need scale!

For auto-sharding it's also super unreliable, tried once and it failed, and now we are using a lib that do application level sharding. We are also considering move to other databases that at least know not losing data is the first and most important thing of a DB.

Some one summarized the issues of mongodb, http://pastebin.com/raw.php?i=FD3xe6Jt , we experienced most problems in the article. So just a remind for someone who want to create serious product using mongodb, read the article, it's not FUD, it's just so true that I hope I read it 1 year ago, so we don't have to try moving so much legacy data to a new database solution.

hello_moto 14 years ago | | |

Not a bad business strategy. Kinda like MySQL back then right?

rdtsc 14 years ago | |

I have been asking that too and I concluded that it is due to dishonest marketing. Up until a couple of months ago they basically shipped a database product with disabled singe server durability. That fact should have been written in bright flashing red letter warning on their front page, it wasn't. So it made for very fast benchmarks, because everyone benchmarks for speed, not many benchmark for failure.

BarkMore 14 years ago | |

The the data model contributes to it's popularity. A document store with indexes on document fields is very convenient for several types of applications.

jethroalias97 14 years ago | | |

It's interesting that couchdb gets little love (as evidenced by google trends), but it has document storage by index, easy enough to install, copy on write so has no global lock, sharding with bigcouch, and all client access is entirely REST... it may be couch is a little hard to grok, I dunno.

ehthere 14 years ago | | |

You can do exactly the same document store with indexes on any RDBMS.

viraptor 14 years ago | |

If I never expect the dataset to grow past 1GB and a single server, why would I use anything else? It doesn't really fail - none of the issues described were "failures" really. [edit: just to be clear, it didn't crash and burn, I don't think performance issue == failure] The data loss was not confirmed either: "There appears to be some data loss occurring" and in small deployments you can just use transaction log.

There's no other project I know of, which provides: schemaless json documents, indexing on any part of them, server-side mapreduce, lots of connectors for different languages, atomic updates on part of the document. If there is one and it's better than mongo, I'd switch any moment.

cscotta 14 years ago | | |

>> "It doesn't really fail - none of the issues described were "failures" really."

These absolutely were failures.

The author listed several instances in which the database became unavailable, the vendor-supplied client drivers refused to communicate with it, or both. Some of these scenarios included the primary database daemon crashing, secondaries failing to return from a "repairing" to an "online" state after a failure (and unable to serve operations in the cluster), and configuration servers failing to propagate shard config to the rest of the cluster -- which required taking down the entire database cluster to repair.

Each of the issues described above would result in extended application downtime (or at best highly degraded availability), the full attention of an operations team, and potential lost revenue. The data loss concern is also unnerving. In a rapidly-moving distributed system, it can be difficult to pin down and identify the root cause of data loss. However, many techniques such as implementing counters at the application level and periodically sanity-checking them against the database can at minimum indicate that data is missing or corrupted. The issues described do not appear to be related to a journal or lack thereof.

Further, the fact that the database's throughput is limited to utilizing a single core of a 16-way box due to a global write lock demonstrates that even when ample IO throughput is available, writes will be stuck contending for the global lock, while all reads are blocked. Being forced to run multiple instances of the daemon behind a sharding service on the same box to achieve any reasonable level of concurrency is embarrassing.

On the "1GB / small dataset" point, keep in mind that Mongo does not permit compactions and read/write operations to occur concurrently. As documents are inserted, updated, and deleted, what may be 1GB of data will grow without bound in size, past 10GB, 16GB, 32GB, and so on until it is compacted in a write-heavy scenario. Unfortunately, compaction also requires that nodes be taken out of service. Even with small datasets, the fact that they will continue to grow without bound in write/update/delete-heavy scenarios until the node is taken out of service to be compacted further compromises the availability of the system.

What's unfortunate is that many of these issues aren't simply "bugs" that can be fixed with a JIRA ticket, a patch, and a couple rounds of code review -- instead, they reach to the core of the engine itself. Even with small datasets, there are very good reasons to pause and carefully consider whether or not your application and operations team can tolerate these tradeoffs.

rhizome 14 years ago | | |

The data loss was not confirmed either: "There appears to be some data loss occurring"

Oh, this mystery is a failure all right, and even the most charitable interpretation would call it a misfeature.

japherwocky 14 years ago | | |

what do you think of redis? I feel the same way about Mongo for the most part, but have been considering switching.

InclinedPlane 14 years ago | |

So what's the preferred alternative noSQL wise?

MongoDB is flaky. CouchDB is a maintainability nightmare, so I hear.

Riak? Cassandra? Or does everything else have some other equally huge down-side?

rkalla 14 years ago | | |

They all have their warts. For every story like this, there are petabyte deployments of your favorite datastore that work fine.

For every X sucks article, ther is Y is awesome.

In the nosql world the only way to choose is around the problems they solve... They are each specializing and optimizing for certain nitches. mongo is the most mysql-esque, but dosnt do things that redis, couch or cassandra do that you may need.

There is no clear winner (fortunately or unfortunately dependng on what you were hoping for)

espeed 14 years ago | | |

Many are moving to Neo4j, including some of the major social networks.

fdr 14 years ago | |

It has a pretty good user experience, except for all the details. But the model isn't bad; it should be learned from. On the other hand, there is no trade-off made by Mongo that I'm aware of that is not fundamentally unavailable to more mature projects in a tractable amount of engineering time, so the question comes down to "does Mongo shed its reputation for lulz soon enough" vs "do other projects witness and adapt".

Yet we've also seen in the past that shedding such a reputation is not strictly required to be popular. And marketing budgets do matter.

vannevar 14 years ago | |

Why is a database that fails so easily and most of the time even loses data so popular?

Perhaps because both of your premises are wrong? I've used Mongo for over a year now with ~1000 writes/sec and haven't seen any of these problems. I'm not saying they don't exist (some are confirmed bugs that have been fixed), but they're not nearly as prevalent as your 'Do you still beat your wife?'-style question implies.

FooBarWidget 14 years ago | |

Not all data is important enough that small losses are unacceptable. Analytics data that can be inferred from other sources, for example. Furthermore MongoDB supports autosharding while most (all?) SQL databases do not.

dextorious 14 years ago | |

Probably because the quality of CS graduates has been so low at recent years. MongoDB = oh, shiny, fast.

gojomo 14 years ago |

Maybe there's a niche for "PostgreNoSQL", a layer atop Postgres that you start using like a NoSQL solution. (Perhaps, it's string keys and JSON blob values.) It's not very efficient, except for simple keyed lookups, but it works enough for a quick start.

Then, as you use it, the system optimizes itself (or makes suggestions) based on actual access patterns. A subset of objects could be a formal, indexed table? Have it happen automatically or offer the SQL as a suggestion.

i34159 14 years ago | |

Conversely, you could have a NoSQL layer below Postgres, where PG stores and indexes metadata which tells it which, of many, small NoSQL dbs to find the actual data in. These data dbs then can be sharded/replicated across physical systems as you like. You loose some raw speed on reads, but avoid a global write lock and the system scales quite well. I've started playing around with such a system with https://github.com/cloudflare/SortaSQL

rasur 14 years ago | |

IIRC, there are people talking directly to InnoDB (MySQL backend) using it as a NoSQL style DB. You don't however get SQL analysis, you're bypassing the SQL side of things.

einhverfr 14 years ago | |

hstore?

christkv 14 years ago |

Seems to me they used the wrong setup they should have looked at a replicaset setup with secondaries for read and sharding if they needed more write performance and nonblocking reads. That said version 2 has less locking problems and I understand they are working on finer grained locking.

schmichael 14 years ago | |

Sorry, this is a pretty poorly written blog post. We're definitely using sharding+replica sets.

Replication of any kind won't help you with a high write load as secondaries have to apply the same number of writes as primaries.

christkv 14 years ago | | |

They seem to be very aware of the problem and focused on solving it as soon as possible. I guess it's just a matter of time. Compared to how long it took MySQL to mature into a stable platform I've been pretty impressed at their responsiveness and quick improvements so far :).

StavrosK 14 years ago |

All I need is a schemaless version of postgres (with ACID-compliance and everything), does anyone know of one?

ericflo 14 years ago | |

http://www.postgresql.org/docs/9.0/static/hstore.html

StavrosK 14 years ago | | |

That's very useful, thank you!

lucian1900 14 years ago |

Sadly, MongoDB blows for actual usage. It locks, it's not crash-only, it has mutable data.

CouchDB is much better (you're as likely to lose data as with Postgres), but is potentially less efficient (no BSON).

bbulkow 14 years ago |

Disclosure: I wrote a product called Citrusleaf, which also plays in the NoSQL space.

My focus in starting Citruseaf wasn't features, it was operational dependability. I had worked at companies who had to take their system offline when they had the greatest exposure - like getting massive load from the Yahoo front page (back in the day). Citrusleaf focuses on monitoring, integration with monitoring software, operations. We call ourselves a real-time database because we've focused on predictable performance (and very high performance).

We don't have as many features as mongo. You can't do a javascript/json long running batch job. We'll get to features.

The global R/W lock does limit mongo. Absolutely. Our testing shows a nearly 10x difference in performance between Mongo and Citrusleaf on writes. Frankly, if you're still doing 1,000 tps, you should probably stick with a decent MySQL implementation.

Here's a performance analysis we did: http://bit.ly/rRlq9V

This theory that "mongo is designed to run on in-memory data sets" is, frankly, terrible --- simply because mongo doesn't give you the control to keep you in memory. You don't know when you're going to spill out of memory. There's no way to "timeout" a page cache IO. There's no asynchronous interface for page IO. For all of these reasons - and our internal testing showing page IO is 5x slower than aio; the reason all professional databases use aio and raw devices - we coded Citrusleaf using normal multithreaded io strategies.

With Citrusleaf, we do it differently, and that difference is huge. We keep our indexes in memory. Our indexes are the most efficient anywhere - more objects, fea. You configure Citrusleaf with the amount of memory you want to use, and apply policies when you start flowing out of memory. Like not taking writes. Like expiring the least-recently-used data.

That's an example of our focus on operations. If your application use pattern changes, you can't have your database go down, or go so slowly as to be nearly unusable.

Again, take my comments with a grain of salt, but with Citrusleaf you'll have better uptime, fewer servers, a far less complex installation. Sure, it's not free, but talk to us and we'll find a way to make it work for your project.

t3mp3st 14 years ago |

Disclosure: I hack on MongoDB.

I'm a little surprised to see all of the MongoDB hate in this thread.

There seems to be quite a bit of misinformation out there: lots of folks seem focused on the global R/W lock and how it must lead to lousy performance.

In practice, the global R/W isn't optimal -- but it's really not a big deal.

First, MongoDB is designed to be run on a machine with sufficient primary memory to hold the working set. In this case, writes finish extremely quickly and therefore lock contention is quite low. Optimizing for this data pattern is a fundamental design decision.

Second, long running operations (i.e., just before a pageout) cause the MongoDB kernel to yield. This prevents slow operations from screwing the pooch, so to speak. Not perfect, but smooths over many problematic cases.

Third, the MongoDB developer community is EXTREMELY passionate about the project. Fine-grained locking and concurrency are areas of active development. The allegation that features or patches are withheld from the broader community is total bunk; the team at 10gen is dedicated, community-focused, and honest. Take a look at the Google Group, JIRA, or disqus if you don't believe me: "free" tickets and questions get resolved very, very quickly.

Other criticisms of MongoDB concerning in-place updates and durability are worth looking at a bit more closely. MongoDB is designed to scale very well for applications where a single master (and/or sharding) makes sense. Thus, the "idiomatic" way of achieving durability in MongoDB is through replication -- journaling comes at a cost that can, in a properly replicated environment, be safely factored out. This is merely a design decision.

Next, in-place updates allow for extremely fast writes provided a correctly designed schema and an aversion to document-growing updates (i.e., $push). If you meet these requirements-- or select an appropriate padding factor-- you'll enjoy high performance without having to garbage collect old versions of data or store more data than you need. Again, this is a design decision.

Finally, it is worth stressing the convenience and flexibility of a schemaless document-oriented datastore. Migrations are greatly simplified and generic models (i.e., product or profile) no longer require a zillion joins. In many regards, working with a schemaless store is a lot like working with an interpreted language: you don't have to mess with "compilation" and you enjoy a bit more flexibility (though you'll need to be more careful at runtime). It's worth noting that MongoDB provides support for dynamic querying of this schemaless data -- you're free to ask whatever you like, indices be damned. Many other schemaless stores do not provide this functionality.

Regardless of the above, if you're looking to scale writes and can tolerate data conflicts (due to outages or network partitions), you might be better served by Cassandra, CouchDB, or another master-master/NoSQL/fill-in-the-blank datastore. It's really up to the developer to select the right tool for the job and to use that tool the way it's designed to be used.

I've written a bit more than I intended to but I hope that what I've said has added to the discussion. MongoDB is a neat piece of software that's really useful for a particular set of applications. Does it always work perfectly? No. Is it the best for everything? Not at all. Do the developers care? You better believe they do.

nomoremongo 14 years ago |

I'd appreciate if someone would submit this story for me.

http://pastebin.com/raw.php?i=FD3xe6Jt

patrickod 14 years ago |

Wow there's a lot of Mongo hate in this thread all from one article. Yesterday MongoDB was the darling of HN and today it has to be defended from ridiculous claims. Why the mob attitude? Have you all had these issues?

vegai 14 years ago |

All the commercial DBs have similar issues. Just deal with them and go on.

dextorious 14 years ago | |

No, they do not. Some joke DBs had some issues back in the day (MySQL comes to mind) but issues of such importance were solved looong ago.

vegai 14 years ago | | |

No, all of them had, and most still do. People don't seem to realize how freaking old and complicated those things are.

plasma 14 years ago |

Ravendb (www.ravendb.net) is a solid competitor.

icey 14 years ago | |

"Raven is an Open Source (with a commercial option) document database for the .NET/Windows platform."

I'm not sure it's a competitor at all. RavenDB is a CouchDB clone for .Net that requires a commercial license for proprietary software.

latch 14 years ago | | |

Which has a ton of magic baked into the driver making it unlikely you'll get your data back out via anything but .NET.

rkalla 14 years ago | | |

Well put.

plasma 14 years ago | |

Why the downvotes? Why would I bother mentioning alternatives next time, sheesh.

amalag 14 years ago |

If your data is easily modeled relationally, go for relation, if you are going to change it constantly and is not a natural fit for a relational model, Mongodb is worth a shot.

From this article, sounds like their data is pretty seriously relational.

Mongodb has been pushing the ops side of their product, but I can agree it has failings there. To me the advantage is the querying and the json style documents.

gmcquillan 14 years ago | |

I'm not sure you read the article fully, because relationships were never described in the article. Instead, it was high read/update load which caused problems.

Mongo, on paper, should be an ideal candidate for this job; but, due to complications with the locking model and with its inability to do online compactions, it's failing.

amalag 14 years ago | | |

Relation was a bad word choice, I meant easily modeled by a relational database system. Seems like your data can be modeled with fixed columns.

I had to model data with umpteen crazy relationships so we went with Mongodb. We did not have the high update issue or any locking issues. If one has a few large tables with fixed columns that can easily define the data, then relational DBs probably make more sense. But to your point, 10gen will not tell you that and the hype doesn't tell you that either.