A Decade of Dynamo

A Decade of Dynamo(allthingsdistributed.com)

107 points by werner 8 years ago | 54 comments

rm999 8 years ago |

DynamoDB is amazing for the right applications if you very carefully understand its limitations.

Last year I built out a recommendation engine for my company; it worked well, but we wanted to make it real-time (a user would get recommendations from actions they made seconds ago, instead of hours or days ago). I planned a 4-6 week project to implement this and put it into production. Long story short: I learned about DynamoDB and built it out in a day of dev time (start to finish, including learning the ins and outs of DynamoDB). The whole project was in stable production within a week. There has been zero down time, the app has seamlessly scaled up ~10x with consistently low latency, and it all costs virtually (relatively) nothing.

eropple 8 years ago | |

This is the good side of Dynamo, and it's awesome that you've had that experience.

The flip side: Dynamo gets expensive and it gets expensive quick, and being a custom API (and, indeed, a very different way to think about datastores) makes migration difficult.

It's great to use, if you understand the tradeoffs. Just make sure you understand them before you make the leap.

rm999 8 years ago | | |

>Dynamo gets expensive and it gets expensive quick,

DynamoDB's pricing scales sublinearly with volume; if it starts getting expensive it was an initial misuse of DynamoDB that got obvious with scale. There are a lot of factors that go into whether you should use DynamoDB and how you implement it. I recommend anyone who is considering using it very carefully understand this page first: http://docs.aws.amazon.com/amazondynamodb/latest/developergu...

freedomben 8 years ago | | |

On the flip side tho, Dynamo can be cheap for small apps. The cost to provision an RDS instance with backup can easily be 10x to 20x what the same load would take on Dynamo.

Of course as you scale this an be less true, so it's all in the application.

simonebrunozzi 8 years ago |

I was at AWS from 2008 to 2016. Werner Vogels, Amazon's CTO (yep, not just AWS', but Amazon's, as he had to point out numerous times) has been one of the most talented, humble and generous senior exec I've ever met in my life.

Lots of good memories of time spent with him, and one of the sad aspects for me of leaving Amazon.

His blog writings are really interesting. If you haven't already, I suggest you search the archives, there are several hidden gems there.

sneak 8 years ago | |

I have always tremendously admired that guy; what a job he has! He is literally responsible for about half of the shit on the internet not going down (and also running the biggest internet shopping mall and making sure a bunch of crappy speakers can talk, but both of those are straightforward by comparison IMO).

Can you even imagine? I want to know what his direct report structure looks like.

iliveinseattle 8 years ago | | |

He is an individual contributor

jchw 8 years ago |

I'm more interested in solutions like Spanner and Cockroach. Different tradeoffs for different applications, but they seem to be the most general purpose of the highly scalable databases. DynamoDB is cool and I've tried to adopt it for things, but it's surprisingly hard to imagine an application where the model isn't somewhat limiting. The capacity provisioning is also quite painful, which doesn't help matters any.

gelatocar 8 years ago |

I'd be interested to hear how others are handling read/write capacity configuration for dynamo. It seems like it would be very easy to hit the account limit of 10,000 units once you are querying any significant amount of data. I've also run into issues with auto scaling where you have to endure up to 15 minutes of downtime before the scaling kicks in [0]. Even on a table with ~2000 items I've found it becomes quite slow and costly to fetch data. Also the 25 item limit on batch writes makes it pretty frustrating to edit/delete lots of data.

- [0] https://hackernoon.com/the-problems-with-dynamodb-auto-scali...

ryanworl 8 years ago | |

You can request that limit be increased through the limit increase form.

Also, if you need to scan a ton of items to assemble your desired result, you should re-think using DynamoDB as a whole.

peterwwillis 8 years ago |

When they mention companies using DynamoDB, at least one of those actually uses their own implementation of Dynamo that they wrote to work around cost and performance limitations.

The main problems faced are not the ability to scale or reach performance benchmarks or keep data safe. They are operational, and primarily problems of infrastructure complexity and management. Oh, and having developers architect and manage the operations of a really freaking huge service is a bad idea. (No offense intended - those developers don't want to be woken up in the middle of the night either)

netvarun 8 years ago |

Direct link to the Dynamo Whitepaper PDF: http://www.allthingsdistributed.com/files/amazon-dynamo-sosp...

fiokoden 8 years ago |

I don't know why amazon is so taken with Dynamodb. I find it to be incredibly unintuitive and lacking real world application, requiring applications to perform gymnastics to work with it.

freedomben 8 years ago | |

I've found just the opposite actually. While it's far from perfect, it has been amazing for rapidly standing up new apps (especially prototypes). We've used quite a few different strategies and found it to be flexible and performant.

The only downside is we do find ourselves sometimes implementing relational DB functionality at the application level to compensate for Dynamo DB's "flexibility." Postgres is still the go-to for data that is relational in nature. But man, letting Amazon worry about hosting and scaling is also pretty awesome...

fiokoden 8 years ago | | |

>> we do find ourselves sometimes implementing relational DB functionality at the application level to compensate for Dynamo DB's "flexibility."

Yep, this is A-grade crazy, and exactly my point. I would question if it's "sometimes", or "actually almost all the time, now that we think about it, there's not much that we CAN do with DynamoDB without writing application level database functionality."

kanwisher 8 years ago | | |

Prototypes are far bettered suited with Postgres or Mysql on RDS. When you don't know your schema or your use case upfront, traditional databases are far easier to work with, since you can change them. Once you know what your doing scaling up works far better on something like Dynamo or Cassandra, but you will be sacrificing dev time

pritambarhate 8 years ago |

Now that autoscaling is available for Dynamo DB, my main complaint with DynamoDB is the lack of out of the box backup solution that works at scale.

Production DB without backups is unthinkable. It just takes one human mistake to erase tons of data. Consistent and regular backups are must have for any production system.

deepsun 8 years ago |

> The Dynamo paper was well-received and served as a catalyst to create the category of distributed database technologies commonly known today as "NoSQL."

No, sorry, it was Memcached and Bigtable paper that popularized "NoSQL" term. Although there were many NoSQL databases tracing way back to 60s [1], those were the ones that "served as catalyst" for the term "NoSQL".

[1] http://blog.knuthaugen.no/2010/03/a-brief-history-of-nosql.h...

pavlov 8 years ago | |

Dynamo was certainly one of the products that spiked interest in the “NoSQL” datastore category.

The phrasing “served as a catalyst” seems right — it doesn’t imply the only catalyst.

mankash666 8 years ago |

Congrats to aws on the impact DynamoDb has had on the ecosystem & industry. The article does make it seems like DynamoDb was the first to publish a unique noSQL architecture. Is this true?

fiokoden 8 years ago | |

Lotus Notes was first. Distributed, replicated key value store.

Actually I'm not right: http://blog.knuthaugen.no/2010/03/a-brief-history-of-nosql.h...

neoeldex 8 years ago | |

No, couchDb is older than DynamoDB, and according to wikipedia there's been nosql databases since the 60s

jbergens 8 years ago |

I would be interesting to read some comparisons between DynamoDb, CosmosDb and maybe Spanner.

jaxondu 8 years ago |

Need a library sdk for a Dynamo Sync feature to allow easy development of offline mobile apps. Similar function to Cognito Sync. Also hope that AWS will release a serverless SQL db. And cheaper price.

SteveNuts 8 years ago | |

> Also hope that AWS will release a serverless SQL db.

You mean like RDS?

fiokoden 8 years ago | | |

RDS is serverful not serverless.

rdiddly 8 years ago |

Oh that Dynamo. Not this Dynamo: http://dynamobim.org/

gt_ 8 years ago |

What is the machine in the photo?

monkmartinez 8 years ago | |

A dynamo... ;)

ddou 8 years ago |

amazing product

sheeshkebab 8 years ago |

The thing doesn't even support a useable cross region replication. On top of that the whole read/write capacity is a joke (a painful one at that).

Other than a dirty js config or a prototype store this db is useless.

eropple 8 years ago | |

I would be very, very careful of calling anything a company of such very sharp people does a "joke."

One of my prior gigs was pushing a billion data points a day through DynamoDB without it breaking a sweat. We were paying for it, too--but it was there and it worked.

dx034 8 years ago | | |

While I don't think Dynamo is a joke, pushing 1bn data points per day through a database system is not much. Given you have enough storage, a postgresql instance on a $50/month dedicated server can achieve that easily (from experience). You have to pay more attention to the data structure (unless you just put everything in jsonb columns) but will probably save 90% on operational costs.

sheeshkebab 8 years ago | | |

Anything that can go to dynamo, can go to s3, especially at that volume. And you get proper multi region replication, read/write capacity based on actual usage and instant scaling.

I stay by my comment that dynamodb is a joke wrapped in thick layer of marketing crap.

freedomben 8 years ago | |

I don't know if your comment was just a troll or not, but we've build serious apps that use Dynamo DB as our store. It's been exactly what we needed, and we've got millions and millions of records across quite a few tables. There's certainly pros and cons to each database, but Dynamo can really shine if you take the time to understand it.