zBase – A high-performance, elastic, distributed key-value store(code.zynga.com) |
zBase – A high-performance, elastic, distributed key-value store(code.zynga.com) |
One of the main advantage of zBase / Membase / Couchbase compared to many other NoSQL data stores is their strict consistency model (compared to eventual consistency in Riak for example).
This means that with a zBase/Membase/Couchbase cluster, if client A writes a new value to a key in the cluster and client B reads from the same key immediately afterwards, client B is guaranteed to (immediately) see what client A wrote.
While in eventual consistency models, client B might see the old value in that case (could be because the change might not have propagated to all servers in the cluster yet). If client B tried to read again a few minutes later he might then see the new value from A.
Strict consistency is required in a lot of applications such as game servers (which is why Zynga needs it).
For use cases that are more read-heavy, like holding the contents of a news website, eventual consistency is good enough.
- LRU based or random eviction based cache management.
- Support for multiple disks and thereby IO parallelism.
- Incremental Backup and Restore (You can pack 5x .. 10x size of RAM in ZBase and make use of incremental backups for node failover)
- Incremental backup helps to offer Blob level restore in hourly, daily and weekly granularities)
- Cluster manager - ZBase operates by partitioning entire data into virtual buckets and servers act as containers to hold these vbuckets. Hence provides scalable ways to increase or decrease the number of servers in a cluster.
Think of it as memcache + disk persistence. (So rather than erasing things by purging cache when memory slab fills, you just evict it from memory and read from disk if its needed again).
zBase would have it's own full copy of the data already on distributed disks, so it wouldn't need to fall through to some other database. That seems to be the entire point there - but surely you'd still need to store the data in some place you could run ad-hoc queries on it? That means that the data is duplicated into two places that would need to be kept up to date in sync. If a transaction fails on one of the data stores, don't you have inconsistent data now?
In any case most game state data wouldn't be very useful on its own, you typically want to look at event streams and histories of certain values, not snapshots of the current state.
It works out well enough if you never need ad hoc queries.
EDIT: To clarify, I know Redis, I'm interested in learning how this differs beyond its distributed nature.
Note also that Redis Sentinel http://redis.io/topics/sentinel provides high availability.
Other than the dynamic resharding part, how do zBase and RethinkDB compare to each other?
zBase is used as highly available key-value store for writes and reads. It offers few fancy operations like get-lock as well.