DiceDB

kiitos 1 year ago |

There are _so many_ bugs in this code.

One example among many:

https://github.com/DiceDB/dice/blob/0e241a9ca253f17b4d364cdf... defines func ExpandID, which reads from cycleMap without locking the package-global mutex; and func NextID, which writes to cycleMap under a lock of the package-global mutex. So writes are synchronized, but only between each other, and not with reads, so concurrent calls to ExpandID and NextID would race.

This is all fine as a hobby project or whatever, but very far from any kind of production-capable system.

kiitos 1 year ago | |

https://github.com/DiceDB/dice/pull/1588

This PR attempted to fix the memory model violation I mentioned in the parent comment, but also added in an extra change that swapped the sync.Mutex to a sync.RWMutex. The PR description claimed 2 benefits: "Eliminates the data race, ensuring thread safety" -- correct! at least to some level; but also "Improves performance by allowing concurrent ExpandID calls, which is likely a common operation" -- which is totally unsubstantiated, and very likely false, as RWMutex is only faster than a regular Mutex under very narrowly-defined load patterns.

In any case, the PR had no kind of test or benchmark to validate either of these claims, so not a great start by the author. But then a maintainer chimed in with a comment that expressed concerns about edge-condition performance details, without any kind of data or evidence, and apparently didn't care about (or know about?) the much more important fixes that the PR made re: data races.

https://github.com/DiceDB/dice/pull/1588#issuecomment-274521...

> I tried changing this, but I did not see any benefit in benchmark numbers.

No apparent understanding of the bugs in this code, nor how changes may or may not fix those bugs, nor really how performance is defined or can be meaningfully evaluated.

Again, hobby project or whatever, all good. But the authors and maintainers of this project are clearly, demonstrably, in over their heads on this one.

senderista 1 year ago | |

Haven't looked at the code, but enforcing mutual exclusion between writers but not readers can make sense for a single-writer lock-free algorithm.

ignoramous 1 year ago | | |

> single-writer lock-free algorithm

I understand the need for correct lock-free impls: Given OP's description, simply avoiding read mutexes can't be the way to go about it?

nebulous1 1 year ago | | |

I don't use Go.

https://go.dev/ref/mem

If I'm reading this correctly, they are recommending a lock in this situation. However, they are saying the implementations has two options, either raise an error reporting the race (if the implementation is told to do so), or, because the value being read is not larger than a machine word, reply to the read with a correct value from a previous write. If true then it cannot reply with corrupted data.

kiitos 1 year ago | | |

> However, they are saying the implementations has two options, either raise an error reporting the race (if the implementation is told to do so), or, because the value being read is not larger than a machine word, reply to the read with a correct value from a previous write.

The spec says

> A read r of a memory location x holding a value that is not larger than a machine word must observe some write w such that r does not happen before w and there is no write w' such that w happens before w' and w' happens before r. That is, each read must observe a value written by a preceding or concurrent write.

These rules apply only if the value isn't larger than a machine word. Otherwise,

> Reads of memory locations larger than a single machine word ... can lead to inconsistent values not corresponding to a single write.

The size of a machine word is different depending on how a program is compiled, so whether or not a value is larger than a machine word isn't know-able by the program itself.

And even if you can assert that your program will only be built where a machine word is always at least of size e.g. uint64, the spec only guarantees that unsynchronized reads of a uint64 will return some previous valid write, it doesn't guarantee anything about which value is returned. So `x=1; x=3; x=2;` concurrently with `print(x); print(x); print(x)` can print `1 1 1` or `3 3 3` or `2 1 1` or `3 2 1` and so on. It won't return a corrupted uint64, but it can return any prior uint64, which is still a data race, and almost certainly useless to the application.

nebulous1 1 year ago | | |

Thanks. So the structure in the OP is an array of uint32s.

> that unsynchronized reads of a uint64 will return some previous valid write, it doesn't guarantee anything about which value is returned

Your the second person saying this, so is my interpretation that this is dissallowed by the part that you quoted incorrect?

> must observe some write w such that r does not happen before w and there is no write w' such that w happens before w' and w' happens before r

edit: somebody is answering this below by the way

fashion-at-cost 1 year ago | | |

The goalposts have been moved. The claim is that this pattern isn’t suitable for production code. The ground truth is that a compliant Go implementation may elect to: crash; read the first value ever set to the variable for the entire lifetime of the program; or behave completely as you’d expect from a single core interleaved execution order. The first is an opt-in, the latter two are up to the whims of the runtime and an implementation may alternate between them at any point.

Is that the kind of uncertainty you want in your production systems? Or is your only requirement that they don’t serve “corrupt” data?

Don’t be “clever”. Use locks.

nebulous1 1 year ago | | |

I don't disagree, but that's not the claim I was replying to. The question I was asking about was

> I understand the need for correct lock-free impls: Given OP's description, simply avoiding read mutexes can't be the way to go about it?

I did note that the documentation recommends a lock.

> read the first value ever set to the variable for the entire lifetime of the program

That is not my reading of the current memory model? It seems to specifically prohibit this behaviour in requirement 3:

> 2. w does not happen before any other write w' (to x) that happens before r.

kiitos 1 year ago | | |

Yep. And even if you were to lock down the implementation of the compiler, the version of Go you're using, the specific set of hardware and OS that you build on and deploy to, and so on -- that still doesn't indemnify you against arbitrary or unexpected behavior, if your code violates the memory model!

senderista 1 year ago | | |

Oh, so cycleMap is a non-threadsafe structure? I don't know golang so I didn't realize this.

kiitos 1 year ago | | |

Nothing in Go is thread-safe, unless explicitly documented otherwise. Some examples of explicitly-documented-otherwise stuff are in package sync and package sync/atomic.

cycleMap is definitely not thread-safe. The authors knew this, to some extent, because they synchronized writes via an adjacent mutex. But they didn't synchronize reads thru the same mutex, which is the issue.

senderista 1 year ago | | |

OK, this doesn't inspire confidence then.

deazy 1 year ago |

Looking at the diceDB code base, I have few questions regarding its design, I'm asking this to understand the project's goals and design rationale. Anyone feel free to help me understand this.

I could be wrong but the primary in-memory storage appears to be a standard Go map with locking. Is this a temporary choice for iterative development, and is there a longer-term plan to adopt a more optimized or custom data structure ?

I find the DiceDB's reactivity mechanism very intriguing, particularly the "re-execution" of the entire watch command (i.e re-running GET.WATCH mykey on key modification), it's an intriguing design choice.

From what I understand is the Eval func executes client side commands this seem to be laying foundation for more complex watch command that can be evaluated before sending notifications to clients.

But I have the following question.

What is the primary motivation behind re-executing the entire command, as opposed to simply notifying clients of a key change (as in Redis Pub/Sub or streams)? Is the intent to simplify client-side logic by handling complex key dependencies on the server?

Given that re-execution seems computationally expensive, especially with multiple watchers or more complex (hypothetical) watch commands, how are potential performance bottlenecks addressed?

How does this "re-execution" approach compare in terms of scalability and consistency to more established methods like server-side logic (e.g., Lua scripts in Redis) or change data capture (CDC) ?

Are there plans to support more complex watch commands beyond GET.WATCH (e.g. JSON.GET.WATCH), and how would re-execution scale in those cases?

I'm curious about the trade-offs considered in choosing this design and how it aligns with the project's overall goals. Any insights into these design decisions would help me understand its use-cases.

Thanks

bdcravens 1 year ago |

Is there a single sentence anywhere that describes what it actually is?

schmookeeg 1 year ago |

Using an instrument of chance to name a data store technology is pretty amusing to me.

bufferoverflow 1 year ago | |

No chance if we live in a deterministic universe.

dkh 1 year ago | |

This is essentially what all in-memory data stores have always been

Kinda refreshing to see someone own it and run with it

cozzyd 1 year ago |

DiceDB sounds like the name of a joke database that returns random results.

BoorishBears 1 year ago | |

No it doesn't.

graynk 1 year ago | | |

Yes it does.

Seems we're in a stalemate, where do we go from here?

BoorishBears 1 year ago | | |

OP continues ignoring static from the people who jump to shoddy conclusions.

kreddor 1 year ago | | |

It was my first thought as well, before reading the landing page.

BoorishBears 1 year ago | | |

Yeah, and I'm sure someone clicked it thinking it was a DB for EA's Dice Studios.

If you expose something to enough people you'll get some unreasonable takes and interpretations of it. It's important to ignore them.

graynk 1 year ago | | |

> If you expose something to enough people you'll get some unreasonable takes and interpretations of it. It's important to ignore them.

Quite literally the main function of dice is to give you random numbers. Looking over the website and readme I could not surmise why they would call it DiceDB except for "it sounds nice", but it's absolutely not unreasonable to look at the name and have a thought "it's probably a joke project about random results".

BoorishBears 1 year ago | | |

There are literal mountains of software named for no particular reason (let alone sounding nice), or named by origins no person would ever infer without digging in deeper.

Reasonable people realize this and won't discard a project as a joke because of such a teneous connection, and the fact they've gotten traction is a testament to that.

weekendcode 1 year ago |

From the benchmarks on 4vCPU and num_clients=4, the numbers doesn't look much different.

Reactive looks promising, doesn't look much useful in realworld for a cache. For example, a client subscribes for something and the machines goes down, what happens to reactivity?

alexey-salmin 1 year ago |

  | Metric               | DiceDB   | Redis    |
  | -------------------- | -------- | -------- |
  | Throughput (ops/sec) | 15655    | 12267    |
  | GET p50 (ms)         | 0.227327 | 0.270335 |
  | GET p90 (ms)         | 0.337919 | 0.329727 |
  | SET p50 (ms)         | 0.230399 | 0.272383 |
  | SET p90 (ms)         | 0.339967 | 0.331775 |

UPD Nevermind, I didn't have my eyes open. Sorry for the confusion.

Something I still fail to understand is where you can actually spend 20ms while answering a GET request in a RAM keyvalue storage (unless you implement it in Java).

I never gained much experience with existing opensource implementations, but when I was building proprietary solutions at my previous workplace, the in-memory response time was measured in tens-hundreds of microseconds. The lower bound of latency is mostly defined by syscalls so using io_uring should in theory result in even better timings, even though I never got to try it in production.

If you read from nvme AND also do the erasure-recovery across 6 nodes (lrc-12-2-2) then yes, you got into tens of milliseconds. But seeing these numbers for a single node RAM DB just doesn't make sense and I'm surprised everyone treats them as normal.

Does anyone has experience with low-latency high-throughput opensource keyvalue storages? Any specific implementation to recommend?

davekeck 1 year ago | |

> Something I still fail to understand is where you can actually spend 20ms

Aren’t these numbers .2 ms, ie 200 microseconds?

ajnin 1 year ago | |

I had the same reaction as you. And that's for 4 simultaneous clients, too, for a single client you get 3159 ops/s (from https://dicedb.io/benchmarks/). I'm not too familiar with in-memory databases in general but I would have expected figures in the millions on modern hardware. Makes me feel there's some hidden bottleneck somewhere and the benchmarks are not purely measuring the performance of the software.

OutOfHere 1 year ago |

In-memory caches (lacking persistence) shouldn't be called a database. It's not totally incorrect, but it's an abuse of terminology. Why is a Python dictionary not an in-memory key-value database?

ac130kz 1 year ago |

Any reason to use this over Valkey, which is now faster than Redis and community driven? Genuinely interested.

hp77 1 year ago | |

DragonflyDB is also in that race, isn't it?

ac130kz 1 year ago | | |

From what I looked at in the past, they seem better on paper by comparing themselves to a very old version of Redis in a rigged scenario (no clustering or multithreading applied despite Drangonfly getting multithreading enabled), and they are a lot worse in terms of code updates. Maybe that's different today, but I'm more keen on using Valkey.

hp77 1 year ago | | |

Does Redis support multithreading? Doesn't it use a single-threaded event loop, while DragonflyDB basic version is with multithreading enabled and shared-nothing architecture. Also I found this latest comparison between Valkey and DragonflyDB : https://www.dragonflydb.io/blog/dragonfly-vs-valkey-benchmar...

romange 1 year ago | | |

Valkey/Redis support offloading of io processing to special I/O threads.

Their goal is to unload the "main" thread from performing i/o related tasks like socket reading and parsing, so it could only spend its precious time on datastore operations. This creates an asymmetrical architecture with I/O threads scaling to any number of CPUs, but the main thread is the only one that touches the hashtable and its entries. It helps a lot in cases where datastore operations are relatively lightweight, like SET/GET with short string values, but its impact will be insignificant for CPU heavy operations like lua EVALs, sorted sets, lists, MGET/MSET etc.

ac130kz 1 year ago | | |

IO multithreading is still not fully there, there were significant improvements within the first couple of iterations, hopefully, it will improve further. I see that Dragonfly uses iouring, which is not recommended by Google due to security vulnerabilities.

romange 1 year ago | | |

Dragonfly supports both epoll and iouring, and polling engine choice is quite orthogonal to its shared nothing architecture. I do not think that Valkey or Redis will become fully multi-threaded any time soon - as such change will require building something like Dragonfly (or use locks that historically were a big NO for Redis).

(Author of Dragonfly here)

hp77 1 year ago | | |

I read Google is limitting the use of io_uring, but I have seen io_uring being used in other Databases, TigerBeetle is another DB which uses io_uring.

losvedir 1 year ago |

I didn't see it in the docs, but I'd want to know the delivery semantics of the pubsub before using this in production. I assume best effort / at most once? Any retries? In what scenarios will the messages be delivered or fail to be delivered?

remram 1 year ago |

This seems orders of magnitude slower than Nubmq which was posted yesterday: https://news.ycombinator.com/item?id=43371097

arpitbbhayani 1 year ago | |

Different tool. I metrics I am optimizing for are different hence wrote a separate utility. May not be the most optimized one. But I am usign this to measure all things DiceDB and will be using this to optimize DiceDB further.

ref: https://github.com/DiceDB/membench

huntaub 1 year ago |

What are some example use cases where having the ability for the database to push updates to an application would be helpful (vs. the traditional polling approach)?

zupa-hu 1 year ago | |

One example is when you want to display live data on a website. Could be a dashboard, a chat, or really the whole site. Polling is both slower and more resource hungry.

If it is built into your language/framework, you can completely ignore the problem of updating the client, as it happens automatically.

Hope that makes sense.

huntaub 1 year ago | | |

Interesting -- is that normally done with database updates + polling vs. something purpose-built?

zupa-hu 1 year ago | | |

Not sure how many such solutions there are out there so no idea about the norm. I doubt polling is a real option.

You may want to search for realtime databases.

alexpadula 1 year ago |

15655 ops a second with a Hetzner CCX23 machine with 4 vCPU and 16GB RAM is rather slow for an in-memory database I hate to say it. You can't blame that on network latency as for example supermassivedb.com is written in go and achieves magnitudes more, actually x20 and it's persisted.. I must investigate the bottlenecks with Dice.

rebolek 1 year ago |

- proudly open source. cool! - join discord. YAY :(

throwaway2037 1 year ago |

FYI: Here is the creator and maintainer's profile: https://github.com/arpitbbhayani

Is there a plan to commercialise this product? (Offer commercial support, features, etc.) I could not find anything obvious from the home page.

sidcool 1 year ago |

Is Arpit is the system design course guy?

arpitbbhayani 1 year ago | |

Yes. I do run a sys design course on weekends.

Aeolun 1 year ago |

I feel like this needs a ‘Why DiceDB instead of Redis or Valtio’ section prominently on the homepage.

dkh 1 year ago | |

Did you mean Valkey, or has the js community now managed to shoehorn an entire high-availability database server into a javascript object proxy?

Aeolun 1 year ago | | |

It’s only a matter of time xD but yes, I meant Valkey.

I was typing that out and felt like something was wrong but couldn’t put my finger on what.

DrammBA 1 year ago |

I love the "Follow on twitter" link with the old logo and everything, they probably used a template that hasn't been updated recently but I'm choosing to believe it's actually a subtle sign of protest or resistance.

spiderfarmer 1 year ago | |

Just use Bluesky. It’s the better middle finger.

arpitbbhayani 1 year ago | |

I prefer that over X icon.

datadeft 1 year ago |

Is this suffering from the same problems like Redis when trying to horizontally scale?

weekendcode 1 year ago | |

I guess yes.

re-lre-l 1 year ago |

> For Modern Hardware fully utilizes underlying core to get higgher throughput and better hardware utilization.

Would be great to disclose details of this one. I'm interested in using what DiceDB achieves higher throughput.

robertlagrant 1 year ago |

> fully utilizes underlying core to get higgher throughput and better hardware utilization

FYI this is a misspelling of "higher"

nylonstrung 1 year ago |

Who is this for? Can you help me explain why and when I'd want to use this in place of redis/dragonfly

deadbabe 1 year ago |

I think Postgres can do everything this does and better if you use LISTEN/NOTIFY.

999900000999 1 year ago |

I like it!

Anyway to persist data in case of reboots?

That's the only thing missing here.

Is Go the only SDK ?

lucifercr7 1 year ago | |

Snapshot functionality is WIP, which can be utilised to persist and replay data between reboots. For now Golang SDK is only one, more SDKs are to be added soon.

retropragma 1 year ago |

Why would I use this over keyspace notifications in redis?

dkh 1 year ago | |

Based on this thread, I'm not sure you would want to use this over keyspace notifications, but I will also say that there comes a point in the maturity of a system when keyspace notifications become a complicated, unreliable, resource-heavy nightmare. They work fine is your needs and scale are limited, but it's definitely not what you want if handling lots of frequent chances across craploads of keys, with complicated logic for who needs them and how they get routed to them, and where it matters if the notification is successfully received.

But certainly you could build something to handle these and most other needs in this realm with mostly just redis, using streams for what needs to be more robust, in tandem with pub/sub, keyspace notifs, etc. in the areas they are suited to.