Husky, Datadog's Third-Generation Event Store(datadoghq.com) |
Husky, Datadog's Third-Generation Event Store(datadoghq.com) |
> Writers read from Kafka, (briefly) buffer events in memory, upload events to blob storage in our custom file format, and then commit the presence of these new files to our metadata store.... Compactors scan the metadata store for small files generated by the Writers and previous compactions, and compact them into larger files.... The Reader (leaf) nodes run queries over individual files in blob storage and return partial aggregates, which are re-aggregated by the distributed query engine.
And then the meta-data supporting the system:
> Husky's metadata store has multiple responsibilities, but its most important one is to serve as the strongly consistent source of truth for the set of files currently visible to each customer. We’ll delve into the details of our metadata store more in future blog posts, but it is a thin abstraction around FoundationDB, which we selected because it was one of the few open source OLTP database systems that met our requirements
There's some nice scalability/isolation benefits in this all. Having reader nodes reading from network storage has created a lot of flexibility & ability to shift work around on demand.
Keeping all the metadata in FoundationFB is exciting, & sounds like a great use case, for it's safe transactional updates!
* You have services emit data into streams.
* You dump the streams into your storage with high frequency so you can have near real-time result, this process will create many small files.
* Because small files are inefficient, you have compactors that run over the small files and merge them into bigger files, and/or delete records that's obsolete.
* You run a query engine that read over the small files and large files to get the final result.
* To speed up step 2,3,4 you store the metadata of the files in-memory / in a database.
https://twitter.com/fulmicoton/status/1526776987553263616 https://github.com/quickwit-oss/quickwit
its a shame, the product is kind of nice. But this is 100% of putting.
one mistake in my logs, and my account was due > 10k us$. until a manager contact-me after a month. It appears to be a method to force a "sales" call.
A simple indicator of how much you are due ( daily ) would solve this kind of problem. ( google/reddit shows that this kind of problem happens all the time in the last 2 years )
As for the tech, it seemed like a quality product.