DuckDB can be 5x faster than Spark at 500M record files(blog.dataexpert.io) |
DuckDB can be 5x faster than Spark at 500M record files(blog.dataexpert.io) |
A bit of an obvious one - small data tech is faster at small data. It serves more of a lower bound reminder of what "small data" is nowadays.
The article rightly starts with:
> Processing power on laptops has increased dramatically over the last twenty years. This allows single laptops to accomplish what we needed multi-node Spark clusters to do ten years ago.