Show HN: Compressing a Histogram from 16GB to 2KB(alexkranias.com) Thought I'd dive into this cool streaming/approximation algorithms problem I encountered a few months ago. TLDR: we can create our own custom floating point representation for encoding and decoding integers and use this to index into a tiny 2D histogram, upper-bounding approximation error based on the number of bits per integer we specify to keep. --- Broadly useful for aggregating statistics from massive data streams of user data. Also turns out this is incredibly similar to data structures used in production, like HDRHistogram. |