Simple, zero overhead way to compress model, KV cache via Low-Rank Decomposition | Dark Hacker News