Dark Hacker News
new
|
best
|
ask
|
show
|
jobs
thw20 | Dark Hacker News
user:
thw20
created:
April 24, 2024
karma:
1
submissions
comments
1.
Simple, zero overhead way to compress model, KV cache via Low-Rank Decomposition
(jeffreywong20.github.io)
1 points
by
thw20
5 days ago
|
0 comments
2.
Towards understanding multiple attention sinks in LLMs
(github.com)
1 points
by
thw20
65 days ago
|
2 comments
3.
The Existence and Behavior of Secondary Attention Sinks
(arxiv.org)
1 points
by
thw20
87 days ago
|
0 comments