Dark Hacker News
new
|
best
|
ask
|
show
|
jobs
SGLang: Fast and Expressive LLM Inference with RadixAttention for 5x Throughput | Dark Hacker News
SGLang: Fast and Expressive LLM Inference with RadixAttention for 5x Throughput
(github.com)
2 points
by
covi
1 year ago
| 0 comments
No comments yet