Dark Hacker News
new
|
best
|
ask
|
show
|
jobs
Muon Is Scalable for LLM Training
(github.com)
5 points
by
renonce
1 year ago
| 1 comment
Muon Is Scalable for LLM Training | Dark Hacker News
yorwba
1 year ago
|
next
[−]
For people who want to know more about the Muon optimizer:
https://kellerjordan.github.io/posts/muon/