Dark Hacker News
new
|
best
|
ask
|
show
|
jobs
Train a LLM from Scratch | Dark Hacker News
Train a LLM from Scratch
(github.com)
3 points
by
linhns
13 days ago
| 1 comment
subtick
13 days ago
|
next
[−]
Curious — how did you handle training stability early on? Was convergence an issue without heavy tuning?