Writing an LLM from scratch, part 32f – Interventions: weight decay | Dark Hacker News