Writing an LLM from scratch, part 20 – starting training, and cross entropy loss | Dark Hacker News