Avatarl: Training language models from scratch with pure reinforcement learning | Dark Hacker News