Understanding reinforcement learning for model training from scratch | Dark Hacker News