Thinking through how pretraining vs. RL learn | Dark Hacker News