OpenAI: Investigating the consequences of accidentally grading CoT during RL | Dark Hacker News