Exploration Hacking: Can LLMs Learn to Resist RL Training?(alignmentforum.org)2 points by Prof_Sigmund 38 days ago | 0 commentsNo comments yet