RLHF: Reinforcement Learning from Human Feedback(huyenchip.com)4 points by madisonmay 3 years ago | 1 comment