Reinforcement Learning from Human Feedback

133 points by onurkanbkrc 145 days ago | 5 comments

verdverm 145 days ago |

Last time I saw Nathan say something about the book, he's actively working on the next version and looking for feedback, check his socials

leggerss 145 days ago | |

You could say he's also learning from human feedback

dang 145 days ago |

Related. Others?

klelatti 145 days ago |

Web version with links, etc:

dang 145 days ago | |

Thanks! We've switched to that above from https://arxiv.org/abs/2504.12501, and put the latter in the toptext.