Unpacking the HF in RLHF(maestroai.substack.com) |
Unpacking the HF in RLHF(maestroai.substack.com) |
See my comment on this thread
https://news.ycombinator.com/item?id=35069965
I'd be glad to chat more (see my profile) but I think as much as people think there is a scalability advantage to a big co training one big model the problems of pleasing everybody, particularly advertisers, are terrible but a personal model that pleases one person might be easy with recent tech.