undefined | Dark Hacker News

1 points by touringa 3 years ago

touringa 3 years ago |

The human feedback fine-tuning concept was applied following strict policies and rules. The rules chosen by OpenAI would be very similar to those applied by DeepMind for the Sparrow dialogue model (Sep/2022), which is a fine-tuned version of DeepMind’s Chinchilla model.

The rules used for DeepMind Sparrow were selected by researchers from DeepMind (Alphabet), California Institute of Technology, University of Toronto, and University College Dublin.