The human feedback fine-tuning concept was applied following strict policies and rules. The rules chosen by OpenAI would be very similar to those applied by DeepMind for the Sparrow dialogue model (Sep/2022), which is a fine-tuned version of DeepMind’s Chinchilla model.
The rules used for DeepMind Sparrow were selected by researchers from DeepMind (Alphabet), California Institute of Technology, University of Toronto, and University College Dublin.