Training language models to follow instructions with human feedback | Dark Hacker News