Narrow finetuning can produce broadly misaligned LLMs

Narrow finetuning can produce broadly misaligned LLMs(emergent-misalignment.com)

10 points by foweltschmerz 1 year ago | 3 comments

sylware 1 year ago |

Is anybody pointing on the fact that "alignment" is brain-washing?

achierius 1 year ago | |

Is teaching a child? Is talking with your friend? Is punishing a criminal?

Or, more pointedly, what about training the model in the first place? Why do you pretend that AI are somehow "people" with a "natural tendency" we're overriding?

marcellus23 1 year ago |

discussed 2 days ago: https://news.ycombinator.com/item?id=43176553