undefined | Dark Hacker News

1 points by remilouf 2 years ago

remilouf 2 years ago |

LLM evaluations are very sensitive to the details of the prompt's structure. This post shows how using structured generation reduces the results' variance and the ranking shifts.

remilouf 2 years ago |

LLM evaluations are very sensitive to the details of the prompt's structure. This post shows how using structured generation reduces the results' variance and the ranking shifts.