GLM-5.2 is probably the most powerful text-only open weights LLM

17 points by Brajeshwar 3 hours ago | 1 comment

I wonder if multiple attempts at the opossum would produce better results.

If we didn’t have the previous example I would interpret this as pretty solid evidence that labs were training on the Pelican “benchmark”.

I just can’t imagine a model dropping so significantly from one version to the next on such a silly task.

besterman23 2 hours ago |

I wonder if multiple attempts at the opossum would produce better results.

If we didn’t have the previous example I would interpret this as pretty solid evidence that labs were training on the Pelican “benchmark”.

I just can’t imagine a model dropping so significantly from one version to the next on such a silly task.