Cohere's minimal compute new LLM

Cohere's minimal compute new LLM(cohere.com)

10 points by Knajjars 1 year ago | 1 comment

wizee 1 year ago |

I tried it out locally and it's pretty good. It has a good writing tone and style, and a good level of knowledge, in line with expectations for its size. It's slightly worse than Mistral Large 2411 at STEM tasks, but very close in its general level of knowledge, and IMO better than Mistral Large in writing style and creative writing.

I also really like the option to choose between the "Strict" and "Contextual" safety modes through the chat template/system prompt. It allows censoring the model in a customizable manner for business use cases, while being minimally censored where such restrictions aren't needed. It's so refreshing to see a good quality model that does what I ask it to do out-of-the-box without condescendingly moralizing, censoring itself, and excessively putting disclaimers everywhere. It's so much better than the approach that like likes of Google and Microsoft take with their Gemma and Phi models.

In terms of knowledge and intelligence, regardless of Google's marketing spin and benchmark gaming, this is vastly superior to yesterday's Gemma 3 27b, as you would expect for a model that's 4x bigger. I like its default writing style and tone much better than Gemma 3 too. As of today, this and Mistral Large 2411 are the two best models you can run locally within 128 GB of RAM.