Cohere's minimal compute new LLM(cohere.com) |
Cohere's minimal compute new LLM(cohere.com) |
I also really like the option to choose between the "Strict" and "Contextual" safety modes through the chat template/system prompt. It allows censoring the model in a customizable manner for business use cases, while being minimally censored where such restrictions aren't needed. It's so refreshing to see a good quality model that does what I ask it to do out-of-the-box without condescendingly moralizing, censoring itself, and excessively putting disclaimers everywhere. It's so much better than the approach that like likes of Google and Microsoft take with their Gemma and Phi models.
In terms of knowledge and intelligence, regardless of Google's marketing spin and benchmark gaming, this is vastly superior to yesterday's Gemma 3 27b, as you would expect for a model that's 4x bigger. I like its default writing style and tone much better than Gemma 3 too. As of today, this and Mistral Large 2411 are the two best models you can run locally within 128 GB of RAM.