Ask HN: Small LM or API? Is small language models still worth it in 2026, or are most people just using APIs now? |
Ask HN: Small LM or API? Is small language models still worth it in 2026, or are most people just using APIs now? |
That said, "worth it" still depends heavily on your hardware. A 4070 Ti gets you a very different answer than a 3060.
Disclosure: I'm building localllm-advisor.com, free and client-side, which also helps answer these types of questions. It shows which models fit your GPU with quantization options and estimated tok/s, or which GPU you'd need to run a specific model. Relevant to the question so I'm mentioning it, but take it for what it is.