Today we’re extending that approach to Claude Code via Arch Gateway[2], bringing multi-LLM access into a single CLI agent with two main benefits:
1. Model Access: Use Claude Code alongside Grok, Mistral, Gemini, DeepSeek, GPT or local models via Ollama.
2. Preference-aware Routing: Assign different models to specific coding tasks, such as – Code generation – Code reviews and comprehension – Architecture and system design – Debugging
Why not route based on public benchmarks? Most routers lean on performance metrics — public benchmarks like MMLU or MT-Bench, or raw latency/cost curves. The problem: they miss domain-specific quality, subjective evaluation criteria, and the nuance of what a “good” response actually means for a particular user. They can be opaque, hard to debug, and disconnected from real developer needs.
[1] Arch-Router: https://huggingface.co/katanemo/Arch-Router-1.5B
[2] Arch Gateway: https://github.com/katanemo/archgw