NEXUS-Bench v0 Wave-9 · Production Ready
Multi-LLM routing that
actually measures quality.
The only production API with Brier-calibrated accuracy scores on every response. LinUCB bandit routing learns from your calls. Most requests cost $0 via Groq.
1.000
Peak quality score (Groq Qwen3)
$0
Cost for most classification calls
6
Providers, 16 models
900+
Tests in production infra
Why NEXUS routes better than OpenRouter
⚡
Brier-calibrated quality
Every model has an empirical quality score from NEXUS-Bench. Not marketing claims — actual accuracy on 1,000+ test tasks.
🧠
LinUCB bandit routing
Uses Upper Confidence Bound exploration to improve model selection with each API call. Gets smarter over time.
$0
Groq models are FREE
Classification (Q=1.000) and research (Q=0.920) via Groq cost $0. Most use cases never touch paid models.
🎯
Task-type routing
Tell us classification / research / reasoning / code-edit. We pick the cheapest model that meets your quality threshold.
Live Brier Leaderboard
NEXUS-Bench v0 Wave-9 — empirical quality, not vendor claims
Simple pricing. Cheaper than AWS Bedrock.
500 free calls to start. No credit card required.
Starter
$49/mo
1,000 calls/mo
- ✓All task types (classify, research, code, reason)
- ✓Groq FREE models included
- ✓Usage dashboard
- ✓Brier quality scores in every response
- ✓JSON response format
Most Popular
Professional
$149/mo
10,000 calls/mo
- ✓Everything in Starter
- ✓DeepSeek R1 reasoning access
- ✓Priority routing queue
- ✓Team API key management
- ✓Usage analytics export
- ✓$0.05/100 calls overage
Scale
$499/mo
100,000 calls/mo
- ✓Everything in Professional
- ✓Claude Sonnet 4.6 access
- ✓SLA: 99.9% uptime
- ✓Dedicated routing queue
- ✓Custom quality thresholds
- ✓White-label option available