LLM Benchmark
5 models · sourced from OpenRouter · refreshed April 27, 2026
Sourced from OpenRouter
Speko benchmarks STT, TTS, and S2S firsthand. We do not independently benchmark LLM quality. The pricing, context window, and provider data shown below are pulled live from OpenRouter. Cost-per-minute is a Speko estimate assuming a typical voice agent: 12 turns per minute, 150 input tokens and 80 output tokens per turn. Use it as an order-of-magnitude guide, not a billing forecast.
| Provider · Model | $/min (voice) | Input $/M | Output $/M | Context |
|---|---|---|---|---|
Qwen: Qwen-TurboAlibaba (Qwen) | <$0.001 | $0.033 | $0.130 | 131K |
OpenAI: GPT-4.1 NanoOpenAI · text+image+file->text | <$0.001 | $0.100 | $0.400 | 1.0M |
Google: Gemini 2.5 Flash LiteGoogle · text+image+file+audio+video->text | <$0.001 | $0.100 | $0.400 | 1.0M |
xAI: Grok 4 FastxAI · text+image+file->text | <$0.001 | $0.200 | $0.500 | 2.0M |
Anthropic: Claude Haiku 4.5Anthropic · text+image->text | $0.0066 | $1.00 | $5.00 | 200K |
Methodology
Models are pulled from OpenRouter's public model catalog, filtered to first-party hosted models from each vendor and sorted by Speko's voice-agent cost-per-minute estimate. Open-weight models that require a third-party host (Groq, Together, Fireworks) are excluded. We refresh once per hour. STT and TTS rankings are still measured firsthand by Speko Bench CLI.