Skip to content

LLM Benchmark

5 models · sourced from OpenRouter · refreshed April 27, 2026

Sourced from OpenRouter

Speko benchmarks STT, TTS, and S2S firsthand. We do not independently benchmark LLM quality. The pricing, context window, and provider data shown below are pulled live from OpenRouter. Cost-per-minute is a Speko estimate assuming a typical voice agent: 12 turns per minute, 150 input tokens and 80 output tokens per turn. Use it as an order-of-magnitude guide, not a billing forecast.

Provider · Model$/min (voice)Input $/MOutput $/MContext
Qwen: Qwen-TurboAlibaba (Qwen)
<$0.001$0.033$0.130131K
OpenAI: GPT-4.1 NanoOpenAI · text+image+file->text
<$0.001$0.100$0.4001.0M
Google: Gemini 2.5 Flash LiteGoogle · text+image+file+audio+video->text
<$0.001$0.100$0.4001.0M
xAI: Grok 4 FastxAI · text+image+file->text
<$0.001$0.200$0.5002.0M
Anthropic: Claude Haiku 4.5Anthropic · text+image->text
$0.0066$1.00$5.00200K

Methodology

Models are pulled from OpenRouter's public model catalog, filtered to first-party hosted models from each vendor and sorted by Speko's voice-agent cost-per-minute estimate. Open-weight models that require a third-party host (Groq, Together, Fireworks) are excluded. We refresh once per hour. STT and TTS rankings are still measured firsthand by Speko Bench CLI.

Stop guessing. Start benchmarking.

Independent, data-driven comparisons to help you pick the right voice AI stack.

Get Started