Best speech-to-text API, by language

English leaderboards do not transfer. We benchmark STT providers per language on FLEURS read speech, identically routed through the Speko gateway, and publish the measured numbers. On the 2026-06-03 run, 2 different providers win across 4 languages - which is exactly why routing per language beats picking one vendor.

English

OpenAI GPT-4o Transcribe: 2.4% WER

Full English leaderboard: accuracy, latency, list price, and which providers do not support it.

Thai

ElevenLabs Scribe v2: 4.1% CER

Full Thai leaderboard: accuracy, latency, list price, and which providers do not support it.

Indonesian

OpenAI GPT-4o Transcribe: 2.4% WER

Full Indonesian leaderboard: accuracy, latency, list price, and which providers do not support it.

Vietnamese

ElevenLabs Scribe v2: 1.9% WER

Full Vietnamese leaderboard: accuracy, latency, list price, and which providers do not support it.

Methodology in one line

FLEURS read-speech clips per language, loudness-normalized to -16 LUFS, every provider measured through the same gateway, scored as WER (CER for Thai, which has no word boundaries). Full methodology and interactive tables live at benchmarks.speko.ai.

Looking for text-to-speech?

TTS benchmarks by language Full interactive STT benchmark