Best Thai Speech-to-Text API (2026): independent benchmark

Thai speech-to-text accuracy, measured. 30 Thai read-speech clips from FLEURS, every provider routed through the same gateway and scored identically on 2026-06-03. Lower CER is better.

ElevenLabs Scribe v2 posted the lowest Thai CER at 4.1%, ahead of Alibaba Qwen3-ASR at 4.8%. 2 of the 6 providers on our English board do not support Thai at all, and the supported field spreads from 4.1% to 8.1% - picking a provider by its English score alone is a mistake.

Thai is written without spaces between words, so word error rate is ill-defined. We score Thai with character error rate (CER) instead; the other languages use WER.

Thai STT leaderboard

30 Thai clips (FLEURS), measured 2026-06-03 through the Speko gateway, loudness-normalized to -16 LUFS. CER measured on Thai; latency and list price from the same gateway setup on the English board. Lower is better.

Provider / model CER p50 latency List price
ElevenLabs Scribe v2 4.1% 1,353 ms $0.0067/min
Alibaba Qwen3-ASR 4.8% 2,195 ms -
xAI Grok STT 6.6% 996 ms -
OpenAI GPT-4o Transcribe 8.1% 1,084 ms $0.006/min
Cartesia Ink-2 does not support - -
Gradium does not support - -

How we measured

Full interactive table, every territory, and the complete methodology: benchmarks.speko.ai

Use the winner without lock-in

The best Thai provider today is one benchmark run away from being second best. Speko is one API in front of every provider on this table: it routes each request to the measured-best provider for your language and fails over automatically when a provider degrades. No per-vendor integration, no migration when the leaderboard flips.

curl
curl -X POST https://api.speko.dev/v1/transcribe \
  -H "Authorization: Bearer $SPEKO_API_KEY" \
  -H "Content-Type: audio/wav" \
  -H "x-speko-intent: {\"language\":\"th\"}" \
  --data-binary @call.wav
TypeScript
import { Speko } from '@spekoai/sdk';
import { readFile } from 'node:fs/promises';

const speko = new Speko({ apiKey: process.env.SPEKO_API_KEY! });
const audio = await readFile('./call.wav');

const { text, provider, confidence } = await speko.transcribe(audio, {
  language: 'th',
});
start free read the docs

FAQ

What is the most accurate Thai speech-to-text API?

On Speko's 2026-06-03 FLEURS benchmark (30 Thai clips), ElevenLabs Scribe v2 posted the lowest CER at 4.1%, followed by Alibaba Qwen3-ASR at 4.8%.

Does ElevenLabs support Thai speech-to-text?

Yes. ElevenLabs Scribe v2 scored 4.1% CER on our Thai run, the best result on the board.

Why is Thai scored with CER instead of WER?

Thai script does not put spaces between words, so "word error rate" depends on an arbitrary segmenter. Character error rate avoids that, which is why Thai STT benchmarks (including ours) report CER.

Which providers do not support Thai transcription?

Cartesia Ink-2 and Gradium are English-only on our board: on Thai input they return text in the wrong script (roughly 76-100% error), so we mark them "does not support" instead of publishing a misleading number.

How was Thai STT accuracy measured?

30 Thai read-speech clips from FLEURS, loudness-normalized to -16 LUFS, sent through the Speko gateway with the provider pinned, and scored as character error rate on 2026-06-03. Support is checked first: a provider is only benchmarked on a language it actually transcribes in the native script.

More language benchmarks

Best Thai TTS APIBest English STT APIBest Indonesian STT APIBest Vietnamese STT APITTS benchmarks by languageFull interactive STT benchmark