Best Indonesian Speech-to-Text API (2026): independent benchmark

Indonesian speech-to-text accuracy, measured. 30 Indonesian read-speech clips from FLEURS, every provider routed through the same gateway and scored identically on 2026-06-03. Lower WER is better.

OpenAI GPT-4o Transcribe posted the lowest Indonesian WER at 2.4%, ahead of xAI Grok STT at 2.9%. 2 of the 6 providers on our English board do not support Indonesian at all, and the supported field spreads from 2.4% to 4.6% - picking a provider by its English score alone is a mistake.

Indonesian STT leaderboard

30 Indonesian clips (FLEURS), measured 2026-06-03 through the Speko gateway, loudness-normalized to -16 LUFS. WER measured on Indonesian; latency and list price from the same gateway setup on the English board. Lower is better.

Provider / model WER p50 latency List price
OpenAI GPT-4o Transcribe 2.4% 1,084 ms $0.006/min
xAI Grok STT 2.9% 996 ms -
ElevenLabs Scribe v2 3% 1,353 ms $0.0067/min
Alibaba Qwen3-ASR 4.6% 2,195 ms -
Cartesia Ink-2 does not support - -
Gradium does not support - -

How we measured

Full interactive table, every territory, and the complete methodology: benchmarks.speko.ai

Use the winner without lock-in

The best Indonesian provider today is one benchmark run away from being second best. Speko is one API in front of every provider on this table: it routes each request to the measured-best provider for your language and fails over automatically when a provider degrades. No per-vendor integration, no migration when the leaderboard flips.

curl
curl -X POST https://api.speko.dev/v1/transcribe \
  -H "Authorization: Bearer $SPEKO_API_KEY" \
  -H "Content-Type: audio/wav" \
  -H "x-speko-intent: {\"language\":\"id\"}" \
  --data-binary @call.wav
TypeScript
import { Speko } from '@spekoai/sdk';
import { readFile } from 'node:fs/promises';

const speko = new Speko({ apiKey: process.env.SPEKO_API_KEY! });
const audio = await readFile('./call.wav');

const { text, provider, confidence } = await speko.transcribe(audio, {
  language: 'id',
});
start free read the docs

FAQ

What is the most accurate Indonesian speech-to-text API?

On Speko's 2026-06-03 FLEURS benchmark (30 Indonesian clips), OpenAI GPT-4o Transcribe posted the lowest WER at 2.4%, followed by xAI Grok STT at 2.9%.

Does ElevenLabs support Indonesian speech-to-text?

Yes. ElevenLabs Scribe v2 scored 3% WER on our Indonesian run.

Which providers do not support Indonesian transcription?

Cartesia Ink-2 and Gradium are English-only on our board: on Indonesian input they return text in the wrong script (roughly 76-100% error), so we mark them "does not support" instead of publishing a misleading number.

How was Indonesian STT accuracy measured?

30 Indonesian read-speech clips from FLEURS, loudness-normalized to -16 LUFS, sent through the Speko gateway with the provider pinned, and scored as word error rate on 2026-06-03. Support is checked first: a provider is only benchmarked on a language it actually transcribes in the native script.

More language benchmarks

Best English STT APIBest Thai STT APIBest Vietnamese STT APITTS benchmarks by languageFull interactive STT benchmark