Best Filipino Text-to-Speech API (2026): independent benchmark

Filipino (Tagalog) breaks the usual TTS scoring playbook: it natively code-switches with English (Taglish), so English-sounding phones are expected content, not an accent tell. We measured 10 systems on Speko's Filipino eval set with the checks that stay objective: an intelligibility gate, round-trip CER, pacing, and signal hygiene.

8 of 10 systems produce intelligible Filipino. Polly Generative and Deepgram Aura 2 fail the gate: their "Filipino" output comes back detected as English. Among the systems that pass, xAI / Grok TTS posted the lowest round-trip CER at 1.5%.

No acoustic feature validly ranks Filipino quality: the rhythm metric that works for Thai inverts here, and English-phone intrusion correlates the wrong way because Taglish makes English phones legitimate content. Accent and naturalness are human-rated only.

Filipino TTS measurements

Sorted by round-trip CER, gate failures last. Objective checks only (intelligibility, pacing, hygiene): no acoustic metric validly ranks Filipino naturalness, so none is shown.

System Type Gate (detected) Round-trip CER Pacing (w/s) True peak (dBTP)
xAI / Grok TTS tts pass (tl) 1.5% 2.51 -4.25
Cartesia Sonic 3.5 tts pass (tl) 2.2% 2.91 -0.86 (hot)
ElevenLabs v3 tts pass (tl) 2.4% 2.27 -0.37 (hot)
Inworld TTS 2 tts pass (tl) 3.0% 2.78 -4.5
GPT Realtime realtime pass (tl) 3.5% 2.38 -3.51
GPT Realtime v2 realtime pass (tl) 3.7% 2.21 -5.27
GPT-4o mini TTS tts pass (tl) 5.6% 2.06 -11.78
MiniMax Speech 2.6 HD tts pass (tl) 5.6% 2.02 -1.47
Polly Generative tts fail (detected English) 15.2% 2.08 -5.81
Deepgram Aura 2 tts fail (detected English) 57.4% 1.09 (outside band) -4.71

How we measured

Full interactive panels, audio clips, and the complete methodology: benchmarks.speko.ai

Use the best Filipino voice without lock-in

Speko is one API in front of every system on this page: it routes each request to the measured-best provider for your language and fails over automatically when a provider degrades. When the next run reshuffles this table, your integration does not change.

curl
curl -X POST https://api.speko.dev/v1/synthesize \
  -H "Authorization: Bearer $SPEKO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"text": "Kumusta! Salamat sa pagtawag.", "intent": {"language": "fil"}}' \
  --output reply.audio
TypeScript
import { Speko } from '@spekoai/sdk';

const speko = new Speko({ apiKey: process.env.SPEKO_API_KEY! });

const { audio, provider, model } = await speko.synthesize('Kumusta! Salamat sa pagtawag.', {
  language: 'fil',
});
start free read the docs

FAQ

What is the best Filipino text-to-speech API?

No acoustic metric validly ranks Filipino naturalness (Taglish code-switching inverts the usual accent signals), so we publish objective checks instead. 8 of 10 systems pass the intelligibility gate, and xAI / Grok TTS has the lowest round-trip CER at 1.5%. Accent and naturalness judgments are left to native raters.

Does AWS Polly support Filipino?

Polly Generative failed our Filipino intelligibility gate: its output was detected as English with a 15.2% round-trip CER, so it is excluded rather than ranked.

Does ElevenLabs support Filipino text-to-speech?

Yes. ElevenLabs v3 passes the gate with a 2.4% round-trip CER. One flag: its master peaks at -0.37 dBTP, above our -1 dBTP danger line for downstream clipping.

Why is there no Filipino naturalness ranking?

The rhythm metric that works for Thai inverts on Filipino (correlation -0.53), and English-phone intrusion correlates the wrong way (+0.44) because Taglish makes English phones legitimate content. No deterministic feature separates the "conyo" accent native speakers penalize from legitimate loanwords, so naturalness stays human-rated.

More language benchmarks

Best Thai TTS APIBest Vietnamese TTS APISTT benchmarks by languageFull interactive TTS benchmark