Speech-to-Speech Benchmark
3 models · Speko-measured · coffee-order-happy scenario · N=20 per region · April 2026
| Provider | Architecture | P50 latency tool-call turn | Task Success | N | Source |
|---|---|---|---|---|---|
OpenAI gpt-realtime | Native S2S | 3485ms | 85% | 20 | OpenAI gpt-realtime |
xAI grok-voice-think-fast-1.0 | Native S2S | 1319ms | 95% | 20 | xAI Grok Voice Agent |
Google gemini-live-2.5-flash-native-audio | Native S2S | 2655ms | 80% | 17 | Vertex AI Gemini Live |