
Summaries like this, in your inbox every morning.
Sign up free →OpenAI shipped three new voice models: GPT-Realtime-2 (for reasoning and real-time conversation), GPT-Realtime-Translate (covering 70+ input languages and 13 output languages), and GPT-Realtime-Whisper (for low-latency streaming transcription). All are available now through the Realtime API.
GPT-Realtime-2 expands the context window from 32,000 to 128,000 tokens to support longer conversations, allows developers to dial reasoning intensity across five levels (minimal, low, medium, high, xhigh), and uses verbal stalling techniques like 'one moment' to signal the system is working. On benchmarks, it reaches 96.6 percent accuracy on Big Bench Audio at the 'high' setting, up from 81.4 percent on its predecessor.
Pricing is token-based for GPT-Realtime-2 ($32 per million audio input tokens and $64 per million audio output tokens) and minute-based for the other two models ($0.034 per minute for translation, $0.017 per minute for transcription). The Realtime API supports EU data residency.
No comments yet. Be the first to share your thoughts!
Log in to join the discussion




Get curated AI news from 200+ sources delivered daily to your inbox. Free to use.
Get Started FreeFree · takes 30 seconds · unsubscribe anytime
5 minutes a day. The AI essentials.
200+ sources · Email / LINE / Slack