AIToday

OpenAI's GPT-4 Realtime model now lets you upload documents and discuss them by voice in a web browser.

Simon Willison's Weblog9h ago1 min read
OpenAI's GPT-4 Realtime model now lets you upload documents and discuss them by voice in a web browser.

Summaries like this, in your inbox every morning.

Sign up free →

3 Key Points

  1. 1

    What happened: A developer rebuilt their WebRTC audio playground tool to support OpenAI's new GPT-Realtime-2 model (described by OpenAI as their first voice model with GPT-5-class reasoning) and added the ability to paste in document context for audio conversations.

  2. 2

    Why it matters: GPT-Realtime-2 offers a more capable voice interaction experience than the earlier WebRTC API model, with the added ability to ground conversations in specific documents—making voice a practical way to explore information interactively rather than just chat.

  3. 3

    What to watch: The tool is available in a web browser now; however, the GPT-Realtime-2 model has not yet appeared in the ChatGPT iPhone app despite its announcement last month.

Discussion

No discussion yet for this article

Stay ahead with AI news

Get curated AI news from 200+ sources delivered daily to your inbox. Free to use.

Get Started Free

Free · takes 30 seconds · unsubscribe anytime

5 minutes a day. The AI essentials.

200+ sources · Email / LINE / Slack

Get it free →