Google launches Gemini 3.5 Flash with 1M-token context and Gemini Omni video generation at I/O 2026; processes 3.2 quadrillion tokens/month across 900M+ monthly users
Latent Space · May 20, 2026
AI Summary
•Gemini 3.5 Flash is now generally available with 1M-token context window, 65k max output tokens, four thinking levels (minimal, low, medium, high), and thought preservation across multi-turn conversations; Google claims it runs 4x faster than comparable frontier models and up to 12x faster in Antigravity (a platform for running background agents and long-horizon tasks).
•Gemini Omni, a new family combining Gemini reasoning with generative media capabilities, takes text/image/video/audio inputs and produces video edits and generation in Gemini, Flow, Shorts, and later via APIs; Omni Flash is available in Gemini and Flow today for paid users and in Shorts and Create starting this week for free users.
•Google reports processing 3.2 quadrillion tokens/month, up 7x year-over-year from 480 trillion/month; Gemini app has 900M+ monthly users and is available in 230+ countries and 70+ languages. Artificial Analysis benchmarks show Gemini 3.5 Flash at Intelligence Index 55 (+9 vs. Gemini 3 Flash), GDPval-AA 1656 Elo, and pricing of $1.50 / $9.00 per 1M input/output tokens; independent Arena reports the model at #9 in Text Arena and #9 in Code Arena: Frontend with a score of 1507 (+70 over Gemini 3 Flash).