
Google has integrated computer-control capability directly into its Gemini 3.5 Flash AI model, enabling it to see and operate screens across computers, browsers, and mobile devices. On benchmark tests, the model ranks highly for this task, making it practical for developers to build automated agents for software testing and office work. The feature includes built-in safeguards and is available now through Google's Gemini API and Enterprise Agent Platform.
Summaries like this, in your inbox every morning.
Sign up free →What happened
Google integrated "Computer Use" into Gemini 3.5 Flash, allowing the model to see, understand, and interact with computers, browsers, and mobile devices on its own. Previously this capability was only available as a separate Gemini 2.5 model. The feature is now available through the Gemini API and the Gemini Enterprise Agent Platform.
Why it matters
On the OSWorld benchmark, Gemini 3.5 Flash scores 78.4, beating Gemini 3 Flash (65.1) and GPT-5.4 mini (72.1), putting it among the top-performing models for computer interaction tasks. This opens the door for developers to build agents that automate software testing, office tasks, and browser workflows across multiple device types.
What to watch
Google has built in two optional enterprise safeguards—one requiring user confirmation for sensitive actions, and another automatically stopping tasks when indirect prompt injections are detected. The company also recommends sandboxing, human oversight, and strict access controls to guard against abuse.
No comments yet. Be the first to share your thoughts!
Log in to join the discussion




Get curated AI news from 200+ sources delivered daily to your inbox. Free to use.
Get Started FreeFree · takes 30 seconds · unsubscribe anytime
5 minutes a day. The AI essentials.
200+ sources · Email / LINE / Slack