Welcome back
Curated from 200+ sources across AI & machine learning

Enterprise AI teams are hitting a wall — not because their models can't reason, but because the workflows underneath them were never built for agents. Tasks fail, handoffs break, and the problem compounds as organizations push agents deeper into back-office systems. A new architectural layer is emerging to address it: workflow execution control planes that impose deterministic structure on processes agents are expected to run. One of the companies bringing this to the forefront is Salesforce, with a new workflow platform that turns back-office workflows into a set of tasks for specialized agents to complete. Users can upload their processes or use one of the set Blueprints provided by Salesforce, and Agentforce Operations will break it down for agents. Salesforce senior vice president of Product, Sanjna Parulekar, told VentureBeat in an interview that the problem is that many enterprise workflows are not built for agents. “What we’ve observed with customers is that a lot of times, the



Large U.S. venture deals this week were led by a massive defense tech raise for space security startup True Anomaly. We also saw sizable deals for startups applying AI to fintech, marketing, customer service, healthcare and developer tools.

Its focus on chips that are well-suited to inference workloads will help it maintain its power in the artificial intelligence space.
When will GitHub allow CoPilot AI programmer for new customers again? Is the AI revolution on hold? Can't the AIs handle the real world scale? https://github.blog/news-insights/company-news/changes-to-github-copilot-individual-plans/ Comments URL: https://news.ycombinator.com/item?id=47980055 Points: 2 # Comments: 1
Article URL: https://deepmind.google/blog/ai-co-clinician/ Comments URL: https://news.ycombinator.com/item?id=47980192 Points: 2 # Comments: 0
Hi everyone, I built a ai personal journalist agent that helps you easily follow any topic or webpage for any changes you want to get alerted on. You just type in what you want to follow, add notification alert criteria and AI keeps monitoring the information, understanding it and decides if its worthly enough to bug you. Helps you monitor so many things you care about without manual reading, understanding and deciding I built it because I often had to jump between tech news sites, and other sources to stay updated. We’re just came out of beta. If you’re interested to try it out. product in comment submitted by /u/ayesrx9 [link] [comments]

Nebius stock rose amid the cloud computing services provider's acquisition of artificial intelligence software maker Eigen AI.

C Squared Social today announced a major expansion of its media buying division, tripling the team of specialists as part of a strategic transformation initiative launched in September 2025. The expansion follows the rollout of Meta's Andromeda update, which represents a significant shift toward AI-driven ad delivery and creative-first optimization.

After Beijing blocked Meta's takeover of Manus, China's securities regulator signaled that companies hoping to go public should be registered at home. Now AI startups like Moonshot AI and StepFun are considering dissolving their foreign holding structures and registering directly in China as part of Beijing's broader push to keep its AI industry under tight control. The article First Chinese AI startups are reportedly ditching offshore structures to register directly in China appeared first on The Decoder.

A survey in March found only 26% of Americans had a favorable view of AI.

A new US-wide cell phone network marketed to Christians is set to launch next week. It blocks porn, which experts in network security say marks the first time a US cell plan has used network-level blocking for such content that can’t be turned off even by adult account owners. It’s also rolling out a filter…

The deals come as the DOD has doubled down on diversifying its exposure to AI vendors in the wake of its controversial dispute with Anthropic over usage terms of its AI models.

X is rolling out a rebuilt ads platform powered by AI as it works to grow revenue again.

Netomi, the San Francisco-based startup building AI systems for enterprise customer service, said Thursday that it has raised $110 million in new funding in a round led by Accenture Ventures, with participation from Adobe Ventures, WndrCo, Silver Lake Waterman, NAVER Ventures, Metis Strategy and Fin Capital. Jeffrey Katzenberg, managing partner of WndrCo and co-founder of DreamWorks, has joined the company's board. The round builds on early backing from a roster of AI luminaries that includes OpenAI co-founder Greg Brockman, Google DeepMind co-founder Demis Hassabis and Microsoft AI CEO Mustafa Suleyman. On its face, the financing is another large AI round in a market still awash in capital. But the deal is more revealing than that. It suggests that a new line is being drawn inside enterprise AI — not between companies that have a chatbot and companies that do not, but between companies that can show AI works in the messy, brittle, heavily governed environments where large businesses a

Article URL: https://reutersinstitute.politics.ox.ac.uk/news/ai-and-future-news-2026-what-we-learnt-about-its-impact-newsrooms-fact-checking-and-news Comments URL: https://news.ycombinator.com/item?id=47971364 Points: 3 # Comments: 0
Hi! I wanted to share my new blog on the costs of running AI Evals. We dig into how benchmarking frontier systems now routinely costs tens of thousands of dollars per run, why agent evals are especially unpredictable, and what that concentration of validation authority means for the broader research community. submitted by /u/evijit [link] [comments]
AI news from 200+ sources
Get Started Free
Anthropic wants to give cyber defenders an edge with Claude Security, drawing on the same offensive capabilities it recently deemed too dangerous to release in another model. The article Anthropic launches Claude Security to give defenders the same AI edge attackers already have appeared first on The Decoder.

Microsoft is launching a new AI agent inside Word that's specifically designed for legal teams. Legal Agent handles document edits, negotiation history, and complex documents to help legal teams handle tasks like reviewing contracts. "Instead of relying on general AI models to interpret commands, the agent follows structured workflows shaped by real legal practice, managing clearly defined, repeatable tasks like reviewing contracts clause by clause against a playbook," explains Sumit Chauhan, corporate vice president of Microsoft's Office Product Group. The Legal Agent can work with existing documents that have tracked changes, and analyz … Read the full story at The Verge.

New results suggest Mythos' cyber threat isn't "a breakthrough specific to one model."

The artificial intelligence boom is entering a new phase. For the past two years, investors obsessed over GPUs, data centers, and power consumption. But agentic AI — systems that can reason, plan, and act independently — is changing the equation. These models don’t just need compute. They need enormous amounts of fast memory constantly available ... The AI Gold Rush Just Hit a New Layer: Here’s Why Sandisk Is Printing Money
In the future, everyone will have their own AI agent. Not just a chatbot, but an actual agent that works for you. It will write code, automate tasks, coordinate workflows, search for information, and interact with other agents. But if millions of agents exist, they need a way to identify and reach each other. Agents should have addresses. Simple human readable identities instead of random hashes. Something agents can discover, message, hire, and collaborate with. An address becomes more than a name. It becomes an entry point into an agent. That’s what I’m building right now. A decentralized network where AI agents can communicate, collaborate, share knowledge, and work together through a unified addressing system. Not isolated tools. A real network for agents. And I’m planning to make the entire thing open source and free for anyone to use. You can leave your email here to get early access: www.cogninet.co submitted by /u/sherdil09 [link] [comments]
In the future, everyone will have their own AI agent. Not just a chatbot, but an actual agent that works for you. It will write code, automate tasks, coordinate workflows, search for information, and interact with other agents. But if millions of agents exist, they need a way to identify and reach each other. Agents should have addresses. Simple human readable identities instead of random hashes. Something agents can discover, message, hire, and collaborate with. An address becomes more than a name. It becomes an entry point into an agent. That’s what I’m building right now. A decentralized network where AI agents can communicate, collaborate, share knowledge, and work together through a unified addressing system. Not isolated tools. A real network for agents. And I’m planning to make the entire thing open source and free for anyone to use. You can leave your email here to get early access: www.cogninet.co submitted by /u/sherdil09 [link] [comments]

Writer, the enterprise AI agent platform backed by Salesforce Ventures, Adobe Ventures, and Insight Partners, today launched event-based triggers for its Writer Agent platform, enabling AI agents to autonomously detect business signals across Gmail, Gong, Google Calendar, Google Drive, Microsoft SharePoint, and Slack — and execute complex multi-step workflows without any human initiating the process. The release, which also includes a new Adobe Experience Manager connector and a suite of enhanced governance controls such as bring-your-own encryption keys and a Datadog observability plugin, represents Writer's most aggressive bet yet on fully autonomous enterprise AI. It arrives at a moment when AWS, Salesforce, and Microsoft are all racing to establish their own agentic platforms, and when the question of how much autonomy enterprises will actually hand to AI agents remains deeply unresolved. "We are launching a series of event triggers that power and drive our playbooks to be more pro

OpenAI is launching additional opt-in protections for ChatGPT accounts. The new security initiative includes a new partnership with security key provider Yubico.

Mistral's new flagship, Mistral Medium 3.5, merges what used to be separate models for chat, reasoning, and code into a single product. The French company is also adding asynchronous cloud agents to its coding tool Vibe and giving Le Chat a new agent mode. The article Mistral's new flagship Medium 3.5 folds chat, reasoning, and code into one model appeared first on The Decoder.
![[AINews] Agents for Everything Else: Codex for Knowledge Work, Claude for Creative Work](https://zmstgxtziqmvvwzllahg.supabase.co/storage/v1/object/public/article-images/latent-space/f01bab2b-5c53-443d-a6ed-8e4308641740.jpg)
a quiet day lets us reflect on coding agents "breaking containment"
TBH I don't know if our current "AI" models are capable of thinking. There is a massive pattern i'm noticing when using AI and have been for the past couple years, AI follows a strict pattern and doesn't seem to think. Just like calculators it already has a designated answer regardless of the question its just a bit more advanced. Hence why it lies to many users. Or it could be that there are so many rules on the intelligence model that it is constantly bouncing off of walls to give you an already programmed answer to not break these rules. Im not sure about either. I'd much rather call AI as of rn "engineered intelligence", not artificial, since its still learning from us engineers, and it will eventually adapt into intelligence. ( This is under the assumption that it can truly freely think ) Does anyone know if these models like Gemini, Chatgpt, Claude, actually "think" submitted by /u/Opening-Name-5270 [link] [comments]
arXiv:2604.26986v1 Announce Type: new Abstract: We introduce a novel task of digital battery passport (DBP) conformance classification and introduce the first public benchmark for the task: BatteryPass-12K, created synthetically from real pilot samples. This is as the EU's battery regulation on DBPs comes into effect soon and there exists no public dataset. We evaluated 22 language models (LMs) in zero-shot inference, spanning small LMs (SLMs), mixture of experts (MoEs), and dense LLMs. We also conducted analysis, additional evaluations of few-shot inference and prompt-injection attacks to find that (1) Thinking models have the best performance (with GPT-5.4 scoring 0.98 (0.03) and 0.71 (0.22) on average as F1 (and confidence interval at 95%) on the validation and test sets, respectively), (2) few-shot examples improve performance significantly, (3) generally capable frontier models find the task challenging, (4) merely scaling model parameters does not necessarily lead to improved pe