AIToday

Welcome back

or
Don't have an account? Sign upForgot password?
🔥 Updated in real-time

Today's Top AI News

Curated from 200+ sources across AI & machine learning

Taiwan launches national robotics center with $629 million startup funding plan
TOP STORYRobotics

Taiwan launches national robotics center with $629 million startup funding plan

Taiwan has launched a new national robotics center alongside a $629 million funding initiative aimed at accelerating the creation of domestic robotics companies, as the island seeks to strengthen its position in the global automation race. According to a report by Cryptopolitan, Taiwan’s president Lai Ching-te formally inaugurated the National Center for AI Robotics (NCAIR), […]

Robotics & Automation News·1h ago
How old records and new AI technology brought growth back to Ancestry
#2General AI

How old records and new AI technology brought growth back to Ancestry

Semafor Tech4h ago
Claude now works across all three major Office apps
#3Models & Gen AI

Claude now works across all three major Office apps

THE DECODER4h ago
🧠

Models & Gen AI

Agentic coding at enterprise scale demands spec-driven development

Agentic coding at enterprise scale demands spec-driven development

Presented by AWS Autonomous agents are compressing software delivery timelines from weeks to days. The enterprises that scale agents safely will be the ones that build using spec-driven development. There’s a moment in every technology shift where the early adopters stop being outliers and start being the baseline. We’re at that moment in software development, and most teams don’t realize it yet. A year ago, vibe coding went viral. Non-developers and junior developers discovered they could build beyond their abilities with AI. It lowered the floor. It made prototyping much quicker, but it also introduced a surplus of slop. What the industry then needed was something that raised the ceiling — something that improved code quality and worked the way the most expert developers work. Spec-driven development did that. It laid the foundation for trustworthy autonomous coding agents. Specs are the trust model for autonomous development Most discussions of AI-generated code focus on whether AI

Models & Gen AI
VentureBeat AI
AI Agents Are Coming for Your Dating Life

AI Agents Are Coming for Your Dating Life

The developers of Pixel Societies are using AI agents to simulate social interactions. It's an attempt optimize the process of choosing new colleagues, friends, and even romantic partners.

Models & Gen AI
WIRED AI
Your developers are already running AI locally: Why on-device inference is the CISO’s new blind spot

Your developers are already running AI locally: Why on-device inference is the CISO’s new blind spot

For the last 18 months, the CISO playbook for generative AI has been relatively simple: Control the browser. Security teams tightened cloud access security broker (CASB) policies, blocked or monitored traffic to well-known AI endpoints, and routed usage through sanctioned gateways. The operating model was clear: If sensitive data leaves the network for an external API call, we can observe it, log it, and stop it. But that model is starting to break. A quiet hardware shift is pushing large language model (LLM) usage off the network and onto the endpoint. Call it Shadow AI 2.0, or the “bring your own model” (BYOM) era: Employees running capable models locally on laptops, offline, with no API calls and no obvious network signature. The governance conversation is still framed as “data exfiltration to the cloud,” but the more immediate enterprise risk is increasingly “unvetted inference inside the device." When inference happens locally, traditional data loss prevention (DLP) doesn’t see th

Models & Gen AI
VentureBeat AI
Blazing hot IPOs, an AI agent craze, and a new word for ‘token’: Here’s what’s happening in the world of Chinese AI

Blazing hot IPOs, an AI agent craze, and a new word for ‘token’: Here’s what’s happening in the world of Chinese AI

China hopes to build a “token economy,” backed by open-source models and real-world AI applications—even as U.S. export controls still hold things back.

Models & Gen AI
Fortune AI
State of AI: April 2026 newsletter

State of AI: April 2026 newsletter

US Government blacklists Anthropic as Iran bombs AWS data centers. Plus: $19B revenue in weeks, industrial-scale distillation wars, and an mRNA dog cancer vaccine designed by ChatGPT.

Models & Gen AI
Air Street Press (State of AI)
V-CAGE: Vision-Closed-Loop Agentic Generation Engine for Robotic Manipulation

V-CAGE: Vision-Closed-Loop Agentic Generation Engine for Robotic Manipulation

arXiv:2604.09036v1 Announce Type: new Abstract: Scaling Vision-Language-Action (VLA) models requires massive datasets that are both semantically coherent and physically feasible. However, existing scene generation methods often lack context-awareness, making it difficult to synthesize high-fidelity environments embedded with rich semantic information, frequently resulting in unreachable target positions that cause tasks to fail prematurely. We present V-CAGE (Vision-Closed-loop Agentic Generation Engine), an agentic framework for autonomous robotic data synthesis. Unlike traditional scripted pipelines, V-CAGE operates as an embodied agentic system, leveraging foundation models to bridge high-level semantic reasoning with low-level physical interaction. Specifically, we introduce Inpainting-Guided Scene Construction to systematically arrange context-aware layouts, ensuring that the generated scenes are both semantically structured and kinematically reachable. To ensure trajectory corre

Models & Gen AI
arXiv cs.RO (Robotics)
AssemLM: Spatial Reasoning Multimodal Large Language Models for Robotic Assembly

AssemLM: Spatial Reasoning Multimodal Large Language Models for Robotic Assembly

arXiv:2604.08983v1 Announce Type: new Abstract: Spatial reasoning is a fundamental capability for embodied intelligence, especially for fine-grained manipulation tasks such as robotic assembly. While recent vision-language models (VLMs) exhibit preliminary spatial awareness, they largely rely on coarse 2D perception and lack the ability to perform accurate reasoning over 3D geometry, which is crucial for precise assembly operations. To address this limitation, we propose AssemLM, a spatial multimodal large language model tailored for robotic assembly. AssemLM integrates assembly manuals, point clouds, and textual instructions to reason about and predict task-critical 6D assembly poses, enabling explicit geometric understanding throughout the assembly process. To effectively bridge raw 3D perception and high-level reasoning, we adopt a specialized point cloud encoder to capture fine-grained geometric and rotational features, which are then integrated into the multimodal language model

Models & Gen AI
arXiv cs.RO (Robotics)
Plasticity-Enhanced Multi-Agent Mixture of Experts for Dynamic Objective Adaptation in UAVs-Assisted Emergency Communication Networks

Plasticity-Enhanced Multi-Agent Mixture of Experts for Dynamic Objective Adaptation in UAVs-Assisted Emergency Communication Networks

arXiv:2604.09028v1 Announce Type: new Abstract: Unmanned aerial vehicles serving as aerial base stations can rapidly restore connectivity after disasters, yet abrupt changes in user mobility and traffic demands shift the quality of service trade-offs and induce strong non-stationarity. Deep reinforcement learning policies suffer from plasticity loss under such shifts, as representation collapse and neuron dormancy impair adaptation. We propose plasticity enhanced multi-agent mixture of experts (PE-MAMoE), a centralized training with decentralized execution framework built on multi-agent proximal policy optimization. PE-MAMoE equips each UAV with a sparsely gated mixture of experts actor whose router selects a single specialist per step. A non-parametric Phase Controller injects brief, expert-only stochastic perturbations after phase switches, resets the action log-standard-deviation, anneals entropy and learning rate, and schedules the router temperature, all to re-plasticize the poli

Models & Gen AI
arXiv cs.MA (Multi-Agent)
Multi-User Large Language Model Agents

Multi-User Large Language Model Agents

arXiv:2604.08567v1 Announce Type: cross Abstract: Large language models (LLMs) and LLM-based agents are increasingly deployed as assistants in planning and decision making, yet most existing systems are implicitly optimized for a single-principal interaction paradigm, in which the model is designed to satisfy the objectives of one dominant user whose instructions are treated as the sole source of authority and utility. However, as they are integrated into team workflows and organizational tools, they are increasingly required to serve multiple users simultaneously, each with distinct roles, preferences, and authority levels, leading to multi-user, multi-principal settings with unavoidable conflicts, information asymmetry, and privacy constraints. In this work, we present the first systematic study of multi-user LLM agents. We begin by formalizing multi-user interaction with LLM agents as a multi-principal decision problem, where a single agent must account for multiple users with pote

Models & Gen AI
arXiv cs.MA (Multi-Agent)
Event-Driven Temporal Graph Networks for Asynchronous Multi-Agent Cyber Defense in NetForge_RL

Event-Driven Temporal Graph Networks for Asynchronous Multi-Agent Cyber Defense in NetForge_RL

arXiv:2604.09523v1 Announce Type: cross Abstract: The transition of Multi-Agent Reinforcement Learning (MARL) policies from simulated cyber wargames to operational Security Operations Centers (SOCs) is fundamentally bottlenecked by the Sim2Real gap. Legacy simulators abstract away network protocol physics, rely on synchronous ticks, and provide clean state vectors rather than authentic, noisy telemetry. To resolve these limitations, we introduce NetForge_RL: a high-fidelity cyber operations simulator that reformulates network defense as an asynchronous, continuous-time Partially Observable Semi-Markov Decision Process (POSMDP). NetForge enforces Zero-Trust Network Access (ZTNA) constraints and requires defenders to process NLP-encoded SIEM telemetry. Crucially, NetForge bridges the Sim2Real gap natively via a dual-mode engine, allowing high-throughput MARL training in a mock hypervisor and zero-shot evaluation against live exploits in a Docker hypervisor. To navigate this continuous-t

Models & Gen AI
arXiv cs.MA (Multi-Agent)
MARINER: A 3E-Driven Benchmark for Fine-Grained Perception and Complex Reasoning in Open-Water Environments

MARINER: A 3E-Driven Benchmark for Fine-Grained Perception and Complex Reasoning in Open-Water Environments

arXiv:2604.08615v1 Announce Type: new Abstract: Fine-grained visual understanding and high-level reasoning in real-world open-water environments remain under-explored due to the lack of dedicated benchmarks. We introduce MARINER, a comprehensive benchmark built under the novel Entity-Environment-Event (3E) paradigm. MARINER contains 16,629 multi-source maritime images with 63 fine-grained vessel categories, diverse adverse environments, and 5 typical dynamic maritime incidents, covering fine-grained classification, object detection, and visual question answering tasks. We conduct extensive evaluations on mainstream Multimodal Large language models (MLLMs) and establish baselines, revealing that even advanced models struggle with fine-grained discrimination and causal reasoning in complex marine scenes. As a dedicated maritime benchmark, MARINER fills the gap of realistic and cognitive-level evaluation for maritime multimodal understanding, and promotes future research on robust vision

Models & Gen AI
arXiv cs.CV
3D-VCD: Hallucination Mitigation in 3D-LLM Embodied Agents through Visual Contrastive Decoding

3D-VCD: Hallucination Mitigation in 3D-LLM Embodied Agents through Visual Contrastive Decoding

arXiv:2604.08645v1 Announce Type: new Abstract: Large multimodal models are increasingly used as the reasoning core of embodied agents operating in 3D environments, yet they remain prone to hallucinations that can produce unsafe and ungrounded decisions. Existing inference-time hallucination mitigation methods largely target 2D vision-language settings and do not transfer to embodied 3D reasoning, where failures arise from object presence, spatial layout, and geometric grounding rather than pixel-level inconsistencies. We introduce 3D-VCD, the first inference-time visual contrastive decoding framework for hallucination mitigation in 3D embodied agents. 3D-VCD constructs a distorted 3D scene graph by applying semantic and geometric perturbations to object-centric representations, such as category substitutions and coordinate or extent corruption. By contrasting predictions under the original and distorted 3D contexts, our method suppresses tokens that are insensitive to grounded scene

Models & Gen AI
arXiv cs.CV
Adaptive Rigor in AI System Evaluation using Temperature-Controlled Verdict Aggregation via Generalized Power Mean

Adaptive Rigor in AI System Evaluation using Temperature-Controlled Verdict Aggregation via Generalized Power Mean

arXiv:2604.08595v1 Announce Type: new Abstract: Existing evaluation methods for LLM-based AI systems, such as LLM-as-a-Judge, verdict systems, and NLI, do not always align well with human assessment because they cannot adapt their strictness to the application domain. This paper presents Temperature-Controlled Verdict Aggregation (TCVA), a method that combines a five-level verdict-scoring system with generalized power-mean aggregation and an intuitive temperature parameter T [0.1, 1.0] to control evaluation rigor. Low temperatures yield pessimistic scores suited for safety-critical domains; high temperatures produce lenient scores appropriate for conversational AI. Experimental evaluation on three benchmark datasets with human Likert-scale annotations (SummEval and USR) shows that TCVA achieves correlation with human judgments comparable to RAGAS on faithfulness (Spearman = 0.667 vs. 0.676) while consistently outperforming DeepEval. The method requires no additional LLM calls when adj

Models & Gen AI
arXiv cs.CL
Structured Exploration and Exploitation of Label Functions for Automated Data Annotation

Structured Exploration and Exploitation of Label Functions for Automated Data Annotation

arXiv:2604.08578v1 Announce Type: new Abstract: High-quality labeled data is critical for training reliable machine learning and deep learning models, yet manual annotation remains costly and error-prone. Programmatic labeling addresses this challenge by using label functions (LFs), i.e., heuristic rules that automatically generate weak labels for training datasets. However, existing automated LF generation methods either rely on large language models (LLMs) to synthesize surface-level heuristics or employ model-based synthesis over hand-crafted primitives. These approaches often result in limited coverage and unreliable label quality. In this paper, we introduce EXPONA, an automated framework for programmatic labeling that formulates LF generation as a principled process balancing diversity and reliability. EXPONA systematically explores multi-level LFs, spanning surface, structural, and semantic perspectives. EXPONA further applies reliability-aware mechanisms to suppress noisy or r

Models & Gen AI
arXiv cs.LG

AI news from 200+ sources

Get Started Free

General AI

Aichi Prefecture eyes Japan's first Level 4 autonomous expressway bus service

Aichi Prefecture eyes Japan's first Level 4 autonomous expressway bus service

Aichi Prefecture aims to put buses with Level 4 autonomous driving, in which a vehicle can operate without a driver under specific conditions, into practical use in fiscal 2027.

General AI
Japan Times Tech
Meta Platforms Finally Releases Muse Spark. Is the AI Model Worth the Wait?

Meta Platforms Finally Releases Muse Spark. Is the AI Model Worth the Wait?

The AI arms race among Big Tech shows no signs of slowing. Companies continue to pour tens of billions of dollars into data centers, talent, and compute power, all chasing the next leap in reasoning, multimodality, and real-world usefulness. For everyday investors watching their portfolios, the stakes feel personal: Will this massive spending deliver revenue ... Meta Platforms Finally Releases Muse Spark. Is the AI Model Worth the Wait?

General AI
Yahoo Finance AI
Amazon AI Chips Move From AWS Engine To Potential New Profit Stream

Amazon AI Chips Move From AWS Engine To Potential New Profit Stream

Amazon.com (NasdaqGS:AMZN) used its latest annual shareholder letter to highlight over $20b revenue run rate from its in house AI chips, including Graviton and Trainium. The company reported triple digit year over year growth in this AI chip segment and is assessing whether to sell these chips directly to external customers. Amazon also outlined plans for around $200b of capital expenditures in 2026, supported by large customer commitments in AWS. For investors, this puts Amazon's AI chip...

General AI
Yahoo Finance AI
Building the first AI Red Team OS – mythosai.cloud – early access open

Building the first AI Red Team OS – mythosai.cloud – early access open

Article URL: https://mythosai.cloud/ Comments URL: https://news.ycombinator.com/item?id=47740401 Points: 1 # Comments: 0

General AI
Hacker News
Sam Altman responds to ‘incendiary’ New Yorker article after attack on his home

Sam Altman responds to ‘incendiary’ New Yorker article after attack on his home

The OpenAI CEO's new blog post responds to both an apparent attack on his home and an in-depth New Yorker profile raising questions about his trustworthiness.

General AI
TechCrunch AI
TSMC likely to book fourth straight quarter of record profit on insatiable AI demand

TSMC likely to book fourth straight quarter of record profit on insatiable AI demand

TSMC, the world's largest manufacturer of advanced artificial intelligence chips, will likely notch up a fourth consecutive quarter ‌of record earnings with a 50% surge in net profit for January-March thanks to ‌booming demand for AI infrastructure. Analysts say that demand for Taiwan Semiconductor Manufacturing Co's 3-nanometre technology to produce AI chips and its ​advanced packaging technology continues to outstrip the firm's current production capacity. Its market capitalisation is now nearly double that of South Korean rival Samsung Electronics at around $1.6 trillion.

General AI
Yahoo Finance AI
HTNav: A Hybrid Navigation Framework with Tiered Structure for Urban Aerial Vision-and-Language Navigation

HTNav: A Hybrid Navigation Framework with Tiered Structure for Urban Aerial Vision-and-Language Navigation

arXiv:2604.08883v1 Announce Type: new Abstract: Inspired by the general Vision-and-Language Navigation (VLN) task, aerial VLN has attracted widespread attention, owing to its significant practical value in applications such as logistics delivery and urban inspection. However, existing methods face several challenges in complex urban environments, including insufficient generalization to unseen scenes, suboptimal performance in long-range path planning, and inadequate understanding of spatial continuity. To address these challenges, we propose HTNav, a new collaborative navigation framework that integrates Imitation Learning (IL) and Reinforcement Learning (RL) within a hybrid IL-RL framework. This framework adopts a staged training mechanism to ensure the stability of the basic navigation strategy while enhancing its environmental exploration capability. By integrating a tiered decision-making mechanism, it achieves collaborative interaction between macro-level path planning and fine-

General AI
arXiv cs.RO (Robotics)
A Semi-Automated Framework for 3D Reconstruction of Medieval Manuscript Miniatures

A Semi-Automated Framework for 3D Reconstruction of Medieval Manuscript Miniatures

arXiv:2604.08610v1 Announce Type: new Abstract: This paper presents a semi-automated framework for transforming two-dimensional miniatures from medieval manuscripts into three-dimensional digital models suitable for extended reality (XR), tactile 3D~printing, and web-based visualization. We evaluate seven image-to-3D methods (TripoSR, SF3D, SPAR3D, TRELLIS, Wonder3D, SAM~3D, Hi3DGen) on 69~manuscript figures from two collections using rendering-based metrics (Silhouette IoU, LPIPS, CLIP~Score) and volumetric measures (Depth Range Ratio, watertight percentage), revealing a trade-off between volumetric expansion and geometric fidelity. Hi3DGen balances topological quality with rich surface detail through its normal bridging approach, making it a good starting point for expert refinement. Our pipeline combines SAM segmentation, Hi3DGen mesh generation, expert refinement in ZBrush, and AI-assisted texturing. Two case studies on Gothic illuminations from the Decretum Gratiani (Vatican Libr

General AI
arXiv cs.CV
AI Driven Soccer Analysis Using Computer Vision

AI Driven Soccer Analysis Using Computer Vision

arXiv:2604.08722v1 Announce Type: new Abstract: Sport analysis is crucial for team performance since it provides actionable data that can inform coaching decisions, improve player performance, and enhance team strategies. To analyze more complex features from game footage, a computer vision model can be used to identify and track key entities from the field. We propose the use of an object detection and tracking system to predict player positioning throughout the game. To translate this to positioning in relation to the field dimensions, we use a point prediction model to identify key points on the field and combine these with known field dimensions to extract actual distances. For the player-identification model, object detection models like YOLO and Faster R-CNN are evaluated on the accuracy of our custom video footage using multiple different evaluation metrics. The goal is to identify the best model for object identification to obtain the most accurate results when paired with SAM

General AI
arXiv cs.CV
Apollo Expands AI Chip And Aviation Bets While Shares Trade At Discount

Apollo Expands AI Chip And Aviation Bets While Shares Trade At Discount

Apollo Global Management (NYSE:APO) has taken part in a major funding round for SiFive, a RISC V chip designer working with Nvidia on AI data center solutions. The firm is also involved in the completed $7.4b acquisition of Air Lease, now operating as Sumisho Air Lease. These moves expand Apollo's reach into both AI chip technology and aviation leasing, adding new angles to its alternative asset focus. At a share price of $104.28, Apollo Global Management gives investors exposure to a...

General AI
Yahoo Finance AI
Trump officials may be encouraging banks to test Anthropic’s Mythos model

Trump officials may be encouraging banks to test Anthropic’s Mythos model

The report is particularly surprising since the Department of Defense recently declared Anthropic a supply-chain risk.

General AI
TechCrunch AI
🤖

Robotics

Online Intention Prediction via Control-Informed Learning

Online Intention Prediction via Control-Informed Learning

arXiv:2604.09303v1 Announce Type: new Abstract: This paper presents an online intention prediction framework for estimating the goal state of autonomous systems in real time, even when intention is time-varying, and system dynamics or objectives include unknown parameters. The problem is formulated as an inverse optimal control / inverse reinforcement learning task, with the intention treated as a parameter in the objective. A shifting horizon strategy discounts outdated information, while online control-informed learning enables efficient gradient computation and online parameter updates. Simulations under varying noise levels and hardware experiments on a quadrotor drone demonstrate that the proposed approach achieves accurate, adaptive intention prediction in complex environments.

Robotics
arXiv cs.RO (Robotics)
🔬

Research

Musculoskeletal Motion Imitation for Learning Personalized Exoskeleton Control Policy in Impaired Gait

Musculoskeletal Motion Imitation for Learning Personalized Exoskeleton Control Policy in Impaired Gait

arXiv:2604.09431v1 Announce Type: new Abstract: Designing generalizable control policies for lower-limb exoskeletons remains fundamentally constrained by exhaustive data collection or iterative optimization procedures, which limit accessibility to clinical populations. To address this challenge, we introduce a device-agnostic framework that combines physiologically plausible musculoskeletal simulation with reinforcement learning to enable scalable personalized exoskeleton assistance for both able-bodied and clinical populations. Our control policies not only generate physiologically plausible locomotion dynamics but also capture clinically observed compensatory strategies under targeted muscular deficits, providing a unified computational model of both healthy and pathological gait. Without task-specific tuning, the resulting exoskeleton control policies produce assistive torque profiles at the hip and ankle that align with state-of-the-art profiles validated in human experiments, whi

Research
arXiv cs.RO (Robotics)