AIToday

Welcome back

or
Don't have an account? Sign upForgot password?
🔥 Updated in real-time

Today's Top AI News

Curated from 200+ sources across AI & machine learning

How old records and new AI technology brought growth back to Ancestry
TOP STORYGeneral AI

How old records and new AI technology brought growth back to Ancestry

CEO Howard Hochhauser says the site had stopped listening to its customers.

Semafor Tech·55m ago
Claude now works across all three major Office apps
#2Models & Gen AI

Claude now works across all three major Office apps

THE DECODER55m ago
Aichi Prefecture eyes Japan's first Level 4 autonomous expressway bus service
#3General AI

Aichi Prefecture eyes Japan's first Level 4 autonomous expressway bus service

Japan Times Tech6h ago
🧠

Models & Gen AI

Your developers are already running AI locally: Why on-device inference is the CISO’s new blind spot

Your developers are already running AI locally: Why on-device inference is the CISO’s new blind spot

For the last 18 months, the CISO playbook for generative AI has been relatively simple: Control the browser. Security teams tightened cloud access security broker (CASB) policies, blocked or monitored traffic to well-known AI endpoints, and routed usage through sanctioned gateways. The operating model was clear: If sensitive data leaves the network for an external API call, we can observe it, log it, and stop it. But that model is starting to break. A quiet hardware shift is pushing large language model (LLM) usage off the network and onto the endpoint. Call it Shadow AI 2.0, or the “bring your own model” (BYOM) era: Employees running capable models locally on laptops, offline, with no API calls and no obvious network signature. The governance conversation is still framed as “data exfiltration to the cloud,” but the more immediate enterprise risk is increasingly “unvetted inference inside the device." When inference happens locally, traditional data loss prevention (DLP) doesn’t see th

Models & Gen AI
VentureBeat AI
Blazing hot IPOs, an AI agent craze, and a new word for ‘token’: Here’s what’s happening in the world of Chinese AI

Blazing hot IPOs, an AI agent craze, and a new word for ‘token’: Here’s what’s happening in the world of Chinese AI

China hopes to build a “token economy,” backed by open-source models and real-world AI applications—even as U.S. export controls still hold things back.

Models & Gen AI
Fortune AI
State of AI: April 2026 newsletter

State of AI: April 2026 newsletter

US Government blacklists Anthropic as Iran bombs AWS data centers. Plus: $19B revenue in weeks, industrial-scale distillation wars, and an mRNA dog cancer vaccine designed by ChatGPT.

Models & Gen AI
Air Street Press (State of AI)
LRTS – Regression testing for LLM prompts (open source, local-first)

LRTS – Regression testing for LLM prompts (open source, local-first)

Article URL: https://github.com/rufus-SD/lrts Comments URL: https://news.ycombinator.com/item?id=47739332 Points: 1 # Comments: 0

Models & Gen AI
Hacker News
OpenAI employee tries to explain usage limits of the new ChatGPT Pro plans

OpenAI employee tries to explain usage limits of the new ChatGPT Pro plans

OpenAI recently added a $100 plan to its lineup, but confusing labels on the pricing page left users guessing about actual usage limits. An OpenAI employee tried to clear things up. The article OpenAI employee tries to explain usage limits of the new ChatGPT Pro plans appeared first on The Decoder.

Models & Gen AI
THE DECODER
MARINER: A 3E-Driven Benchmark for Fine-Grained Perception and Complex Reasoning in Open-Water Environments

MARINER: A 3E-Driven Benchmark for Fine-Grained Perception and Complex Reasoning in Open-Water Environments

arXiv:2604.08615v1 Announce Type: new Abstract: Fine-grained visual understanding and high-level reasoning in real-world open-water environments remain under-explored due to the lack of dedicated benchmarks. We introduce MARINER, a comprehensive benchmark built under the novel Entity-Environment-Event (3E) paradigm. MARINER contains 16,629 multi-source maritime images with 63 fine-grained vessel categories, diverse adverse environments, and 5 typical dynamic maritime incidents, covering fine-grained classification, object detection, and visual question answering tasks. We conduct extensive evaluations on mainstream Multimodal Large language models (MLLMs) and establish baselines, revealing that even advanced models struggle with fine-grained discrimination and causal reasoning in complex marine scenes. As a dedicated maritime benchmark, MARINER fills the gap of realistic and cognitive-level evaluation for maritime multimodal understanding, and promotes future research on robust vision

Models & Gen AI
arXiv cs.CV
Adaptive Rigor in AI System Evaluation using Temperature-Controlled Verdict Aggregation via Generalized Power Mean

Adaptive Rigor in AI System Evaluation using Temperature-Controlled Verdict Aggregation via Generalized Power Mean

arXiv:2604.08595v1 Announce Type: new Abstract: Existing evaluation methods for LLM-based AI systems, such as LLM-as-a-Judge, verdict systems, and NLI, do not always align well with human assessment because they cannot adapt their strictness to the application domain. This paper presents Temperature-Controlled Verdict Aggregation (TCVA), a method that combines a five-level verdict-scoring system with generalized power-mean aggregation and an intuitive temperature parameter T [0.1, 1.0] to control evaluation rigor. Low temperatures yield pessimistic scores suited for safety-critical domains; high temperatures produce lenient scores appropriate for conversational AI. Experimental evaluation on three benchmark datasets with human Likert-scale annotations (SummEval and USR) shows that TCVA achieves correlation with human judgments comparable to RAGAS on faithfulness (Spearman = 0.667 vs. 0.676) while consistently outperforming DeepEval. The method requires no additional LLM calls when adj

Models & Gen AI
arXiv cs.CL
At the HumanX conference, everyone was talking about Claude

At the HumanX conference, everyone was talking about Claude

Anthropic was the star of the show at San Francisco's AI-centric conference.

Models & Gen AI
TechCrunch AI
From LLMs to hallucinations, here’s a simple guide to common AI terms

From LLMs to hallucinations, here’s a simple guide to common AI terms

The rise of AI has brought an avalanche of new terms and slang. Here is a glossary with definitions of some of the most important words and phrases you might encounter.

Models & Gen AI
TechCrunch AI
arXiv cs.LG

Structured Exploration and Exploitation of Label Functions for Automated Data Annotation

arXiv:2604.08578v1 Announce Type: new Abstract: High-quality labeled data is critical for training reliable machine learning and deep learning models, yet manual annotation remains costly and error-prone. Programmatic labeling addresses this challenge by using label functions (LFs), i.e., heuristic rules that automatically generate weak labels for training datasets. However, existing automated LF generation methods either rely on large language models (LLMs) to synthesize surface-level heuristics or employ model-based synthesis over hand-crafted primitives. These approaches often result in limited coverage and unreliable label quality. In this paper, we introduce EXPONA, an automated framework for programmatic labeling that formulates LF generation as a principled process balancing diversity and reliability. EXPONA systematically explores multi-level LFs, spanning surface, structural, and semantic perspectives. EXPONA further applies reliability-aware mechanisms to suppress noisy or r

Models & Gen AI
arXiv cs.LG
arXiv cs.AI

PilotBench: A Benchmark for General Aviation Agents with Safety Constraints

arXiv:2604.08987v1 Announce Type: new Abstract: As Large Language Models (LLMs) advance toward embodied AI agents operating in physical environments, a fundamental question emerges: can models trained on text corpora reliably reason about complex physics while adhering to safety constraints? We address this through PilotBench, a benchmark evaluating LLMs on safety-critical flight trajectory and attitude prediction. Built from 708 real-world general aviation trajectories spanning nine operationally distinct flight phases with synchronized 34-channel telemetry, PilotBench systematically probes the intersection of semantic understanding and physics-governed prediction through comparative analysis of LLMs and traditional forecasters. We introduce Pilot-Score, a composite metric balancing 60% regression accuracy with 40% instruction adherence and safety compliance. Comparative evaluation across 41 models uncovers a Precision-Controllability Dichotomy: traditional forecasters achieve superi

Models & Gen AI
arXiv cs.AI
arXiv cs.AI

Advantage-Guided Diffusion for Model-Based Reinforcement Learning

arXiv:2604.09035v1 Announce Type: new Abstract: Model-based reinforcement learning (MBRL) with autoregressive world models suffers from compounding errors, whereas diffusion world models mitigate this by generating trajectory segments jointly. However, existing diffusion guides are either policy-only, discarding value information, or reward-based, which becomes myopic when the diffusion horizon is short. We introduce Advantage-Guided Diffusion for MBRL (AGD-MBRL), which steers the reverse diffusion process using the agent's advantage estimates so that sampling concentrates on trajectories expected to yield higher long-term return beyond the generated window. We develop two guides: (i) Sigmoid Advantage Guidance (SAG) and (ii) Exponential Advantage Guidance (EAG). We prove that a diffusion model guided through SAG or EAG allows us to perform reweighted sampling of trajectories with weights increasing in state-action advantage-implying policy improvement under standard assumptions. Addi

Models & Gen AI
arXiv cs.AI
The AI code wars are heating up

The AI code wars are heating up

This is The Stepback, a weekly newsletter breaking down one essential story from the tech world. For more on the AI coding and vibe-coding booms, follow David Pierce. The Stepback arrives in our subscribers' inboxes at 8AM ET. Opt in for The Stepback here. How it started Writing code was a killer app for AI even before anyone was really talking about AI. In the spring of 2021, 18 months before the world knew the word "ChatGPT," Microsoft debuted the very first product of a partnership with a nonprofit called OpenAI: a tool called GitHub Copilot that watched developers as they wrote code and tried to autocomplete snippets and lines for them … Read the full story at The Verge.

Models & Gen AI
The Verge AI
Hart Research March 8, 2024 opinion poll for NBC News: people hate AI

Hart Research March 8, 2024 opinion poll for NBC News: people hate AI

Article URL: https://web.archive.org/web/20260310175721if_/https://s3.documentcloud.org/documents/27777984/nbc-news-march-2026-poll-03-08-2024-release-final.pdf?t=1772898915520 Comments URL: https://news.ycombinator.com/item?id=47731392 Points: 3 # Comments: 1

Models & Gen AI
Hacker News

AI news from 200+ sources

Get Started Free

General AI

Meta Platforms Finally Releases Muse Spark. Is the AI Model Worth the Wait?

Meta Platforms Finally Releases Muse Spark. Is the AI Model Worth the Wait?

The AI arms race among Big Tech shows no signs of slowing. Companies continue to pour tens of billions of dollars into data centers, talent, and compute power, all chasing the next leap in reasoning, multimodality, and real-world usefulness. For everyday investors watching their portfolios, the stakes feel personal: Will this massive spending deliver revenue ... Meta Platforms Finally Releases Muse Spark. Is the AI Model Worth the Wait?

General AI
Yahoo Finance AI
Amazon AI Chips Move From AWS Engine To Potential New Profit Stream

Amazon AI Chips Move From AWS Engine To Potential New Profit Stream

Amazon.com (NasdaqGS:AMZN) used its latest annual shareholder letter to highlight over $20b revenue run rate from its in house AI chips, including Graviton and Trainium. The company reported triple digit year over year growth in this AI chip segment and is assessing whether to sell these chips directly to external customers. Amazon also outlined plans for around $200b of capital expenditures in 2026, supported by large customer commitments in AWS. For investors, this puts Amazon's AI chip...

General AI
Yahoo Finance AI
Building the first AI Red Team OS – mythosai.cloud – early access open

Building the first AI Red Team OS – mythosai.cloud – early access open

Article URL: https://mythosai.cloud/ Comments URL: https://news.ycombinator.com/item?id=47740401 Points: 1 # Comments: 0

General AI
Hacker News
Sam Altman responds to ‘incendiary’ New Yorker article after attack on his home

Sam Altman responds to ‘incendiary’ New Yorker article after attack on his home

The OpenAI CEO's new blog post responds to both an apparent attack on his home and an in-depth New Yorker profile raising questions about his trustworthiness.

General AI
TechCrunch AI
A Semi-Automated Framework for 3D Reconstruction of Medieval Manuscript Miniatures

A Semi-Automated Framework for 3D Reconstruction of Medieval Manuscript Miniatures

arXiv:2604.08610v1 Announce Type: new Abstract: This paper presents a semi-automated framework for transforming two-dimensional miniatures from medieval manuscripts into three-dimensional digital models suitable for extended reality (XR), tactile 3D~printing, and web-based visualization. We evaluate seven image-to-3D methods (TripoSR, SF3D, SPAR3D, TRELLIS, Wonder3D, SAM~3D, Hi3DGen) on 69~manuscript figures from two collections using rendering-based metrics (Silhouette IoU, LPIPS, CLIP~Score) and volumetric measures (Depth Range Ratio, watertight percentage), revealing a trade-off between volumetric expansion and geometric fidelity. Hi3DGen balances topological quality with rich surface detail through its normal bridging approach, making it a good starting point for expert refinement. Our pipeline combines SAM segmentation, Hi3DGen mesh generation, expert refinement in ZBrush, and AI-assisted texturing. Two case studies on Gothic illuminations from the Decretum Gratiani (Vatican Libr

General AI
arXiv cs.CV
The Only Artificial Intelligence (AI) Stock in the "Magnificent Seven" That's Worth Buying After the Correction

The Only Artificial Intelligence (AI) Stock in the "Magnificent Seven" That's Worth Buying After the Correction

One is the clear AI leader.

General AI
Yahoo Finance AI
Can AI replace a priest? Japan’s temples and shrines are testing the limits

Can AI replace a priest? Japan’s temples and shrines are testing the limits

Japan's temples experiment with artificial intelligence as questions of faith, presence and care grow more urgent.

General AI
Japan Times Tech
Eggs, rooms, puzzles, and talking about AI

Eggs, rooms, puzzles, and talking about AI

I live with five friends in a big house, and two things I’ve done in it on this particular Sunday are hide 156 easter eggs all around, and reach a tentative joint decision on the allocation of four of its rooms. These tasks are delightful to me for a reason they have in common, and from which I hope to gesture at extremely far reaching conclusions. Easter eggs A room usually seems like a simple thing to me—a big box, with some smaller mostly boxish objects and holes in it. Each of those things also usually seems simple: a cupboard is a box-shaped hole, with a movable thin-box-shaped front, which has hinges (the most complicated part, but in this picture their only qualities are letting flat surfaces rotate around fixed edges). Sometimes a cupboard has shelves, which are like planes breaking up the space. In this picture, hiding easter eggs well is hard! Like, I could put one in the cupboard? On the top shelf? Or the bottom shelf! They’ll never find it there! These are not good hiding p

General AI
LessWrong AI
Apollo Expands AI Chip And Aviation Bets While Shares Trade At Discount

Apollo Expands AI Chip And Aviation Bets While Shares Trade At Discount

Apollo Global Management (NYSE:APO) has taken part in a major funding round for SiFive, a RISC V chip designer working with Nvidia on AI data center solutions. The firm is also involved in the completed $7.4b acquisition of Air Lease, now operating as Sumisho Air Lease. These moves expand Apollo's reach into both AI chip technology and aviation leasing, adding new angles to its alternative asset focus. At a share price of $104.28, Apollo Global Management gives investors exposure to a...

General AI
Yahoo Finance AI
Trump officials may be encouraging banks to test Anthropic’s Mythos model

Trump officials may be encouraging banks to test Anthropic’s Mythos model

The report is particularly surprising since the Department of Defense recently declared Anthropic a supply-chain risk.

General AI
TechCrunch AI
Five signs data drift is already undermining your security models

Five signs data drift is already undermining your security models

Data drift happens when the statistical properties of a machine learning (ML) model's input data change over time, eventually rendering its predictions less accurate. Cybersecurity professionals who rely on ML for tasks like malware detection and network threat analysis find that undetected data drift can create vulnerabilities. A model trained on old attack patterns may fail to see today's sophisticated threats. Recognizing the early signs of data drift is the first step in maintaining reliable and efficient security systems. Why data drift compromises security models ML models are trained on a snapshot of historical data. When live data no longer resembles this snapshot, the model's performance dwindles, creating a critical cybersecurity risk. A threat detection model may generate more false negatives by missing real breaches or create more false positives, leading to alert fatigue for security teams. Adversaries actively exploit this weakness. In 2024, attackers used echo-spoofing t

General AI
VentureBeat AI
SoftBank, others set up new firm to develop high-performance AI

SoftBank, others set up new firm to develop high-performance AI

Engineers from SoftBank and Tokyo-based AI developer Preferred Networks Inc. are expected to participate in the development.

General AI
Japan Times Tech
Meta transfers top engineers into new AI tooling team

Meta transfers top engineers into new AI tooling team

Article URL: https://www.reuters.com/technology/meta-transfers-top-engineers-into-new-ai-tooling-team-2026-04-09/ Comments URL: https://news.ycombinator.com/item?id=47731801 Points: 4 # Comments: 1

General AI
Hacker News