Welcome back
Curated from 200+ sources across AI & machine learning

Softr, the Berlin-based no-code platform used by more than one million builders and 7,000 organizations including Netflix, Google, and Stripe, today launched what it calls an AI-native platform — a bet that the explosive growth of AI-powered app creation tools has produced a market full of impressive demos but very little production-ready business software. The company's new AI Co-Builder lets non-technical users describe in plain language the software they need, and the platform generates a fully integrated system — database, user interface, permissions, and business logic included — connected and ready for real-world deployment immediately. The move marks a fundamental evolution for a company that spent five years building a no-code business before layering AI on top of what it describes as a proven infrastructure of constrained, pre-built building blocks. "Most AI app-builders stop at the shiny demo stage," Softr Co-Founder and CEO Mariam Hakobyan told VentureBeat in an exclusive in



I've been building Sandflare for the past few months — it launches Firecracker microVMs for AI agents in ~300ms cold start. The idea came from running LLM-generated code in production. Docker felt too risky (shared kernel), full VMs too slow (5–10s). Firecracker hits the middle: real VM isolation, fast boot. I also added managed Postgres because almost every agent I built needed persistent state. One call wires a database into a sandbox. There are great tools in this space already (E2B, Modal, Daytona) — I wanted something with batteries-included Postgres, and simpler pricing What I'm trying to figure out: how do I get cold start below 100ms? Currently the bottleneck is the Firecracker API + network setup. Would love to hear from anyone who's pushed Firecracker further. https://sandflare.io Comments URL: https://news.ycombinator.com/item?id=47583255 Points: 2 # Comments: 3

For three decades, the web has existed in a state of architectural denial. It is a platform originally conceived to share static physics papers, yet it is now tasked with rendering the most complex, interactive, and generative interfaces humanity has ever conceived. At the heart of this tension lies a single, invisible, and prohibitively expensive operation known as "layout reflow." Whenever a developer needs to know the height of a paragraph or the position of a line to build a modern interface, they must ask the browser’s Document Object Model (DOM), the standard by which developers can create and modify webpages. In response, the browser often has to recalculate the geometry of the entire page — a process akin to a city being forced to redraw its entire map every time a resident opens their front door. Last Friday, March 27, 2026, Cheng Lou — a prominent software engineer whose work on React, ReScript, and Midjourney has defined much of the modern frontend landscape — announced on

Article URL: https://www.aiagentsbay.com Comments URL: https://news.ycombinator.com/item?id=47586284 Points: 1 # Comments: 0

Article URL: https://arxiv.org/abs/2603.27626 Comments URL: https://news.ycombinator.com/item?id=47585759 Points: 2 # Comments: 1

arXiv:2603.26771v1 Announce Type: new Abstract: Masked diffusion language models (MDLMs) generate text by iteratively unmasking tokens from a fully masked sequence, offering parallel generation and bidirectional context. However, their standard confidence-based unmasking strategy systematically defers high-entropy logical connective tokens, the critical branching points in reasoning chains, leading to severely degraded reasoning performance. We introduce LogicDiff, an inference-time method that replaces confidence-based unmasking with logic-role-guided unmasking. A lightweight classification head (4.2M parameters, 0.05% of the base model) predicts the logical role of each masked position (premise, connective, derived step, conclusion, or filler) from the base model's hidden states with 98.4% accuracy. A dependency-ordered scheduler then unmasks tokens in logical dependency order: premises first, then connectives, then derived steps, then conclusions. Without modifying a single paramet

Mistral aims to start operating the data center by the second quarter of 2026.

This just showed up a couple of days ago on GitHub. Note that ANE is the NPU in all Apple Silicon, not the new 'Neural Accelerator' GPU cores that are only in M5. (ggml-org/llama.cpp#10453) - Comment by arozanov Built a working ggml ANE backend. Dispatches MUL_MAT to ANE via private API. M4 Pro results: 4.0 TFLOPS peak at N=256, 16.8x faster than CPU MIL-side transpose, kernel cache, quantized weight support ANE for prefill (N>=64), Metal/CPU for decode Code: https://github.com/arozanov/ggml-ane Based on maderix/ANE bridge. submitted by /u/PracticlySpeaking [link] [comments]

arXiv:2603.26983v1 Announce Type: new Abstract: Art. 50 II of the EU Artificial Intelligence Act mandates dual transparency for AI-generated content: outputs must be labeled in both human-understandable and machine-readable form for automated verification. This requirement, entering into force in August 2026, collides with fundamental constraints of current generative AI systems. Using synthetic data generation and automated fact-checking as diagnostic use cases, we show that compliance cannot be reduced to post-hoc labeling. In fact-checking pipelines, provenance tracking is not feasible under iterative editorial workflows and non-deterministic LLM outputs; moreover, the assistive-function exemption does not apply, as such systems actively assign truth values rather than supporting editorial presentation. In synthetic data generation, persistent dual-mode marking is paradoxical: watermarks surviving human inspection risk being learned as spurious features during training, while marks

arXiv:2603.27343v1 Announce Type: new Abstract: Task-completion rate is the standard proxy for LLM agent capability, but models with identical completion scores can differ substantially in their ability to track intermediate state. We introduce Working Memory Fidelity-Active Manipulation (WMF-AM), a calibrated no-scratchpad probe of cumulative arithmetic state tracking, and evaluate it on 20 open-weight models (0.5B-35B, 13 families) against a released deterministic 10-task agent battery. In a pre-specified, Bonferroni-corrected analysis, WMF-AM predicts agent performance with Kendall's tau = 0.612 (p < 0.001, 95% CI [0.360, 0.814]); exploratory partial-tau analyses suggest this signal persists after controlling for completion score and model scale. Three construct-isolation ablations (K = 1 control, non-arithmetic ceiling, yoked cancellation) support the interpretation that cumulative state tracking under load, rather than single-step arithmetic or entity tracking alone, is the prima

LiteLLM had obtained two security compliance certifications via Delve and fell victim to some horrific credential-stealing malware last week.

“You can deceive, manipulate, and lie. That’s an inherent property of language. It’s a feature, not a flaw,” CrowdStrike CTO Elia Zaitsev told VentureBeat in an exclusive interview at RSA Conference 2026. If deception is baked into language itself, every vendor trying to secure AI agents by analyzing their intent is chasing a problem that cannot be conclusively solved. Zaitsev is betting on context instead. CrowdStrike’s Falcon sensor walks the process tree on an endpoint and tracks what agents did, not what agents appeared to intend. “Observing actual kinetic actions is a structured, solvable problem,” Zaitsev told VentureBeat. “Intent is not.” That argument landed 24 hours after CrowdStrike CEO George Kurtz disclosed two production incidents at Fortune 50 companies. In the first, a CEO's AI agent rewrote the company's own security policy — not because it was compromised, but because it wanted to fix a problem, lacked the permissions to do so, and removed the restriction itself. Every

The latest app from the team behind Bluesky is Attie, an AI assistant that lets you build your own algorithm. At the Atmosphere conference, Bluesky's former CEO, Jay Graber, and CTO Paul Frazee, unveiled Attie, which is powered by Anthropic's Claude and built on top of Bluesky's underlying AT Protocol (atproto). Attie allows users to create custom feeds using natural language. For example, you could ask for "posts about folklore, mythology, and traditional music, especially Celtic traditions." To start these custom feeds will be confined to a standalone Attie app. But the plan is to make them available in Bluesky and other atproto apps. … Read the full story at The Verge.

Earlier this month, Microsoft launched Copilot Health, a new space within its Copilot app where users will be able to connect their medical records and ask specific questions about their health. A couple of days earlier, Amazon had announced that Health AI, an LLM-based tool previously restricted to members of its One Medical service, would…

arXiv:2603.25891v1 Announce Type: new Abstract: Pre-trained vision-language models (VLMs) excel in multimodal tasks, commonly encoding images as embedding vectors for storage in databases and retrieval via approximate nearest neighbor search (ANNS). However, these models struggle with compositional queries and out-of-distribution (OOD) image-text pairs. Inspired by human cognition's ability to learn from minimal examples, we address this performance gap through few-shot learning approaches specifically designed for image retrieval. We introduce the Few-Shot Text-to-Image Retrieval (FSIR) task and its accompanying benchmark dataset, FSIR-BD - the first to explicitly target image retrieval by text accompanied by reference examples, focusing on the challenging compositional and OOD queries. The compositional part is divided to urban scenes and nature species, both in specific situations or with distinctive features. FSIR-BD contains 38,353 images and 303 queries, with 82% comprising the

arXiv:2603.26156v1 Announce Type: new Abstract: Framing continues to remain one of the most extensively applied theories in political communication. Developments in computation, particularly with the introduction of transformer architecture and more so with large language models (LLMs), have naturally prompted scholars to explore various novel computational approaches, especially for deductive frame detection, in recent years. While many studies have shown that different transformer models outperform their preceding models that use bag-of-words features, the debate continues to evolve regarding how these models compare with each other on classification tasks. By placing itself at this juncture, this study makes three key contributions: First, it comparatively performs generic news frame detection and compares the performance of five BERT-based variants (BERT, RoBERTa, DeBERTa, DistilBERT and ALBERT) to add to the debate on best practices around employing computational text analysis fo
AI news from 200+ sources
Get Started Free
Emerald AI touts new fundraising success and partnerships with utilities and power generators.

Article URL: https://www.omgubuntu.co.uk/2026/03/firefox-smart-window-hands-on Comments URL: https://news.ycombinator.com/item?id=47585853 Points: 1 # Comments: 0

Article URL: https://poll.qu.edu/poll-release?releaseid=3955 Comments URL: https://news.ycombinator.com/item?id=47586401 Points: 1 # Comments: 0

A college instructor turns to typewriters to curb AI-written work and teach life lessons AP News

ScaleOps just raised $130M to tackle GPU shortages and soaring AI cloud costs by automating infrastructure in real time.

The startup, which is planning to go public later this year, designs chips specifically for AI inference, another challenger to Nvidia's dominance.

As AI floods software development with code, Qodo is betting the real challenge is making sure it actually works.

Feds probe whether NYC Council member, Hochul aide took bribes to help migrant shelter provider AP News

Air Canada CEO will retire this year after his English-only crash message was criticized apnews.com
I could really use some outside perspective. I’m a senior ML/CV engineer in Canada with about 5–6 years across research and industry. Master’s in CS and a few publications. I left my previous remote startup role about five months ago. The role gradually changed, I burned out, and decided to step away. I took around two months to decompress and have been actively searching for the last three months. It’s been tough. A few interview loops and a couple of final rounds, but no offers until now. Last week I finished a four-round process with a small pre-seed AI startup in healthcare. The work is genuinely interesting and very aligned with my background. The team also seems strong. Here’s the complication. The role was posted with a salary range, but the verbal offer came in roughly 20% below the bottom of that range. On top of that, it’s structured as a 3-month contract-to-hire instead of full-time. Since I’m in Canada and they’re in the US, I would be working as a contractor. That mean

Iran University of Science and Technology building reduced to rubble by Israeli airstrike AP News

Last week, one of our product managers (PMs) built and shipped a feature. Not spec'd it. Not filed a ticket for it. Built it, tested it, and shipped it to production. In a day. A few days earlier, our designer noticed that the visual appearance of our IDE plugins had drifted from the design system. In the old world, that meant screenshots, a JIRA ticket, a conversation to explain the intent, and a sprint slot. Instead, he opened an agent, adjusted the layout himself, experimented, iterated, and tuned in real time, then pushed the fix. The person with the strongest design intuition fixed the design directly. No translation layer required. None of this is new in theory. Vibe coding opened the gates of software creation to millions. That was aspiration. When I shared the data on how our engineers doubled throughput, shifted from coding to validation, brought design upfront for rapid experimentation, it was still an engineering story. What changed is that the theory became practice. Here's