CEM888.AI's agent Vetta achieves the highest published scores on memory-focused AI benchmarks, with 99.9% accuracy on retrieval and 77.2% on reasoning tasks, using local-first architecture that keeps data on-premises.

Hacker News14h ago2 min read

Summaries like this, in your inbox every morning.

3 Key Points

1
What happened: CEM888.AI announced that its agent Vetta holds the highest published scores on MemoryAgentBench (ICLR 2026), a peer-reviewed benchmark for AI agent memory. Vetta achieved 99.90% on AR Retrieval (compared with the best published score of 71.8% from GPT-4.1-mini) and 77.2% on BEAM Memory (compared with 64.1% from Hindsight official). Both benchmarks use honest retrieval — the agent retrieves from its own knowledge base without answer keys or pre-computed embeddings.
2
Why it matters: Enterprise customers increasingly need AI systems that keep sensitive data on-premises rather than sending it to external cloud providers. CEM888.AI's benchmark results suggest its local-first architecture (which the company describes as 100% local-first model runtimes) may offer both strong performance and data sovereignty — a meaningful combination for regulated industries and organizations with strict data residency requirements.
3
What to watch: The company emphasizes that its caching layers are designed to reduce compute costs by up to 90% while maintaining sub-millisecond response times, and it offers zero-touch deployment for macOS and Linux environments. Verification data (2,000 AR Q&A pairs and 200 BEAM Q&A pairs) are published in the benchmarks/ directory.

No discussion yet for this article

Get curated AI news from 200+ sources delivered daily to your inbox. Free to use.

Free · takes 30 seconds · unsubscribe anytime

5 minutes a day. The AI essentials.

200+ sources · Email / LINE / Slack