mnemo – local-first AI memory layer for LLMs with persistent knowledge graph in SQLite and Rust

Hacker NewsJun 3, 2026

Summaries like this, in your inbox every morning.

3 Key Points

mnemo is a sidecar service that extracts named entities and relationships from conversations using an LLM, builds a persistent knowledge graph in SQLite, and injects relevant context back into future prompts automatically in under 50ms. It works with Ollama, OpenAI, Anthropic, or any OpenAI-compatible API, and ships as a single static binary with zero cloud dependency.
The system runs a 6-stage retrieval pipeline: full-text chunk search → entity name search → graph expansion (breadth-first search over the knowledge graph) → relation filter → score and rank → assemble context. Available as Docker + Ollama (free), static binary, CLI tool, or Python SDK.
Performance on Apple M2 in debug build: entity insert at ~0.12 ms (~8,300 ops/s), full-text chunk search at ~0.28 ms (~3,500 ops/s), and full retrieval pipeline at ~4.2 ms (~238 ops/s). Release build is 3–5× faster. Includes 122 Rust tests, 21 Python tests, and 12 benchmarks.

AI-summarized, only the topics you pick — one digest a day via Email, Slack, or Discord.

Free · takes 30 seconds · unsubscribe anytime

No comments yet. Be the first to share your thoughts!

Get curated AI news from 200+ sources delivered daily to your inbox. Free to use.

Free · takes 30 seconds · unsubscribe anytime

1 minute a day. The AI essentials.

200+ sources · Email / LINE / Slack