Summaries like this, in your inbox every morning.
Sign up free →ext-infer is a PHP 8.3+ extension written in Rust that loads a GGUF model and runs LLM inference inside the PHP process via llama.cpp, enabling semantic search, RAG pipelines, and worker inference without calling Python or remote APIs.
The extension provides a fluent Prompt builder, a Response that separates reasoning from answer, and an Embedding class that handles normalization and cosine similarity—designed to feel native to PHP like the intl or pdo extensions.
In-process inference reduces latency (bounded only by decode time versus milliseconds or tens of milliseconds for subprocess or HTTP calls) and eliminates the need for a separate Python sidecar, daemon, or inference server to manage alongside PHP-FPM.
No comments yet. Be the first to share your thoughts!
Log in to join the discussion





Get curated AI news from 200+ sources delivered daily to your inbox. Free to use.
Get Started FreeFree · takes 30 seconds · unsubscribe anytime
5 minutes a day. The AI essentials.
200+ sources · Email / LINE / Slack