AIToday

ext-infer: PHP 8.3+ extension for native LLM inference and embeddings via llama.cpp

Hacker News23h ago1 min read

Summaries like this, in your inbox every morning.

Sign up free →

3 Key Points

  1. 1

    ext-infer is a PHP 8.3+ extension written in Rust that loads a GGUF model and runs LLM inference inside the PHP process via llama.cpp, enabling semantic search, RAG pipelines, and worker inference without calling Python or remote APIs.

  2. 2

    The extension provides a fluent Prompt builder, a Response that separates reasoning from answer, and an Embedding class that handles normalization and cosine similarity—designed to feel native to PHP like the intl or pdo extensions.

  3. 3

    In-process inference reduces latency (bounded only by decode time versus milliseconds or tens of milliseconds for subprocess or HTTP calls) and eliminates the need for a separate Python sidecar, daemon, or inference server to manage alongside PHP-FPM.

Discussion

No comments yet. Be the first to share your thoughts!

Log in to join the discussion

Related Articles

Stay ahead with AI news

Get curated AI news from 200+ sources delivered daily to your inbox. Free to use.

Get Started Free

Free · takes 30 seconds · unsubscribe anytime

5 minutes a day. The AI essentials.

200+ sources · Email / LINE / Slack

Get it free →