AIToday

llmaker: open-source platform bundles LLM stack into single command

Hacker News6h ago4 min read
llmaker: open-source platform bundles LLM stack into single command

Key takeaway

llmaker is an open-source platform that provisions an entire AI application stack—models, vector databases, embeddings, caching, tracing—on private infrastructure with a single command. It eliminates the manual work of assembling containerized services and ensures data stays on-premises, avoiding per-token API costs and vendor dependence. The tool includes a built-in agent for retrieval-augmented generation tasks and operates as a unified fleet with automatic service discovery.

Summaries like this, in your inbox every morning.

Sign up free →

3 Key Points

  • What happened

    llmaker, an open-source tool, lets developers run a complete AI application stack—language models, vector databases, embeddings, caching, and observability—on their own infrastructure from a single CLI command, with no third-party API keys or data leaving the machine.

  • Why it matters

    Companies handling proprietary documents or customer data can now avoid per-token costs and vendor lock-in by self-hosting. The tool removes the operational overhead of manually assembling and networking separate containers, which organizations currently handle through Docker Compose files and custom integration code.

  • What to watch

    llmaker includes a built-in retrieval-and-agent layer for RAG (retrieval-augmented generation) chatbots and recommendation engines, with integrated observability via Langfuse and Prometheus metrics. Installation requires Docker; prebuilt binaries are available for Linux and macOS.

FAQ

What infrastructure does llmaker run on?
llmaker runs on Docker and your own infrastructure. It provisions containers on a private network where services discover each other by name, and by default all containers bind to 127.0.0.1 so nothing leaves your hardware.
What models and services does llmaker support?
The platform ships with a curated catalog including vector databases (Qdrant, Chroma, pgvector, Weaviate), Redis, embeddings, Open WebUI, n8n, Flowise, Whisper, and Langfuse. Models expose an OpenAI-compatible API.
How do you get started?
Installation requires Docker. You can then run a single command like `llmaker stack up assistant` to scaffold and provision a complete retrieval-augmented generation stack, or `llmaker up --model llama3:8b` to run a single model with an OpenAI-compatible endpoint.

Discussion

No comments yet. Be the first to share your thoughts!

Log in to join the discussion

Related Articles

Stay ahead with AI news

Get curated AI news from 200+ sources delivered daily to your inbox. Free to use.

Get Started Free

Free · takes 30 seconds · unsubscribe anytime

1 minute a day. The AI essentials.

200+ sources · Email / LINE / Slack

Get it free →