Whissle Gateway lets businesses run voice AI—including speech recognition, speaker identification, and sales coaching analysis—entirely on their own servers in a 500MB Docker container.

Hacker News5d ago2 min read

Summaries like this, in your inbox every morning.

3 Key Points

1
What happened: Whissle released a containerized voice AI system that performs automatic speech recognition (ASR), text-to-speech (TTS), speaker diarization, and AI-powered call analysis locally without cloud dependency. The system downloads ~2 GB of models on first run and caches them thereafter, and supports multiple language variants including English, Hindi-English, Mandarin, and 23 languages total.
2
Why it matters: Businesses handling sensitive customer calls—such as sales teams, debt collections, and interviews—can now analyze conversations for coaching and compliance without sending audio to third-party APIs. The system extracts metadata per segment (emotion, behavior, speaker age/gender, intent) in a single forward pass, and can summarize transcripts using Claude or Gemini with custom prompts, letting organizations keep call data on premises.
3
What to watch: The system runs on both CPU (laptop/MacBook at 1–3 concurrent calls) and GPU (A100 40GB at 50–80 concurrent calls), with variants optimized for speed (en-lite, ~500 MB) or quality (en-full, ~2 GB). It is available immediately via Docker image (whissleasr/whissle-gateway:latest) with local authentication and REST/WebSocket APIs.

No discussion yet for this article

Get curated AI news from 200+ sources delivered daily to your inbox. Free to use.

Free · takes 30 seconds · unsubscribe anytime

5 minutes a day. The AI essentials.

200+ sources · Email / LINE / Slack