Skip to main content
AI Tool Radar
OSI-openOpen voice and text-to-speech

Parlor

fikrikarim

On-device, real-time voice and vision AI - powered by Gemma and Kokoro, no cloud.

1.8k stars(as of 2026-06-14)View on GitHub

What is Parlor?

Parlor is a local assistant combining a multimodal Gemma model with Kokoro TTS for real-time voice-and-camera conversations with no cloud dependency. It runs on Apple Silicon (MLX) or Linux GPU, uses Silero VAD for hands-free use, supports barge-in, and streams TTS at the sentence level.

Pros & Cons

Pros

  • Truly on-device - voice, vision and LLM all local, strong privacy story
  • Barge-in and sentence-level streaming give a natural conversational feel
  • Apache-2.0 throughout, actively maintained

Cons

  • English-only and Apple Silicon / Linux GPU only - no Windows or CPU path
  • Thin layer over Gemma + Kokoro - voice quality bound by Kokoro
  • Alpha-stage solo project with no versioned releases

License

Apache-2.0 (OSI-open)

When it is interesting

You want a privacy-first, fully local voice assistant with camera awareness and zero API keys, especially on Apple Silicon.

When it is too early

You need multilingual support, a stable SDK, or production reliability.

Commercial alternative & related

This repo featured in the 2026-07 edition of the Open-Source AI Radar.