MOSS-TTS
OpenMOSS
Open speech and sound generation family - nano to 8B, 31 languages, real-time streaming.
What is MOSS-TTS?
MOSS-TTS is a family of five open models from OpenMOSS/MOSI.AI: a flagship 8B with zero-shot cloning, a multi-speaker dialogue model, a voice-design-from-text model, a low-latency real-time model, and a sound-effect model. A ~100M nano variant targets CPU-only deployment. Code and weights are Apache-2.0.
Pros & Cons
Pros
- Covers the full voice-AI stack from sound effects to real-time agents in one Apache-2.0 repo
- Nano (~100M) claims real-time generation on 4 CPU cores - accessible for edge use
- 31-language support with active development
Cons
- Flagship 8B model has heavy infrastructure requirements
- Quality and latency figures are self-reported
- Chinese-lab origin may raise supply-chain scrutiny in regulated contexts
License
Apache-2.0 (OSI-open)
When it is interesting
You want an Apache-licensed, self-hostable voice toolkit spanning TTS, dialogue, voice design and real-time, including a CPU-deployable nano model.
When it is too early
You need proven production reliability with third-party benchmark comparisons.
Commercial alternative & related
- Commercial counterpart: ElevenLabs
This repo featured in the 2026-07 edition of the Open-Source AI Radar.
voicebox
jamiepine
A free, on-device alternative to ElevenLabs for TTS, voice cloning and dictation.
VoxCPM
OpenBMB
Tokenizer-free TTS from OpenBMB covering 30 languages with voice design and real-time streaming.
Chatterbox
resemble-ai
MIT-licensed open TTS with zero-shot voice cloning - 500M params, 23+ languages.