OSI-openOpen voice and text-to-speech

MOSS-TTS

OpenMOSS

Open speech and sound generation family - nano to 8B, 31 languages, real-time streaming.

3.3k stars(as of 2026-06-14)View on GitHub Homepage

Overview

What is MOSS-TTS?

MOSS-TTS is a family of five open models from OpenMOSS/MOSI.AI: a flagship 8B with zero-shot cloning, a multi-speaker dialogue model, a voice-design-from-text model, a low-latency real-time model, and a sound-effect model. A ~100M nano variant targets CPU-only deployment. Code and weights are Apache-2.0.

Analysis

Pros & Cons

Pros

Covers the full voice-AI stack from sound effects to real-time agents in one Apache-2.0 repo
Nano (~100M) claims real-time generation on 4 CPU cores - accessible for edge use
31-language support with active development

Cons

Flagship 8B model has heavy infrastructure requirements
Quality and latency figures are self-reported
Chinese-lab origin may raise supply-chain scrutiny in regulated contexts

License

License

Apache-2.0 (OSI-open)

When it is interesting

You want an Apache-licensed, self-hostable voice toolkit spanning TTS, dialogue, voice design and real-time, including a CPU-deployable nano model.

When it is too early

You need proven production reliability with third-party benchmark comparisons.

Context

Commercial alternative & related

Commercial counterpart: ElevenLabs

This repo featured in the 2026-07 edition of the Open-Source AI Radar.

Similar repositories

voicebox

jamiepine

A free, on-device alternative to ElevenLabs for TTS, voice cloning and dictation.

OSI-openOpen voice and text-to-speech

VoxCPM

OpenBMB

Tokenizer-free TTS from OpenBMB covering 30 languages with voice design and real-time streaming.

OSI-openOpen voice and text-to-speech

Chatterbox

resemble-ai

MIT-licensed open TTS with zero-shot voice cloning - 500M params, 23+ languages.

OSI-openOpen voice and text-to-speech