Skip to main content
AI Tool Radar
OSI-openOpen voice and text-to-speech

MOSS-TTS-Nano

OpenMOSS

0.1B multilingual TTS with zero-shot voice cloning that runs realtime on a CPU, fully open weights.

3.8k stars(as of 2026-06-26)View on GitHubHomepage

What is MOSS-TTS-Nano?

A 0.1B-parameter multilingual text-to-speech model (audio tokenizer plus a small LLM) doing zero-shot voice cloning across 20 languages including German, with native 48 kHz output. It is built for low-latency, CPU-only realtime synthesis and ships open weights, full inference code, an ONNX CPU build, an Android example and a browser-extension reader, from the OpenMOSS team (Fudan/SII).

Pros & Cons

Pros

  • Genuinely tiny at 0.1B and CPU-runnable in realtime, no GPU needed
  • Fully OSI-open Apache-2.0 on both code and weights, commercial-safe
  • 20-language coverage plus ONNX, Android and browser deployment paths and released finetuning code

Cons

  • 0.1B trades fidelity for size; the 8B MOSS-TTS flagship is the quality tier
  • The README licence section still shows contradictory stale wording despite the Apache LICENSE file
  • Very young (April 2026), so long-term maintenance and quality at scale are unproven

License

Apache-2.0 (OSI-open) - model license: Apache-2.0

Both the code and the model weights are Apache-2.0 (verified against the published LICENSE file and the Hugging Face card); the README's licence section still carries stale conditional wording that the Apache-2.0 LICENSE supersedes.

When it is interesting

On-device, offline, low-latency multilingual TTS and voice cloning on commodity CPUs (mobile, edge, browser).

When it is too early

If you need top-tier studio fidelity or production stability; the larger MOSS-TTS or a managed API fits better.

Commercial alternative & related

  • Commercial counterpart: ElevenLabs

This repo featured in the 2026-07 edition of the Open-Source AI Radar.