OSI-openOpen voice and text-to-speech

VieNeu-TTS

pnnbao97

Open Vietnamese (plus English) TTS with instant voice cloning, trained from scratch and runnable on CPU.

2.0k stars(as of 2026-06-26)View on GitHub Homepage

Overview

What is VieNeu-TTS?

A Vietnamese, with English code-switching, text-to-speech system with instant zero-shot voice cloning from a few seconds of reference audio. The current v3 Turbo is a 0.1B model trained from scratch on around 10,000 hours of Vietnamese-English speech, outputs 48 kHz and uses the MOSS-Audio-Tokenizer-Nano codec, with a Python package, CPU (ONNX/GGUF) and GPU paths and Docker serving.

Analysis

Pros & Cons

Pros

Fills a genuine niche: dedicated, open, voice-cloning Vietnamese TTS with code-switching, poorly covered by mainstream open models
Apache-2.0 on code and weights, commercial-safe with no inherited restrictions
A real on-device CPU path (torch-free ONNX/GGUF) plus pip and Docker tooling

Cons

The flagship v3 Turbo is 'early access', not a final release, so the strongest claims sit on a pre-stable build
Language scope is narrow (Vietnamese and English only)
Largely single-maintainer; long-term support and the from-scratch training claims rest on the author's statements

License

Apache-2.0 (OSI-open) - model license: Apache-2.0

Both the code and the model weights are Apache-2.0; v3 Turbo is trained from scratch, so there is no inherited base-model restriction.

When it is interesting

You specifically need offline, commercially licensed Vietnamese voice cloning or Vietnamese-English code-switching on CPU or a modest GPU.

When it is too early

If you need a frozen, production-stable release today; use the stable v1/v2 rather than v3 Turbo early access.

Context

Commercial alternative & related

Commercial counterpart: ElevenLabs

This repo featured in the 2026-07 edition of the Open-Source AI Radar.

Similar repositories

voicebox

jamiepine

29.5k

A free, on-device alternative to ElevenLabs for TTS, voice cloning and dictation.

OSI-openOpen voice and text-to-speech

VoxCPM

OpenBMB

26.1k

Tokenizer-free TTS from OpenBMB covering 30 languages with voice design and real-time streaming.

OSI-openOpen voice and text-to-speech

Chatterbox

resemble-ai

25.1k

MIT-licensed open TTS with zero-shot voice cloning - 500M params, 23+ languages.

OSI-openOpen voice and text-to-speech