Skip to main content
AI Tool Radar
Open weight, with conditionsOpen voice and text-to-speech

Higgs Audio

boson-ai

Text-audio foundation model - conversational TTS in 100+ languages with zero-shot cloning, 4B params.

8.2k stars(as of 2026-06-14)View on GitHubHomepage

What is Higgs Audio?

Higgs Audio is a text-audio foundation model family from Boson AI. v3 is a 4B-parameter conversational TTS model covering 100+ languages with zero-shot voice cloning, inline emotion/style/prosody control and an OpenAI-compatible streaming API. Self-hosting is via SGLang-Omni.

Pros & Cons

Pros

  • 100+ languages with zero-shot cloning and inline prosody control in one 4B model
  • Pretrained on 10M+ hours of audio (project's own claim) - a large open-weight corpus
  • OpenAI-compatible streaming API eases drop-in integration

Cons

  • Weights are non-commercial - commercial self-hosting needs a paid agreement
  • 4B params plus SGLang-Omni adds meaningful infra overhead
  • Research-licensed weights limit production open-source appeal

License

Apache-2.0 (code) (Open weight, with conditions) - model license: Boson Higgs Audio v3 Research and Non-Commercial License

Code is Apache-2.0, but the v3 model weights are under a Research and Non-Commercial License - production/revenue-generating deployments require a separate commercial agreement with Boson AI.

When it is interesting

Research or non-commercial products needing the broadest multilingual coverage and richest prosody control in open weights.

When it is too early

You need a fully open commercial self-hosting license.

Commercial alternative & related

This repo featured in the 2026-07 edition of the Open-Source AI Radar.