Skip to main content
AI Tool Radar
OSI-openOpen voice and text-to-speech

Irodori-TTS

Aratako

Japanese flow-matching TTS with zero-shot cloning and emoji-driven style control, MIT on code and weights.

975 stars(as of 2026-06-26)View on GitHub

What is Irodori-TTS?

A Japanese flow-matching text-to-speech model (a rectified-flow diffusion transformer over continuous latents) with zero-shot voice cloning and distinctive emoji-driven style control, where emoji in the input steer delivery and non-verbal expression. A VoiceDesign variant adds caption-text conditioning for emotion and tone and can synthesise without reference audio, and it ships weights, a CLI, Gradio UIs, training and LoRA finetuning code.

Pros & Cons

Pros

  • Permissive MIT on both code and weights, among the cleanest licensing for an open TTS model
  • Novel, genuinely useful emoji-driven style and caption-based VoiceDesign control, not just plain cloning
  • Broad backend support (CUDA, ROCm, Intel XPU, CPU, Apple MPS) with full training and LoRA finetuning code

Cons

  • Japanese only, no value outside Japanese use cases
  • Flow-matching inference is heavier than the autoregressive CPU-first models; GPU is the practical path
  • Quality depends on assembled components whose own licences must be checked before commercial redistribution

License

MIT (OSI-open) - model license: MIT

Both code and weights are MIT (per the v3 model cards); the cards add advisory ethical-use guidelines that are not licence restrictions, and the VoiceDesign variant builds on components (an llm-jp encoder, a DACVAE codec) whose own licences should be checked before commercial redistribution.

When it is interesting

Open, MIT-licensed Japanese TTS with expressive, controllable delivery (emoji or caption style steering) and finetuning flexibility.

When it is too early

If you need non-Japanese languages or lightweight CPU-only realtime synthesis on commodity hardware.

Commercial alternative & related

  • Commercial counterpart: ElevenLabs

This repo featured in the 2026-07 edition of the Open-Source AI Radar.