A one stop shop to track all open-access/ source TTS models as they come out. Feel free to make a PR for all those that arenβt linked here.
This is aimed as a resource to increase awareness for these models and to make it easier for researchers, developers, and enthusiasts to stay informed about the latest advancements in the field.
[!NOTE]
This repo will only track open source/access codebase TTS models. More motivation for everyone to open-source! π€
Name β | GitHub π» |
Weights β |
License π§Ύ |
Fine-tune π€ |
Languages | Paper π |
Demo π£οΈ |
Issues π |
Processor β‘ |
Word pronunciation adjustment π |
Insta-clone π₯ |
Emotional control π |
Prompting π |
Streaming support π |
Audio control π |
S2S support π¦ |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
XTTS | Repo | π€ Hub | CPML | Yes | Multilingual | Technical notes | π€ Space | |||||||||
TorToiSe TTS | Repo | π€ Hub | Apache 2.0 | Yes | English | Technical report | π€ Space | |||||||||
VITS/ MMS-TTS | Repo | π€ Hub / MMS | Apache 2.0 | Yes | English | Paper | π€ Space | |||||||||
Pheme | Repo | π€ Hub | CC-BY | Yes | English | Paper | π€ Space | |||||||||
OpenVoice | Repo | π€ Hub | CC-BY-NC 4.0 | No | ZH + EN | Paper | π€ Space | |||||||||
IMS-Toucan | Repo | GH release | Apache 2.0 | Yes | Multilingual | Paper | π€ Space | |||||||||
Matcha-TTS | Repo | GDrive | MIT | Yes | English | Paper | π€ Space | GPL-licensed phonemizer | ||||||||
pflowTTS | Unofficial Repo | GDrive | MIT | Yes | English | Paper | Not Available | GPL-licensed phonemizer | ||||||||
StyleTTS 2 | Repo | π€ Hub | MIT | Yes | English | Paper | π€ Space | GPL-licensed phonemizer | ||||||||
VALL-E | Unofficial Repo | Not Available | MIT | Yes | NA | Paper | Not Available | |||||||||
HierSpeech++ | Repo | GDrive | CC-BY-NC-SA 4.0 | No | KR + EN | Paper | π€ Space | |||||||||
Bark | Repo | π€ Hub | MIT | No | Multilingual | Paper | π€ Space | |||||||||
EmotiVoice | Repo | GDrive | Apache 2.0 | Yes | ZH + EN | Not Available | Not Available | Separate GUI agreement | ||||||||
Amphion | Repo | π€ Hub | MIT | No | Multilingual | Paper | π€ Space | |||||||||
xVASynth | Repo | GH commit | GPL-3.0 | Yes | Multilingual | Paper | Not Available | Copyright materials used for training. | CPU / CUDA | ARPAbet | 4-type π‘π ππ― per-phoneme |
speed / pitch / energy π per-phoneme |
π¦ | |||
OverFlow TTS | Repo | GitHub | MIT | Yes | English | Paper | GH Pages | |||||||||
Neural-HMM TTS | Repo | GitHub | MIT | Yes | English | Paper | GH Pages | |||||||||
Tacotron 2 | Unofficial Repo | GDrive | BSD-3 | Yes | English | Paper | Webpage | |||||||||
Glow-TTS | Repo | GDrive | MIT | Yes | English | Paper | GH Pages | |||||||||
Silero | Repo | GH links | CC BY-NC-SA | No | EM + DE + ES + EA | Not Available | Not Available | Non Commercial | ||||||||
MahaTTS | Repo | π€ Hub | Apache 2.0 | No | English, Hindi, Indian English, Bengali, Tamil, Telugu, Punjabi, Marathi, Gujarati, Assamese | Not Available | Recordings, Colab |