MeloTTS: A high-quality multilingual text-to-speech (TTS) library developed by MyShell AI

Supports multiple languages such as English, Spanish, French, Chinese, Japanese and Korean.
It is very fast, supports mixed pronunciation of Chinese and English, and can generate clear and natural speech output.
Real-time Text To Speech can be realized even on ordinary CPU.
Tested it and the quality is very good.👍

The main functions include:

1. Multi-language support: MeloTTS supports text-to-speech conversion in multiple languages, including English (with multiple accents such as US, UK, India, Australia, etc.), Spanish, French, Chinese, Japanese and Korean. This makes it suitable for application scenarios in multiple language environments around the world.
2. Mixed pronunciation of Chinese and English: Especially for Chinese, MeloTTS supports mixed pronunciation of Chinese and English. This is a very practical function in multi-language communication and can process Chinese texts containing English words.
3. Real-time CPU Reasoning: MeloTTS is designed to ensure that real-time Text To Speech can be achieved on the CPU even without GPU acceleration, which improves its usability in different hardware environments.
4. High-quality speech output: MeloTTS aims to generate clear and natural speech output, striving to maintain the naturalness and clarity of speech in various supported languages.
5. Easy to install and use: A simple installation guide and Python API are provided, allowing users to easily install MeloTTS in a Linux environment and convert text to speech in a few lines of code.

MeloTTS leverages multiple excellent open source projects such as TTS, VITS, VITS2 and Bert-VITS2 to achieve its high-quality text-to-speech conversion capabilities. It follows the MIT license and is available for commercial and non-commercial use.

GitHub:https://github.com/myshell-ai/MeloTTS
Demonstration:https://huggingface.co/spaces/mrfakename/MeloTTS

Video:

Scroll to Top