In addition to text-to-speech capabilities, it can also change the voice of one song with that of another singer. It also supports voice conversion, singing synthesis, text-to-audio, text-to-music, and more!
Very powerful!
Demo video: Taylor Swift singing a Chinese song 🎵
Amphion’s supported audio generation tasks span a wide range of domains, from text to music, each with its unique applications and technical requirements.
Key features:
1. Text-to-speech: Convert text into colloquial speech.
Applications: Used to make voice assistants, automatic voice reply systems, read texts for the visually impaired, etc.
2. Singing voice synthesis: Create a virtual singer’s voice, which can generate singing voices from text or melodies.
Application: Used for music production, virtual idol creation, etc.
3. Voice Transformation: Change one person’s voice to sound like another person.
Applications: Used for entertainment, sound design, anonymous communication, etc.
4. Singing voice conversion: Convert the voice of the singer of a song into the voice of another singer.
Applications: For music production, personalized music experiences, and more.
5. Text-to-audio: Not only convert text to speech, but also other types of audio, such as sound effects or music clips.
Applications: Used to create sound effects, music clips, audio stories, etc.
6. Text to Music: Generate music from text descriptions.
Applications: For automated music creation, creating music based on emotions or storylines, etc.
Model Support: The toolkit supports multiple models and architectures, such as FastSpeech2, VITS, Vall-E, NaturalSpeech2, and more, for different audio generation tasks.
Vocoder Support: Amphion supports a wide range of neural vocoders, including GAN-based vocoders (e.g., MelGAN, HiFi-GAN), stream-based vocoders (e.g., WaveGlow), diffusion-based vocoders (e.g., Diffwave), and more.
Dataset Support: Amphion unifies data preprocessing for open-source datasets and supports multiple datasets such as AudioCaps, LibriTTS, LJSpeech, and more.
GitHub:https://github.com/open-mmlab/Amphion
Paper: https://arxiv.org/abs/2312.09911
HuggingFace Demo: https://huggingface.co/amphion