AI-Vtuber digital person - Tarogo General Blogs

A high-degree-of-freedom end-to-end customizable AI-VTuber.
Supports docking with the Beili Live Broadcast Room, uses the Intelligent Spectrum API as the language base model, has intention recognition, long-term and short-term memory (direct memory and associative memory), supports the construction of cognitive libraries, song works libraries, and connects to some of the currently popular speech conversion, Text To Speech, image generation, digital human-driven projects, and provides an easy-to-operate client.

Features of this project:

1. This project does not have high requirements for local graphics cards. Computers that can run stable-diffusion normally can basically eat this project with peace of mind.
2. The area of this project may be relatively large (about 20 grams after full deployment, not counting third-party projects), mainly because the virtual environment is relatively large, and this problem will be solved in the future.
3. This project has a built-in miniconda3 management virtual environment, allowing users to expand third-party modules themselves.
4. This project provides a visual client (built based on the streamlit framework) that supports: environment management, virtual anchor customization, self-starting of extended projects, some practical widgets, live broadcast backend monitoring, graph database editing and other operations.
5. This project provides a one-stop training-reasoning service for the so-vits-svc4.1 project.
6. This project provides a back-end API server that supports obtaining most of the project’s services through get/post requests.
7. This project supports operations such as virtual anchor template construction, multi-person template management, and real-time switching of virtual anchor templates.
8. In the current version, the open source projects connected by this project include: so-vits-svc4.1 (Speech Conversion), GPT-Sovits (Text To Speech), UVR5 (Human Voice Separation), fast-whisper (Speech Recognition), stable-diffusion-webui (Image Generation), stable-diffusion-comfyui, easyaivtuber (Digital Human Drive), rembg (Background Subtraction)
9. The practical gadgets provided by this project include: video/audio crawler, speech recognition, human voice separation, Text To Speech, speech conversion, AI painting, and picture background removal.
10. This project builds an AI virtual anchor character by building a role prompt word template, a cognitive/work knowledge base based on knowledge map query, and a knowledge base query based on vector database (technical implementation can be used to view the author’s language document or blog).

If you want to learn more, you can click on the link below the video.
Thank you for watching this video. If you like it, please subscribe and like it. thank

GitHub：https://github.com/whoiswennie/AI-Vtuber

Oil tubing: