The project is based on OpenAI’s Whisper model and uses the asynchronous features of FastAPI to efficiently wrap it, supporting asynchronous task queuing, file processing, web crawler, and more custom functions.
The vision of “Fast-Powerful-Whisper-AI-Services-API” is to create a powerful and out-of-the-box Whisper service API that is designed for high-performance, highly scalable, and distributed processing requirements, and is based on the producer-consumer model. Built at the core of the design, it is ideal for scenarios that require large-scale, efficient automated speech recognition. The project is based on the OpenAI Whisper model and the Faster Whisper model, which has faster reasoning speed and similar accuracy. It supports high-quality speech transcription and translation tasks in multiple languages, and the built-in crawler module can easily implement video processing on social media platforms such as Douyin and TikTok., you only need to enter a link interface to easily create tasks.
This system achieves efficient resource scheduling and task management through an asynchronous model pool solution, and the asynchronous model pool supports the use of multiple GPUs for parallel computing, providing a fully localized, highly scalable, and reliable solution. In addition, the project plans to implement a flexible set of custom components and workflow designs, allowing users to define complex multi-step task flows through JSON files, or write custom components through Python to extend functionality. Built-in high-performance asynchronous HTTP module, asynchronous file IO module, and asynchronous database module. Users can use these modules to write their own services or task processors to expand their business. In the future, they plan to connect with LLM APIs such as ChatGPT to realize automatic speech recognition. A complete workflow from natural language processing and analysis.
Github:https://github.com/Evil0ctal/Fast-Powerful-Whisper-AI-Services-API
Oil tubing: