Video to subtitle tool, one click to generate high-quality SRT subtitles

Project name: VideoTextPro
Project function: text conversion tool
Project Description: A text conversion tool optimized for Douyin live video recording. It is mainly used to handle text extraction and subtitle generation for live playback and recorded video. It supports multiple video and audio formats, including FLV, MP4, AVI, etc.
It has efficient batch processing capabilities, can automatically scan recorded and broadcast folders, intelligently skip processed files, and supports multiple subtitle formats (such as SRT, ASS, TXT).

Project introduction

the project, called video-to-text-conversion, aiming toTranscribes audio content in a video file into text, which is to automatically generate subtitles. Supports multilingual speech recognition and generates subtitles with Timeline (.srt documents).

🧰Project functions

Support the extraction and recognition of audio from videos as text;
generation criterion .srt Subtitle files for easy loading by video players;
Can recognize multiple languages;
Support batch processing of multiple video files.

Installation steps

clone projects

git clone https://github.com/ldlkuz/video-to-text-conversion.git
cd video-to-text-conversion

Create a virtual environment and install dependencies

python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

④ˇ Usage

Put your video files (for example .mp4) Put in videos/ In the folder.
Run the main script
```
python main.py
```
After execution, it will:
- traversal videos/ All videos in the folder;
- Extract audio for each video;
- Use Whisper for speech transcription;
- save the result as .srt Format subtitle file, output to subtitles/ Folder.

Project structure

main.py: Main program, which processes video files, calls Whisper, and generates subtitles.
videos/: Place video files to be processed.
subtitles/: Output the generated subtitle file.
requirements.txt: List of required Python libraries (mainly including openai-whisper, moviepy, ffmpeg-python etc.).

ˇSupported languages

The Whisper model natively supports multiple languages, not limited to English. You can modify the code to specify the recognition language.

📝Precautions

You need to install FFmpeg (for processing video and audio) and make sure you can call it on the command line ffmpeg;
If there is no graphics card in the system, the model will use CPU by default, which may lead to slower speeds;
If you need higher precision, you can use Whisper’s large model (you need to modify your own code to load the corresponding model).

Project address: Click to open (https://github.com/ldlkuz/video-to-text-conversion)

Oil tubing: