PDFMathTranslate: PDF mathematical formula translation tool

The tool uses artificial intelligence technology to automatically identify and retain formulas, charts, catalogs and annotations. Supports multiple languages and many translation services. Provides command-line tools, graphical user interfaces, and container-based deployments.
Currently, Github 3.3K Star is still growing rapidly.

1. Project Introduction

PDFMathTranslate is an open source project that aims to extract mathematical formulas from PDF files and translate them into multiple languages. It is especially suitable for reading aids in scientific papers, textbooks, and technical documents. This project combines multiple modules such as OCR (Optical Character Recognition), formula recognition, and language translation, which greatly reduces the threshold for cross-language reading of mathematical materials.

2. Core functional modules

  1. PDF page screenshot extraction

    • use fitz(PyMuPDF) Convert PDF into images and extract page by page;
    • You can selectively process a certain page or multiple pages to adapt to different needs.
  2. Formula area detection

    • Use the YOLOv7 training model to locate the area where the formula is located;
    • Support batch testing to improve efficiency.
  3. Formula identification and conversion

    • Convert formula regions to LaTeX expressions using MathPix or LaTeX-OCR;
    • Ensure high recognition accuracy and adapt to various complex mathematical expressions.
  4. multilingual translation

    • Translate formulas and their contexts into the specified language based on OpenAI GPT-3.5 or other translation models;
    • Support Chinese-English translation and is highly expansible.
  5. resultant output

    • Support exporting recognition results to JSON, TXT or integrating them into HTML pages;
    • Visually friendly and suitable for subsequent reading and editing.

3. Project highlights

  • highly automated: Basically realize automatic processing of the entire process from PDF to translation results;
  • interdisciplinary integration: Integrated image processing, deep learning, and natural language processing;
  • strong practicability: Suitable for different user groups such as researchers, students, and translators;
  • Open source extensible: You can access your own OCR model and translation API to achieve personalized customization.

4. Brief description of usage

git clone https://github.com/Byaidu/PDFMathTranslate.git
cd PDFMathTranslate
pip install -r requirements.txt

After that, run the main program through the configuration parameters:

python main.py --pdf_path sample.pdf --lang zh

you can config.yaml Adjust parameters such as recognition range, translation language, and output format in the file.


5. Future Outlook

  • Support more languages and model backends (such as DeepL, Claude);
  • Add Math semantics parsing;
  • Improve the robustness of formula recognition and support handwritten formulas;
  • Develop Web UI and provide visual interactive interfaces.

Github: https://github.com/Byaidu/PDFMathTranslate

Oil tubing:

Scroll to Top