UltraRAG is a framework for complex RAG

UltraRAG is a lightweight framework that makes the process of building retrieval-augmented generation (RAG) systems simple and efficient. It uses a low-code development model: you don’t need to write complex code, just a few dozen lines of YAML configuration files to create complex AI workflows with conditional logic and looping mechanisms.
The framework has a built-in visual development environment, which allows you to build data processing pipelines, adjust parameters in real time, and transform the designed logic into interactive conversational applications with one click. This means that even if you don’t have deep programming skills, you can deploy a high-performance AI system that can generate answers based on your own data, reducing AI hallucinations and improving the accuracy of your answers.

Today, with the increasing popularity of large model applications, RAG (retrieval-augmented generation) has almost become the standard solution for building private knowledge question and answer systems. But anyone who has actually done RAG will find that the problem is never just “connect a vector library”.

How to design a search strategy? Is rearrangement necessary? How is multi-stage search organized? How to combine different modules? How do I reproduce the results?

When the process starts to get complicated, the code swells rapidly.

That’s exactly what UltraRAG wants to solve. It is open source by OpenBMB and is positioned as:

A Low-Code MCP Framework for Building Complex and Innovative RAG Pipelines

What exactly does it mean by “Low-Code”?

UltraRAG is indeed low-code, but its low-code is not a drag-and-drop UI or a zero-threshold platform.

Its low-code nature is:

Define complex RAG processes with configurations such as YAML instead of writing a lot of Python glue code by hand.

You can abstract retrievers, rearrangers, generative models, evaluation modules, etc. into “modules” and then combine these modules into a complete pipeline through the MCP (Modular Composition Pipeline) mechanism.

When you want to do experimental comparisons, you don’t need to copy a whole set of scripts, just adjust the configuration to build a new process.

It lowers the cost of building experiments, not programming thresholds.

MCP: The core idea of modular combinations

At the heart of UltraRAG is the MCP (Modular Composition Pipeline).

It breaks down RAG into a series of composable modules that then allow:

Multi-stage search
Conditional logic
Complex pipeline structure
Innovative RAG architecture experiments

In other words, it doesn’t help you “build a Q&A application”, but helps you “design a reproducible RAG experimental system”.

This is especially important for research-based scenarios.

It is not a chat app platform

If you are looking forward to:

Visualize the drag-and-drop interface
Generate chatbots with one click
AI Builder for non-technical people

UltraRAG may disappoint you.

It is positioned more towards research and engineering optimization than product-oriented application layers.

Some tools on the market are better suited for getting your application up to speed and UltraRAG is better for:

Optimized search strategy
Build innovative RAG structures
Do experimental comparisons
Make a reproducible pipeline design

It focuses on “how to get stronger in RAG” rather than “how to go live faster”.

Why is this framework valuable?

Many RAG systems stay at:

Vector retrieval + stitching context + handing over to the large model

But when you really start optimizing, you’ll encounter:

Dense vs Sparse vs Hybrid？
Join Reranker?
How to do a multi-stage recall?
How to systematically evaluate whether improvements are effective?

Without a structured pipeline framework, experiments can become confusing and difficult to reproduce.

UltraRAG provides an engineered way to express complex RAG structures.

Github：https://github.com/OpenBMB/UltraRAG
Tubing: