xiaozhi-esp32: A chat robot based on MCP. Xiaozhi AI chat robot serves as a voice interaction portal. It uses the AI capabilities of large models such as Qwen / DeepSeek to achieve multi-terminal control through MCP protocol
this project xiaozhi‑esp32 It is an open source project initiated and maintained by “Brother Shrimp” and runs on low-cost ESP32 series chips (such as ESP32 ‑C3/S3/P4) to create a voice interactive AI chat robot
Overview of core functions
- Offline wake-up real-time conversation
Use ESP-SR to achieve local wake-up, upload streaming speech recognition (ASR) to cloud large language models (such as Qwen, DeepSeek), receive LLM replies and speak through TTS - multi-protocol communication
Support WebSocket or MQTT+UDP to handle remote message control and device interaction - audio processing
Using OPUS codec to improve voice transmission efficiency - Voice Print Recognition
It can distinguish the voices of multiple people and realize identity interaction. - display and interaction
Supports OLED/LCD display to display expressions, power and other information, as well as camera photography and image recognition functions (the latest version has been added) - Super wide hardware compatibility
It has been verified to support 70+ ESP32 development boards, including S3, P4 series and various screens, communication modules, sensing modules, etc. - MCP protocol controls intelligent hardware
It can control volume, lights, motors, GPIOs and other peripherals through MCP, and can also control PCs, smart homes, etc. through cloud commands. - multi-language support
Compatible with Chinese, English, Japanese and other languages, suitable for global users
🚀Latest developments
- v1.7.6 (June 24, 2025)
Add MCP as the default protocol, support camera taking photos, optimize memory and support more board platforms. - The community is continuing to expand, such as the support of UNIHIKER K10, Waveshare, M5Stack, DeepSeek and other devices, and some have also implemented Home Assistant adaptation.
Usage
- Brush firmware directly: GitHub or xiaozhi.me provides pre-compiled bin for common boards. Users only need to configure Wi-Fi to experience basic functions.
- Compiled by yourself: Build an ESP‑IDF environment (recommended by Linux), adjust sdkconfig, select the applicable firmware version (such as bread, ml307, etc.), compile and burn it yourself.
- advanced development: You can add new hardware custom boards according to the documentation, or realize new function expansion through MCP.
Suitable for the crowd
- DIY enthusiasts and the Maker community
- Students or innovators who want to bring big language models to smart hardware
- Developers with needs for voice interaction, edge reasoning, and remote control
Overall, xiaozhi‑ esp32 is a highly integrated, community-active, and feature-rich open source project suitable for in-depth exploration by developers who like hand-crafted and want to implement AI functions.
Github:https://github.com/78/xiaozhi-esp32
Oil tubing:
An English version of Youtube introduction: