MCP-based chat robot, Xiaozhi AI chat robot entrance

xiaozhi-esp32: A chat robot based on MCP. Xiaozhi AI chat robot serves as a voice interaction portal. It uses the AI capabilities of large models such as Qwen / DeepSeek to achieve multi-terminal control through MCP protocol

this project xiaozhi‑esp32 It is an open source project initiated and maintained by “Brother Shrimp” and runs on low-cost ESP32 series chips (such as ESP32 ‑C3/S3/P4) to create a voice interactive AI chat robot

Overview of core functions

  • Offline wake-up real-time conversation
    Use ESP-SR to achieve local wake-up, upload streaming speech recognition (ASR) to cloud large language models (such as Qwen, DeepSeek), receive LLM replies and speak through TTS
  • multi-protocol communication
    Support WebSocket or MQTT+UDP to handle remote message control and device interaction
  • audio processing
    Using OPUS codec to improve voice transmission efficiency
  • Voice Print Recognition
    It can distinguish the voices of multiple people and realize identity interaction.
  • display and interaction
    Supports OLED/LCD display to display expressions, power and other information, as well as camera photography and image recognition functions (the latest version has been added)
  • Super wide hardware compatibility
    It has been verified to support 70+ ESP32 development boards, including S3, P4 series and various screens, communication modules, sensing modules, etc.
  • MCP protocol controls intelligent hardware
    It can control volume, lights, motors, GPIOs and other peripherals through MCP, and can also control PCs, smart homes, etc. through cloud commands.
  • multi-language support
    Compatible with Chinese, English, Japanese and other languages, suitable for global users

🚀Latest developments

  • v1.7.6 (June 24, 2025)
    Add MCP as the default protocol, support camera taking photos, optimize memory and support more board platforms.
  • The community is continuing to expand, such as the support of UNIHIKER K10, Waveshare, M5Stack, DeepSeek and other devices, and some have also implemented Home Assistant adaptation.

Usage

  1. Brush firmware directly: GitHub or xiaozhi.me provides pre-compiled bin for common boards. Users only need to configure Wi-Fi to experience basic functions.
  2. Compiled by yourself: Build an ESP‑IDF environment (recommended by Linux), adjust sdkconfig, select the applicable firmware version (such as bread, ml307, etc.), compile and burn it yourself.
  3. advanced development: You can add new hardware custom boards according to the documentation, or realize new function expansion through MCP.

Suitable for the crowd

  • DIY enthusiasts and the Maker community
  • Students or innovators who want to bring big language models to smart hardware
  • Developers with needs for voice interaction, edge reasoning, and remote control

Overall, xiaozhi‑ esp32 is a highly integrated, community-active, and feature-rich open source project suitable for in-depth exploration by developers who like hand-crafted and want to implement AI functions.

Github:https://github.com/78/xiaozhi-esp32

Oil tubing:

An English version of Youtube introduction:

Scroll to Top