Octopus-v2: 2B LLMs that can run on mobile devices

Octopus-V2-2B is developed by Nexa AI at Stanford University and is customized for function calls of the Android API.
A unique functional tagging strategy is adopted that goes beyond RAG-based methods and is particularly suitable for edge computing devices.
It is 36 times faster than the Llama 7B + RAG solution, has better performance than GPT-4, and has a delay time of less than 1 second.
It can run directly on mobile devices and supports a wide range of application scenarios, driving new ways of Android system management and collaborative work between devices.
Its fast and efficient reasoning capabilities are especially suitable for scenarios that require high-performance and precise function calls, such as smart home control, mobile application development, etc.
🔑Special flags minimize error rates: Assigning a unique flag to each function greatly reduces the error rate of function selection. Shortens the context by more than 95%.

Excellent accuracy: With only 100 samples per function, the accuracy rate reaches an astonishing 98.095%.
Reduced response time: Compared to Llama-7B based on RAG function calls, our method reduces latency by 35 times.
ˇ Typical queries run on the device in 1.1 to 1.7 seconds.️It can be deployed on cars, headphones, mobile phones, computers and other devices…
Practical applications: Mobile smart devices can be transformed to achieve seamless interaction with various services such as maps and food delivery.

Thesis:https://arxiv.org/abs/2404.01744
Model download:https://huggingface.co/NexaAIDev/Octopus-v2

Video: