Alibaba launches Qwen 1.5 - 32B model

Qwen 1.5 – 32B is the latest member of the Qwen 1.5 language model series, which strives to find the ideal balance between performance, efficiency and memory footprint.

Its functional characteristics and uniqueness mainly include:

1. Balance between parameter quantity and efficiency:

Qwen 1.5 – 32B has approximately 30 billion parameters, which is considered the best balance between maintaining strong performance and manageable resource requirements. This allows the model to have high accuracy when performing complex tasks while maintaining lower operating costs and faster reasoning speeds.

2. Group query attention (GQA):

The Qwen1.5- 32B model architecture includes a group query attention mechanism. This is an optimized attention mechanism that can improve the inference efficiency and performance of the model when processing large amounts of data, making the model have better inference performance potential when serving.

3. Strong dialogue skills:

Through post-training techniques on the Qwen1.5 – 32 B-Chat model, especially the use of RLHF (Reinforcement Learning from Human Feedback), the Qwen1.5 series has made significant progress in enhancing dialogue capabilities, making Qwen1.5 – 32 B-Chat provide a more natural and smoother dialogue experience in chat applications.

4. Competitive performance:

Compared with other approximately 3 billion parametric models, Qwen1.5- 32B demonstrated competitive performance in multiple benchmarks, including language understanding, generation and multilingual evaluation in multiple domains. Although there is a slight decrease in performance compared to larger parametric models such as Qwen1.5- 72B, Qwen1.5- 32B still outperforms other models of similar size in most tasks.

5. Multi-language support:

Qwen 1.5 – 32B was tested in 12 different languages, including Arabic, Spanish, French, etc., demonstrating its ability in multilingual understanding and generation. This demonstrates its ability to serve as a multi-purpose language model that can adapt to different language environments and needs.

6. Optimized memory footprint and speed:

Compared to models with more parameters, such as Qwen 1.5 – 72B, Qwen 1.5 – 32B requires less memory usage and runs faster. This makes it possible to deploy high-performance language models in resource-limited environments, while also reducing running costs.

The launch of Qwen 1.5 – 32B provides an attractive option for applications that need to find the best balance between strong performance and resource efficiency. Its unique technical optimization and multi-language support capabilities enable it to play an important role in diverse application scenarios, especially where large amounts of information need to be processed quickly and efficiently.

Blog: http://qwenlm.github.io/blog/qwen1.5/
GitHub: http://github.com/QwenLM/Qwen1.5
HF: http://huggingface.co/Qwen
Demo：https://huggingface.co/spaces/Qwen/Qwen1.5-32B-Chat-demo

Video: