Huawei Outpaces DeepSeek with Superior AI Training on Own Chips

Huawei unveils a groundbreaking AI training method, surpassing DeepSeek with proprietary Ascend chips amid current tech export challenges.

Huawei Claims Better AI Training Method Than DeepSeek Using Own Chips

In the rapidly evolving landscape of artificial intelligence (AI), companies are constantly pushing the boundaries of innovation. Recently, Huawei has made a significant claim that its AI training method surpasses that of DeepSeek, a prominent AI model, by leveraging its proprietary Ascend chips. This development is particularly noteworthy given the current geopolitical tensions and restrictions on AI technology exports, which have significantly impacted Huawei's access to advanced chips from companies like Nvidia. Let's dive into the details of this breakthrough and its implications for the AI industry.

Background: Huawei's Challenges and Innovations

Huawei, a Chinese technology giant, has faced significant challenges in recent years due to U.S. sanctions that restrict its access to advanced semiconductors and AI chips. Despite these hurdles, Huawei has continued to innovate, focusing on developing its own AI hardware and software solutions. The company's large language model, Pangu, has been at the forefront of this effort, utilizing Huawei's proprietary Ascend chips to improve AI training efficiency.

DeepSeek's Mixture of Experts (MoE) Approach

DeepSeek, another notable AI model, employs the Mixture of Experts (MoE) technique, which involves activating only specialized parts of the model for specific tasks. This approach enhances computational efficiency while maintaining strong performance by reducing unnecessary computations. However, the MoE method can sometimes lead to uneven activation of these "experts," potentially hindering performance when running on multiple devices in parallel[1][5].

Huawei's Mixture of Grouped Experts (MoGE) Innovation

Huawei's Pangu team has introduced an upgraded version of the MoE technique called Mixture of Grouped Experts (MoGE). This innovation groups experts during selection, better balancing the workload among them. By doing so, MoGE addresses the inefficiencies inherent in traditional MoE methods by optimizing how experts are activated and utilized across different devices. This approach not only improves the efficiency of AI training but also enhances overall model performance[1].

Technical Comparison and Implications

Feature	DeepSeek	Huawei's MoGE
Architecture	Mixture of Experts (MoE) with Multi-Head Latent Attention (MLA)	Mixture of Grouped Experts (MoGE)
Efficiency	Activates only necessary parts of the model for specific tasks	Groups experts for better workload balance and efficiency
Performance	Comparable to top American models, efficient computation	Claims better performance than DeepSeek by optimizing expert activation
Hardware Utilization	Can be limited by hardware constraints	Optimized for Ascend chips, potentially overcoming hardware limitations

The table highlights the technical differences between DeepSeek's MoE approach and Huawei's MoGE innovation. While DeepSeek is known for its efficiency and scalability, Huawei's method aims to further enhance these aspects by optimizing expert workload distribution.

Future Implications and Real-World Applications

The implications of Huawei's MoGE innovation are significant, particularly in the context of ongoing U.S.-China tensions over AI technology. As Chinese companies like Huawei continue to develop their own AI ecosystems, they are reducing their reliance on U.S. technology. This shift could have profound effects on the global AI landscape, potentially leading to a more decentralized market where different regions develop their own AI solutions.

In real-world applications, more efficient AI training methods like MoGE could lead to faster development and deployment of AI models in various industries, such as healthcare, finance, and education. For instance, improved language models could enhance chatbots and virtual assistants, making them more responsive and effective.

Conclusion

Huawei's claim of surpassing DeepSeek with its MoGE method marks a significant milestone in AI innovation. By leveraging its proprietary hardware and software, Huawei is positioning itself as a leader in the AI race despite international challenges. As AI continues to evolve, innovations like MoGE will play a crucial role in shaping the future of AI technology and its applications worldwide.

Excerpt

Huawei introduces MoGE, surpassing DeepSeek's AI training efficiency with optimized expert workload distribution using Ascend chips.

Huawei Outpaces DeepSeek with Superior AI Training on Own Chips

Huawei Claims Better AI Training Method Than DeepSeek Using Own Chips

Background: Huawei's Challenges and Innovations

DeepSeek's Mixture of Experts (MoE) Approach

Huawei's Mixture of Grouped Experts (MoGE) Innovation

Technical Comparison and Implications

Future Implications and Real-World Applications

Conclusion

Excerpt

Tags

Category

Related Articles

Windows 11 Beta: AI Search Tool Designed by Microsoft

Global Risks of Unregulated AI, Warns Expert

AI Hardware Innovations at Computex 2025: GPUs in Focus