DeepSeek V4 to Run on Huawei Chips, Sidelining Nvidia

DeepSeek will optimize its next frontier model, V4, to run on Huawei-manufactured chips rather than Nvidia hardware. The Information reported the shift on April 3, citing five people with direct knowledge of chip purchase orders from China’s largest technology companies. The trillion-parameter model represents the first time a top-tier AI lab has built a frontier system entirely around Chinese-made silicon – a milestone with major implications for the global AI chip market.

Why It Matters

China’s three largest internet platforms – Alibaba, ByteDance, and Tencent – have placed orders for hundreds of thousands of Huawei AI accelerators. ByteDance alone plans to spend over $5.6 billion on Huawei Ascend chips this year. These commitments, confirmed independently by Reuters, mark the most aggressive collective push yet toward a domestic AI hardware supply chain free of American components.

DeepSeek worked closely with Huawei and Cambricon Technologies over several months to rewrite key parts of V4’s codebase for compatibility with Ascend silicon. The company also gave Huawei early access to V4 for performance tuning – a privilege it did not offer Nvidia or AMD. V4 builds on the V3 architecture that shocked the industry in late 2024 by matching leading Western models at a fraction of typical training budgets. The new model reportedly scales to roughly one trillion parameters using a mixture-of-experts design with native multimodal capabilities.

On the hardware side, Huawei’s newest Ascend 950PR accelerator delivers 2.8 times the performance of Nvidia’s H20, the most powerful chip currently cleared for export to China. The 950PR runs on Huawei’s proprietary HiBL 1.0 memory and ships with CANN Next, a software stack designed to closely replicate the CUDA programming environment developers already know. Huawei plans to manufacture 750,000 units this year, priced at roughly $6,900 per chip – a steep discount compared to Nvidia’s H100.

What’s Next for the Industry

A successful V4 launch would challenge a foundational assumption behind US export controls: that restricting sales of advanced Nvidia chips can slow China’s AI progress. DeepSeek’s V3 already triggered a 17-percent single-day crash in Nvidia shares when it demonstrated competitive performance on cheaper hardware. If V4 delivers similar results on fully domestic silicon, the strategic logic of those restrictions weakens considerably.

The global AI hardware market is splitting along geopolitical lines. Western companies build on Nvidia’s CUDA ecosystem, while Chinese firms increasingly adopt Huawei Ascend and the CANN framework. Each model optimized for Ascend reduces the cost of the next migration, setting off a reinforcing cycle that could permanently divide AI infrastructure into rival standards. ByteDance and Alibaba received 950PR samples in January and have run production benchmarks for two months, a sign that adoption has moved well past the pilot stage.

DeepSeek has not locked in a public release date, though sources told The Information that V4 will likely arrive within weeks. Previous timelines slipped due to training instability on Ascend hardware, a known limitation that forced DeepSeek back to Nvidia GPUs for its R2 model. Overcoming that reliability gap will determine whether China’s chip ecosystem earns broader trust among AI developers worldwide.

The race to build frontier AI without Nvidia is no longer theoretical. DeepSeek V4, if it delivers, will prove that the world’s most powerful models can run on chips designed and fabricated within China – reshaping competition across the entire AI chip industry.

Sources: The Information · Reuters · CNBC · Tom’s Hardware