Huawei unveils new Atlas 350 AI accelerator with 1.56 PFLOPS of FP4 compute and up to 112GB of HBM — claims 2.8x more performance than Nvidia's H20
Built on the Ascend 950PR silicon.
Get 3DTested's best news and in-depth reviews, straight to your inbox.
You are now subscribed
Your newsletter sign-up was successful
China's mission to become entirely self-reliant in the field of artificial intelligence has reached a new milestone. Announced at the Huawei China Partner Conference 2026 in Shenzhen, the company has just unveiled its latest AI accelerator: the Atlas 350. This new NPU is based on an in-house Ascend 950PR chip, representing a significant upgrade over the last-gen Ascend 910-class silicon.
Huawei is marketing the Atlas 350 as a high-efficiency workhorse designed for the prefill stage (inference) of AI deployment. As such, it delivers 1.56 PFLOPS of FP4 throughput, which Huawei claims is 2.87 times higher than Nvidia's China-only H20. That number can't be verified because Hopper-era cards don't support FP4 natively, while the Atlas 350 is the first homegrown Chinese accelerator to be optimized for FP4 precision.
That's already a significant achievement because even Nvidia only recently started to support the format with its Blackwell GPUs. FP4 allows for larger models to be deployed on the same hardware while requiring less memory. Speaking of which, the Atlas 350 comes with 112GB of Huawei's proprietary HBM known as "HiBL 1.0."
Article continues below
Even though the Ascend 950PR otherwise features 128 GB of memory with a 1.6 TB/s bandwidth, current reports for the Atlas 350 say it maxes out at 1.4 TB/s. The memory access granularity has been reduced from 512 bytes to just 128 bytes. It also supports 2 TB/s interconnect bandwidth using the new LingQu protocol, which is 2.5x higher than the previous Ascend 910 series. The Atlas 350 is rated at 600W, 200W more than the H20.
Those specs paint an impressive picture for a homegrown chip, especially one that's made with U.S. Sanctions in place. Huawei is not allowed to access TSMC's CoWoS tech that Nvidia uses to stack HBM near the GPU, so the company is leveraging some other advanced packaging. The memory itself is in-house and is supposed to compete with the likes of SK Hynix and Micron, though we don't know who the actual supplier is.
Precise availability wasn't announced — it rarely is with AI accelerators — but Huawei has kept its prior promise of a Q1 2026 release for the Ascend 950PR. BigGo Finance says the NPU is priced at 111,000 Yuan (~$16,000) versus Nvidia's H20 which can range from anywhere between $15,000 to $25,000 in the region. Street pricing doesn't really exist for AI GPUs, so take this particular bit with a grain of salt.
There are a lot more Ascend chips in the pipeline that we've covered in a dedicated article before. Despite the ambition to gain independence from foreign hardware, Chinese companies still source Nvidia GPUs (and not the nerfed ones), which makes sense considering how local silicon is not quite as competitive yet and because the CUDA software stack is so mature. Huawei's latest efforts, therefore, represent a serious step in trying to bridge that gap.
Get 3DTested's best news and in-depth reviews, straight to your inbox.
Follow 3DTested on Google News, or add us as a preferred source, to get our latest news, analysis, & reviews in your feeds.
