Chinese Firms Foil US AI Sanctions With Older GPUs, Software Tweaks

image of brain over circuit board
(Image credit: Shutterstock)

After Chinese companies lost access to Nvidia's leading-edge A100 and H100 compute GPUs, which can be used to train various AI models, they had to find ways to train them without using the most advanced hardware. To compensate for the lack of powerful GPUs, Chinese AI model developers are instead simplifying their programs to reduce requirements, and using all the compute hardware they can get in combination, the Wall Street Journal reports.

Nvidia cannot sell its A100 and H100 compute GPUs to Chinese entities like Alibaba or Baidu without getting an export license from the U.S. Department of Commerce (and any application would almost certainly be denied). So Nvidia has developed A 800 and H800 processors that offer reduced performance and come with handicapped NVLink capabilities, which limits the ability to build high-performance multi-GPU systems traditionally required to train large-scale AI models.

Article continues below

Due to high costs and the inability to physically get all the GPUs they need, Chinese companies have designed methods to train large-scale AI models across different chip types, something that U.S.-based companies rarely do due to technical challenges and reliability concerns. For example, companies like Alibaba, Baidu, and Huawei have explored using combinations of Nvidia's A100s, V100s, and P100s, and Huawei's Ascends, according to research papers reviewed by WSJ. 

"If it works well, they can effectively circumvent the sanctions," Dylan Patel, chief analyst at SemiAnalysis, is reported to have said.

 

Anton Shilov
Contributing Writer
  • bit_user
    Meanwhile, a paper published last year by Baidu and Peng Cheng Laboratory demonstrated that researchers were training large language models using a method that could render the additional feature irrelevant.
    Does anyone know which paper that is?
    Reply