Price of Nvidia's Vera Rubin NVL72 racks skyrockets to as…

Prices of Nvidia Rubin-based rack-scale solutions will increase compared to existing Blackwell-powered racks, and then will skyrocket with Rubin Ultra, which will double the number of GPU packages per rack. This clearly affects Nvidia's bottom line, but the margins for its partners are shrinking, according to DigiTimes. This happens not only due to the fact that it is hard to maintain a 10% margin on items that cost millions, but also because Nvidia tends to reduce the role of server makers and systems integrators in the final BOM.

Millions per rack-scale systems

Depending on configuration (and who you ask), a Blackwell-based NVL72 rack-scale system costs $2.8 – $3.4 million for an AI training and HPC-optimized NVL72 GB200 and $6 million to $6.5 million for an AI inference-optimized NVL72 GB300 system, according to a 3DTested source with knowledge of the matter.

Meanwhile, Vera Rubin-based NVL72 VR200 systems are currently quoted at $5 million - $7 million per unit (yet, these systems come with about $1 million of 3D NAND storage), the same source indicated. While some quotes cited on the net have merit, one should keep in mind that many of them represent only the ODM price without a proper warranty, according to the source.

Article continues below

There are not many companies currently getting quotes for NVL144 VR300 (Vera Rubin Ultra) systems, given the fact that the Rubin Ultra silicon has not taped out yet (according to Nvidia's boss Jensen Huang), and actual systems are due sometime in the second half of 2028. Yet, there are reports that NVL144 VR300 rack-scale systems can be priced from $7 million all the way to $8.8 million, depending on the report. Nvidia has never confirmed the list prices of its NVL72 or NVL144 products.

The difference between the numbers clearly indicates that we are walking on very thin ice in terms of actual numbers, yet the order of the numbers clearly indicates that we are dealing with systems costing millions rather than hundreds of thousands.

Such increases are generally good for Nvidia and key silicon suppliers that span from makers of tiny retimers like Astera Labs (which supplies hundreds of them per rack) to large HBM4 suppliers like Micron or TSMC. Apart from the increasing costs of Nvidia's silicon itself at TSMC, as well as more expensive HBM4 memory, there are many other reasons why Vera Rubin machines are more expensive than Blackwell systems.

The new systems are considerably more complex than their predecessors, as they are more power hungry and require more expensive components. More importantly, as all Vera Rubin-based NVL72 VR200 systems come with storage, this adds about a million dollars to their cost, which goes to 3D NAND memory makers.

Actual server makers struggle to remain profitable

But while Nvidia and other chipmakers make huge profits on AI, this cannot be said about actual system suppliers, as Nvidia reportedly intends to supply its partners with fully assembled Level-10 (L10) compute trays, effectively taking over most of the server's core design and manufacturing.

Instead of shipping individual components, Nvidia reportedly plans to deliver pre-built tray modules that include the Vera CPU, Rubin GPUs, memory, networking, power delivery, and liquid cooling. This goes beyond its earlier GB200 approach (L7–L8 integration) and would standardize nearly the entire compute subsystem. Since these trays could account for ~90% of a server's cost, ODMs and hyperscalers would no longer design motherboards or cooling systems themselves, but would focus only on rack-level integration, power infrastructure, cooling distribution (e.g., CDUs), and management software. While this certainly reduces their hassle, this also reduces their margins, which is naturally not something they would be happy about. In addition to margins, ODM lose ability to differentiate and innovate beyond what Nvidia offers them, which means that the only way for them to compete is to reduce prices at the cost of their profits.

Shipping pre-assembled trays to server makers could accelerate deployment timelines and reduce development costs, as Nvidia would rely on large EMS partners (likely Foxconn, Quanta, or Wistron) to mass-produce these complex systems. This is particularly relevant given the increasing engineering difficulty of next-gen hardware, including very thick PCBs, high-density designs, as well as rising GPU power consumption — reportedly scaling from 1.4 kW (Blackwell Ultra) to 1.8 kW and potentially higher, which drives the need for tightly integrated cooling solutions.

However, the shift would also redefine the role of ODMs, turning them from system designers into integrators and service providers, while increasing Nvidia's control over the value chain as well as its margins.

Going forward, this strategy could extend further into rack-scale systems like Kyber NVL576 (Kyber chassis doubles the number of GPU packages amid doubling the number of compute chiplets per package), especially as the industry moves toward 800V data center architectures and megawatt-class racks, which increases the chance that Nvidia may eventually expand its control beyond trays to full rack-level integration, reduce the role of ODMs even further, and shrink their margins even more.

Follow 3DTested on Google News, or add us as a preferred source, to get our latest news, analysis, & reviews in your feeds.

TOPICS