The Blackwell GB300 NVL72 established a new baseline for AI rack cooling: 142 kW per rack, five-layer direct liquid cooling, 450 quick disconnects per rack, 30–35°C CDU supply water. Vera Rubin NVL72 discards most of those constraints. It operates at 45°C supply temperature, eliminates fans and hoses from the tray design entirely, liquid-cools the busbar, and processes 187–227 kW per rack. Each of those changes cascades through the entire facility infrastructure stack — and through the commercial ecosystem that supplies it.
Every generation of NVIDIA rack-scale systems has pushed cooling harder than the previous one. But Vera Rubin NVL72 is not simply an incremented version of Blackwell GB300 with a higher thermal budget. It makes several discontinuous changes to the rack architecture that have substantive downstream consequences for data center facilities, the cooling supply chain, and the operational model of AI cluster operators.
The changes are worth enumerating precisely before analyzing each one, because they do not all point in the same direction. Some make cooling harder — the higher power budget, the busbar that now needs liquid cooling, the 100% liquid-cooled requirement. Some make it meaningfully easier — the 45°C supply temperature, the elimination of fans and hoses, the dramatically reduced assembly and failure-point count. Understanding which changes belong to which category determines whether Vera Rubin is a net increase or net decrease in cooling complexity. The answer is: it is simultaneously both, in different parts of the stack.
| Parameter | Blackwell GB300 NVL72 | Vera Rubin NVL72 | Direction |
|---|---|---|---|
| Rack peak TDP | ~142 kW | 187–227 kW (MaxQ/MaxP) | Harder ↑ |
| CDU supply temperature | 30–35°C | 45°C max | Easier ↑ (warmer = simpler facility) |
| Tray cooling method | Hose + QD + fan (hybrid) | PCB midplane blind-mate, 100% liquid, no fans | Simpler ↑ |
| Air-cooled components | OSFP, NVMe, PDB | None (all liquid) | Simpler ↑ |
| Busbar cooling | Air-cooled | Liquid-cooled | Harder ↑ (new component) |
| QD count per rack | ~450 | Reduced (PCB midplane replaces tray QDs) | Simpler ↓ (fewer failure points) |
| Assembly time per tray | ~100 minutes | ~5.5 minutes (18× reduction) | Simpler ↑ |
| NVLink switch tray hot-swap | Requires shutdown | Zero-downtime maintenance | Simpler ↑ |
| Free-cooling viability | Requires <30°C ambient | Viable at ambient up to ~40°C | Much easier ↑ |
The table shows a clear pattern: Vera Rubin trades higher absolute power and one new liquid-cooled component (the busbar) for a substantially simpler mechanical architecture everywhere else. The 45°C supply temperature change is the single most consequential operational shift — it is the lever that makes the entire facilities simplification possible.
The supply temperature for the cooling fluid entering a rack CDU is not just a thermal engineering parameter. It determines which facility-level cooling technologies are viable, which climates can achieve free cooling, and how much of a data center's total power budget gets consumed by the cooling plant rather than by compute.
Blackwell GB300 NVL72 requires coolant supply temperatures of 30–35°C at the CDU input. To deliver 30°C supply water to a rack, the facility chilled water plant — the chiller, the cooling towers, and the distribution piping — typically needs to maintain the primary loop at 18–22°C, accounting for the heat exchanger approach temperature in the CDU. This means running mechanical compression chillers essentially continuously. Compression chillers consume roughly 0.3–0.5 kW of facility power for every kW of heat they remove. At 7 MW of rack heat load (50 GB300 NVL72 racks), the chiller plant alone consumes an additional 2–3.5 MW.
Vera Rubin NVL72 raises the supply temperature limit to 45°C. To understand why this is significant, consider the physics of heat rejection. The fundamental constraint on free cooling — rejecting heat directly to ambient air through a dry cooler or fluid cooler without a mechanical refrigeration cycle — is that the supply water must be cooler than the ambient air entering the cooler. If ambient air is at 35°C and the water is at 30°C, the cooler cannot transfer heat from water to air. But if the water is at 45°C and ambient air is at 35°C, free cooling works.
A 10°C increase in acceptable supply temperature transforms free-cooling viability from a seasonal privilege available in cold climates to a year-round operational norm in most temperate and subtropical data center locations. For a 100 MW AI factory, this is the difference between spending $30M per year on chiller power and spending near zero.
The arithmetic of facility efficiency savings is stark. NVIDIA's own documentation states that operating at 45°C enables data centers in many climates to use ambient air and closed-loop dry coolers for cooling, reducing the need for compressors, driving down PUE, and unlocking larger energy budgets for compute — specifically enough to allocate up to 10% additional Vera Rubin NVL72 racks for more token generation in the same power budget. For a 100 MW cluster, that 10% headroom translates to roughly 10 MW of additional compute capacity — essentially the equivalent of another 40–50 Vera Rubin racks — obtained purely from the facility efficiency improvement, without any increase in total power budget.
The supply temperature increase also cascades into the CDU design. A CDU serving a 35°C supply system needs internal heat exchangers and control systems tuned for a narrow approach temperature. A CDU serving a 45°C supply system has more thermal headroom, which means it can operate with less precise coolant conditioning, simpler controls, and in some configurations can tolerate warm-water supply from a dry cooler with no heat exchanger at all — a direct-to-chip circuit using facility water as both primary and secondary loop. This is the warm-water direct liquid cooling architecture, and it is the most operationally simple cooling scenario possible for a high-density AI rack.
The Vera Rubin NVL72 compute tray makes a structural architectural bet that will affect the cooling industry far beyond this product generation. It eliminates fans, hoses, and internal cables from the tray entirely. The only mechanical connections are the two liquid inlet and outlet hoses at the rack level. Inside the rack, compute trays connect to the cooling and power distribution system via blind-mate PCB connectors when they are physically inserted — no manual coupling required.
This is a departure from the GB300 NVL72, where each compute tray required roughly 100 minutes to assemble due to manual cable routing and hose connection. The new design reduces assembly and servicing time by 18×, meaning a single tray can be inserted and operational in approximately 5.5 minutes. At the scale of a data center deploying hundreds of racks — each with 18 compute trays — this is not a marginal serviceability improvement. It is the difference between a deployment operation measured in weeks and one measured in days.
The fan elimination has consequences beyond assembly time. Fans are the highest-failure-rate mechanical component in a server. In a rack with 18 compute trays, each with multiple fans, the probability of at least one fan failure per year of operation is substantial. Fan failures require either shutting down the affected tray (losing 4 GPUs from the cluster's available compute) or operating it at elevated temperature risk while waiting for a service window. Vera Rubin's fan-free design eliminates this failure mode entirely from the compute and switch trays — a reliability improvement that directly translates to higher cluster availability and fewer unplanned maintenance windows.
The engineering question implicit in the fan-free design is: where does the residual air cooling for lower-density components like optical modules come from? The answer in Vera Rubin is that the rack-level architecture provides some managed airflow, but all thermally significant components — GPU, CPU, NVSwitch, busbar — are liquid-cooled. The small number of components that remain thermally manageable via air are cool enough that the residual convection from the rack environment is sufficient.
The busbar is the copper conductor that distributes electrical current from the power shelves at the top and bottom of the rack to the compute trays in between. In GB200 and GB300 generations, the busbar carries substantial current — GB300 runs at 2,900 A — but the current density is low enough that the busbar can be air-cooled, dissipating its resistive heating losses passively.
Vera Rubin NVL72 changes this. The busbar in VR NVL72 is rated for 5,000+ amperes, reflecting the higher per-tray power requirements of the Rubin GPU and Vera CPU. At 5,000 A through the same copper cross-section, resistive heating increases by roughly 3× compared to GB300. The busbar now generates significant heat that cannot be managed by passive air cooling in a fan-free rack. NVIDIA's solution is to liquid-cool the busbar itself — a new component category that did not exist in any previous generation of NVIDIA rack-scale systems.
A liquid-cooled busbar must satisfy constraints that conventional busbars do not face: it must carry high current while maintaining electrical isolation between the coolant loop and the energized conductor; it must be mechanically robust enough to flex as trays are inserted and removed; and its cooling channels must maintain adequate coolant flow even when only a subset of trays are installed. The manufacturing and qualification requirements for this component are substantially more complex than for a conventional copper busbar.
The Vera Rubin NVL72 has two defined operating modes, referred to as MaxQ and MaxP, with rack-level TDPs of 187 kW and 227 kW respectively. MaxQ represents a power-optimized operating point that trades some performance headroom for lower peak thermal and electrical demand. MaxP is full performance with no power capping.
The two operating modes are meaningful because they allow data center operators to make a deployment decision based on their facility's thermal and electrical headroom. A facility designed around GB300 NVL72 at 142 kW per rack and 35°C supply water will need facility upgrades to support VR NVL72 at 227 kW. But the same facility can likely support VR NVL72 in MaxQ mode at 187 kW with the upgraded CDUs providing 45°C supply — because the higher supply temperature unlocks more cooling capacity from existing infrastructure.
The per-rack power numbers for Vera Rubin also include the liquid-cooled busbar's heat contribution, which was previously carried by the power shelves' exhaust air. The power shelf design for VR NVL72 uses three-phase 415–480 VAC input stepping down to 50 VDC for the rack busbar, with six 18.3 kW PSUs per shelf at approximately 94% efficiency — generating about 1.1 kW of heat per PSU as losses. Six PSUs across three power shelves contribute roughly 6.6 kW of heat from power conversion alone, which the cooling system must absorb.
The 45°C supply temperature is the most discussed cooling feature of Vera Rubin in data center design circles, but the discussion is often imprecise about what exactly it enables and what limitations remain. Let us be specific.
A dry cooler or fluid cooler works by passing facility water through a heat exchanger that rejects heat to ambient air. The fundamental constraint is that the fluid temperature must be higher than the ambient air temperature for heat to transfer from fluid to air — at steady state, the supply water temperature will approach the ambient wet-bulb temperature (for cooling towers) or dry-bulb temperature (for dry coolers).
For a Vera Rubin deployment using a dry cooler with 45°C supply temperature, the dry cooler can reject heat effectively whenever ambient dry-bulb temperature is below roughly 40°C, accounting for a typical approach temperature of 5–8°C through the dry cooler coils. This means that a data center in Phoenix, Arizona (design dry-bulb of 43°C) would still need mechanical chillers on the hottest summer days, but could run free cooling for roughly 95% of annual hours. A data center in Dublin, Ireland (design dry-bulb of 23°C) could run free cooling essentially year-round.
The water savings are substantial. Cooling towers reject heat through evaporation — for a 50 MW data center, evaporative cooling at 35°C supply consumes roughly 40–60 million gallons of water per year. Switching to a dry cooler at 45°C supply eliminates that water consumption almost entirely. For data centers located in water-stressed regions — or for operators facing water use restrictions from municipalities — this is not a marginal improvement. It may be the difference between a project being permitted and not.
GB300 NVL72 left three component categories on air cooling: OSFP optical modules, NVMe storage drives, and power distribution boards. Vera Rubin NVL72 targets 100% liquid cooling — meaning all thermally significant components are on the liquid cooling circuit. The remaining air-cooled peripheral components are either eliminated from the tray design or managed through rack-level convection.
The most significant transition is the NVLink switch trays. In GB300, the switch trays required liquid cooling but retained some fan-assisted airflow. In VR NVL72, the switch trays support zero-downtime maintenance — they can be removed while the rack continues operating with the remaining switch trays maintaining reduced-bandwidth NVLink connectivity. This is a direct consequence of moving to fully liquid-cooled, blind-mate switch tray connections: without hoses to disconnect and fans to manage, the physical insertion and removal of a switch tray becomes a hot-plug operation.
The optical modules are the open question. VR NVL72's networking uses ConnectX-9 SuperNICs delivering 1.6 Tb/s per GPU bandwidth and Spectrum-6 switches with co-packaged optics (CPO). CPO integrates optics directly with the switch silicon, which means the optical I/O is cooled as part of the ASIC's liquid cooling circuit rather than as a separate pluggable module. This is a fundamental change in the optical thermal architecture — and it means the per-module cooling budgets that constrained GB300 OSFP thermal management become irrelevant for the CPO-connected portion of the network.
Over 80 MGX ecosystem partners have committed to supporting VR NVL72, including cooling vendors at every layer of the thermal stack. The commercial dynamics are different from the Blackwell wave in several ways.
Vertiv enters the Vera Rubin cycle in a strong position. It co-developed reference architectures for GB200 NVL72 and was the dominant CDU supplier in the first Blackwell deployment wave. For VR NVL72, Vertiv's Liebert XDU CDUs need to be reconfigured for 45°C supply operation, which is a different operating point than the 30–35°C systems they were optimized for in GB200. Vertiv has the engineering depth to do this quickly, but competitors who design specifically for warm-water supply from the start have fewer legacy constraints to work around.
Schneider Electric (including Motivair after its October 2024 acquisition) holds the NVIDIA-co-developed reference design for GB300 NVL72 and has been actively engaging on VR NVL72 architecture. Schneider's EcoStruxure IT Design CFD models and ETAP power analysis tools are part of the reference design package — giving data center operators digital twin simulation of Vera Rubin deployments before physical commissioning. The MQTT-based control plane integration with NVIDIA Mission Control, established for GB300, carries forward to VR NVL72.
nVent brings a specific strength to Vera Rubin that was less prominent in Blackwell: its Project Deschutes-inspired CDU design aligns with the warm-water, high-approach-temperature operating point that 45°C supply demands. nVent's modular row and rack CDUs, launched at SC25, are designed for current and future AI chip generations — and Vera Rubin's 45°C requirement is exactly the operating regime those designs are optimized for.
Modine Manufacturing is scaling aggressively into the chiller and dry-cooler layer of the VR facility stack. Its Cooling AI control systems and 1 MW CDUs are positioned for the facility-level heat rejection that Vera Rubin's 187–227 kW racks require. Modine's five-year order backlog and 50–70% projected annual growth in its data center business reflect hyperscaler commitments to Vera Rubin infrastructure.
| Layer | Function | Primary vendors for VR NVL72 | Key change from GB300 |
|---|---|---|---|
| Cold plate | GPU/CPU die-to-liquid | Boyd, AVC, Cooler Master | New geometry for Rubin package; HBM4 hotspot map different from B300 |
| PCB midplane | Blind-mate power + cooling | New category — PCB/connector OEMs | Replaces per-tray hoses and QDs entirely |
| Liquid-cooled busbar | Power distribution with integrated cooling | Early ecosystem — qualification in progress | New component; no equivalent in any prior generation |
| Rack manifold | Rack-level coolant distribution | NVIDIA UQD08 standard + partners | Carries full rack load including busbar; fewer total connection points |
| CDU (in-row / in-rack) | L2L heat exchange, 45°C optimized | Vertiv, nVent, Motivair (SE), CoolIT, Boyd, Delta | Must be re-optimized for 45°C supply; warm-water designs advantaged |
| Facility cooling | Dry cooler / chiller, free-cooling preferred | Vertiv, Schneider Electric, Modine, Carrier, Airedale | Dry cooler viable year-round in most climates at 45°C; chiller becomes optional |
NVIDIA has disclosed the architecture of its next-generation rack platform, codenamed Kyber, which will succeed Vera Rubin and house up to 576 Rubin Ultra GPUs per rack. Kyber introduces a further architectural discontinuity in the power and cooling stack: 800 VDC distribution rather than the current 48–54 VDC in-rack busbars.
The shift to 800 VDC has direct cooling consequences. Higher voltage means lower current for the same power — at 800 VDC versus 54 VDC, the same 200 kW rack power requires roughly 15× lower current. That dramatically reduces the resistive heating losses in the busbar and power distribution cables. NVIDIA states that 800 VDC allows over 150% more power to be transmitted through the same copper cross-section compared to the current architecture — which means a Kyber rack can potentially eliminate the liquid-cooled busbar that Vera Rubin requires, replacing it with a conventional air-cooled busbar despite operating at even higher total rack power.
For the cooling ecosystem, Kyber's 800 VDC shift redistributes the thermal challenge. The busbar stops being a primary cooling load. The GPU packages continue to be direct-liquid-cooled. But the power conversion efficiency at 800 VDC to the GPU supply voltage is different from the 48 VDC conversion chain — and the heat generated by DC-DC conversion in each compute tray needs to be managed through the same liquid cooling loop or through a new dedicated path. The thermal architecture of Kyber is not yet fully public, but the direction of travel is clear: higher voltage, less copper, simpler power routing, and continued pressure on the cold-plate and CDU layers to handle steadily increasing GPU package power density.
For teams designing or retrofitting data center facilities for Vera Rubin NVL72 deployments, the 45°C supply temperature changes several fundamental assumptions that were correct for Blackwell but no longer apply.
First, the chilled water plant sizing calculation changes. GB300 NVL72 at 35°C supply requires chilled water at roughly 18–22°C from the facility plant, demanding year-round mechanical refrigeration at most locations. VR NVL72 at 45°C supply can be served by a dry cooler operating at 40°C supply in most climates, reducing or eliminating the chiller plant entirely. A facility designed specifically for Vera Rubin in a temperate climate can potentially deploy with only a dry cooler and a small mechanical backup chiller for peak summer days — a capital cost reduction of $2–5M per 10 MW of compute capacity compared to a full chiller plant design.
Second, the rack-level plumbing changes. GB300 deployments require trained technicians to make manual hose and QD connections during tray installation. VR NVL72's blind-mate PCB midplane means that a compute tray can be installed by any qualified server technician — not just a cooling specialist. This has supply chain implications for deployment services and for operational staffing at AI factories that need to swap trays rapidly during production operations.
Third, the monitoring strategy changes. In a GB300 deployment, QD failures and hose failures are the primary cooling failure modes requiring active monitoring. In a VR NVL72 deployment, those failure modes are largely eliminated at the tray level, and monitoring attention shifts toward the rack-level manifold connections, the CDU pump health, and the facility supply water temperature. A thermally-aware control plane for VR NVL72 needs different sensor coverage from one designed for GB300.
The most important insight about Vera Rubin's cooling architecture is that the 45°C supply temperature is not just a thermal engineering specification. It is a facility architecture decision that NVIDIA made at the chip and system design level — by accepting a higher GPU package thermal resistance and higher HBM operating temperature in exchange for a simpler and more efficient facility infrastructure requirement. That tradeoff is deliberate and consequential.
The result is a rack platform that is more thermally demanding in absolute terms — 187–227 kW versus 142 kW — but operationally simpler at every level of the cooling stack except the liquid-cooled busbar, which is genuinely new. Less plumbing inside the rack. Less need for mechanical chillers outside the building. Dramatically less assembly time per tray. Far fewer failure-prone mechanical components at the interfaces between compute and cooling.
What the Vera Rubin cooling architecture tells us about the direction of AI infrastructure design is something that applies beyond this generation: the engineering constraint has shifted from "how do we make the GPU cool enough" to "how do we make the thermal system simple and reliable enough to operate thousands of racks continuously." Those are different optimization targets — and Vera Rubin is the first NVIDIA rack generation where the second target has clearly started to dominate the design.
Manish KL writes about AI infrastructure, memory systems, accelerator architecture, and cooling systems. Related essays: AI Cluster Reliability Beyond Fault-Tolerant Parallelism · The Next AI Cluster Failure Won't Look Like a GPU Failure · The Cooling Stack Is the New Critical Path: How Blackwell GB300 NVL72 Racks Manage 142 kW
© 2026 Manish KL