DEV Community

Zoey Lee
Zoey Lee

Posted on

NVIDIA's New Products Will Drive New Growth in the Liquid Cooling Market

NVIDIA Strengthens Its Arsenal! Pushing into the Inference Market with the Launch of LPU Chips

In 2025, NVIDIA entered into a $20 billion exclusive technology licensing and team integration partnership with Groq, thereby incorporating Groq’s LPU (Language Processing Unit) inference architecture capabilities. The core logic behind this move is crystal clear: the AI ​​industry is shifting from being "training-driven" to "inference-driven." Now that large language models (LLMs) have achieved commercial deployment, the true consumers of computing power are the massive volume of real-time inference requests—not the one-off process of model training.

While NVIDIA already holds an absolute advantage in the realm of GPU-based training, specialized LPU architectures offer superior deterministic latency and a higher performance-per-watt ratio in scenarios requiring ultra-low latency and high energy efficiency for inference. Consequently, by integrating Groq’s technology, NVIDIA can plug the gaps in its inference product line and construct a heterogeneous computing ecosystem where "GPUs handle training, while specialized inference chips handle deployment."

Currently, NVIDIA’s key downstream customers—such as Google and AWS—are accelerating their efforts to develop proprietary ASIC chips. This is particularly evident in AI inference scenarios, where they aim to boost energy efficiency and reduce costs through customized architectures, while simultaneously gradually reducing their heavy reliance on NVIDIA GPUs. Against this backdrop, NVIDIA’s launch of LPU inference chips represents a strategic maneuver that combines both defensive and offensive elements: on one hand, it addresses the product gaps in its own portfolio regarding ultra-low latency inference; on the other, by building a "training-plus-inference" full-stack computing ecosystem, it continues to secure its position as the provider of core infrastructure for cloud service providers and LLM companies, thereby solidifying its dominant leadership in the global AI chip market.

2. LPU Cabinet Structural Breakdown

While traditional servers adhere to a "single-unit-centric" design philosophy, the LPU cabinet marks a complete shift toward a system-level architecture defined by the "entire cabinet as a single unit." In NVIDIA's design paradigm, the cabinet itself is no longer merely a physical enclosure for housing equipment; rather, it functions as a highly integrated compute unit.

A breakdown of the hardware structure reveals that the LPU cabinet exhibits the quintessential characteristics of high density, modularity, and full liquid cooling. Specifically, at the core compute unit level, a single cabinet integrates approximately 256 LPU chips.

These chips are not deployed in a dispersed manner; instead, they are organized into standardized "Trays"—with each Tray housing approximately 32 chips, and the complete cabinet typically comprising eight such Trays. This design approach not only ensures standardization in manufacturing and maintenance but also facilitates scalable expansion capabilities.

Secondly, in terms of interconnection architecture, the interior of the LPU cabinet employs a high-bandwidth, low-latency fully interconnected network (resembling the NVLink architecture) to integrate 256 chips into a single, unified-scheduling "supernode." From a system-level perspective, this signifies that these are not merely 256 independent acceleration units, but rather a single logical "giant processor," thereby vastly enhancing the parallel efficiency and response speed of inference tasks.

Furthermore, in terms of power and power supply systems, LPU cabinets have entered the ultra-high power density range. With a power output of approximately 200 kW per cabinet, this represents an increase of an order of magnitude compared to traditional 10–20 kW cabinets.

This shift yields two direct consequences: First, the power architecture must be fundamentally redesigned, typically adopting High Voltage Direct Current (HVDC) or a centralized, high-efficiency AC-to-DC power supply scheme; second, power distribution shifts from the "data center room level" down to the "cabinet level," necessitating PDU and busbar designs that are more robust and highly reliable.

Most critically, this entails a transformation in the cooling system. Given a power density of 200 kW, air cooling is no longer capable of providing effective thermal management; consequently, LPU cabinets have fully embraced a liquid-cooling architecture, which is deeply integrated into the hardware design itself.

3. LPU Cabinet Shipment Volume and Per-Cabinet Liquid Cooling Value Estimation

NVIDIA is driving computing infrastructure into a new phase—"whole-cabinet delivery"—through its LPU cabinets. Based on data released by Zero-One Technology (01.AI) and partial figures disclosed by TSMC and Samsung, LPU chip shipments are projected to reach approximately 4.5 million units between 2026 and 2027. Assuming a high-density integration configuration of 256 chips per cabinet, this corresponds to a market scale of approximately 17,500 cabinets.

Large-scale shipments are expected to commence in the second half of 2026, with an estimated 4,300 to 5,000 whole cabinets shipped this year. Volume is then expected to ramp up rapidly in 2027, reaching a total shipment of approximately 13,000 whole cabinets.

Tips:
In an era of skyrocketing data demands, Lian Li Liquid Cooling provides the cooling power you need to stay ahead. As a global leader in liquid cooling implementation, we’ve supported clients in over 100 countries, deploying more than 5 million units to date.

Our 10 GW+ project capacity ensures that whether you are running a boutique server room or a hyperscale data center, we have the scale and technical expertise to deliver. With RoHS, CE, and UL certifications, you can trust Lian Li for safety, sustainability, and superior service.

Top comments (0)