Trung Quốc đang đối mặt khủng hoảng làm mát trung tâm dữ liệu khi bùng nổ AI

  • Trung Quốc đang tăng tốc xây dựng trung tâm dữ liệu để đáp ứng nhu cầu AI, nhưng gặp khủng hoảng làm mát do mật độ điện năng quá cao.

  • Chip Huawei Ascend 910B/910C tiêu thụ khoảng 310 W, cụm chip có thể đẩy mật độ điện năng rack lên trên 15 kW, thậm chí 30 kW. GPU Nvidia GB200 tiêu thụ tới 2.700 W, còn Rubin Ultra sắp ra mắt dự kiến cao hơn nữa.

  • Hệ thống quạt gió truyền thống không còn đủ hiệu quả, khiến liquid cooling, đặc biệt cold plate cooling, trở thành xu hướng chính.

  • Theo IDC 2025, thị trường server liquid-cooled Trung Quốc đạt 2,37 tỉ USD năm 2024, tăng 67% so với năm trước, và dự kiến đạt 16,2 tỉ USD vào 2029 (tăng trưởng trung bình 46,8%/năm).

  • Chính phủ đặt mục tiêu PUE dưới 1,25 nhằm tối ưu hiệu quả năng lượng. PUE càng gần 1 thì càng hiệu quả.

  • Cold plate cooling chiếm 95% thị trường 2024 vì chi phí cải tạo thấp, dễ tích hợp. Immersion cooling có tiềm năng nhưng đòi hỏi server tùy chỉnh, tăng rủi ro vận hành. Alibaba đang phát triển immersion cooling cho tác vụ cực nặng.

  • Phương án phổ biến hiện tại: hybrid – kết hợp cold plate cho chip tiêu thụ cao và air cooling cho thành phần phụ.

  • Thách thức: chọn giữa single-phase (ổn định, dễ vận hành) và two-phase (hiệu quả hơn nhưng phức tạp hơn).

  • Vấn đề lớn khác là thiếu tiêu chuẩn thống nhất, chuỗi cung ứng phụ thuộc. Quyết định 3M ngừng sản xuất PFAS vào 2025 khiến nguồn cung chất làm mát gặp rủi ro. Giá PFAS khoảng 20.000 USD/tấn, trong khi nhà cung cấp nội địa khan hiếm.

  • Tại Hội nghị Asia Data Center 7/2025, CTO Zhang Peng (Sugon Data Innovation) thừa nhận việc phối hợp hệ thống giữa nhiều tiêu chuẩn làm khó xây dựng data center.

  • Dù nhiều khó khăn, liquid cooling vẫn được xem là bước đệm cần thiết cho AI, đồng thời gắn với mục tiêu trung hòa carbon 2060 của Trung Quốc.


📌

Trung Quốc đối mặt bài toán năng lượng khi AI thúc đẩy trung tâm dữ liệu tiêu thụ điện khổng lồ: GPU Nvidia GB200 tới 2.700 W, rack lên 30 kW. Chip Huawei Ascend 910B/910C tiêu thụ điện tương tự. Thị trường làm mát bằng chất lỏng đạt 2,37 tỉ USD năm 2024, dự báo 16,2 tỉ USD năm 2029. Làm mát trực tiếp lên mạch chiếm 95% thị phần, làm mát lai giữa không khí và nước trở thành giải pháp chính. Tuy nhiên, thiếu tiêu chuẩn, rủi ro chuỗi cung ứng và chi phí vẫn là rào cản

https://interestingengineering.com/energy/china-ai-liquid-cooling-data-centers


Behind China’s AI boom lies a cooling crisis
As China races to dominate AI, its data centers are overheating. Cooling tech is under pressure, and the limits are showing.
Updated: Aug 19, 2025 09:13 AM EST
Photo of the Author Ni Tao

Behind China’s AI boom lies a cooling crisis
China is ramping up liquid cooling infrastructure to keep its AI ambitions from overheating.
IE/Getty
Ni Tao is IE’s columnist, giving exclusive insight into China’s technology and engineering ecosystem. His monthly Inside China column explores the issues that shape discussions and understanding about Chinese innovation, providing fresh perspectives not found elsewhere.
With the rapid rise in demand for AI computing power, China’s data center construction is accelerating at an unprecedented pace. The need to train and run large AI models has transformed high-power servers from niche equipment into standard fixtures.
A key consequence? Surging energy consumption. For example, Huawei’s Ascend 910B and 910C AI chips consume around 310 watts. Clusters often house dozens of these chips, pushing a single rack’s power density beyond 15 kW, and in extreme cases, close to 30 kW.
Similarly, Nvidia’s GB200 GPUs consume a staggering 2,700 watts per unit under full load, far exceeding predecessors like the A100 and H100. Its Rubin Ultra, expected within two years, is anticipated to demand even higher power densities, intensifying the cooling challenges confronting data centers.
This concentration of heat exposes the limits of traditional air cooling, which relies on fans at high speeds, increasing noise, energy use, and maintenance complexity.
As a result, liquid cooling, particularly cold plate liquid cooling, has become the new norm in China’s data centers due to its efficient heat dissipation and relatively simple retrofit process.
The PUE requirement
According to a 2025 IDC report, China’s liquid-cooled server market reached $2.37 billion in 2024, a 67 percent year-over-year increase. The market is projected to grow at a 46.8 percent compound annual rate between 2024 and 2029, approaching $16.2 billion by 2029.
This leap is partly fueled by China’s “East Data West Computing” strategy, which shifts computing power from the coastal east to western inland regions like Guizhou and Gansu, where large-scale data centers have sprouted.
At the same time, national policies promoting green data centers state that they meet power usage effectiveness (PUE) targets below 1.25. A reading of 1.0 means 100 percent of the power is consumed by IT equipment only. This is ideal but practically impossible. The closer the PUE is to 1, the more efficient the data center is.
Liquid cooling mainly involves a cold plate and immersion cooling. Cold plate cooling stands out for its lower retrofit costs and compatibility with existing infrastructure. It typically uses copper or aluminum plates with liquid channels and fittings, making maintenance processes similar to air cooling and easier for operations teams to adopt.
Immersion cooling, by contrast, offers theoretically higher heat removal capacity and supports greater power densities, but it requires custom servers and racks. The need to manage large liquid tanks during maintenance increases downtime and operational risk, limiting their practicality for many data centers.
Notably, tech titan Alibaba is among the few active developers of immersion cooling to address ultra-high-density workloads. IDC data shows cold plate cooling accounted for 95 percent of China’s liquid-cooled server market in 2024.
As a result, cold plate cooling has become the preferred choice for most facilities. However, it’s too early to bid adieu to air cooling. Instead, engineers in China often use a hybrid, cost-effective approach—cold plates for power-hungry chips and air cooling for memory, power supplies, and other lower-heat components.
The reason? This “liquid plus air” framework balances performance demands with cost concerns and system complexity, reflecting practical engineering trade-offs. This thinking has dominated the expansion of liquid-cooled servers.

Black imgIX server racks. Source: imgix/Unsplash
Engineers must also choose between single-phase and two-phase methods within cold plate cooling. Single-phase cooling circulates liquid coolant that absorbs heat without changing state, offering a simpler design and stable operation.
By contrast, two-phase cooling involves vaporizing coolant to absorb heat, providing higher thermal efficiency. But both have notable drawbacks. Aside from the upfront system costs for the immersion technique, there are additional barriers, including managing flow and pressure across the cold plates and requiring technicians with HVAC (heating, ventilation, and air conditioning) experience. Consistent refrigerant supply is also another issue that engineers need to grapple with.
A lack of unified standards
Therefore, the decision between single- and two-phase cooling methods is often based on each data center project’s specific needs, reliability requirements, and cost considerations.
Another key factor impacting liquid cooling is the delivery model. Decoupled delivery, where servers, racks, and cooling systems come from different vendors, offers flexibility but lacks unified interface standards, complicating integration. 
Integrated delivery by a single vendor is now prevalent in China, enabling faster deployment and better compatibility. However, the main downside is that it can lead to vendor lock-in and reduce negotiating power in the long term. Engineers must weigh budgets, project scale, and expansion plans to select the best approach.
Power distribution in liquid-cooled servers requires trade-offs as well. Centralized power supply simplifies management for large operations, while distributed power reduces losses and heat, and is thus better suited to high-density racks.
However, distributed systems demand more sophisticated management and stronger fault tolerance. Engineers must balance energy efficiency, stability, and flexibility to meet policy and operational targets.
Choosing the right coolant is equally critical. Commonly used coolants are ethylene glycol and propylene glycol mixtures, typically in 20-30 percent concentration ranges. 
High concentrations increase viscosity and reduce heat transfer while low concentrations risk freezing, especially in cold, high-altitude regions targeted by the “East Data West Computing” plan. Coolants must also resist microbial growth to prevent system failures. Formulations need to be tailored to local climate, pump capacity, and piping resistance.
On top of these challenges, supply chain risks add further complexity. While China dominates the supply network for EV parts, wind turbines, solar panels, and almost anything, liquid coolant catering to data centers remains outside its control.
In late 2022, 3M announced, to the surprise of the global industry community, that it would cease production of fluoropolymers, fluorinated fluids, and PFAS (per- and polyfluorinated alkyl substances) additives by 2025, citing tightening environmental regulations and public health concerns worldwide.
The coolant challenge
These fluids are widely used in seals, lubricants, and corrosion prevention in cooling systems. This regulatory-driven move by 3M forces manufacturers to source reliable substitutes.
While some domestic pundits hailed 3M’s withdrawal as an opportunity for domestic suppliers, they were largely unaware that this process would be fraught with technical, cost, and scheduling challenges.

A rack of servers in a server room. Source: Kevin Ache/Unsplash
Cost reduction, in particular, poses the biggest test. The market price for PFAS is generally around $20,000 per ton and varies significantly among manufacturers. Quality domestic suppliers are scarce. Delays in finding the alternatives risk disrupting data center commissioning timelines.
Again, standardization remains another major industry pain point. China lacks unified standards for liquid cooling interfaces, coolant types, piping, and power schemes.
Hundreds of players have crowded into this industry, each promoting technical and engineering standards. This often leads to incompatibility between vendors’ equipment, raising integration costs and risking IT and facility lifecycle mismatch. Engineers must prioritize supply chain compatibility and ease of maintenance, sometimes at the expense of efficiency or innovation.
Reflecting on this problem, Zhang Peng, CTO of Sugon Data Innovation, an air-cooled server rack supplier from China, observed at the Asia Data Center Summit in July 2025: “Many types of liquid-cooled servers exist in the market, and matching systems is challenging. Manufacturers’ design standards vary widely across temperature, pressure, and architecture, complicating liquid-cooled data center construction.”
Striking a compromise
For all its shortcomings, liquid cooling continues to advance in China and beyond. It has proven to be a game-changer for data center energy efficiency, resource use, and service density. But while it’s easy to conclude it represents the future, the transition is anything but.
For frontline engineers, this often involves tirelessly juggling heat dissipation, retrofit costs, regulatory compliance, supply chain reliability, and maintenance ease. Simulations and field tests are needed to guide operational targets. Meanwhile, backup cooling designs are necessary to prevent single points of failure, ensuring system stability and business continuity.
Moreover, sustainability considerations extend beyond energy use. The push for greener data centers includes adopting recyclable materials for cooling components, minimizing water consumption, and integrating renewable energy sources.
These trends align with China’s broader environmental goals, especially its commitment to carbon neutrality by 2060. But they pose ever greater challenges for material science and engineering.
While immersion cooling and fully liquid-cooled racks may gain traction as chip power skyrockets, the hybrid cold-plate-plus-air approach currently dominates and shows strong potential for scaling.
Until supply chains stabilize and industry standards align, liquid cooling in China won’t be a revolution. It will be and already is an exercise in practical compromise.

Không có file đính kèm.

19

Thảo luận

© Sóng AI - Tóm tắt tin, bài trí tuệ nhân tạo