Choosing a CDU – Expert Advice

Get expert advice on how to choose the right cdu for your data center

CDU
CDU

🚨 Considering a CDU for your data center? Don’t miss this breakdown from one of the industry’s top experts. Dr. Steve Harrington — a leader in computer cooling since 1985 and member of the ASHRAE SSPC127 committee — explains how to quickly estimate a CDU liquid cooling capacity and highlights key things to watch for before you buy.

What does a CDU(Coolant Distribution Unit) do in a direct to chip cooling system? It takes the heat from the servers in a clean, controlled-temperature coolant loop and transfers it to the dirtier, less-controlled building coolant loop. Servers are not designed to work with typical building cooling loops, as they require controlled chemistry and filtered water to prevent corrosion, bio growth and clogging.

Why should you listen to me talk about it? I have been on the ASHRAE SSPC127 committee since its inception, helping to develop standards, and I have designed, built, and deployed CDUs since 2011. I started my career in computer cooling in 1985. Since then, I have designed pumps, valves, heat exchangers, flow meters, rocket engines, medical ventilators, airplane cooling systems, etc. I also taught thermodynamics and aerospace engineering at UCSD.

There is a lot of talk about CDUs these days. What matters? Is more capacity better? Does physical size matter in a data center with 100 kW racks, where real estate costs are negligible? I would argue that uptime is what matters. This is a function of the entire cooling system, and the current approach of siloed procurement—of CDUs, cold plates, coolant, and the Technology Cooling System (TCS) flow networks—is not addressing uptime. More capacity is not necessarily better, because larger cooling loops result in more servers going down when things go wrong.

The most common issues are leaks, contamination, and corrosion.

  • Leaks can be caused by poor CDU design, poor quick-disconnect design, and poor cold plate design or construction.
  • Contamination can stem from poor CDU quality control, inadequate flushing procedures, or improper coolant maintenance.
  • Corrosion is a function of wetted materials, proper system grounding, and coolant maintenance.

All of these problems typically show up after a year or more of operation, so a lot of hardware may be deployed before trouble starts—leading to significant risks. One way to mitigate this risk is to use one CDU per rack, but this solution leads to higher installation and maintenance costs. Moreover, any design or manufacturing errors that appear after many racks have been installed will still cause problems. Note that pump failure is not likely until end of life, 5+ years.

There are a number of new CDUs on the market. The hardest part of launching a CDU product is getting someone to trust their $20M+ cluster with your brand-new CDU. As with any new product, there will be issues to work out after deployment—but with a CDU, this may result in downtime on expensive servers. This is why we still see major companies showing the same prototype at trade shows year after year.

Right now, we have 4–5 vendors with significant row-scale CDU experience: Chilldyne, Motivair (now Schneider), Cooltera (now Vertiv), and Nortek. CoolIT has been using Cooltera and Nortek for years for row-scale CDUs.

A CDU seems pretty simple—pump, heat exchanger, and filter. However, the minimum viable product must cool expensive servers for five years with no downtime. Furthermore, there are no standards for how to handle leaks, sensor failures, valve issues, or pump failures. Contrast this with a passenger aircraft or medical ventilator—complex systems that must be very reliable, but with well-established design protocols.

One approach is to buy a CDU from the largest vendor. But many of the larger vendors are primarily in the air-cooling business. They saw liquid cooling as a threat and avoided it until recently, so they don’t have much experience.

CDU specs are all over the place. Ten years ago, each CPU was 120 watts and GPUs were 300 watts—so only a trickle of water was needed to cool each chip. At the time, we rated CDUs in watts at 1 LPM per kW or less, with a water temperature delta (from cold plate out to cold plate in) of 14°C (25°F) or more.

Now, higher power systems need more water, so most older CDU ratings no longer apply. ASHRAE has stepped in to develop a standard rating system based on 1.5 LPM/kW or higher flow rates. The ASHRAE rating uses:

  • 3°C approach (difference between facility water in and TCS water out)
  • A minimum of 1.5 LPM/kW

This will lower the ratings of older CDUs but will be more appropriate for tomorrow’s high-power chips.

Here’s a chart of flow vs. DeltaT (cold plate out minus cold plate in) for water. PG25 is 2-3% higher.

Flow (LPM/kW) DeltaT (°C)

0.5 28.8

0.75 19.2

1 14.4

1.5 9.6

2 7.2

3 4.8

4 3.6

Our CDU300 is flow-limited at 300 LPM, so it is rated at 200 kW according to ASHRAE, based on flow limitations. At 200 kW, the approach of the CDU300 is 2°C, so we exceed the ASHRAE spec in this area. Our CDU1500 is rated at 1 MW due to a flow limit of 1500 LPM.

The Vertiv XDU1350 is rated at 1368 kW at a 4°C approach, but at 3°C the rating is ¾ of that—about 930 kW. The Motivair MCDU-40 is rated at 840 kW at 8.3°C approach, but at 3°C the rating drops to about 300 kW.

All these ratings assume the facility flow rate is about the same as the server flow rate.

Key takeaway: Look at the flow and the approach of the CDU.

  • Take the flow rate and divide it by 1.5 LPM/kW to get the maximum capacity based on flow.
  • Then look at the approach at the manufacturer’s kW rating and multiply the rated kW by (3 Ă· approach in °C).

This will give you the approximate capacity at a 3°C approach. (More complex calculations will provide more accuracy.)

The minimum of the flow-based or approach-based capacity is the final CDU capacity.

When looking for a new CDU, check out the vendor’s experience. Have they deployed systems at scale for years? Ask about customer references.  Check out their service procedures. Ask about how they recovered from problems.



hashtag#chilldyne hashtag#cdu hashtag#liquidcooling hashtag#liquidcoolingexpert hashtag#datacentercooling

Leave a Comment

Your email address will not be published. Required fields are marked *

Call Now