Data Center Configuration in the Modern World

Supermicro’s Michael McNerney offers insights on data center configuration in the modern world. This article originally appeared on Solutions Review’s Insight Jam, an enterprise IT community enabling the human conversation on AI.
The modern data center has undergone significant reconstruction over a short period of time. Just 20 years ago, the machines that would fill a room held less power than the mobile phones that fit in our pockets today. Due to increasing technological innovations and the demand of today’s digital operations, the pressure on data center infrastructure is at an all-time high, which is forcing companies to take a much harder look at the best way to configure their data centers for optimal performance.
Where We Are & How We Got Here
The data center has undergone its fair share of peaks and valleys since its inception in the mid-20th century. What began as primitive, clunky, and oversized machines would quickly become the go-to resource for computing technology – but the tide began to turn a few decades later after the invention of the PC, and tech companies began consolidating their IT services on desktops rather than entire rooms.
What came next was the internet, and data centers once again saw a massive uptick in production and usage. As IT investments continued to increase, HPC, enterprise tech and cloud services helped propel the importance of data centers even further. Fast forward to the present day, and the AI boom has made them more prevalent than ever before. As companies expect more from their data centers, it is up to the companies building the infrastructure to satisfy their customers and find better ways to position data centers for success.
Shifting Towards Rack-Scale Designs
Configuring data centers to reach maximum efficiency starts with rethinking their composition. Data center designers must think bigger, taking entire racks into account as opposed to individual servers.
This rack-scale approach can yield faster results and lower maintenance costs, but it is more complex than simply overloading the racks with servers and switches. There are numerous considerations to keep in mind, such as how much power the rack can handle, different air and liquid cooling layouts, and how to fully optimize communication speed between server units. These variables affect how much computing and storage power you can fit within a rack and whether lower-density racks without high-capacity cooling systems are suitable. How designers arrange servers within racks is extremely important, but there are hybrid rack-designs that can alleviate power and storage concerns altogether.
Racks can be designed with different heights depending on the number of servers they need to house. Filling the rack with servers can help decrease the amount of sprawl and power connections; consolidating the equipment into one efficient stack. Alternatively, filling them with a combination of compute and storage systems on the same switch can help the systems interact at a much faster rate with minimal latency. These high-density racks are great for maximizing space and building data centers to best fit your organization’s needs, but designers must also consider how to keep the rack and servers cool.
Why We Need Data Center Cooling Technology
When arranging servers into racks, it can be difficult for air cooling to keep up with the amount of heat being generated. However, most modern data centers do not require advanced GPUs or CPUs, making air cooling a very practical solution for the time being. This won’t last forever, as high-end CPUs are expected to reach 500W with GPUs running in the 700W range. Liquid cooling is becoming a necessity for high-end server designs, and there are three main iterations: Direct-to-Chip (DTC) cooling, immersion cooling and rear door heat exchangers.
Direct-to-Chip cooling is a very practical method that cools the server right at its source. It includes a cooling distribution unit (CDU), several cooling distribution manifolds (CDM) and hoses for the hot and cold liquid. Because the cooling capacity for a CDU is 100kW’s, DTC cooling is most effective on a single rack, although it can be equipped with longer hoses to cool other additional racks.
Immersion cooling takes an entirely different approach, submerging the entire server or certain components in a tank filled with non-conductive inert dielectric liquids. The liquids transfer heat while in a liquid state via a liquid-to-liquid heat exchanger or convert to a gaseous state while carrying the latent heat. The gaseous agent is transformed back into a fluid via a condenser, cooling the entire system evenly while the hot air is returned to the exterior environment.
Rear door heat exchangers are a cooling method tailored for data centers that are unable to modify or add to their infrastructure. When this issue arises, a specialized rear door can be added to the hottest servers in the rack – instantly cooling the hot air from the back of the server. The door is equipped with fans and coolant which absorbs the heat and feeds cooler air back into the data center. Its reduction of CRAC is the primary benefit of rear door heat exchangers.
These liquid cooling methods operate differently, but they have the same benefits outside of simply cooling servers. Noise reduction, energy savings and overall cost reduction are all reasons data center manufactures continue to implement liquid cooling solutions into their servers. Liquid cooling also lowers carbon emissions from fossil fuel power plants, reducing the environmental impact of modern data centers and helping organizations achieve their net-zero goals.
Customized Rack Infrastructure Technology
Rack infrastructure technology has also seen significant innovation alongside cooling and enterprise technology. Effectively managing and optimizing data center performance is paramount, and it begins with enhancing server-to-server communication through AI cluster networks. Ethernet and InfiniBand technology are the two primary interconnection technologies to assist in rapid data-transmission, each with their own unique features and advantages.
The individual characteristics of InfiniBand make it attractive for data scientists with HPC background. It has low overhead to minimize latency in the fabric and adaptive routing to restore communication at a very rapid pace. Another key attribute is its remote direct memory access (RDMA), which reduces the CPU load and allows direct memory access between servers. InfiniBand is best utilized in HPC environments and is often adapted to work specific subsets. Ethernet is much more traditional and has been around for many years. It offers a much larger and open ecosystem than its counterpart InfiniBand with a variety of different protocols and software options. Ethernet is best suited for those with little to know InfiniBand experience who don’t want to learn a new fabric, as it can be applied to both simple office networks and complex data center environments.
The introduction of NVLink and UALink also represents new ways in which companies are boosting interconnectivity between servers and racks. Both are designed for HPC environments and facilitate high-speed interconnection, thus enhancing the performance and efficiency of data centers. In certain situations where higher compute power is needed, parallel processing helps to combine the processing power of multiple computers to handle complex computing tasks. NVLink excels in high-bandwidth connections between GPUs whereas UALink offers high-speed interconnectivity between CPUs. Combining the two enhances the performance of computing systems altogether.
Conclusion
Since the invention of the internet, the data center has grown in importance for enterprises globally. As we continue to see advancements in HPC and AI applications, data center providers must explore alternative, innovative methods outside of simply building more campuses. We must thoroughly examine data center configuration and implement the proper rack-scale solutions to continue propelling the data center forward and power the next stage of computing.