Over the past few years, we’ve seen the thermal design power (TDP) of all types of chips steadily increase as chip makers struggle to keep Moore’s Law alive.
It (probably) won’t kill latency and could save you a buck
Over the course of five years, we’ve seen chipmakers push processors from 150-200W to as much as 400W in the case of AMD’s 4th generation Epic. And during that same period, we’ve seen a rapid rise in accelerated computing architectures that use GPUs and other AI accelerators. Following this trend, it’s not hard to imagine power consumption per socket in excess of 1kW in the next year or two, especially as AMD, Intel and Nvidia work to finalize their Accelerated Processing Unit (APU) architecture and merge the GPU data center with the CPU. The idea of 1KW may seem shocking and will almost certainly require direct liquid cooling or perhaps even immersion cooling.
However, higher TDPs are not inherently bad if the performance per watt is higher and scales linearly. But just because a CPU can burn 400W under load, doesn’t mean it should. While Intel and AMD have increased the TDP of their fourth-generation parts, they’ve also introduced a number of power management improvements that make it easier for users to prioritize pure performance or optimize for efficiency. In short, the Epic 4 can either be tuned to prioritize consistent performance stability or tuned to ensure consistent power consumption by modulating clock speed as more or fewer cores are loaded. Intel, meanwhile, introduced “Optimized Power Mode” to its Sapphire Rapids Xeon Scalable processors, which the company claims can reduce power consumption per socket by as much as 20 percent, in exchange for a performance gain of roughly 5 percent.
According to Intel contributor Mohan Kumar, the power management feature is particularly effective in scenarios where CPUs are only running at 30-40 percent utilization. With Optimized Power Mode enabled, he says customers can expect a 140W reduction in power consumption on a dual socket system.
Dogma
Of course, power management at the CPU level does not perform very well.
“IT operators are risk averse. The general reaction to energy management is – because of the small energy and cost savings, the risk in terms of our service level agreements with our customers is too high. And so, there’s a reluctance to move into using power management,” said Uptime Institute analyst Jay Dietrich. “There is usually an urban legend associated with these beliefs involving an SLA disaster three technological generations in the past”.
While the ability to pack two or three racks of legacy systems into a single cabinet full of Epics or Xeons—a logic not unique to AMD—is quite attractive, most data centers simply aren’t equipped to power or cool the resulting equipment.
As CPUs and GPUs become more and more power-hungry, the number of systems you can fit into a typical six-kilowatt rack plummets.
Factoring in RAM, storage, networking, and cooling, it’s not hard to imagine a 2U, two-socket Epic 4 platform consuming much more than a kilowatt. That means it will only take five or six nodes—six to 12 rack units—before you blow your rack power budget. Even assuming that all of those systems won’t be fully loaded at the same time—they probably won’t—and if you overprovision the rack, you’ll still run out of power before the rack is half full. And that’s just looking at general computing nodes. GPU nodes are even more power hungry.
Of course, data center operators are not blind to this reality, and many are taking steps to upgrade their new and existing infrastructure to support hotter, higher-power systems. But it will take time for data center operators to adapt and perhaps even encourage the adoption of these power management features to reduce operating costs.
Source: Theregister
Source: PC Press by pcpress.rs.
*The article has been translated based on the content of PC Press by pcpress.rs. If there is any problem regarding the content, copyright, please leave a report below the article. We will try to process as quickly as possible to protect the rights of the author. Thank you very much!
*We just want readers to access information more quickly and easily with other multilingual content, instead of information only available in a certain language.
*We always respect the copyright of the content of the author and always include the original link of the source article.If the author disagrees, just leave the report below the article, the article will be edited or deleted at the request of the author. Thanks very much! Best regards!