Emerging research suggests that the AI processors of the next decade will not only deliver massive performance boosts but also introduce extreme energy demands that could fundamentally alter how data centers are built and operated.
A forward-looking analysis from the TeraByte Interconnection and Package Laboratory (TeraLab) at Korea Advanced Institute of Science and Technology (KAIST) outlines a future where high-bandwidth memory (HBM) paired with AI GPUs could push power requirements beyond 15,000 watts per module by the year 2035.
Escalating Memory Performance Brings Surging Power Needs
According to the roadmap published by TeraLab, the evolution from HBM4 in the mid-2020s to HBM8 in the 2030s comes at a steep energy cost. The projected capabilities of HBM8 include bandwidths reaching 64 terabytes per second, enabled by a staggering 16,384 I/O channels. These memory stacks are expected to offer up to 240GB per unit, with each one drawing 180 watts.
When combined with the increasing power draw of future GPUs — expected to exceed 1,200 watts per unit — the cumulative energy consumption for a single AI module loaded with memory could surpass 15.3 kilowatts, a massive jump from today’s standard.
AI’s Growing Appetite for Energy
The explosive rise of AI models — from large language models to emerging physics-based systems — is fueling a relentless demand for both compute and memory throughput. This shift is forcing a reevaluation of how power is delivered and dissipated within data centers.
“AI’s progress is running into its biggest limitation — energy efficiency,” said Neil Shah, Vice President of Research at Counterpoint Research. “Each generational leap in AI, from generative to agent-based intelligence, brings exponential increases in computational load.”
Today’s top-tier server GPUs typically operate within the 300W–600W range. Doubling or even tripling that power envelope will require entirely new rack architectures, power distribution models, and facility-level upgrades.
Cooling: A Critical Engineering Challenge
As power scales, so too does heat — and conventional cooling systems may not be able to keep up. The KAIST report indicates that air-based cooling methods will become insufficient for next-gen GPUs with dense memory stacks.
To address the rising thermal density, only limited modifications — such as to the molding compounds used in chip packaging — may help reduce the transfer of heat between closely packed components. Still, this is unlikely to provide the level of cooling efficiency needed for chips drawing over 15kW each.
The industry is likely to move toward advanced cooling technologies, such as liquid immersion or two-phase systems, to maintain safe operational temperatures and ensure performance reliability.
Future AI Chips to Demand Over 15kW Each, Reshaping Data Center Power and Cooling Paradigms
Emerging research suggests that the AI processors of the next decade will not only deliver massive performance boosts but also introduce extreme energy demands that could fundamentally alter how data centers are built and operated.
A forward-looking analysis from the TeraByte Interconnection and Package Laboratory (TeraLab) at Korea Advanced Institute of Science and Technology (KAIST) outlines a future where high-bandwidth memory (HBM) paired with AI GPUs could push power requirements beyond 15,000 watts per module by the year 2035.
Escalating Memory Performance Brings Surging Power Needs
According to the roadmap published by TeraLab, the evolution from HBM4 in the mid-2020s to HBM8 in the 2030s comes at a steep energy cost. The projected capabilities of HBM8 include bandwidths reaching 64 terabytes per second, enabled by a staggering 16,384 I/O channels. These memory stacks are expected to offer up to 240GB per unit, with each one drawing 180 watts.
When combined with the increasing power draw of future GPUs — expected to exceed 1,200 watts per unit — the cumulative energy consumption for a single AI module loaded with memory could surpass 15.3 kilowatts, a massive jump from today’s standard.
AI’s Growing Appetite for Energy
The explosive rise of AI models — from large language models to emerging physics-based systems — is fueling a relentless demand for both compute and memory throughput. This shift is forcing a reevaluation of how power is delivered and dissipated within data centers.
“AI’s progress is running into its biggest limitation — energy efficiency,” said Neil Shah, Vice President of Research at Counterpoint Research. “Each generational leap in AI, from generative to agent-based intelligence, brings exponential increases in computational load.”
Today’s top-tier server GPUs typically operate within the 300W–600W range. Doubling or even tripling that power envelope will require entirely new rack architectures, power distribution models, and facility-level upgrades.
Cooling: A Critical Engineering Challenge
As power scales, so too does heat — and conventional cooling systems may not be able to keep up. The KAIST report indicates that air-based cooling methods will become insufficient for next-gen GPUs with dense memory stacks.
To address the rising thermal density, only limited modifications — such as to the molding compounds used in chip packaging — may help reduce the transfer of heat between closely packed components. Still, this is unlikely to provide the level of cooling efficiency needed for chips drawing over 15kW each.
The industry is likely to move toward advanced cooling technologies, such as liquid immersion or two-phase systems, to maintain safe operational temperatures and ensure performance reliability.
Archives
Categories
Archives
Future AI Chips to Demand Over 15kW Each, Reshaping Data Center Power and Cooling Paradigms
June 24, 2025What’s Coming in Java 25: Key Enhancements and Features in the Next LTS Release
May 30, 2025Categories
Meta