eazyware
Research·May 19, 2025·10 min read

AI energy footprint: what the numbers actually say

Training is a spike; inference is the ongoing cost. Where the energy debate has signal, and where it is hype.

KR
Kushal R.
Engineering lead

How much energy does AI actually use? The narrative in the media has swung from 'AI will boil the oceans' to 'AI is a rounding error' and back. Neither framing is correct. The real numbers matter because they drive actual decisions for data-center operators, utilities, regulators, and policy makers. This post is the honest math.

Energy at a glance
AI energy — training vs inference vs everything else PER LARGE-MODEL TRAINING RUN Training: 1,000-50,000 MWh · 1-time cost PER DAY INFERENCE AT PRODUCTION SCALE Inference: ~10-100 MWh/day for ChatGPT-scale · recurring COMPARISONS (annualized) AI inference (global, all LLMs) · ~2-5 TWh/yr Crypto mining · ~120 TWh/yr Global data centers (all use) · ~350 TWh/yr Global electricity (total) · ~29,000 TWh/yr
Training is a one-time large pulse (1,000-50,000 MWh per frontier run). Inference is a steady 10-100 MWh/day at ChatGPT scale. All global AI inference annualized is 2-5 TWh — small vs 350 TWh total data centers and 29,000 TWh global electricity.

Training numbers

A single frontier-model training run consumes 1,000 to 50,000 MWh, depending on model size, training duration, and hardware efficiency. GPT-4-scale runs estimate at ~25,000 MWh; smaller models are proportionally less. For context, 25,000 MWh is roughly the annual electricity of 2,500 US households.

Critical point: training is a one-time cost per model version. You amortize it across the entire deployment lifetime of that model. A single training run feeding a billion queries is a different energy equation than a training run feeding a million queries.

Inference numbers

Inference is the recurring cost. A ChatGPT-scale service at hundreds of millions of queries per day runs perhaps 10-100 MWh/day in total inference energy, depending on model efficiency and query complexity. Annualized that's roughly 3-36 GWh per major deployment.

Globally, summing all significant LLM deployments, total inference energy in 2025 was estimated at 2-5 TWh/year. This number is growing — probably doubling year-over-year — but it's still a specific, boundable number.

Contextualizing

Global data centers total: ~350 TWh/year. AI inference is under 2% of this today. By 2028-2030, estimates suggest AI could be 10-20% of total data center energy, depending on deployment scale and efficiency improvements.

Global electricity: ~29,000 TWh/year. All current AI use is ~0.01-0.02% of global electricity. The share is growing but starts from a very low baseline.

For comparison: cryptocurrency mining is ~120 TWh/year. Residential clothes dryers in the US alone are ~80 TWh/year. Hollywood streaming services are ~70 TWh/year globally. AI energy is real but fits in a larger landscape of significant energy uses.

What it means operationally

For operators: AI inference energy cost is not a dominant line in most projects. A ChatGPT-scale service paying spot electricity rates ($0.05-0.10/kWh) spends ~$300K-$1M/year on inference electricity. Meaningful but not transformational; model API costs typically dwarf electricity costs by 10-50x.

For data center operators: AI workload growth is real and drives capacity planning. Peak-power density for GPU racks (40-80 kW/rack) far exceeds traditional server racks (5-10 kW/rack). The facility implications — cooling, power distribution, rack design — require re-investment regardless of whether the total MWh is large.

For utilities and regulators: large AI training clusters are new localized loads — a new training cluster might demand 100-200 MW continuously, comparable to a small city. Interconnect requests tied to AI/data center growth have exploded. The grid-planning implications are the story, not the global TWh numbers.

For policy-makers: regulations targeting AI energy specifically are mostly missing the target. Data center efficiency, grid decarbonization, and hardware efficiency improvements drive the actual outcomes. 'AI tax' proposals capture little of the dynamic and create perverse incentives.

Efficiency is moving fast

Per-query inference energy has dropped roughly 10x in 3 years across comparable capabilities. H200 is 30-40% more efficient than H100 per flop; B200 improves that further. Software side — speculative decoding, quantization, better kernels — delivers another 2-3x. Combined, the energy-per-query for frontier capabilities has fallen faster than deployment volume has grown, though the margin is narrowing.

The honest summary

AI energy is not an existential environmental crisis today. It is also not nothing — it is a meaningful and growing slice of data center load, with localized grid implications that require serious planning. The mature framing is: efficiency investment, grid decarbonization, and load-aware siting, not apocalypse narratives or dismissal.

Read next
Self-hosting vs managed: GPU decisions in 2026
Read next
AI in the energy sector: grid, trading, and operations
Read next
Total cost of ownership for LLM systems
Tags
energysustainabilityinfrastructure
/ Next step

Want to talk about this?

We love debating this stuff. 30-minute call, no pitch, just engineering conversation.

~4h
avg response
Q2 '26
next slot
100%
NDA on request