03. How It Works
Compute economics and training costs
Training compute for notable AI models has grown approximately 4-5x per year since 2010, according to Epoch AI's analysis of 333 models published through May 2024. GPT-3 (2020) used roughly 3 x 10^23 FLOP. GPT-4 (2023) is estimated at 2 x 10^25 FLOP. Gemini Ultra (2023) at approximately 5 x 10^25 FLOP. Each generation is roughly 4-5x larger than the previous.
In dollar terms, training GPT-4 is widely estimated to have cost in the range of $50-100 million in compute alone, based on reported hardware and cloud pricing. Frontier models training on 2026 hardware (H100/H200 clusters) at scale cost more. The critical dynamic is that only organizations with access to large GPU clusters can participate. As of 2026, this means primarily OpenAI (backed by Microsoft), Google DeepMind, Anthropic (backed by Amazon and Google), Meta AI, and xAI. A handful of Chinese labs. No meaningful academic participation at the true frontier.
Hardware concentration mirrors model concentration. NVIDIA dominates AI accelerator supply with roughly 80%+ market share in training chips. The H100 and H200 GPUs (and successor Blackwell architecture) are produced by TSMC in Taiwan on 4nm and 3nm processes. Export controls imposed by the US in 2022 and expanded in 2023 restrict sales of advanced AI chips to China, which has responded by accelerating domestic chip development through Huawei and others.
Inference costs are lower per query but aggregate to significant sums at scale. A ChatGPT query requires approximately 2.9 watt-hours of electricity versus 0.3 watt-hours for a Google search, according to the International Energy Agency, roughly a 10x difference. At billions of queries per day, this adds up.
Labor market impact
The labor economics of AI is contested. Early evidence divides roughly as follows:
Productivity augmentation evidence:
GitHub Copilot studies showed developers completing tasks roughly 55% faster. Experimental studies of AI writing tools show measurable output gains for knowledge workers. A 2023 study by Erik Brynjolfsson, Danielle Li, and Lindsey Raymond examining a customer-service AI tool found a 14% average productivity increase, with the largest gains for lower-skilled workers. This suggests AI can reduce skill differentials within a job category.
Displacement evidence:
The McKinsey Global Institute estimated in 2023 that generative AI could automate 60-70% of work activities in some occupations. The most exposed roles are those involving routine information processing, document drafting, and data entry. Translation, paralegal support, and entry-level coding are already experiencing headcount constraints. Fiverr and Upwork reported reduced demand for certain freelance categories post-ChatGPT.
The augmentation vs. displacement divide by job type. Higher-skill cognitive work (research, strategy, engineering) tends toward augmentation: AI tools raise productivity without eliminating the judgment component. Lower-skill information tasks (transcription, basic data entry, template drafting) face more displacement risk. Manual labor requiring physical dexterity is currently less exposed. Healthcare, education, and trades are considered more resilient.
Historical context:
Prior technology transitions (electrification, computing) displaced some occupations while creating others. AI may follow the same pattern over decades while causing short-term disruption. Whether AI is categorically different because it targets cognitive work specifically is a live research question without resolved consensus.
Energy, water, and carbon footprint
Electricity:
Global data centers consumed approximately 200-250 terawatt-hours per year through roughly 2020, roughly 1% of global electricity, largely flat due to efficiency improvements. Since 2020, AI workloads have broken this plateau. Goldman Sachs Research estimates data center power demand will grow 160% by 2030, rising from 1-2% of global electricity to 3-4%. In the US, data centers used about 3% of national electricity in 2022 and Goldman expects this to reach 8% by 2030. A single ChatGPT-scale query requires nearly 10x the energy of a Google search. By 2028, Goldman analysts project AI will represent about 19% of data center power demand.
The CO2 implications are significant. Goldman Sachs estimates data center carbon emissions may more than double between 2022 and 2030 even accounting for renewable energy investments, at a present-value "social cost" of $125-140 billion.
Water:
Less visible than electricity but measurable. Data center cooling systems use large amounts of water. Training GPT-3 in Microsoft's US data centers directly evaporated approximately 700,000 liters of freshwater, according to research published by Li et al. (arXiv:2304.03271, accepted Communications of the ACM). Globally, AI demand is projected to account for 4.2-6.6 billion cubic meters of water withdrawal in 2027, more than the total annual withdrawal of 4-6 countries the size of Denmark, by the same research. Water stress is location-dependent: data centers in arid areas draw from scarcer supplies.
Training vs. inference. Training a large model is a one-time (though repeatedly iterated) compute spike. Inference, serving millions of queries daily, accumulates steadily. For deployed models with large user bases, cumulative inference energy likely exceeds training energy within months to years of launch.
Carbon intensity:
Varies dramatically by grid. A data center powered by Icelandic geothermal or Norwegian hydro has near-zero operational carbon. One powered by coal-heavy grids in parts of the US Midwest or Southeast has much higher emissions. Tech companies are investing in renewables and entering power purchase agreements, but they are also increasing total consumption faster than they are decarbonizing.