BackMay 22, 2026

TSMC Is The Governor On The AI Bubble

"we need to throw a party for them"

Patrick O'Shaughnessy and Invest Like the Best Podcastsource

Gavin Bakerspeaker Watch full clip

Recap

Demand Shock - Frontier Labs Reveal Unusual Revenue Pressure

0.2s-40.2s - Baker opens by saying Anthropic added $11B of ARR and compares that with Palantir, Snowflake, and Databricks taking roughly a decade to build their combined businesses, using it as evidence that AI demand is historically unusual (, [137.8s]-[180.6s]). Treat the number as Baker's claim.
553.0s-606.0s - He says Anthropic would be doing well north of $100B if it had all the compute and ties model quality to token budgets, arguing that rate-limited users receive a weaker experience ().

Watts Constraint - Power Becomes A Timed Bottleneck

884.8s-968.1s - On power, he says capitalism will likely solve the watts shortage over time, but regulatory and political blowback are real risks; he expects the shortage to start easing in 2027-2028 and points to orbital compute as a longer-term solution ().

Orbital Compute - Space As A Pressure Valve

973.0s-1276.4s - He reframes orbital compute as racks in space, not giant floating data centers, and says inference is a more sensible orbital workload than training ().

TSMC Governor - Wafer Discipline As Bubble Control

1319.0s-1661.3s - The core TSMC claim comes at : Baker argues that foundational technologies usually get bubbles, that TSMC's wafer constraint may prevent one, and that TSMC capacity decisions are the main thing he would watch.

Token Pricing - Demand Surfaces Through Usage

2276.5s-2367.9s - He says usage-based pricing is bullish for frontier labs because flat plans rate-limit heavy users, then speculates OpenAI and Anthropic could exceed $200B in ARR this year as more compute comes online and enterprise token pricing expands ().

Hardware Finance - GPU Life And Credit Underwriting

2691.4s-2706.2s - He argues prefill/decode disaggregation can extend GPU useful lives by letting older GPUs handle some workloads longer, which could affect private-credit underwriting of AI hardware (, [2932.5s]-[3031.3s]).

Application Layer - Value Accrues Near The Token Path

3176.8s-3326.2s - On applications, he says value is accruing to energy, data centers, chips, and models, not broadly to applications; companies need to be in the token path or own a narrow hard niche ().

The Brief

Baker's sharpest point is that AI demand may be real and still become a bubble if supply is allowed to run too far ahead. In his view, TSMC's wafer discipline is the hidden governor: it keeps Nvidia scarcity valuable, slows reckless overbuild, and forces every frontier model, chip startup, hyperscaler, and application company to compete for constrained compute. Baker is saying AI is not mainly limited by ideas right now. It is limited by the physical system that makes and powers the computers: chips, wafer capacity, data centers, energy, financing, and time. If too many chips and data centers get built too quickly, the industry can create a bubble. If too few get built, AI companies cannot meet demand. TSMC matters because it controls much of the advanced chip supply, so its capacity decisions decide how fast the whole AI market can grow. This is one of the cleanest allocation-economy sources in the Genesis set. Baker's argument turns AI from a software story into a capacity-allocation system: TSMC allocates wafers, Nvidia allocates scarce accelerators, frontier labs allocate tokens, data-center builders allocate power and permits, creditors allocate against GPU useful life, and startups fight to be close enough to token flow to capture value. The issue is not whether AI demand exists; it is who controls the bottlenecks that decide how much intelligence can actually be delivered.

TSMC capacity decisions become a better AI-market signal than many model launches, because they shape whether scarcity persists or overbuild begins.

Nvidia's power comes from being the main seller of shortage; that remains valuable while wafers, memory, power, optics, and deployment capacity are constrained.

Frontier labs gain leverage when pricing shifts from flat subscriptions to usage-based token plans, because heavy users reveal true demand instead of being rate-limited.

Technical Need To Knows

TSMC: Taiwan Semiconductor Manufacturing Company, the dominant manufacturer of the most advanced chips. It matters because Baker treats TSMC's capacity discipline as the hidden control knob on how fast AI compute supply can grow.
Wafer: A round slice of silicon that many chips are manufactured on before being cut into individual processors. It matters because if wafer supply is constrained, even huge AI demand cannot instantly become more GPUs.
GPU: A processor originally built for graphics that became the main engine for modern AI training and inference. It matters because Nvidia's GPU scarcity is the economic center of Baker's argument.
Nvidia: The leading seller of AI GPUs and related systems. It matters because Baker argues Nvidia benefits from scarcity as long as wafers, memory, power, and deployment capacity remain constrained.
AI bubble: A period where investment runs too far ahead of real returns. It matters because Baker's core claim is that real demand can still become a bubble if supply expands too quickly.
Watts: Electrical power capacity. It matters because data centers cannot run more AI hardware unless they can secure enough power, grid access, cooling, and permits.
Orbital compute: Putting compute infrastructure in space rather than on Earth. It matters here as a speculative pressure valve for power and land constraints, especially for inference workloads that might tolerate space-based infrastructure.
Training: The expensive process of teaching an AI model from large amounts of data. It matters because training has different power, networking, and latency needs than inference.
Inference: Running a trained model to produce answers for users. It matters because Baker sees inference demand, especially reasoning and coding use, as the workload that keeps pulling more compute into production.
Reasoning models: AI models that spend more compute thinking through a problem before answering. They matter because Baker says they are more compute-hungry during inference than older non-reasoning models.
Token: A small unit of text or data processed by a model. It matters because model revenue, user limits, and compute demand increasingly show up as token usage.
ARR: Annual recurring revenue, a run-rate estimate for subscription or recurring business. It matters because Baker uses Anthropic's claimed ARR growth to argue AI demand is historically unusual.
Usage-based token pricing: Charging customers based on the amount of model usage rather than a flat subscription. It matters because heavy users reveal true compute demand when they pay for more tokens instead of being rate-limited.
Prefill: The part of inference where the model reads the prompt and context. It matters because prefill can potentially run on different hardware than token generation, changing which chips remain useful.
Decode: The part of inference where the model generates new output tokens. It matters because Baker argues decode has different bottlenecks than prefill, creating room for specialist accelerators.
Prefill/decode disaggregation: Splitting the input-reading and answer-generation stages across different hardware. It matters because it could extend GPU useful lives and change hardware-financing assumptions.
Private credit: Non-bank lending to companies or projects. It matters because lenders financing GPUs need to know how long those GPUs will remain economically useful.
Frontier models: The most capable AI models at the leading edge. They matter because Baker argues value and demand still concentrate near the frontier while cheaper models lag behind.
Token path: The business path where user activity directly creates model calls and token usage. It matters because Baker thinks applications closer to token flow capture more value than generic AI wrappers.
DRAM / HBM: Memory technologies; HBM is the high-bandwidth memory used in advanced AI accelerators. They matter because memory supply and performance can bottleneck AI hardware even when GPU demand is strong.
Optical networking: High-speed data movement using light. It matters because very large AI clusters need fast connections, so optics can become a supply-chain choke point alongside wafers and power.

Counterpoints and Caveats

Baker is an investor making a market-structure argument, not a primary source for Anthropic ARR, OpenAI/Anthropic revenue, TSMC contracts, Nvidia potential sales, orbital compute timelines, or GPU useful lives.
The transcript has caption noise, including "Taiwan Smi/Simmy" for TSMC and "Watson wafers" for watts and wafers. Exact quotes need video review before publication.
His power view is internally balanced but optimistic: Gartner and Anthropic-related reporting support power as a real bottleneck, while Baker thinks capitalism and orbital compute ease it. That timing is the key uncertainty.
The orbital compute section is high-upside but speculative. The source does not provide a primary SpaceX technical plan, signed customer contract, cost curve, maintenance model, or launch schedule.
The TSMC thesis can cut both ways. Capacity discipline may prevent overbuild, but if TSMC is too constrained, it can also push customers toward Intel, Samsung, custom silicon, or geopolitical workarounds that weaken the shortage premium.
The application-layer warning may understate workflows where distribution, proprietary data, trust, regulation, or user habit create durable value even outside the direct token path.

What Folks Are Saying

The official Colossus episode page frames the conversation around "watts and wafers," TSMC's ability to prevent an AI bubble, the frontier-lab prisoner's dilemma, GPU disaggregation, Terafab, and whether value keeps accruing to frontier models. That confirms the wafer-governor read is central, not incidental.
PJFP's commentary on the episode calls the TSMC bottleneck "the only thing standing between today's market and a full-on AI bubble" and highlights the same chain: Anthropic's vertical revenue takeoff, orbital compute as racks in space, usage pricing, prefill/decode, and token-path pressure on applications. Source: PJFP, May 20, 2026.
TrendForce, citing Reuters and Broadcom commentary, separately reports that TSMC is approaching production limits and that Broadcom views TSMC capacity, lasers, and PCBs as 2026 supply-chain choke points. That corroborates Baker's broader constraint map without validating every valuation claim. Source: TrendForce, March 24, 2026.
Gartner predicted that power availability would operationally constrain 40% of AI data centers by 2027, while Data Center Dynamics reports Anthropic's view that U.S. AI may need at least 50GW of electric capacity by 2028. These sources support Baker's claim that watts and permitting are real bottlenecks, though they also make his confidence that capitalism solves watts by 2027-2028 worth testing. Sources: Gartner, Data Center Dynamics.

Back to allocation feed