HBM and the AI Memory Arms Race: What It Means for the Broader Hardware Market

Every NVIDIA H100 GPU contains 80 GB of High Bandwidth Memory. Every H200 contains 141 GB. The Blackwell B200 ships with 192 GB. Multiply those numbers by the hundreds of thousands of GPUs being manufactured each quarter, and a staggering truth emerges: AI is consuming memory at a rate that is fundamentally restructuring the global DRAM supply chain.

High Bandwidth Memory—HBM—is not a new technology. SK Hynix shipped the first HBM modules in 2013. But until the generative AI explosion of 2023–2024, it was a niche product serving a small segment of the GPU and high-performance computing market. That is no longer the case. HBM revenue is projected to reach $62 billion in 2026, up from $2.4 billion in 2022. The ripple effects on the broader memory market—including DDR4 and DDR5 pricing on the secondary market—are only beginning to be understood.

$62B

Projected HBM revenue in 2026, up from $2.4 billion in 2022

What HBM Is and Why AI Needs It

HBM is a 3D-stacked memory architecture that bonds multiple DRAM dies vertically using through-silicon vias (TSVs). A single HBM3E stack contains 8 or 12 DRAM dies connected by thousands of microscopic vertical channels, delivering bandwidth that conventional memory architectures cannot match.

Memory Type	Bandwidth	Capacity per Module	Power Efficiency
DDR5-5600 (Server RDIMM)	44.8 GB/s	32–128 GB	~3.7 pJ/bit
HBM2e	460 GB/s per stack	16 GB per stack	~2.1 pJ/bit
HBM3	819 GB/s per stack	24 GB per stack	~1.8 pJ/bit
HBM3E	1,180 GB/s per stack	36 GB per stack	~1.5 pJ/bit
HBM4 (expected late 2026)	~1,600 GB/s per stack	48 GB per stack	~1.2 pJ/bit (est.)

Large language models require enormous memory bandwidth to move weights and activations between memory and compute units during inference and training. A model with hundreds of billions of parameters generates trillions of memory access operations per second. Conventional DDR5 cannot keep the GPU’s tensor cores fed. HBM can. It is the bottleneck that unlocks AI compute performance, which is why GPU manufacturers have no choice but to integrate more of it into every successive generation.

The Supply Chain Crunch: Who Makes HBM and How Much

The HBM market is dominated by three manufacturers: SK Hynix, Samsung, and Micron. Their positions are not equal.

SK Hynix: The Undisputed Leader

SK Hynix supplies the majority of HBM3E for NVIDIA’s data center GPUs and holds an estimated 50–55% market share in HBM. The company has invested heavily in converting conventional DRAM production lines to HBM manufacturing, with its Icheon and Cheongju fabs now dedicating an increasing percentage of wafer output to HBM stacks. SK Hynix’s HBM is sold out through the end of 2026, with allocation commitments already stretching into 2027.

Samsung: Playing Catch-Up

Samsung was slower to ramp HBM3E production and faced yield issues that delayed its qualification with NVIDIA. As of Q1 2026, Samsung has closed the gap and is shipping HBM3E at scale, but it holds roughly 30–35% market share. Samsung’s advantage is its sheer DRAM fabrication capacity—it is the world’s largest memory manufacturer—which gives it the ability to convert more production to HBM as demand warrants.

Micron: The American Entrant

Micron entered the HBM market later than its Korean competitors but has been aggressively ramping production at its Hiroshima, Japan fab. Micron holds approximately 12–15% of the HBM market. Its 12-high HBM3E stacks have been qualified by NVIDIA and are shipping in volume. Micron is also the primary beneficiary of CHIPS Act funding in the United States, which will expand its domestic HBM production capacity.

The Cannibalization Problem

Here is the critical dynamic that most market observers underappreciate: HBM and commodity DRAM (DDR4, DDR5) are manufactured on the same production lines, using the same wafer starts, and drawing from the same pool of advanced DRAM dies.

An HBM3E stack requires 8–12 DRAM dies that meet stringent quality and performance requirements. These are not separate, dedicated dies—they are drawn from the same production output that would otherwise become DDR5 modules. When a memory manufacturer converts a production line to HBM, every wafer allocated to HBM is a wafer that does not produce DDR5.

25%

Estimated share of advanced DRAM wafer output diverted to HBM production in 2026

Every HBM stack that ships in an H200 is 8–12 DDR5 modules that never reach the server memory market. The AI memory arms race is a zero-sum reallocation of finite DRAM production capacity.

The numbers are striking. Industry analysts estimate that 20–25% of all advanced-node DRAM wafer output is now allocated to HBM production, up from less than 5% in 2022. This diversion is occurring at the same time that DDR5 adoption is accelerating in enterprise servers, creating a supply squeeze on commodity server memory that would not exist without AI-driven HBM demand.

The Revenue Incentive Is Overwhelming

Memory manufacturers are not converting production to HBM out of altruism. The economics are irresistible.

Product	Revenue per Wafer (Est.)	Margin Profile
DDR4 RDIMM (32 GB)	$800–$1,200	15–25%
DDR5 RDIMM (64 GB)	$2,000–$3,000	25–35%
HBM3E (36 GB stack)	$8,000–$12,000	50–60%+

HBM generates 4–6x the revenue per wafer compared to DDR5 and 8–10x compared to DDR4. Gross margins on HBM are roughly double those of commodity DRAM. No rational manufacturer would allocate wafer capacity to DDR4 or DDR5 when HBM commands this premium. The market is behaving accordingly.

What This Means for DDR4 and DDR5 Pricing

The HBM-driven supply diversion is creating two distinct effects on the commodity DRAM market:

DDR4: Accelerated Decline, Then Scarcity Risk

DDR4 is being squeezed from both sides. On the demand side, the enterprise transition to DDR5 is reducing new-server DDR4 requirements. On the supply side, no manufacturer is investing in DDR4 production capacity—every incremental dollar goes to HBM or DDR5.

The result: DDR4 secondary-market prices have fallen 30–40% from their 2024 levels as enterprises offload DDR4 during server refreshes. But this decline may be temporary. As DDR4 production continues to shrink, the supply of new DDR4 modules will eventually contract to a point where secondary-market pricing stabilizes or even increases for specific SKUs. This pattern has occurred with every previous DDR generation transition—DDR3 pricing firmed significantly in 2020–2021 after production ceased.

DDR4 RDIMM	Q1 2025 (Secondary)	Q1 2026 (Secondary)	Change
16 GB 2Rx8 PC4-2666V	$18–$24	$10–$16	−38%
32 GB 2Rx4 PC4-3200AA	$32–$44	$22–$30	−34%
64 GB 2Rx4 PC4-3200AA	$65–$85	$42–$58	−35%
128 GB 4Rx4 PC4-3200 LRDIMM	$180–$240	$120–$165	−33%

DDR5: Tighter Supply Than Expected

DDR5 should be in oversupply right now. Production has been ramping for two years, yields have improved, and new fabs are online. But DDR5 pricing on both primary and secondary markets has been firmer than analysts predicted. The reason is the HBM diversion: wafer capacity that would have produced DDR5 is instead producing higher-margin HBM stacks.

For secondary-market participants, this creates a counterintuitive dynamic. Used DDR5 RDIMMs from server decommissions are holding value better than DDR4 did at the same stage of its lifecycle. A used 64 GB DDR5-4800 RDIMM commands $85–$110 on the secondary market, a modest 25–35% discount to new, compared to the 45–55% discounts typical for DDR4 at equivalent volumes.

The HBM Aftermarket: A Market That Doesn’t Exist Yet

One question we receive frequently: will there be a secondary market for HBM?

The short answer is: not in any meaningful way. HBM is physically bonded to the GPU package during manufacturing using advanced 2.5D packaging (CoWoS or its equivalents). It cannot be removed, replaced, or resold separately from the GPU. When an H100 is decommissioned, the HBM goes with it—as part of the GPU, not as a standalone memory module.

This means the secondary-market value of HBM is captured entirely in GPU resale pricing. An H100 with 80 GB of HBM3 and an H200 with 141 GB of HBM3E are priced differently on the secondary market not just because of their compute capabilities, but because of the memory capacity embedded in them. The HBM is inseparable from the silicon.

Strategic Implications for the Secondary Hardware Market

The HBM-driven restructuring of the memory market has several practical implications for buyers and sellers of secondary enterprise hardware:

DDR4 inventory is a time-limited asset. The window to sell DDR4 at reasonable prices is narrowing. Once the enterprise installed base transitions fully to DDR5 platforms, DDR4 demand will concentrate in a shrinking pool of legacy systems. Sell DDR4 inventory now rather than holding for a price recovery that is unlikely in the near term.
DDR5 will hold value better than DDR4 did. The HBM-driven supply constraint on DDR5 production means used DDR5 modules will depreciate more slowly than historical DDR transitions would suggest. This makes DDR5 a better inventory hold for brokers who need to carry stock.
GPU resale values are propped up by embedded HBM. The H100’s secondary-market floor price is partly determined by the replacement cost of the 80 GB of HBM3 inside it. As long as HBM remains scarce and expensive, used AI GPUs will retain value better than their compute performance alone would justify.
Watch for the HBM4 transition. HBM4 is expected to begin shipping in late 2026 or early 2027, initially in NVIDIA’s next-generation data center GPUs. When HBM4 ramps, HBM3E production will be scaled back, further tightening the supply of advanced DRAM dies available for DDR5 production. The cycle repeats.

The Bigger Picture: Memory as Infrastructure Bottleneck

For decades, memory was a commodity. DDR modules were interchangeable, widely available, and priced on transparent spot markets. The AI era is changing that calculus. Memory is becoming a strategic bottleneck—the component that determines how large a model can be trained, how fast inference runs, and how many users a deployment can serve.

The implications extend beyond GPUs and servers. Enterprise buyers who need commodity DDR5 for standard server deployments are competing, indirectly, with AI companies whose GPU orders are consuming the wafer capacity that would have produced those DDR5 modules. This is not a temporary disruption. It is a structural reallocation of semiconductor manufacturing priority that will persist as long as AI workloads continue growing.

For secondary-market participants, the takeaway is clear: memory pricing is no longer governed solely by supply-and-demand dynamics within the server memory market. It is now inextricably linked to the AI hardware cycle. Understanding HBM demand is essential to forecasting DDR5 supply—and by extension, secondary-market pricing for every server that uses it.