Every NVIDIA H100 GPU contains 80 GB of High Bandwidth Memory. Every H200 contains 141 GB. The Blackwell B200 ships with 192 GB. Multiply those numbers by the hundreds of thousands of GPUs being manufactured each quarter, and a staggering truth emerges: AI is consuming memory at a rate that is fundamentally restructuring the global DRAM supply chain.
High Bandwidth Memory—HBM—is not a new technology. SK Hynix shipped the first HBM modules in 2013. But until the generative AI explosion of 2023–2024, it was a niche product serving a small segment of the GPU and high-performance computing market. That is no longer the case. HBM revenue is projected to reach $62 billion in 2026, up from $2.4 billion in 2022. The ripple effects on the broader memory market—including DDR4 and DDR5 pricing on the secondary market—are only beginning to be understood.
HBM is a 3D-stacked memory architecture that bonds multiple DRAM dies vertically using through-silicon vias (TSVs). A single HBM3E stack contains 8 or 12 DRAM dies connected by thousands of microscopic vertical channels, delivering bandwidth that conventional memory architectures cannot match.
| Memory Type | Bandwidth | Capacity per Module | Power Efficiency |
|---|---|---|---|
| DDR5-5600 (Server RDIMM) | 44.8 GB/s | 32–128 GB | ~3.7 pJ/bit |
| HBM2e | 460 GB/s per stack | 16 GB per stack | ~2.1 pJ/bit |
| HBM3 | 819 GB/s per stack | 24 GB per stack | ~1.8 pJ/bit |
| HBM3E | 1,180 GB/s per stack | 36 GB per stack | ~1.5 pJ/bit |
| HBM4 (expected late 2026) | ~1,600 GB/s per stack | 48 GB per stack | ~1.2 pJ/bit (est.) |
Large language models require enormous memory bandwidth to move weights and activations between memory and compute units during inference and training. A model with hundreds of billions of parameters generates trillions of memory access operations per second. Conventional DDR5 cannot keep the GPU’s tensor cores fed. HBM can. It is the bottleneck that unlocks AI compute performance, which is why GPU manufacturers have no choice but to integrate more of it into every successive generation.
The HBM market is dominated by three manufacturers: SK Hynix, Samsung, and Micron. Their positions are not equal.
SK Hynix supplies the majority of HBM3E for NVIDIA’s data center GPUs and holds an estimated 50–55% market share in HBM. The company has invested heavily in converting conventional DRAM production lines to HBM manufacturing, with its Icheon and Cheongju fabs now dedicating an increasing percentage of wafer output to HBM stacks. SK Hynix’s HBM is sold out through the end of 2026, with allocation commitments already stretching into 2027.
Samsung was slower to ramp HBM3E production and faced yield issues that delayed its qualification with NVIDIA. As of Q1 2026, Samsung has closed the gap and is shipping HBM3E at scale, but it holds roughly 30–35% market share. Samsung’s advantage is its sheer DRAM fabrication capacity—it is the world’s largest memory manufacturer—which gives it the ability to convert more production to HBM as demand warrants.
Micron entered the HBM market later than its Korean competitors but has been aggressively ramping production at its Hiroshima, Japan fab. Micron holds approximately 12–15% of the HBM market. Its 12-high HBM3E stacks have been qualified by NVIDIA and are shipping in volume. Micron is also the primary beneficiary of CHIPS Act funding in the United States, which will expand its domestic HBM production capacity.
Here is the critical dynamic that most market observers underappreciate: HBM and commodity DRAM (DDR4, DDR5) are manufactured on the same production lines, using the same wafer starts, and drawing from the same pool of advanced DRAM dies.
An HBM3E stack requires 8–12 DRAM dies that meet stringent quality and performance requirements. These are not separate, dedicated dies—they are drawn from the same production output that would otherwise become DDR5 modules. When a memory manufacturer converts a production line to HBM, every wafer allocated to HBM is a wafer that does not produce DDR5.
Every HBM stack that ships in an H200 is 8–12 DDR5 modules that never reach the server memory market. The AI memory arms race is a zero-sum reallocation of finite DRAM production capacity.
The numbers are striking. Industry analysts estimate that 20–25% of all advanced-node DRAM wafer output is now allocated to HBM production, up from less than 5% in 2022. This diversion is occurring at the same time that DDR5 adoption is accelerating in enterprise servers, creating a supply squeeze on commodity server memory that would not exist without AI-driven HBM demand.
Memory manufacturers are not converting production to HBM out of altruism. The economics are irresistible.
| Product | Revenue per Wafer (Est.) | Margin Profile |
|---|---|---|
| DDR4 RDIMM (32 GB) | $800–$1,200 | 15–25% |
| DDR5 RDIMM (64 GB) | $2,000–$3,000 | 25–35% |
| HBM3E (36 GB stack) | $8,000–$12,000 | 50–60%+ |
HBM generates 4–6x the revenue per wafer compared to DDR5 and 8–10x compared to DDR4. Gross margins on HBM are roughly double those of commodity DRAM. No rational manufacturer would allocate wafer capacity to DDR4 or DDR5 when HBM commands this premium. The market is behaving accordingly.
The HBM-driven supply diversion is creating two distinct effects on the commodity DRAM market:
DDR4 is being squeezed from both sides. On the demand side, the enterprise transition to DDR5 is reducing new-server DDR4 requirements. On the supply side, no manufacturer is investing in DDR4 production capacity—every incremental dollar goes to HBM or DDR5.
The result: DDR4 secondary-market prices have fallen 30–40% from their 2024 levels as enterprises offload DDR4 during server refreshes. But this decline may be temporary. As DDR4 production continues to shrink, the supply of new DDR4 modules will eventually contract to a point where secondary-market pricing stabilizes or even increases for specific SKUs. This pattern has occurred with every previous DDR generation transition—DDR3 pricing firmed significantly in 2020–2021 after production ceased.
| DDR4 RDIMM | Q1 2025 (Secondary) | Q1 2026 (Secondary) | Change |
|---|---|---|---|
| 16 GB 2Rx8 PC4-2666V | $18–$24 | $10–$16 | −38% |
| 32 GB 2Rx4 PC4-3200AA | $32–$44 | $22–$30 | −34% |
| 64 GB 2Rx4 PC4-3200AA | $65–$85 | $42–$58 | −35% |
| 128 GB 4Rx4 PC4-3200 LRDIMM | $180–$240 | $120–$165 | −33% |
DDR5 should be in oversupply right now. Production has been ramping for two years, yields have improved, and new fabs are online. But DDR5 pricing on both primary and secondary markets has been firmer than analysts predicted. The reason is the HBM diversion: wafer capacity that would have produced DDR5 is instead producing higher-margin HBM stacks.
For secondary-market participants, this creates a counterintuitive dynamic. Used DDR5 RDIMMs from server decommissions are holding value better than DDR4 did at the same stage of its lifecycle. A used 64 GB DDR5-4800 RDIMM commands $85–$110 on the secondary market, a modest 25–35% discount to new, compared to the 45–55% discounts typical for DDR4 at equivalent volumes.
One question we receive frequently: will there be a secondary market for HBM?
The short answer is: not in any meaningful way. HBM is physically bonded to the GPU package during manufacturing using advanced 2.5D packaging (CoWoS or its equivalents). It cannot be removed, replaced, or resold separately from the GPU. When an H100 is decommissioned, the HBM goes with it—as part of the GPU, not as a standalone memory module.
This means the secondary-market value of HBM is captured entirely in GPU resale pricing. An H100 with 80 GB of HBM3 and an H200 with 141 GB of HBM3E are priced differently on the secondary market not just because of their compute capabilities, but because of the memory capacity embedded in them. The HBM is inseparable from the silicon.
The HBM-driven restructuring of the memory market has several practical implications for buyers and sellers of secondary enterprise hardware:
For decades, memory was a commodity. DDR modules were interchangeable, widely available, and priced on transparent spot markets. The AI era is changing that calculus. Memory is becoming a strategic bottleneck—the component that determines how large a model can be trained, how fast inference runs, and how many users a deployment can serve.
The implications extend beyond GPUs and servers. Enterprise buyers who need commodity DDR5 for standard server deployments are competing, indirectly, with AI companies whose GPU orders are consuming the wafer capacity that would have produced those DDR5 modules. This is not a temporary disruption. It is a structural reallocation of semiconductor manufacturing priority that will persist as long as AI workloads continue growing.
For secondary-market participants, the takeaway is clear: memory pricing is no longer governed solely by supply-and-demand dynamics within the server memory market. It is now inextricably linked to the AI hardware cycle. Understanding HBM demand is essential to forecasting DDR5 supply—and by extension, secondary-market pricing for every server that uses it.
We source DDR4 and DDR5 RDIMMs across every major OEM configuration. Volume pricing available.
Get in Touch →