Americas

  • United States
Maria Korolov
Contributing writer

Storage constraints add to AI data center bottleneck

Feature
Oct 22, 20257 mins

Hard-drive lead times have gone from a few weeks to more than a year due to AI demands, and enterprise flash storage prices are expected to rise with surging demand.

shipping trucking container supply chain shutterstock 402767335
Credit: SasinTipchai / Shutterstock

After GPUs, storage capacity has emerged as the next major constraint for AI data centers. Hard-drive lead times are ballooning to more than a year, and enterprise flash storage is also expected to see shortages and price increases, experts say. This is being driven by an explosion in AI inferencing as trained models are put to use.

โ€œAI inference isnโ€™t just a GPU story, itโ€™s a data story,โ€ says Constellation analyst Chirag Mehta. โ€œExpect tight supply into 2026, higher pricing, and a faster move to dense [flash storage] footprints, especially where power, space, and latency are constrained.โ€

Dellโ€™Oro Group projects the storage drive market, encompassing both HDDs and SSDs, to grow at a CAGR of over 20% over the next five years. โ€œBoth technologies will continue to play distinct roles across different tiers of AI infrastructure storage,โ€ says Dellโ€™Oro Group analyst Baron Fung.

According to a TrendForce report released earlier this month, AI inference is creating huge demand for real-time data access, causing both hard disk drive (HDD) and solid-state storage (SSD) suppliers to increase their high-capacity options.

For example, HDD manufacturers are moving to next-generation, heat-assisted magnetic recording, which takes substantial investment, and production lines arenโ€™t yet at full speed. As a result, the average price per gigabyte has increased, diminishing the cost advantage of HDDs over SSDs. Meanwhile, flash storage vendors are upping the capacity of their SSDs to 122 terabytes and higher, leading to lower per gigabyte prices and better power consumption.

All of this is important because of the explosive growth in AI inference.

AI inference driving storage needs

AI inference is the computing that takes place when a query is sent to an AI model, and the AI model sends back an answer. On the enterprise side, this requires access to vector databases and other data sources used to enrich prompts for better results, known as retrieval-augmented generation (RAG). On the AI model side โ€” whether with a third-party provider or a companyโ€™s own on-prem model โ€” these longer prompts increase the storage requirements during inference.

And with new reasoning models and agentic AI systems, the number of interactions between data and AI models is only going to increase, putting even greater demands on storage systems.

Another driver? The falling prices of each interaction. According to Stanford Universityโ€™s AI index report, inference costs are falling from nine-fold to 900-fold per year, depending on the task. According to Air Street Capitalโ€™s State of AI report, released in October, Googleโ€™s flagship modelsโ€™ intelligence-to-cost ratio is doubling every 3.4 months, and OpenAIโ€™s is doubling every 5.8 months.

And the less something costs, the more people use it. For example, Google is now processing more than 1.3 quadrillion tokens per month, up from 10 trillion a year ago. OpenAI doesnโ€™t release numbers on how many tokens it processes, but its revenues hit $4.3 billion in the first half of 2025, up from $3.7 billion for all of 2024, according to news reports.

In fact, there will be severe shortages in high-capacity hard disk drives next year, TrendForce predicts, with lead times surging from weeks to more than a year.

HDDs offer low costs and are typically used for cold storage โ€” data that doesnโ€™t need to be accessed with extremely low latency. SSDs offer better performance for warm and hot storage but come with a higher price tag. But because of HDD shortages, some data centers are shifting some of their cold storage to SSDs, according to TrendForce, and this might happen even more in the future as SSD prices come down and HDDs run into constraints.

โ€œHDD bit output is difficult to increase,โ€ says TrendForce analyst Bryan Ao. โ€œAI will generate more data than the growth in HDD output. It is necessary to prepare for this with SSD storage.โ€

And 256-terabyte QLC SSDs are coming in 2028, he adds. QLC stands for โ€œquad-level cellsโ€ and is an update to the triple-level cells, or TLCs, in the previous generation of SSD storage. โ€œSuch large-capacity QLC solutions will be more cost-effective to compete with HDDs from a total cost of ownership and performance perspective,โ€ Ao says.

QLC is optimized for network-attached storage and can handle petabyte and exabyte-scale AI pipelines, according to Roger Corell, senior director for AI and leadership marketing at Solidigm, an SSD manufacturer. โ€œQLC has tremendous savings in terms of space and power,โ€ he says. โ€œAnd when data center operators are focused on maximizing the power and space envelopes that they have for the AI data center build, theyโ€™re looking to get as efficient storage as they can.โ€

According to the TrendForce report, SSD manufacturers are increasing QLC SSD production, but AI workloads will also expand, โ€œleading to tight supply conditions for enterprise SSDs by 2026.โ€

Corell says his company is seeing โ€œvery, very strong demand.โ€

But that doesnโ€™t mean that SSDs are going to completely take over, he adds. โ€œI think looking into 2026 and beyond itโ€™s going to take a mix of SSDs and HDDs,โ€ Corell says. โ€œWe do believe that there is a place for HDDs, but some of the demands for AI are clearly pointing to QLC as being the optimal storage for AI workloads.โ€

AI deployment uses multiple storage layers, and each one has different requirements, says Dellโ€™Oroโ€™s Fung. For storing massive amounts of unstructured, raw data, cold storage on HDDs makes more sense, he says. SSDs make sense for warm storage, such as for pre-processing data and for post-training and inference. โ€œThereโ€™s a place for each type of storage,โ€ he says.

Planning ahead

According to Constellationโ€™s Mehta, data center managers and other storage buyers should prepare by treating SSD procurement like they do GPUs. โ€œMulti-source, lock in lanes early, and engineer to standards so vendor swaps donโ€™t break your data path.โ€ He recommends qualifying at least two vendors for both QLC and TLC and starting early.

TrendForceโ€™s Ao agrees. โ€œIt is better to build inventory now,โ€ he says. โ€œIt is difficult to lock-in long term deals with suppliers now due to tight supply in 2026.โ€

Based on suppliersโ€™ availability, Kioxia, SanDisk, and Micron are in the best position to support 128-terabyte QLC enterprise SSD solutions, Ao says. โ€œBut in the longer term, some module houses may be able to provide similar solutions at a lower cost,โ€ Ao adds. โ€œWe are seeing more module houses, such as Phison and Pure Storage, supporting these solutions.โ€

And itโ€™s not just SSD for fast storage and HDD for slow storage. Memory solutions are becoming more complex in the AI era, says Ao. โ€œFor enterprise players with smaller-scale business models, it is important to keep an eye on Z-NAND and XL-Flash for AI inference demand,โ€ he says.

These are memory technologies that sit somewhere between the SSDs and the RAM working memory. โ€œThese solutions will be more cost-effective compared to HBM or even HBF [high bandwidth flash],โ€ he says.

On the positive side, SSDs use standard protocols, says Constellationโ€™s Mehta. โ€œSo, interface lock-in is limited,โ€ he says. โ€œThe risk is roadmap and supply, not protocol.โ€ He recommends that companies plan ahead for price and lead-time volatility โ€” and power.

โ€œUS data center energy constraints are tightening,โ€ he says. โ€œStorage total-cost-of-ownership conversations now start with watts per terabyte. In 2026, your bottleneck may be lead time or power, not just price. Architect for either constraint, and youโ€™ll make better storage decisions.โ€

Maria Korolov
Contributing writer

Maria Korolov is an award-winning technology journalist with over 20 years of experience covering enterprise technology, mostly for Foundry publications -- CIO, CSO, Network World, Computerworld, PCWorld, and others. She is a speaker, a sci-fi author and magazine editor, and the host of a YouTube channel. She ran a business news bureau in Asia for five years and reported for the Chicago Tribune, Reuters, UPI, the Associated Press and The Hollywood Reporter. In the 1990s, she was a war correspondent in the former Soviet Union and reported from a dozen war zones, including Chechnya and Afghanistan.

Maria won 2025 AZBEE awards for her coverage of Broadcom VMware and Quantum Computing.

More from this author