1. Storage Basics: Persistence, Latency, Bandwidth, and IOPS
A good storage mental model starts with a simple distinction: memory is where active state lives right now; storage is where state survives. DRAM and HBM are fast but volatile. Power them off and their contents disappear. HDDs and SSDs are persistent. They survive reboots, failures, and restarts.
AI workloads stress all four at once. Training wants huge throughput. Retrieval wants lots of random reads. Checkpointing wants sustained write bandwidth. Long-context systems force hot and cold state to move across tiers. That is why storage design in AI is not just “buy faster SSDs.” It is workload matching.
2. NAND Flash from First Principles
SSDs are built on NAND flash. At a very high level, NAND stores charge in cells. The more bits you try to pack into each cell, the cheaper and denser the storage becomes, but the harder it gets to read and write quickly and reliably.
| Type | Bits per cell | Best property | Main compromise |
|---|---|---|---|
| SLC | 1 | Fastest, most durable | Too expensive for broad capacity use |
| MLC | 2 | Balanced performance/endurance | Lower density than TLC/QLC |
| TLC | 3 | Good mainstream tradeoff | More fragile and slower than lower-bit cells |
| QLC | 4 | Maximum density / low cost per bit | Lower endurance and weaker write behavior |
This is why QLC exists: AI infrastructure wants absurd capacity, and capacity has to be paid for. If the hot path does not require the endurance or write behavior of TLC, then QLC becomes attractive because it pushes cost per terabyte down.
Program/erase cycles and endurance
NAND cells wear out. They can only tolerate a limited number of program/erase cycles before reliability degrades. That is why SSD endurance is a real design variable, not a footnote. AI checkpointing, heavy logging, and write-heavy preprocessing can stress endurance much more than read-mostly serving.
Write amplification
NAND cannot overwrite arbitrarily the way RAM can. Data is written in pages, but erasure happens in larger blocks. That mismatch means the SSD controller often has to move and rewrite more data internally than the host actually requested. That is write amplification. In practical terms, it hurts both performance and endurance.
3. Inside an SSD: Controller, Channels, FTL, and DRAM
An SSD is not “just fast flash.” It is a storage computer. The controller schedules reads and writes across many NAND packages and channels in parallel, manages wear leveling, garbage collection, error correction, mapping tables, and often a DRAM cache.
The right mental model is: SSD performance = flash media + controller architecture + parallelism + firmware policy. Two SSDs with similar NAND may behave very differently because of controller and firmware choices.
This matters in AI because workloads are mixed. A training-stage staging SSD may see huge sequential reads. A vector system may see random reads. A checkpoint SSD may see bursts of large writes. The same device class can look great in one workload and mediocre in another.
4. NVMe and PCIe: How Data Really Moves
NVMe matters because the older SATA/AHCI stack was built for a different era. NVMe was designed for solid-state storage and deep parallelism. It uses PCIe directly and supports many queues with deep queue depth. In plain language: NVMe is the protocol/storage stack that lets SSDs behave more like parallel devices and less like single-file storage appliances.
PCIe lane count matters too. A PCIe x4 SSD is not just “a drive”; it is a device attached to a fabric with defined bandwidth and contention properties. In AI systems, that matters because multiple SSDs, NICs, and accelerators often compete on the same host root complexes and switches.
5. Why HDD Still Matters in AI
HDD is not dead in AI. It simply occupies a different place in the hierarchy. Large AI datasets, logs, archived checkpoints, compliance retention, replay data, and cold object stores do not disappear because GPUs got faster. In fact, the more AI generates and consumes data, the more attractive cheap bulk storage becomes.
Seagate’s public AI-storage positioning leans directly into this point, arguing that high-capacity hard-drive storage paired with SSD caching remains useful for AI pipelines and that NVMe connectivity can make HDD easier to integrate into modern storage fabrics. WD’s own AI Data Cycle also explicitly keeps HDD in the picture as part of the cost-optimized storage mix. That is the real answer: HDD survives because the economics of cheap capacity still matter at AI scale.
6. HBM vs DRAM vs SSD vs HDD: The Hierarchy That Actually Exists
AI pain often comes from pretending these layers are interchangeable when they are not.
| Tier | Persistence | Typical latency class | Typical bandwidth class | Primary AI role |
|---|---|---|---|---|
| HBM | No | sub-microsecond / memory-class | TB/s class | accelerator-local active working set |
| DRAM | No | ~100ns order | tens to hundreds of GB/s | host memory, staging, buffers, metadata |
| NVMe SSD | Yes | tens to hundreds of microseconds | GB/s class | hot persistent data, checkpoints, vectors, spill |
| HDD | Yes | milliseconds | hundreds of MB/s | bulk lake, archive, cold retention |
7. How AI Workloads Stress Storage
This is why a storage primer for AI cannot just talk about device peak speed. Workload shape matters. Sequential training data streaming is a different problem from low-latency vector retrieval. Checkpointing is a different problem from log retention. The correct storage design almost always begins with the I/O profile.
8. Seagate, WD, and Sandisk: Three Different Bets
WD: the AI Data Cycle framing
WD’s strongest public contribution is conceptual. Its AI Data Cycle framework argues that AI needs a storage mix aligned to workflow stage, not a single hero medium. WD has also tied specific products like its Gen5 SN861 SSD to AI-heavy deployments, including certification messaging around NVIDIA GB200 NVL72 systems.
Seagate: cheap capacity is strategic, not old-fashioned
Seagate’s public AI storage writing stresses that AI’s data footprint keeps expanding and that large-capacity HDDs paired with faster tiers still make architectural sense. The interesting part of Seagate’s argument is not nostalgia for disks. It is the claim that AI needs economically scalable capacity, and that means storage hierarchy, not flash monoculture.
Sandisk: push flash upward
Sandisk’s public positioning is the most aggressive on flash evolution: very large enterprise SSDs, AI-oriented QLC roadmaps, and HBF as a response to the AI memory wall. That is a useful signal. Sandisk is effectively arguing that flash is being asked to move closer to the memory plane.
9. GPUDirect Storage: When the CPU Becomes Too Expensive in the Path
NVIDIA describes GPUDirect Storage as a direct data path between local or remote storage and GPU memory that avoids extra copies through CPU memory. In practical terms, that means DMA engines near storage or network adapters can move data more directly into GPU memory. The reason this matters is simple: once datasets get large enough, bouncing everything through the CPU becomes both a latency cost and a bandwidth tax.
GPUDirect Storage does not magically erase bad storage design. It simply removes one kind of overhead and exposes the next bottleneck more clearly.
10. Why High Bandwidth Flash Exists
High Bandwidth Flash is revealing because it exists only if the current hierarchy is too painful. Sandisk describes HBF as a new form of NAND aimed at the AI memory wall, and publicly positions it as delivering far more capacity than HBM while chasing enough bandwidth to matter for AI inference-scale workloads.
Conclusion
AI storage is not an afterthought. It is the persistent half of the memory hierarchy. HDD remains the economic base layer. SSD is the active persistent tier. GPUDirect Storage shortens the path to accelerators. HBF is an attempt to close the widening gap between storage and memory. The bigger the models get, the more storage architecture starts to feel like core AI infrastructure rather than backend plumbing.
Selected references
- Western Digital AI Data Cycle framework
- Seagate on NVMe HDDs and AI storage
- Sandisk on HBF and the AI memory wall
- NVIDIA GPUDirect Storage
- NVIDIA GPUDirect Storage overview guide
This primer combines stable systems explanations with current public vendor positioning where relevant.