NOR Flash vs NAND Flash in AI Servers: 2D, 3D NAND, Uses, and Linux Drivers

Abstract

When people talk about “flash” in AI infrastructure, they often collapse multiple technologies into one bucket. That loses the most important distinction. NOR flash is the thing that helps the server boot and recover. NAND flash is the thing that gives the server affordable persistent capacity, usually via SSDs. Inside the NAND family, 2D NAND describes the older planar geometry and 3D NAND describes the stacked geometry that made modern high-capacity enterprise SSDs possible. This article maps those roles onto real AI server architecture and then walks the Linux kernel and driver stack from spi-nor and MTD all the way to NVMe.

The short version

There are two different conversations hidden inside the phrase “flash memory in servers.” The first conversation is about platform firmware, where NOR dominates. The second is about persistent data storage, where NAND dominates. AI servers need both, but for completely different reasons.

The mental model: NOR is for bring-up. NAND is for capacity. Raw flash uses Linux flash subsystems. SSD-backed flash usually shows up to Linux as a normal block device through NVMe.

Figure 1. The two flash roles are structurally different. One anchors the boot path and service plane. The other anchors the persistence and local storage tier.

What is NOR flash?

NOR flash is non-volatile memory optimized for reliable random reads and direct addressing. It behaves much more like a read-mostly memory device than like a traditional disk. In many systems, the processor or service controller can map it directly and fetch instructions from it.

Why engineers choose NOR

NOR is attractive when the payload is relatively small, must be trustworthy, and needs predictable access. Firmware images fit this pattern almost perfectly.

What it is bad at

NOR is not economical for high-capacity storage. It has lower density and a much higher cost per bit than NAND, so it does not make sense as the bulk storage medium for AI data.

Fast random reads
Read-mostly usage pattern
Often suitable for execute-in-place firmware paths
Lower density and higher cost per bit
Usually stores firmware, recovery slots, and secure boot anchors

Where NOR is used in AI servers

In a real AI server, NOR usually stores the motherboard firmware, the BMC image, and firmware for supporting devices such as NICs, DPUs, retimers, or small controller subsystems. This is why NOR matters even in a machine whose main business is moving tensors: without NOR, the box does not boot, recover, or manage itself cleanly.

What is NAND flash?

NAND flash is non-volatile memory optimized for density and capacity. It is designed around page and erase-block operations rather than arbitrary byte writes. That makes it a far better medium for SSDs than for directly executable firmware.

NAND is the technology that turned flash from a firmware medium into a storage tier.

High density and lower cost per bit
Ideal for enterprise SSD capacity scaling
Requires ECC, wear management, and bad-block handling
Usually accessed through a controller rather than directly by applications
Dominant persistent local storage medium in AI servers

Where NAND is used in AI servers

In AI systems, NAND usually lives inside NVMe SSDs. Those drives hold dataset shards, checkpoints, container layers, local model caches, vector indexes, telemetry, and OS images. When an AI team says “the flash tier,” they almost always mean NAND-backed SSDs.

Figure 2. The most useful distinction is not electrical trivia. It is the role each flash type plays in the system and the software stack that surrounds it.

2D vs 3D NAND

2D NAND, also called planar NAND, arranges cells on the surface of the silicon. 3D NAND stacks cells vertically across many layers. The shift from 2D to 3D is one of the biggest reasons modern enterprise SSDs reached the capacities AI infrastructure now expects.

Why 2D NAND ran into trouble

As planar NAND cells shrank, manufacturers ran into higher interference, more leakage, tighter process margins, and worse scaling economics. That made continued 2D shrink harder and less rewarding.

Why 3D NAND changed the economics

3D NAND improved density by adding vertical layers instead of only shrinking lateral dimensions. That pushed capacity up and cost per bit down enough to make very large SSDs practical for data-center and AI deployments.

What about 4D NAND?

4D NAND is not usually treated as a completely separate textbook category in the way 2D and 3D NAND are. In practice, it is best understood as an advanced form of 3D NAND combined with architectural changes around how peripheral circuitry, charge-trap structures, and array organization are laid out. Some vendors use the term to highlight that the design is not only vertically stacked, but also optimized in the surrounding cell architecture and control layout.

The important practical takeaway is that 4D NAND is usually still part of the broader 3D NAND era, not a clean break like planar-to-vertical NAND was. For AI servers, that means it matters mainly through the same outcomes engineers already care about: density, endurance behavior, controller efficiency, cost per bit, and enterprise SSD capacity scaling.

Figure 3. Planar NAND runs out of easy scaling room. 3D NAND moves into the vertical dimension and becomes the foundation for dense enterprise SSDs.

Property	2D NAND	3D NAND
Cell layout	Planar, mostly horizontal	Vertically stacked layers
Scaling path	Shrink lateral geometry	Add layers and optimize stack
Density potential	More limited at advanced nodes	Far better for large SSD capacities
Role in modern AI servers	Mainly historical context or older generations	Dominant enterprise SSD media

Where flash sits in AI servers

An AI server has multiple storage roles, and different flash technologies map to them differently. The boot path wants reliability and deterministic reads. The data path wants capacity, economics, and fast local persistence.

Boot and recovery plane

NOR is the anchor here. It stores the firmware that lets the board come alive, initializes management processors, and gives the machine a recovery story when the main OS is unavailable.

Main storage plane

NAND, typically inside NVMe SSDs, stores the large persistent state around AI jobs: datasets, local replicas, checkpoints, container images, logs, and warm caches used for rapid restarts or model rollouts.

Management and service plane

Service controllers such as the BMC often have their own NOR-backed firmware. That separation matters because out-of-band management must survive host crashes and remain independently bootable.

Figure 4. AI servers do not use all flash in the same way. NOR and NAND sit on different control and data paths.

How AI workloads use flash

Training

During training, NAND-backed SSDs commonly hold dataset shards, preprocessing outputs, checkpoints, experiment logs, and local scratch space. The hot tensors themselves live in GPU memory or host DRAM, but NAND keeps the job fed and persistent.

Inference

During inference, NAND-backed storage commonly holds local model caches, deployment artifacts, vector indexes, embeddings, and sometimes spill or staging data. It helps fleets warm quickly and keeps model rollout latency under control.

Operations

Both flash types matter for operations. NOR preserves bootability and rollback paths. NAND preserves the evidence and artifacts that let operators restore services, reprovision nodes, and inspect failures.

Raw flash vs SSD-managed flash

This distinction is the one most readers miss. Linux can encounter flash in two completely different forms.

Raw flash

Raw flash means Linux is dealing directly with the NOR or NAND device. In that world the OS must respect erase-before-write behavior, bad blocks, ECC constraints, and geometry details. This is common for motherboard firmware devices, embedded controllers, or BMC storage.

SSD-managed flash

In an NVMe SSD, the controller firmware hides the physical NAND complexity. Linux sees a logical block device or namespace. The SSD controller internally manages the flash translation layer, ECC, bad-block remapping, garbage collection, and wear leveling.

The key clarification: Most host-side AI server data storage is not Linux talking to raw NAND. It is Linux talking to an NVMe device whose controller happens to be backed by NAND, usually modern 3D NAND.

Linux kernel and driver support

The software stack changes depending on whether the flash is raw or controller-managed.

Where these drivers live in the kernel structure

At a high level, Linux splits this problem by abstraction boundary. Raw flash sits in the memory-technology world. SSD-backed flash sits in the block-storage world. That split is why MTD and NVMe feel so different even when the underlying persistent medium is ultimately “flash.”

drivers/mtd/ holds the flash-oriented subsystem for raw memory technology devices.
drivers/mtd/spi-nor/ is the common home for SPI NOR support.
drivers/mtd/nand/raw/ contains raw NAND support.
drivers/mtd/nand/spi/ contains SPI NAND support in kernels that split it that way.
drivers/spi/ contains SPI controller drivers that sit underneath SPI NOR and some SPI NAND paths.
drivers/nvme/host/ contains the NVMe host-side driver stack used for SSDs on PCIe.
block/ contains the generic block layer that filesystems and storage drivers use to exchange I/O requests.
fs/ contains the filesystems that sit on top of block storage or, in some raw-flash cases, flash-aware layers such as UBIFS.

Figure 7. Linux does not have one unified “flash driver stack.” It has one stack for raw flash semantics and another for controller-managed block storage.

NOR flash on Linux

Raw NOR is usually supported through the MTD subsystem. A common path is MTD plus spi-nor on top of an SPI controller driver. Linux probes the chip, exposes MTD partitions, and supports read, erase, and program operations along with vendor-specific quirks and protections.

Figure 5. Raw NOR usually flows through the Linux MTD path and a bus-specific flash driver such as spi-nor.

MTD core for memory technology devices
spi-nor for SPI NOR chips
spi-mem and SPI controller drivers underneath
Partitioning and firmware update workflows in userspace

Raw NAND on Linux

Raw NAND is more complicated. Linux still often uses the MTD stack, but adds raw NAND support, SPI NAND support in some designs, ECC machinery, and layers such as UBI and UBIFS for wear-aware volume management.

MTD core
rawnand core
spinand for SPI NAND devices
NAND controller drivers
UBI and UBIFS in raw NAND deployments

This path is much more common in embedded systems and service controllers than in the host-side storage path of an AI server.

NAND inside SSDs on Linux

When NAND is inside an SSD, Linux usually talks to NVMe, not to NAND directly. The kernel stack is the PCIe subsystem, the nvme host driver, the block layer, and then the filesystem. The SSD firmware is what absorbs the flash complexity below that.

Figure 6. The dominant AI-server storage path is not raw NAND. It is Linux speaking NVMe to a controller-managed SSD backed by NAND media.

Flash case	Typical Linux subsystem	Common drivers or layers	Common AI-server role
Raw NOR	`MTD`	`spi-nor`, SPI controller drivers	UEFI, BIOS, BMC, firmware images
Raw NAND	`MTD`	`rawnand`, `spinand`, `UBI`, `UBIFS`	Embedded controllers, service devices
NAND in SSDs	block + `NVMe`	`nvme`, PCIe, filesystem	Datasets, checkpoints, logs, model cache

Kernel init and probe flow

The simplest useful way to understand Linux flash support is to follow the probe path. The exact source files differ by platform, but the high-level bring-up sequence is consistent: the kernel discovers a bus or controller, binds the matching driver, identifies the flash device, allocates subsystem structures, and finally exposes an interface that upper layers can use.

Basic NOR initialization flow

The platform firmware or device tree describes an SPI controller and a child flash device.
The SPI controller driver registers itself with the kernel and initializes the bus hardware.
The SPI core matches the flash child node to the spi-nor path.
spi-nor reads JEDEC identification data and determines chip geometry and capabilities.
The MTD layer allocates an mtd_info-style device representation and wires in read, erase, and program callbacks.
Partitions may then be registered, after which userspace sees MTD devices it can read or update.

In plain English, the SPI controller comes up first, the flash driver learns what chip is attached, and the MTD subsystem becomes the public interface that the rest of Linux uses.

Basic raw NAND initialization flow

A NAND controller driver initializes its hardware registers, DMA setup, and timing hooks.
The raw NAND core or SPI NAND layer identifies the device and learns page size, erase-block size, and bad-block rules.
The driver configures ECC requirements, either through a dedicated ECC engine or controller-integrated logic.
The MTD core registers the NAND device as a flash-aware object rather than as a normal disk.
Optional upper layers such as UBI and UBIFS are attached if the platform uses them.

This is why raw NAND drivers feel more complex than NOR drivers: they are not just discovering capacity, they are also establishing the correctness rules needed to survive bad blocks and bit errors.

Basic NVMe SSD initialization flow

PCIe enumeration discovers the SSD as a PCIe function.
The kernel matches the device to the nvme host driver.
The NVMe driver maps controller registers, negotiates queue resources, and sets up admin queues first.
The driver identifies the controller, then enumerates namespaces that represent logical storage regions.
The block layer registers those namespaces as block devices.
Filesystems such as XFS or ext4 mount on top, and applications finally see ordinary storage.

The important difference from raw flash is that Linux never needs to understand the internal NAND geometry. The SSD firmware has already converted the flash medium into a logical namespace abstraction.

The probe-order intuition: bus or transport first, device identification second, subsystem registration third, user-visible node last. That pattern shows up across SPI NOR, raw NAND, and NVMe even though the details differ.

Why init routines matter in production

These initialization steps are not just academic. If the wrong SPI flash description is shipped, the NOR path may boot unreliably. If ECC settings are mismatched, a raw NAND design may silently degrade. If PCIe or NVMe initialization is unstable, SSD namespaces may vanish or reset under load. In production AI servers, a large share of “storage weirdness” actually begins in probe, identification, or controller bring-up rather than in steady-state reads and writes.

Kernel code samples

Here are small illustrative code samples that show the shape of what these drivers and subsystems do. These are intentionally simplified, but they mirror the responsibilities you see in Linux: probe the device, identify capabilities, register with the right subsystem, and expose a stable interface to the rest of the system.

SPI NOR probe for platform firmware flash

static int ai_spi_nor_probe(struct spi_device *spi)
{
    struct spi_nor *nor;
    int ret;

    nor = devm_kzalloc(&spi->dev, sizeof(*nor), GFP_KERNEL);
    if (!nor)
        return -ENOMEM;

    nor->dev = &spi->dev;
    spi_nor_set_flash_node(nor, spi->dev.of_node);

    ret = spi_nor_scan(nor, NULL, NULL);
    if (ret)
        return ret;

    ret = mtd_device_register(&nor->mtd, NULL, 0);
    if (ret)
        return ret;

    dev_info(&spi->dev, "NOR flash ready for firmware slots\n");
    return 0;
}

AI-server meaning: this is the path that turns a raw boot flash chip into an MTD-managed device the platform can use for BIOS, BMC, or recovery-image handling.

Raw NAND bring-up with ECC setup

static int ai_nand_probe(struct platform_device *pdev)
{
    struct nand_chip *chip;
    struct mtd_info *mtd;
    int ret;

    chip = devm_kzalloc(&pdev->dev, sizeof(*chip), GFP_KERNEL);
    if (!chip)
        return -ENOMEM;

    chip->ecc.engine_type = NAND_ECC_ENGINE_TYPE_ON_HOST;
    chip->ecc.size = 1024;
    chip->ecc.strength = 8;

    ret = nand_scan(chip, 1);
    if (ret)
        return ret;

    mtd = nand_to_mtd(chip);
    return mtd_device_register(mtd, NULL, 0);
}

AI-server meaning: this is more common on embedded service processors than on the main host storage path, but it shows why raw NAND needs ECC and geometry setup before Linux can trust it.

NVMe controller initialization for NAND-backed SSDs

static int ai_nvme_probe(struct pci_dev *pdev,
                         const struct pci_device_id *id)
{
    struct nvme_ctrl *ctrl;
    int ret;

    ret = pcim_enable_device(pdev);
    if (ret)
        return ret;

    ret = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64));
    if (ret)
        return ret;

    ctrl = nvme_init_ctrl(&pdev->dev);
    if (IS_ERR(ctrl))
        return PTR_ERR(ctrl);

    ret = nvme_alloc_admin_tags(ctrl);
    if (ret)
        return ret;

    ret = nvme_start_ctrl(ctrl);
    if (ret)
        return ret;

    return nvme_scan_ns_list(ctrl);
}

AI-server meaning: this is the host-side path that makes SSD capacity appear for checkpoints, model artifacts, local caches, and fast node reprovisioning. Linux is bringing up an NVMe controller, not managing NAND directly.

Block queue registration for storage traffic

static int ai_nvme_alloc_io_queues(struct nvme_ctrl *ctrl)
{
    int qid;

    for (qid = 1; qid <= ctrl->queue_count; qid++) {
        struct blk_mq_tag_set *set = &ctrl->tagset[qid - 1];

        set->ops = &ai_nvme_mq_ops;
        set->nr_hw_queues = 1;
        set->queue_depth = ctrl->q_depth;
        set->numa_node = dev_to_node(ctrl->dev);

        if (blk_mq_alloc_tag_set(set))
            return -ENOMEM;
    }

    return 0;
}

AI-server meaning: once the controller exists, Linux still has to create the queues that move I/O. Queue depth, NUMA placement, and completion handling affect how smoothly storage feeds GPU jobs.

Simple MTD firmware-slot write path

static int ai_firmware_slot_write(struct mtd_info *mtd,
                                  loff_t off,
                                  const u8 *image,
                                  size_t len)
{
    size_t written = 0;
    int ret;

    if (len > mtd->size)
        return -EINVAL;

    ret = mtd_erase(mtd, &(struct erase_info) {
        .addr = 0,
        .len  = mtd->size,
    });
    if (ret)
        return ret;

    ret = mtd_write(mtd, off, len, &written, image);
    if (ret || written != len)
        return -EIO;

    return 0;
}

AI-server meaning: this is the kind of tiny path that controls whether a node comes back after maintenance. Small firmware-write routines can have far more operational leverage than their code size suggests.

The hidden vendor stack

The most under-discussed part of flash in AI servers is that “flash” is not one vendor decision. It is a stack of vendor decisions layered on top of one another. The NAND manufacturer, the SSD controller vendor, the server OEM, the BMC silicon vendor, the BIOS supplier, and the accelerator platform integrator all influence the behavior that operators eventually experience as “storage reliability” or “boot stability.”

1. The firmware-island problem

A modern AI server is full of tiny flash-backed control domains. The motherboard boot ROM may be on one NOR device. The BMC may boot from another. NICs, DPUs, retimers, storage controllers, and even SSDs themselves can each carry independent firmware images and update lifecycles. That means fleet reliability can fail at the seams between these firmware islands rather than at the main host OS.

Why this matters operationally: a thousand-node training cluster can be blocked by a bad BMC rollout, a NIC firmware regression, or a board-level recovery-path failure even when the GPUs, CPUs, and SSD media are perfectly healthy.

2. The SSD controller is the real storage personality

People often compare NAND types as if the media alone defines behavior. In practice, enterprise SSD behavior is dominated by controller firmware policy: the flash translation layer, garbage collection strategy, overprovisioning choices, thermal policy, queue handling, recovery behavior, and power-fail logic. In other words, the SSD controller firmware is often the hidden operating system underneath AI storage.

That is why two drives with nominally similar 3D NAND can behave very differently under checkpoint bursts, mixed read-write pressure, or long-lived cache workloads. The media matters, but the controller policy often matters more.

3. The vendor map that actually matters

The most useful way to think about vendors is by layer, not by a flat list of brand names.

NAND manufacturers: vendors such as Samsung, Kioxia, Micron, SK hynix, Solidigm, and Western Digital / SanDisk shape density, endurance classes, and enterprise SSD supply.
SSD controller vendors: vendors such as Phison, Marvell, and Silicon Motion, along with in-house controller teams at major SSD vendors, shape the actual runtime behavior the host sees.
Platform firmware vendors: suppliers such as AMI, Phoenix, and Insyde influence BIOS and UEFI behavior, while ASPEED is a major force in BMC silicon and out-of-band management.
System OEMs and ODMs: vendors such as Dell, HPE, Lenovo, Supermicro, Gigabyte, Quanta, and Wiwynn decide NVMe topology, cooling, serviceability, and firmware integration quality.
AI platform integrators: ecosystems built around NVIDIA, AMD, Intel, and hyperscaler reference designs influence how much local flash exists per node and which workloads that flash tier must absorb.

4. AI creates a strange endurance profile

AI workloads do not look like traditional enterprise storage mixes. Checkpoints create huge sequential write bursts. Model rollout and local cache warming can create heavy read waves. Telemetry and logging create smaller steady writes. Some systems add intermediate spill or artifact churn on top. The result is a wear profile that can be surprisingly asymmetric and much harsher than generic enterprise assumptions.

This is why write amplification, spare area, drive write per day ratings, and controller garbage-collection policy deserve much more attention in AI storage design than they usually get in high-level infrastructure discussions.

5. Thermals and throttling are often misdiagnosed as software issues

AI operators often think in terms of GPU thermals and rack power, but local flash thermals matter too. Enterprise SSDs can throttle under sustained heavy writes, sustained random I/O, or dense front-bay thermal conditions. When they do, model load times stretch, checkpoint latency grows, and the symptoms often look like a software regression rather than a storage thermal event.

The hidden failure mode: a hot SSD bay can silently convert a storage design problem into what looks like a flaky dataloader, a slow model rollout, or a mysteriously inconsistent checkpoint pipeline.

6. Power-loss protection is not a boring enterprise checkbox

Power-loss protection on enterprise SSDs matters a lot more in AI than people usually admit. When a node is checkpointing or writing deployment artifacts, a sudden power event without robust protection can corrupt in-flight metadata or partially committed writes. In large fleets, those edge cases become inevitabilities rather than hypotheticals.

This is one of the reasons consumer SSD assumptions map badly onto AI infrastructure. The cost of a corrupted checkpoint, a damaged local cache, or a recovery-path failure is far larger than the sticker price difference between a consumer-style drive and a real enterprise one.

7. NOR is tiny, but existential

NOR rarely gets attention because it is small in capacity and invisible during normal operation. But from a fleet-operations perspective, NOR is existential. If that tiny device holds the wrong firmware, the node may fail to boot, fail to recover, or fail to rejoin the cluster after maintenance. NAND shortages hurt capacity planning; NOR mistakes hurt bootability.

The smallest flash device in the server may be the one with the highest operational leverage.

Practical takeaways

If you are a platform engineer, you care about NOR because it defines boot safety, rollback, recovery, and secure-update behavior. If you are a storage or kernel engineer, you care about whether the flash is raw or controller-managed, because that determines whether you work in the MTD world or the NVMe world. If you are an AI systems engineer, you mostly care about NAND-backed SSD behavior because that is the tier that feeds checkpoints, local caches, and deployment artifacts.

The one-sentence summary: NOR flash keeps the AI server bootable; NAND flash keeps the AI server practical.

NOR flash is the read-mostly firmware medium for boot and service processors.
NAND flash is the density-first storage medium that usually appears as SSD capacity.
2D NAND is the older planar form; 3D NAND is the stacked form that made modern high-capacity enterprise SSDs viable.
Linux usually supports raw flash through MTD, but supports SSD-backed NAND through NVMe and the block layer.

nor-vs-nand-flash-ai-servers-linux-drivers.html · May 2026 · ← All writings