Inside the multilayer ceramic capacitor — a fingernail-sized passive component made of ancient minerals, manufactured in the billions, and silently responsible for keeping the world's most powerful chips alive.
A single NVIDIA H100 GPU costs $30,000. The rack that houses it costs more still. The data center cooling system, the fiber network, the custom silicon — billions of dollars of engineering stacked together. But holding it all together, hidden beneath every processor, is a component that costs less than a cup of coffee.
The Multilayer Ceramic Capacitor, or MLCC, is an unglamorous rectangle of ceramic and metal typically the size of a grain of sand. It is boring by design. It does one thing: it smooths out electrical noise on power lines. But that one thing — voltage stabilization — is existential for modern computing.
Remove the MLCCs from an AI server and the chips inside would crash within nanoseconds. They cannot tolerate power fluctuations that the human eye would never detect. MLCCs absorb and release charge so fast and so precisely that they effectively make each chip's power supply appear perfect — a constant, calm, unwavering voltage — in a physical environment that is anything but.
"Every AI server contains tens of thousands of MLCCs. A single H100 GPU board alone can carry over 10,000 of them. They are not optional. They are as fundamental as the transistors they protect."
A capacitor is a device that stores electrical charge between two conductive plates separated by an insulating material (the dielectric). The more layers, the more surface area, the more capacitance. The MLCC simply takes this principle and stacks hundreds of ceramic-and-metal layers into a monolithic block the size of a sesame seed.
The insulating material between the plates is the heart of the MLCC. It is almost always a barium titanate (BaTiO₃)-based ceramic. The reason is remarkable: barium titanate is a ferroelectric material, meaning it has a spontaneous electric polarization that can be reversed. This property gives it a dielectric constant many thousands of times higher than air, allowing enormous capacitance in a tiny volume.
The ceramic is deposited as slurry, dried into thin sheets (sometimes just 1–3 micrometers thick), and then stacked with printed metal electrode layers in between. The entire stack is compressed and fired in kilns at around 1,200°C — a process called sintering — which fuses everything into a single monolithic block.
| EIA Code | Size (mm) | Typical Use | Notes |
|---|---|---|---|
| 0201 | 0.6 × 0.3 | Mobile SoCs, fine-pitch BGA | Requires advanced pick-and-place |
| 0402 | 1.0 × 0.5 | Consumer electronics, AI GPUs | Most common in AI server boards |
| 0603 | 1.6 × 0.8 | General purpose, decoupling | Easier to handle and inspect |
| 0805 | 2.0 × 1.25 | Power conversion, bulk filtering | Higher capacitance available |
| 1206 | 3.2 × 1.6 | High-voltage, power supply | Used in VRM input filtering |
| 1210 | 3.2 × 2.5 | Bulk capacitance near power rails | Can reach 100µF+ |
A modern GPU like the H100 contains over 80 billion transistors. Each transistor is a tiny switch that toggles between on and off billions of times per second. Every time billions of transistors switch simultaneously, they demand a sudden surge of current from the power supply — a phenomenon called a current transient.
The power supply (voltage regulator module, or VRM) cannot respond instantly. Its inductance and physical distance mean current delivery lags behind demand by tens or hundreds of nanoseconds. During that lag, voltage droops. If voltage droops below the chip's minimum operating threshold — even for a few nanoseconds — the chip makes computation errors or crashes entirely.
MLCCs solve this by acting as local charge reservoirs placed within millimeters of the chip. When the chip demands current, the MLCC discharges instantly — far faster than the VRM can respond. When the transient is over, the VRM recharges the MLCC for the next event. The chip sees a smooth, stable voltage. The MLCC absorbs all the violence of the switching event.
Electrolytic capacitors (the cylindrical ones in older electronics) use a liquid electrolyte and have significant parasitic inductance and resistance. They are good for bulk energy storage but too slow for high-frequency transients.
MLCC's ceramic dielectric, combined with its extremely short internal current paths (the layers are micrometers thick), gives it an Equivalent Series Inductance (ESL) in the range of 0.5–2 nanohenries. This is what makes them effective at hundreds of MHz — exactly the frequency range of modern processor switching events.
An AI server is not one computing device — it is a symphony of interdependent subsystems, each with its own power integrity demands. MLCCs appear at every layer, in different sizes, capacitances, and dielectric classes.
The densest concentration of MLCCs in any AI server is directly around the GPU package. These capacitors — often in 0402 or 0201 packages — are placed in a ring within 5mm of the die edge, and sometimes even in the package substrate itself (embedded capacitors). Their job: supply instantaneous current during compute bursts when thousands of CUDA cores activate simultaneously.
The VRM converts 48V or 12V supply rail voltage down to the GPU's core voltage (typically 0.8–1.1V). This conversion is noisy. Large MLCCs (0805, 1206) on the input and output of the switching converter filter the high-frequency switching noise that would otherwise propagate to the GPU.
HBM3 memory in GPUs like H100 runs at extremely high bandwidth (3.35 TB/s per chip). The memory interface is sensitive to noise. MLCCs are placed densely around each HBM stack to maintain clean power at the memory I/O circuits.
NVLink (used for GPU-to-GPU communication at 900 GB/s) and PCIe Gen5 interfaces contain serializer/deserializer (SerDes) circuits running at tens of gigabits per second. These are noise-intolerant. MLCCs in the power domains of these circuits prevent bit errors from power supply noise.
The host CPU(s) — typically dual AMD EPYC or Intel Xeon — have their own extensive MLCC populations. Modern server CPUs with up to 96 cores switching at multi-GHz frequencies create substantial transients. DDR5 memory requires tight voltage regulation (1.1V ±2%), enforced by capacitor arrays along each DIMM slot.
Not all MLCCs are created equal. Their dielectric material determines their electrical behavior — and engineers must choose carefully based on where in the circuit the capacitor lives.
| Dielectric Code | Temp Range | Capacitance Variation | Typical C Range | Primary AI Server Use |
|---|---|---|---|---|
| C0G (NP0) | −55 to +125°C | ±30 ppm/°C (negligible) | 1pF – 100nF | Oscillators, PLL filters, RF bypass |
| X7R | −55 to +125°C | ±15% | 1nF – 47µF | General decoupling near CPUs/GPUs |
| X5R | −55 to +85°C | ±15% | 1µF – 100µF | High-cap bulk decoupling on PCB |
| Y5V | −30 to +85°C | +22%/−82% | Up to 470µF | Rare in servers; high cap, poor stability |
Engineers do not place one type of capacitor and call it done. Modern AI chip power delivery uses a three-stage capacitor hierarchy, each stage handling a different frequency range of power transients:
The MLCC industry is one of the most concentrated in all of electronics. Three Japanese companies — Murata, TDK/EPCOS, and Taiyo Yuden — control approximately 70% of global MLCC production. South Korean Samsung Electro-Mechanics and Taiwanese Yageo are also major players. There are almost no significant Western manufacturers.
The concentration of production in Japan creates a genuine supply chain risk that the AI industry is only beginning to grapple with. In 2018, a severe MLCC shortage drove prices up 400–600% on certain components. Automotive and consumer electronics companies scrambled to secure supply. AI server manufacturers are now their own kind of mega-customer, ordering MLCCs in quantities previously unheard of for a single end-use application.
"A single NVIDIA DGX H100 system contains 8 GPUs. Each GPU board alone may require 10,000+ MLCCs. One DGX rack — 100,000+ capacitors. A hyperscale cluster of 100,000 GPUs? You're looking at over a billion MLCCs."
Despite their simplicity, MLCCs can fail — and when they do in an AI server, the results can be catastrophic or silent and slow.
The trends in AI computing are pushing power delivery to its limits. The H100 consumes 700W. The next-generation Blackwell B200 exceeds 1,000W. Future chips may demand 2,000W or more. At these power levels, even the best PCB-mounted MLCCs are not fast enough — the inductance of the board traces and package connections creates voltage droop that cannot be filtered away.
Chip makers are exploring integrating capacitors directly into the package substrate — and eventually into the silicon itself. Deep Trench Capacitors (DTCs) are etched into silicon wafers, creating capacitance with near-zero inductance. Intel has announced its "PowerVia" backside power delivery technology, which places power delivery on the back of the die, allowing capacitors to be placed directly below the active transistors.
In 2.5D and 3D IC packaging (used in HBM-on-GPU configurations), MLCCs are now being embedded directly into the interposer — the silicon or organic bridge connecting the chips. This reduces the distance from capacitor to transistor from millimeters to micrometers, dramatically improving transient response.
There is something almost philosophical about the MLCC. In a world obsessed with the most advanced, most expensive, most complex technology ever built, the thing that keeps it all running is a grain-of-sand-sized block of ancient ceramic — barium, titanium, oxygen — sintered at 1,200 degrees and placed by the thousands with sub-millimeter precision.
The AI revolution is not only built on transformer architectures and massive datasets and GPU compute. It is built on meticulous, unglamorous power integrity engineering. Every token generated by a large language model was enabled by thousands of tiny ceramic capacitors quietly absorbing voltage transients nobody can see, operating at speeds faster than any mechanical system can contemplate, for years at a time, without complaint.
They are not magic. They are just physics, relentlessly applied. And the next time someone tells you the future runs on silicon, you can tell them it also runs on barium titanate — and a whole lot of it.
"The most important component in an AI server is not the GPU. It is the capacitor that makes sure the GPU can do its job without flinching."