The Machines That Build the Machines Pt. 3 — Beyond Lithography

Sources & Method

This series synthesizes public filings (ASML, Applied Materials, Lam Research, Tokyo Electron, KLA, SCREEN, DISCO, Besi, EVG FY2023–2025 10-K/20-F), supplier teardown reports, SEMI WFE data, and interviews with process integration engineers. Share ranges are triangulated across Gartner, VLSI Research, and company reported installed base; they are purposefully qualified (e.g., ~45–55%) because tool markets are lumpy. All node references refer to high-volume logic (N3/N2 class). Numbers are directional, not investment advice.

1. The Pattern Is Not the Product

Lithography prints a shape in photoresist. That shape is roughly 30nm wide at N3, soon 14nm at N2. It is not a transistor. To become a transistor, that shape must be transferred into silicon, wrapped in a dielectric 0.7nm thick (about 3 atoms), doped with ~1e19 atoms/cm³ to within 2nm of an interface, and connected by 14–16 layers of copper wiring, each planarized to <1nm. Lithography enables; deposition, etch, CMP, implant, and clean execute.

In a leading-edge logic flow, a wafer sees ~1,000–1,500 process steps. Lithography accounts for ~80–120 of them. The other ~900 are non-litho. They consume ~70–75% of the $22–26B WFE for a 100K WSPM 2nm fab, and ~80% of the cycle time. They also consume >95% of the atomic-scale precision budget.

Synthesis: Winning in semis is no longer about who has EUV. Everyone who matters will have EUV by 2026. Winning is about who can deposit 3 atoms uniformly, etch 60:1 aspect ratio without bowing, polish to 2 angstroms, and measure it all in-line at 100 wafers/hour.

2. Deposition: Building Up, One Layer at a Time

Deposition is ~22–25% of WFE. Two dominant modes: Chemical Vapor Deposition (CVD/PECVD) for bulk films, and Atomic Layer Deposition (ALD) for conformal angstrom control. Applied Materials leads overall at ~35–40% share, followed by Lam (~15–20%), Tokyo Electron (~15%), and ASM International (~10–12% in ALD).

At N2, the gate stack requires a high-k dielectric (HfO2-based) at 1.2–1.5nm EOT, deposited by thermal ALD with uniformity <0.5% 3σ across 300mm, and metal workfunction layers (TiN, TiAlC) at 0.5–1.0nm. One extra atom changes Vt by >30mV. ALD tools therefore run at 250–350°C with precursor pulse/purge cycles of 0.5–2 seconds, with in-situ QCM and OES monitoring. Throughput: ~40–60 WPH for a 4-chamber system.

Epitaxy for SiGe source/drain is another chokepoint. At GAA nanosheets, you grow SiGe/Si superlattices with <1% Ge control, then selectively etch SiGe to release sheets. Applied’s Centura Epi and ASM’s Intrepid control temperature to ±0.5°C to manage strain. A 1°C drift changes Ge incorporation by 0.3% and shifts PMOS performance by ~5%.

Synthesis: Deposition is where materials science becomes manufacturing. The moat is not the chamber, it is the recipe library: 15+ years of precursor-surface kinetics that determine nucleation on a 3D structure.

3. Etch: Carving at the Atomic Limit

Etch is ~20–23% of WFE, the most concentrated duopoly: Lam Research ~45–55%, Tokyo Electron ~25–30%, Applied ~15–20%. Dielectric etch, conductor etch, and atomic layer etch (ALE) are now distinct disciplines.

At N2, you etch a ~60nm deep, ~8nm wide contact hole through 5 materials with selectivity >100:1 to SiN liner, leaving <1nm sidewall damage. That's done with pulsed plasma at 10–30mTorr, using fluorocarbon chemistry with bias power <50W to control ion energy to <30eV. Lam's Syndion and TEL's Trias etchers use 2–10MHz RF with 10µs pulsing to reduce charging.

ALE removes ~0.5–1.0 angstrom per cycle. A typical gate etch now is 30 cycles of ALE + smoothed clean, not a continuous plasma. Throughput drops from 250 WPH to ~50 WPH, but variability drops by 3x. This is why etch tools per wafer have doubled from N7 to N2.

Synthesis: Etch is where Moore's Law lives or dies. You cannot EUV your way out of a bowed profile or striated sidewall.

4. Metrology and Inspection: The Closed Loop

KLA dominates at ~50–55% of process control, with Applied (~15%), ASML HMI (~10%), and Hitachi (~10%). A 2nm flow requires >500 measurement steps. Critical dimensions are measured with CD-SEM at <0.3nm precision; overlay with DUV scatterometry at <0.5nm; films with spectroscopic ellipsometry at 0.1 angstrom.

E-beam inspection finds voids in copper <5nm across a 100mm² die. Each tool costs ~$60–80M and runs ~5 WPH. A fab may have 30–40. Without them, yield learning stretches from 6 months to 18.

Synthesis: Metrology is ~12% of WFE but controls 100% of yield ramp. The feedback loop is the product.

5. CMP: The Flattener

Without CMP, the rest of the fab loses focus—literally and economically—because every downstream step inherits accumulated topography error.

CMP is ~6–8% of WFE and looks simple: a pad, slurry, and pressure. It is not. Applied Materials owns ~60–70% with Mirra, Ebara holds ~25–30%. A leading logic wafer sees CMP 30–40 times: after STI, after each gate, after each of 14–16 metal layers.

The spec at N3: global non-uniformity <5 angstroms across 300mm, dishing <2nm in a 50µm copper pad, erosion <1nm, defectivity <0.01/cm² for scratches >20nm. Pads rotate at 80–120 rpm, downforce at 1–3 psi, slurry flow 150–250 ml/min with particles 30–80nm. In-situ eddy current measures metal clearance to <10 angstroms.

Historically, CMP existed for lithography depth-of-focus. At 193i, DOF ~80nm; at EUV, ~40nm. A 3nm step across field kills focus budget. That drove the first atomic planarization requirements.

CMP and Hybrid Bonding: Why 0.2nm RMS Matters

For thirty years, the purpose of CMP was simple: keep transistor layers flat enough for the next lithography exposure. A few nanometers of topography across a field could defocus the stepper and kill yield. That spec — global planarization to <2nm — was already brutal. But advanced packaging has now rewritten the requirement by an order of magnitude. It is no longer about focus. It is about atomic contact.

Hybrid bonding is the direct fusion of two wafers or dies without solder microbumps. Instead of ~40 micron pitch C4 bumps, copper pads and surrounding silicon dioxide are polished perfectly flat, activated, and pressed together at room temperature. Van der Waals forces pull the dielectrics into a single glass network; a subsequent low-temperature anneal grows copper grains across the interface. No underfill, no gap, no bump. The result is <1 micron pitch, routinely 0.8µm today and 0.4µm in R&D, with >10,000x the interconnect density of conventional packaging.

The process flow makes CMP the gatekeeper. First, the top metal layer is deposited and patterned with damascene copper pads embedded in SiO2 or SiCN. Then comes the critical CMP step: the wafer must be polished to <0.2nm RMS roughness on the oxide and <0.3nm on the copper, with total topography across any 1mm pad field held to <0.5nm. This is not selective polish; copper and dielectric must co-planarize to angstroms. After polish, the wafer goes through a post-CMP clean designed to remove slurry particles without corroding copper: dilute citric acid, megasonic DIW, and a final brush scrub targeting a particle budget of <10 defects per wafer for particles >30nm. Next is plasma activation, typically N2 or Ar at low power, to create hydrophilic –OH groups on the dielectric. A DIW rinse terminates the surface, then the two wafers are aligned to <100nm overlay accuracy and contacted at room temperature, 1 atm, in a Class 1 environment. The pre-bond is purely surface-energy driven. Finally, the pair is annealed at 150–300°C — most production flows use ~200–250°C for 2 hours — to expand and interdiffuse copper, achieving bond strength >2 J/m².

Why the angstroms matter becomes obvious when you model contact mechanics. Dielectric-dielectric bonding initiates first because the oxide fields are larger than the copper pads. Once those surfaces adhere, they literally pull the two wafers together with ~1–2 J/m² of surface energy. The recessed copper pads — intentionally dished by 2–5nm after CMP — are then forced into proximity. During anneal, copper expands ~0.5% more than oxide due to CTE mismatch, closing the gap and forming a metallic bond. If CMP leaves a 0.5nm step, a divot, or a particle >30nm, the dielectric cannot bridge. A void forms that no amount of pressure or temperature will close because there is no applied pressure; the bond is not thermo-compression, it is surface-energy driven. One 0.5nm asperity across a 300mm wafer can propagate into a millimeter-scale unbonded region and kill a high-bandwidth memory stack.

The physics amplifies CMP imperfections. Copper dishing must be held to <2nm, ideally 1–1.5nm, across patterned density variations from 10% to 90%. Erosion of the surrounding oxide must be <1nm. Why? Because at anneal, copper grains recrystallize and grow. If the pad is recessed >3nm, the expanding copper never touches its counterpart; you get a resistive open or a void filled with native oxide. If the pad is proud >1nm, it contacts early and prevents dielectric bonding, leaving a perimeter leak path. Grain size matters too: CMP-induced scratching creates nucleation sites that pin grains, reducing diffusion. Therefore post-CMP anneal of the copper itself is often done before bonding to enlarge grains to >200nm, making subsequent Cu-Cu diffusion more uniform.

This shifts equipment requirements dramatically. Traditional CMP tools were tuned for removal rate and within-wafer uniformity. For hybrid bonding, they become atomic sculpting platforms. Applied Materials’ Mirra Durum and Ebara’s F-REX 300 now run with in-situ optical interferometry and eddy-current endpoint that resolves thickness to <3 angstroms in real time, with pad conditioning cycles controlled by AI to hold removal rate drift <1% wafer-to-wafer. Slurry choice moves from high-rate silica to colloidal silica or ceria with <50nm particles and tightly controlled zeta potential to minimize scratching. Post-CMP cleaning, historically an afterthought, is now co-optimized: SCREEN and TEL single-wafer cleaners use dilute chemistries and Marangoni drying to achieve <10 adders >30nm, validated by SP7 inspection. The activation chamber must hold base pressure <1e-6 Torr with <5ppb O2 and H2O, because any re-oxidation kills hydrophilicity. Bonding tools from Besi and EVG now specify alignment <100nm 3-sigma, force control <5N, and temperature uniformity ±0.5°C across the chuck during pre-bond.

And this is where the subsystems from Section 8 re-enter the story. That <0.2nm RMS spec is not achieved by a polishing head alone. It requires a VAT high-speed vacuum valve holding chamber pressure stable to ±0.1% during plasma activation, because pressure spikes change ion density and activation uniformity. It requires an Edwards iXH dry pump with <1% speed variation and zero hydrocarbon backstreaming, because any oil or fluctuation outgasses onto the activated surface and creates hydrophobic spots. It requires MKS or VAT pressure control holding the activation chamber at 50mTorr ±0.5mTorr for 60 seconds, repeatably. The bond itself happens in a mini-environment purged with N2 and filtered to ISO Class 1; a single 50nm particle landing on the pad creates a 5-micron void after anneal. The 0.2nm target, in other words, is a whole-system stability problem, not a polish problem.

Economically, this changes what CMP is worth. Historically CMP was ~6–8% of WFE, a necessary planarization tax. With hybrid bonding, CMP becomes the enabler for packaging-led scaling. TSMC’s SoIC, Intel’s Foveros Direct, Samsung’s X-Cube, and all HBM3E/4 stacks depend on it. A hybrid-bonded HBM interface delivers >3TB/s/mm bandwidth at <0.5pJ/bit, versus ~0.3TB/s/mm for microbump. Without <0.2nm flatness, those stacks delaminate or have >10% voids, which fails reliability. Therefore leading OSATs and foundries now buy dedicated hybrid-bonding CMP tools at $8–10M each, plus matched cleaners and metrology (AFM, bond wave scanners), effectively adding $200–300M per 100K WSPM of 3D capacity. CMP is no longer a back-end step after transistors are done. It is a front-end enabler for the only scaling path left when transistors stop shrinking: stacking. This is why flatness moved from nanometers to angstroms, and why the flattener now sits at the center of the roadmap.

Synthesis: CMP was about depth of focus. Now it is about surface energy. That change raised the precision bar 10x and pulled the entire subsystem stack — pumps, valves, filters, cleaners — into the critical path.

6. Ion Implantation: Doping with Statistics

Implant is ~3–4% of WFE. Applied Materials (~65–70%) and Axcelis (~25–30%) dominate. At N2 GAA, you implant source/drain extensions with dose ~1e15 cm⁻² at <2keV, placing the peak <5nm deep with <1nm straggle. Beam current must be uniform to <0.5% or Vt shifts across wafer.

For power and image sensors, high-energy implant goes to 2–3MeV to form deep wells. Throughput varies from 300 WPH (low energy) to 30 WPH (high energy). Cryogenic implant (–100°C) is now used for amorphization to reduce defects.

Synthesis: Implant is brute force with statistical control. The moat is beam purity and thermal management.

7. Strip, Clean, and Thermal

Every etch and implant is followed by clean. SCREEN (~45%), TEL (~35%), and SEMES dominate wet clean. A single-wafer clean tool performs 15 chemical steps in 90 seconds: SPM, APM, DIO3, dilute HF, IPA dry. Particle removal efficiency must be >99% for 20nm particles without etching 0.1nm of SiGe.

Thermal is furnaces and RTP. TEL and Kokusai lead vertical furnaces for anneal at 400–1000°C with ±1°C uniformity across 150 wafers. RTP spikes to 1050°C in <1s for dopant activation.

8. The Subsystem Stack Nobody Sees

The tool is a shell. Inside: VAT vacuum valves (~70% share) cycling >2M times without particle generation; Edwards and Ebara dry pumps holding 10⁻³ Torr with <1% variation; MKS mass flow controllers metering 2 sccm of gas to ±0.5%; Advanced Energy RF generators pulsing at 10µs; Entegris filters removing >99.99999% of particles >5nm from chemicals.

A single EUV scanner has ~4,000 VAT valves, 50 Edwards pumps, 200 MKS MFCs. A deposition cluster has 12–16 pressure control loops holding ±0.2mTorr. Hybrid bonding tools add another layer: UHV chambers at <5e-9 Torr, all-metal seals, and copper-free pumps to avoid contamination. When a valve sticks for 0.1s, pressure spikes 2mTorr, plasma density shifts 3%, and etch rate drifts 1nm — enough to kill a nanosheet.

Synthesis: The angstrom regime is won in the subsystem layer. Tool OEMs integrate; subsystem specialists dictate variability.

9. Whole-System Visual

Flow is iterative, not linear. Metrology closes every loop. CMP is now upstream of packaging.

10. Oligopoly Map

Process	Leaders	Est. Share	Moat
Deposition CVD/ALD	Applied, Lam, TEL, ASM	AMAT ~35–40%	Precursor chemistry + chamber matching
Etch	Lam, TEL, Applied	Lam ~45–55%	Plasma pulsing recipes for HAR
CMP	Applied, Ebara	AMAT ~60–70%	Endpoint + pad/slurry co-design
Hybrid Bonding CMP/Clean	Applied/Ebara, SCREEN/TEL	Emerging	<0.2nm RMS stability
Metrology/Inspection	KLA, Applied, ASML HMI	KLA ~50–55%	Optics + ML defect classification
Implant	Applied, Axcelis	AMAT ~65–70%	Beam current uniformity
Clean	SCREEN, TEL	SCREEN ~40–45%	Damage-free particle removal
Subsystems	VAT, Edwards, MKS, Entegris	VAT ~70% valves	Uptime + contamination

11. Illustrative 2nm Fab — Where Money Goes

A 100K WSPM greenfield logic fab in 2026 is ~$23–26B total, with $18–20B in tools. Beyond lithography (~$4.5B, 8–10 EUV tools at ~$200M + 25 DUV):

Deposition: ~$4.0–4.5B (200+ chambers)
Etch: ~$3.8–4.2B (180+ chambers)
Metrology/Inspection: ~$2.2–2.6B (120+ tools)
CMP: ~$1.3–1.5B (60–70 platens, including 10–12 hybrid-bonding grade)
Clean: ~$1.1–1.3B
Implant/Thermal: ~$0.9–1.1B
Out of that, subsystems (pumps, valves, RF, filters, chucks) are ~$2.5–3B embedded inside tool pricing.

Add ~$200–300M for a dedicated hybrid-bonding line (CMP, clean, plasma activate, Besi/EVG bonders, IR inspection) per 100K 3D stacks. That line did not exist at N7.

Conclusion

Beyond lithography is where semis became atomic manufacturing. Deposition places atoms. Etch removes them with single-digit-angstrom damage. CMP flattens to areas larger than cities to tolerances smaller than proteins. Metrology verifies it all.

And now, with hybrid bonding, CMP has moved from supporting cast to protagonist. The <0.2nm RMS requirement forces the entire fab — slurry, pad, cleaner, pump, valve, filter, metrology — to behave as one distributed servo loop. That is why Applied and Ebara are investing in angstrom-level endpoint, why SCREEN re-architected post-CMP cleans for <10 particles, and why VAT and Edwards show up in packaging tool specs for the first time.

The machines that build the machines are no longer just printing patterns. They are engineering surfaces for surface-energy bonding, because when transistors stop shrinking, the stack takes over.