Quantum Series #5: Quantum Simulation — Why Chemistry and Pharma Actually Care

1

Why simulate molecules at all?

Chemistry runs on the energy landscape of molecules. Reaction barriers, binding affinities, spectra, acidity, redox potentials — all are energy differences between electronic states.

Problem statement:

Given atomic positions, find the ground-state electronic energy $E_0$ to within chemical accuracy:

$\approx 1$ kcal/mol $\approx$ 1.6 millihartree $\approx$ 0.043 eV

This tolerance is brutal: errors larger than this flip predictions about which reaction dominates, whether a drug binds, or whether a catalyst works. Classical methods either scale badly with electron count or use approximations that fail for strongly correlated systems — the interesting ones.

Classical Bottleneck

Exact diagonalization (FCI) costs $\mathcal{O}(\binom{M}{N})$ where $N$ electrons in $M$ orbitals. FeMoco needs ~$10^{35}$ determinants. DFT is $\mathcal{O}(N^3)$ but uncontrolled error for strong correlation.

Quantum Promise

Store the wavefunction in $\mathcal{O}(N)$ qubits. With fault tolerance, phase estimation gives $E_0$ to $\varepsilon$ in $\mathcal{O}(1/\varepsilon)$ time. Exponential compression of Hilbert space.

2

The problem's math structure

Electronic Hamiltonian in Second Quantization

$$ H = \sum_{pq} h_{pq} \, a_p^\dagger a_q + \frac{1}{2}\sum_{pqrs} h_{pqrs} \, a_p^\dagger a_q^\dagger a_r a_s + E_{\text{nuc}} $$

$h_{pq}$

One-electron integrals: kinetic + nuclear attraction. $\mathcal{O}(M^2)$ terms.

$h_{pqrs}$

Two-electron integrals: electron repulsion. $\mathcal{O}(M^4)$ terms dominate cost.

$a_p^\dagger, a_q$

Fermionic operators. Must map to qubits: Jordan-Wigner or Bravyi-Kitaev.

Why this is hard classically:

The Hilbert space dimension is $\binom{M}{N}$ ~ $M^N$ for half-filling. Storing one vector for FeMoco (~200 spin-orbitals, ~100 electrons) needs ~$10^{35}$ amplitudes. That's more bits than atoms in the observable universe.

3

Why quantum resources help

Resource flip

A quantum computer doesn't simulate the wavefunction by writing it down. It is the wavefunction. Preparing $|\psi_0\rangle$ with overlap $|\langle\psi_0|\phi\rangle|^2 = \gamma$ lets you extract $E_0$ using phase estimation.

Cost: $\tilde{\mathcal{O}}(\gamma^{-1} \varepsilon^{-1})$

vs classical FCI: $\exp(N)$ memory

Where the win actually is

✓Strong correlation: Transition metal centers, bond breaking, excited states — where DFT fails
✓Active spaces: CAS(16,16) to CAS(24,24) — beyond classical CASSCF
✗Weak correlation: HF/DFT already good; no advantage
✗Astrochemistry scale: 10³⁶ conformers — still sampling problem

4

NISQ vs Fault-Tolerant: not the same game

Aspect	NISQ Era (now-2035)	Fault-Tolerant (2035+)
Qubits	50-1000 noisy, no QEC	10⁶ physical → 10³⁻⁴ logical, QEC
Gate depth	10²-10³ before decoherence	10⁶-10⁹ via QEC
Algorithm	VQE, QAOA, QSCI — variational, shallow	QPE, qubitization — rigorous, deep
Guarantees	Heuristic. No convergence proof. Barren plateaus.	Provable in 1/ε. Requires state prep.
Use case	Benchmarks, small active spaces, learning	FeMoco, nitrogenase, catalyst design

Reality check: VQE "failed" at scale means...

VQE can't reach chemical accuracy for FeMoco-class problems. Gradient variance decays exponentially in qubit count — barren plateaus. Optimization landscape becomes a golf course. You need exponentially many shots to find the hole.

But VQE remains critical for NISQ-era hardware calibration, error characterization, and benchmarking. Running VQE on H₂/LiH validates control systems, measures coherence, and tests compiler stacks. It's the 'hello world' that proves your chip isn't dead. But it's not the path to 200-qubit chemistry.

5

Algorithms: QPE vs VQE vs QSCI

QPE

Fault-Tolerant

Quantum Phase Estimation. Apply $U = e^{-iHt}$ to $|\psi\rangle$, read out phase via inverse QFT.

Depth:$\mathcal{O}(1/\varepsilon)$

Qubits:$n + \mathcal{O}(\log 1/\varepsilon)$

Success prob:$|\langle\psi|\phi_0\rangle|^2$

Requires: Good initial state prep, deep circuits

VQE

NISQ

Variational Quantum Eigensolver. Quantum prepares $|\psi(\theta)\rangle$, classical minimizes $\langle H\rangle$.

Depth:Shallow, ansatz-dependent

Qubits:$n$ physical

Shots:$\mathcal{O}(1/\varepsilon^2)$

Problem: Barren plateaus, local minima, ansatz expressivity

QSCI

Hybrid

Quantum Selected CI. Sample quantum state, select important determinants, diagonalize classically.

Depth:Shallow prep + sampling

Classical:$\mathcal{O}(s^3)$ subspace diag

Shots:$M$ to find basis

Trade: Heuristic sampling, but avoids deep circuits

QSCI = Quantum Selected CI: Step-by-step

1

Use shallow quantum circuit to prepare approximate ground state $|\psi\rangle$. This is your cheap ansatz — maybe ADAPT-VQE or hardware-efficient — trained for a few iterations.

2

Measure $|\psi\rangle$ in computational basis $M$ times → get bitstrings. Each bitstring is a Slater determinant: $|001101... \rangle$ means orbitals 3,4,6 occupied.

3

Classically select important determinants from bitstrings. Take top $s$ most frequent or use importance sampling. You now have a subspace basis: $\{|D_i\rangle\}_{i=1}^s$.

4

Build reduced Hamiltonian in that subspace: $H_{ij} = \langle D_i | H | D_j \rangle$. Compute matrix elements classically — only need 1- and 2-body integrals.

5

Diagonalize the $s \times s$ matrix classically → energy. $E = \text{min eig}(H_{\text{sub}})$. This is exact in your selected subspace.

So what:

Quantum finds the important basis states, classical does the linear algebra. Classical cost: $\mathcal{O}(s^3)$ for $s$ selected determinants. If $s \sim 10^4$, this is laptop-scale. This is why QSCI is NISQ-relevant: short circuits + classical post-processing.

Trade-off: Heuristic, no guarantee you sampled the right subspace. If the true ground state has weight on determinants you never measured, your energy is wrong. Bias vs variance.

Method	Quantum Cost	Classical Cost	Best for	Status 2025
QPE	Deep circuits, QEC	poly($n$)	Fault-tolerant, large active spaces	Future
VQE	Shallow, many shots	poly($n$) for optimization	NISQ demos, H₂, LiH, BeH₂	Active
QSCI	Sampling only	$\mathcal{O}(s^3)$ diagonalization	NISQ intermediate, Cr₂, N₂	Emerging

6

Key explainers: Ansätze, overlaps, and embeddings

7

Why you need chemical accuracy, and barren plateaus

Chemical Accuracy Threshold

1 kcal/mol = 0.043 eV = 1.6 mHa

Error in activation energy → 10× error in rate at 300K via $k \propto e^{-E_a/RT}$

DFT typical error: 3-5 kcal/mol

Good for geometries, fails for barrier heights of transition metal reactions

Quantum goal: ±0.5 kcal/mol

Predictive for drug binding $\Delta G$, selectivity in catalysis

Barren Plateaus in VQE

Theorem: For random circuits, $\text{Var}[\partial_\theta C] \in \mathcal{O}(\text{poly}(n)^{-1} \cdot 2^{-n})$

Implication: Gradient descent needs $2^n$ shots to see signal. VQE dead for $n > 50$ with hardware-efficient ansätze.

Mitigations: Layer-wise training, parameter initialization, symmetry-preserving ansätze, or give up and use QSCI/QPE.

8

What chemistry cares about

Reaction Barriers

$\Delta E^\ddagger$ determines rate. Need <2 kcal/mol error. DFT fails for multi-reference transition states.

Use case: Catalyst design

Binding Affinities

$\Delta G_{\text{bind}} = \Delta H - T\Delta S$. Quantum for $\Delta H_{\text{el}}$, classical for entropy. Critical for drug design.

Use case: Hit-to-lead

Excited States

UV-Vis, fluorescence, photochemistry. TD-DFT errors 0.3-0.5 eV. Need EOM methods on quantum.

Use case: OLEDs, dyes

Redox Potentials

$E^\circ = -\Delta G / nF$. Transition metals in proteins. Self-interaction error in DFT shifts by 0.5V.

Use case: Battery materials

Spin States

High-spin vs low-spin Fe in enzymes. Energy gaps ~1-5 kcal/mol. DFT functional-dependent.

Use case: Hemoglobin, P450

Force Constants

Hessian for IR spectra, zero-point energy. Need analytic gradients from quantum simulation.

Use case: Spectroscopy

9

Pharma pipeline: where quantum fits

1

Target ID & Validation

Classical

Genomics, proteomics, CRISPR screens. Quantum has no role — this is biology, not chemistry.

2

Hit Discovery

Hybrid

HTS of 10⁶ compounds. Classical docking + ML. Quantum for rescoring top 1000: accurate $\Delta G_{\text{bind}}$.

Quantum value:

Reduce false positives. Classical MM-GBSA error ~2 kcal/mol → 30× affinity error. Quantum could cut to <1 kcal/mol.

3

Lead Optimization

Quantum Sweet Spot

SAR for 10-100 analogs. Need accurate relative energies, metabolytes, pKa, logP. This is the problem quantum was built for.

Quantum value:

First-principles pKa ±0.2 units. CYP450 metabolism barriers. Covalent inhibitor战ality. Replace 6-week synthesis + assay cycle with 1-day compute.

4

Preclinical → Clinical

Classical

ADME/T, tox, formulation, trials. Classical MD, PK/PD models, statistics. Quantum irrelevant — too many atoms, too much chaos.

The Economic Case

Drug discovery costs ~$2.6B, 10-15 years. 90% failure rate. If quantum simulation in lead-opt cuts cycle time 20% and improves affinity prediction 2×, expected value: ~$500M per approved drug. Break-even for pharma if quantum cloud costs <$50M/year. Currently: NISQ can't deliver. FTQC 2040+: plausible.

10

Quantum+MM Workflows: Hybrid is the answer

QM/MM/QSCI Stack for Enzyme Active Site

QM

Quantum Core: 50-100 atoms

Ligand + metal center + coordinating residues. Run on quantum computer: VQE for NISQ, QPE for FTQC. Output: $E_{\text{QM}}$, $\rho(r)$, forces.

Qubits: 100-200 Depth: 10³-10⁶

MM

Molecular Mechanics Shell: 10⁴-10⁵ atoms

Protein + water + membrane. Classical force field: AMBER/CHARMM. Provides electrostatic embedding $V_{\text{MM}\to\text{QM}}$, steric constraints.

Cost: O(N) GPU: ms/step

SCF

Self-Consistent Field Iteration

1. QM density → MM polarizes → Update $V_{\text{ext}}$. 2. Repeat until $\Delta E < 10^{-6}$ Ha. 3-5 cycles typical.

$E_{\text{total}} = E_{\text{QM}}[\rho; V_{\text{MM}}] + E_{\text{MM}} + E_{\text{QM/MM}}^{\text{coupling}}$

Why this works

→Separation of scales: Chemistry happens in 5Å radius. Rest is bath. Quantum computer handles hard part, classical handles big part.
→Error budget: QM error 0.5 kcal/mol, MM error 1 kcal/mol, coupling error 0.3 kcal/mol → total ~1.2 kcal/mol. Still beats DFT.
→Resource fit: 200 qubits handles CAS(20,20) active space. Need FTQC for full protein, but not for active site.

11

Bridge to Medical AI: From molecules to patients

Quantum simulation doesn't replace medical AI — it feeds it. High-quality quantum data trains ML models that then screen millions of compounds. The pipeline:

Quantum → ML

Quantum computes ground-truth energies for 100-1000 molecules. Train message-passing NN or Δ-ML model. Now predict energies for millions at DFT cost but quantum accuracy.

ANI, MACE, NequIP: E = NN(R) trained on CCSD(T)

ML → Clinic

ML models predict binding, toxicity, metabolites. Feed into higher-level models: patient response prediction, trial design, dosage optimization. Quantum data → better drugs → better outcomes.

Example: Quantum pKa → ML bioavailability → Clinical success