← All posts

Quantum Computing:
Structured Interference at Scale

Published · 12 min read
Quantum computers don't try every answer at once.

They choreograph interference so wrong answers cancel and right answers amplify. When that choreography is possible, you get speedup. When it isn't, you don't.

“Nature isn't classical, dammit, and if you want to make a simulation of nature, you'd better make it quantum mechanical.” — Richard Feynman

Thesis: This is not a universal speedup machine

Quantum computing is a model for exploiting structured interference over exponentially large state spaces. The hype ignores state preparation, error correction, and readout costs. This thesis frames every section that follows.

Problem Best Classical Quantum Caveats
Factoring 2048-bit RSA $2^{112}$ GNFS $\sim 10^8$ Toffoli ~20M physical qubits, surface code, 2D grid
Unstructured Search $O(N)$ $O(\sqrt{N})$ Oracle construction cost usually kills advantage
Linear Systems Ax=b $O(N^2.37)$ $O(\log N)$* *Hides $O(N)$ state prep. Only for sparse, well-conditioned
Chemistry Ground State Exp in accuracy Poly? Heuristic. No proof. Ansatz quality unknown.

1. The Postulates

Quantum mechanics rests on four postulates. They define the rules of the game: what a state is, how it evolves, what happens when you measure, and how systems compose.

1. State Space

An isolated physical system is associated with a complex Hilbert space. The system is completely described by a unit vector $|\psi\rangle$ in that space.

$|\psi\rangle = \alpha|0\rangle + \beta|1\rangle, \quad |\alpha|^2 + |\beta|^2 = 1$

2. Evolution

The evolution of a closed quantum system is described by a unitary transformation $U$.

$|\psi(t_2)\rangle = U(t_1, t_2) |\psi(t_1)\rangle, \quad U^\dagger U = I$

3. Measurement

Quantum measurements are described by Hermitian operators. Measuring $M = \sum_m m P_m$ gives outcome $m$ with probability $p(m) = \langle\psi|P_m|\psi\rangle$ and leaves the system in $P_m|\psi\rangle / \sqrt{p(m)}$.

4. Composite Systems

The state space of a composite system is the tensor product of the component state spaces.

$|\psi\rangle_{AB} = |\psi\rangle_A \otimes |\phi\rangle_B \in \mathcal{H}_A \otimes \mathcal{H}_B$

Why this matters

Unitarity = reversibility = no heat from erasure. Classical logic gates like AND are irreversible and dissipate $kT \ln 2$ per bit erased. Quantum gates are unitary matrices so computation is thermodynamically free until measurement. No-cloning theorem: you cannot copy unknown quantum states. This is why quantum error correction is hard — you can't just duplicate qubits like RAID.

2. Superposition

Intuition

The Hadamard gate puts you on the equator of the Bloch sphere. From $|0\rangle$ at the north pole, $H$ rotates you to $|+\rangle = \frac{|0\rangle + |1\rangle}{\sqrt{2}}$ — equal probability of measuring 0 or 1, but with phase information that lets amplitudes interfere later.

A qubit can exist in a linear combination of $|0\rangle$ and $|1\rangle$. This is not probability — it's amplitude. The difference: amplitudes can be negative and cancel.

superposition.py
from qiskit import QuantumCircuit, Aer, execute
from qiskit.visualization import plot_histogram

# Hadamard creates superposition: |0⟩ → (|0⟩ + |1⟩)/√2
qc = QuantumCircuit(1, 1)
qc.h(0)  # Equator of Bloch sphere
qc.measure(0, 0)

simulator = Aer.get_backend('qasm_simulator')
result = execute(qc, simulator, shots=1024).result()
counts = result.get_counts()

# Interference example: H-Z-H = Identity
# H puts in superposition, Z flips phase of |1⟩, 
# second H makes amplitudes interfere back to |0⟩
qc2 = QuantumCircuit(1, 1)
qc2.h(0)
qc2.z(0)  # Adds -1 phase to |1⟩ component
qc2.h(0)  # Interference: |0⟩ and -|0⟩ cancel for |1⟩ output
qc2.measure(0, 0)

Reality check

Measuring destroys superposition. You don't get to read all $2^n$ values. You get one sample from the probability distribution $|\alpha_i|^2$. The art is using interference before measurement to ensure the distribution peaks on the right answer. Without interference, superposition is just expensive randomness.

3. Entanglement

Some composite states cannot be written as tensor products of individual states. These are entangled. Bell states are the canonical examples.

$|\Phi^+\rangle = \frac{|00\rangle + |11\rangle}{\sqrt{2}}$

Measure the first qubit. If you get 0, the second is instantly 0. If you get 1, the second is instantly 1. No classical correlation can reproduce this if measurement bases are changed, per Bell's theorem.

bell_state.py
from qiskit import QuantumCircuit

# Create Bell state: H on control, then CNOT
qc = QuantumCircuit(2, 2)
qc.h(0)        # Superposition
qc.cx(0, 1)    # Entangle: |00⟩ + |11⟩
qc.measure_all()

# This 2-qubit state requires 4 complex amplitudes classically.
# 50 qubits require 2^50 ≈ 1.1e15 amplitudes = 9 petabytes at 8 bytes each.
# That's why you can't simulate 50 qubits on your laptop.

So what?

This is why you can't simulate 50 qubits classically — you'd need $2^{50}$ amplitudes to store the statevector. That's 9 PB for double-precision complex numbers, before you even compute a single gate. Entanglement is what gives quantum computers their exponential state space. It's also what makes error correction exponentially hard.

4. Quantum Fourier Transform

The QFT is the key subroutine in Shor's algorithm. It maps from the computational basis to the Fourier basis, revealing periodic structure in amplitudes.

$$QFT|j\rangle = \frac{1}{\sqrt{N}} \sum_{k=0}^{N-1} e^{2\pi i j k / N} |k\rangle$$

For $n$ qubits, $N = 2^n$. The QFT requires $O(n^2)$ gates, exponentially faster than the classical FFT's $O(N \log N) = O(n 2^n)$.

Periodicity in amplitudes → spikes in frequency domain

QFT is the lens that makes period visible. If your quantum state has amplitudes that repeat every $r$ indices, the QFT transforms it into peaks at multiples of $N/r$.

Periodic Input Amplitudes (r=4) QFT Output (Peaks at N/r) Index j QFT Frequency k
qft.py
from qiskit.circuit.library import QFT
import numpy as np

# QFT on 3 qubits
qft = QFT(3)

# Inverse QFT is used in phase estimation
iqft = QFT(3, inverse=True)

# Period finding: if f(x) = f(x+r), QFT peaks at multiples of 2^n/r
# Shor exploits this: factoring → order finding → period finding

5. Shor's Algorithm

Shor's algorithm reduces factoring to order finding, which reduces to period finding, which reduces to QFT. The chain: $N = pq$ → find $r$ such that $a^r \equiv 1 \bmod N$ → $r$ is the period of $f(x) = a^x \bmod N$.

Resource Reality

Requires ~20M physical qubits for RSA-2048 at $10^{-3}$ error rate, 2D grid connectivity, using surface code. This assumes error rates stay at 0.1% and decoding is fast enough. Change assumptions, change the number. At $10^{-4}$ physical error, you might get away with 2M. At $10^{-2}$, you need 200M+.

Link to QFT intuition

Order finding: given $a, N$, find smallest $r > 0$ such that $a^r \bmod N = 1$. Create superposition $\frac{1}{\sqrt{Q}} \sum_x |x\rangle|a^x \bmod N\rangle$. The second register entangles with periodic $x$. Measure it, collapsing the first to a superposition of $x$ values spaced by $r$. QFT on the first register gives peaks at $j \cdot 2^n / r$. Measure, run continued fractions, extract $r$.

shor_outline.py
from math import gcd
from fractions import Fraction

def shor_classical_post(N, a, r):
    """After quantum step gives period r"""
    if r % 2 != 0:
        return None  # Try again
    x = pow(a, r//2, N)
    if x == N - 1:
        return None  # Try again
    p = gcd(x + 1, N)
    q = gcd(x - 1, N)
    return p, q

# Quantum part: find r such that a^r ≡ 1 (mod N)
# 1. Create |x⟩|a^x mod N⟩ in superposition
# 2. Measure second register → first becomes periodic
# 3. QFT first register → peaks at multiples of 2^n/r
# 4. Measure → get j, compute r from continued fractions of j/2^n
# Shor's algorithm is interference: amplitudes for non-periods cancel

6. Grover's Algorithm

Grover's algorithm provides a quadratic speedup for unstructured search. Classically $O(N)$, quantumly $O(\sqrt{N})$. The mechanism is amplitude amplification via reflection about the mean.

Why quadratic matters

Breaks AES-128 to 64-bit security since $\sqrt{2^{128}} = 2^{64}$. Still exhaust 2^64 operations, but now feasible for nation-states. Useless for web search because oracle construction + setup > $\sqrt{N}$. You'd spend more time building the quantum database than searching it classically.

Grover Iteration: Geometric View |w⟩ |s⊥⟩ |s⟩ U_w|s⟩ U_sU_w|s⟩ Each iteration rotates ~2θ toward |w⟩, where sin θ = 1/√N
grover.py
from qiskit import QuantumCircuit
import numpy as np

def grover_oracle(n, marked):
    """Marks target state by flipping its phase"""
    qc = QuantumCircuit(n)
    # Example: mark |11...1⟩
    qc.x(range(n))  # Flip to |00...0⟩
    qc.h(n-1)
    qc.mcx(list(range(n-1)), n-1)  # Multi-controlled Z
    qc.h(n-1)
    qc.x(range(n))
    return qc

def diffusion(n):
    """Reflection about mean: H-X-MCX-X-H"""
    qc = QuantumCircuit(n)
    qc.h(range(n))
    qc.x(range(n))
    qc.h(n-1)
    qc.mcx(list(range(n-1)), n-1)
    qc.h(n-1)
    qc.x(range(n))
    qc.h(range(n))
    return qc

# Grover = (Oracle + Diffusion)^k where k ≈ π/4 * √N
# Interference: marked state amplitude grows, others shrink

7. Why Your Intuition From GPUs Breaks

GPUs do $10^9$ ops per kernel launch with all-to-all memory. Qubits have nearest-neighbor connectivity on a 2D grid. To run a CNOT between qubit 0 and qubit 99, you need ~200 SWAP gates to shuffle them adjacent.

SWAP-mapping overhead

$O(n^2)$ algorithmic gates → $O(n^3)$ physical gates due to limited connectivity on 2D grids. Each SWAP = 3 CNOTs. So your 10,000-gate algorithm becomes 1M physical gates.

Decoherence

T1/T2 ~100µs means ~1000 gate budget before your state is noise. Compare to GPU kernel with $10^9$ ops at 1GHz = 1s of compute. Quantum computers must finish before the quantum leaks away. You can't checkpoint and resume.

swap_overhead.py
from qiskit import QuantumCircuit, transpile
from qiskit.providers.fake_provider import FakeBrooklyn
from qiskit.providers.aer.noise import NoiseModel

# Brooklyn: 65-qubit 2D grid, limited connectivity
backend = FakeBrooklyn()
coupling_map = backend.configuration().coupling_map

# Algorithmic circuit: all-to-all
qc = QuantumCircuit(65)
for i in range(64):
    qc.cx(i, 64)  # CNOT from each qubit to qubit 64

# Transpile to hardware: adds SWAPs to route
qc_mapped = transpile(qc, backend, optimization_level=3)
print(f"Algorithmic: {qc.size()} gates")
print(f"Physical: {qc_mapped.size()} gates")  # Expect 10-20x overhead

# Noise model: T1=100µs, T2=100µs, gate_time=50ns
# Coherence limits you to ~2000 gates before fidelity < 1/e
noise_model = NoiseModel.from_backend(backend)
# SWAP = 3 CNOTs, each CNOT adds errors
# Fidelity drops exponentially: F = (1-p)^G

8. The Data-Loading Problem

HHL algorithm headline: $O(\log N)$ for solving $Ax = b$. Hides $O(N)$ state prep to load vector $b$. If loading $b$ takes $O(N)$ time, you already lost to classical $O(N^2)$ direct solve.

QRAM doesn't exist yet

Quantum RAM to load classical data in superposition is an unsolved hardware problem. Proposals require $O(N)$ physical qubits and $O(\log N)$ depth, but each access decoheres. Without QRAM, quantum machine learning on classical data is dead on arrival.

Algorithm Theoretical With Data Loading Verdict
HHL: $Ax=b$ $O(\log N)$ $O(N)$ prep Worse than classical
QPCA $O(\log N)$ $O(N)$ prep Worse than classical
QML Kernels $O(\log N)$ $O(N)$ prep No advantage
Shor, Grover Exp, Quad $O(n)$ input Input is small

Quantum advantage requires problems where the input is exponentially compressible (e.g., factoring: input is $n = \log N$ bits) or generated quantumly (e.g., quantum chemistry).

9. Error Correction

Physical qubits have error rates ~$10^{-3}$. Useful algorithms need logical error rates ~$10^{-15}$. Surface code is the leading scheme: tile qubits in 2D, measure stabilizers, decode errors.

Order-of-magnitude estimates

Distance-$d$ surface code: $\sim 2d^2$ physical qubits per logical qubit. To get $10^{-12}$ logical error rate with $10^{-3}$ physical error, you need $d=25$.

$d=25 \rightarrow \sim 1250:1$ physical-to-logical overhead

Assumes 2D grid connectivity, real-time matching decoder, and $10^{-3}$ physical error rate. If errors are $10^{-2}$, overhead grows to ~10,000:1.

Clock speed

Surface code cycle time ~1µs on superconducting qubits. A single T-gate takes $d$ cycles via magic state distillation. At $d=25$, that's 25µs per T-gate. Shor's algorithm needs ~$10^8$ T-gates → ~2500 seconds. That's 40 minutes if everything works perfectly. A single cosmic ray hit resets you to zero.

surface_code.py
# Surface code resource estimate
import math

def logical_error_rate(p, d):
    """Approx logical error for distance-d surface code"""
    # P_L ≈ 0.1 * (p/p_th)^((d+1)/2), p_th ≈ 0.01
    p_th = 0.01
    if p >= p_th:
        return 1.0  # Above threshold, no protection
    return 0.1 * (p / p_th) ** ((d + 1) / 2)

p_physical = 1e-3
d = 25
p_logical = logical_error_rate(p_physical, d)  # ~1e-12
physical_per_logical = 2 * d**2  # ~1250

# Shor-2048 needs ~4000 logical qubits × 1250 = 5M physical
# At 1µs cycle time, 1e8 T-gates × 25 cycles = 2500s runtime

10. NISQ Reality

Noisy Intermediate-Scale Quantum: 50-1000 qubits, no error correction, errors $10^{-2}$ to $10^{-3}$. What can you actually do?

Claim Reality
Quantum ML advantage No end-to-end speedup demonstrated. Data loading kills it.
Chemistry simulation VQE ansatz quality unknown. Barren plateaus kill training. Best classical methods still win.
Optimization (QAOA) Heuristic. No proven separation. Needs $p \to \infty$ layers, but noise limits to $p \sim 3$.
Quantum supremacy Google's 2019 result: sampling task with no application. Classical spoofed it 2 years later.

Anti-patterns

VQE for chemistry: You measure energy $E(\theta)$ with $1/\epsilon^2$ shots for precision $\epsilon$. Barren plateaus mean gradients vanish exponentially in $n$. You need exponential shots to find any signal.

QAOA for optimization: Performance guaranteed only as layer count $p \to \infty$. But noise limits you to $p \sim 3$ before fidelity collapses. At $p=3$, you're worse than simulated annealing.

Quantum ML on MNIST: Loading 784 pixels costs 784 operations. Classical CNN inference is 10k FLOPs at 1ns each = 10µs. Your quantum computer hasn't even loaded the image.

References

Foundational Papers

  • • Nielsen, M. A. & Chuang, I. L. Quantum Computation and Quantum Information. Cambridge University Press, 2010.
  • • Shor, P. W. "Polynomial-time algorithms for prime factorization and discrete logarithms on a quantum computer." SIAM Review 41(2), 1999.
  • • Grover, L. K. "A fast quantum mechanical algorithm for database search." STOC, 1996.
  • • Preskill, J. "Quantum Computing in the NISQ era and beyond." Quantum 2, 79, 2018.

Resource Estimation

  • • Gidney, C. & Ekerå, M. "How to factor 2048 bit RSA integers in 8 hours using 20 million noisy qubits." Quantum 5, 433, 2021.
  • • Fowler, A. G. et al. "Surface codes: Towards practical large-scale quantum computation." Phys. Rev. A 86, 2012.

Interactive Learning

  • IBM Quantum Lab: Run real quantum circuits on superconducting hardware.
  • QuEra: Neutral atom quantum computers, 256+ qubits.
  • Qiskit: Open-source SDK for quantum computing.

Quantum computing is not magic. It's interference over exponentially large spaces, constrained by decoherence, connectivity, and error correction. When those constraints align with problem structure, you get speedup. Otherwise you get an expensive random number generator.