1. The Postulates
Quantum mechanics rests on four postulates. They define the rules of the game: what a state is, how it evolves, what happens when you measure, and how systems compose.
1. State Space
An isolated physical system is associated with a complex Hilbert space. The system is completely described by a unit vector $|\psi\rangle$ in that space.
2. Evolution
The evolution of a closed quantum system is described by a unitary transformation $U$.
3. Measurement
Quantum measurements are described by Hermitian operators. Measuring $M = \sum_m m P_m$ gives outcome $m$ with probability $p(m) = \langle\psi|P_m|\psi\rangle$ and leaves the system in $P_m|\psi\rangle / \sqrt{p(m)}$.
4. Composite Systems
The state space of a composite system is the tensor product of the component state spaces.
Why this matters
Unitarity = reversibility = no heat from erasure. Classical logic gates like AND are irreversible and dissipate $kT \ln 2$ per bit erased. Quantum gates are unitary matrices so computation is thermodynamically free until measurement. No-cloning theorem: you cannot copy unknown quantum states. This is why quantum error correction is hard — you can't just duplicate qubits like RAID.
2. Superposition
Intuition
The Hadamard gate puts you on the equator of the Bloch sphere. From $|0\rangle$ at the north pole, $H$ rotates you to $|+\rangle = \frac{|0\rangle + |1\rangle}{\sqrt{2}}$ — equal probability of measuring 0 or 1, but with phase information that lets amplitudes interfere later.
A qubit can exist in a linear combination of $|0\rangle$ and $|1\rangle$. This is not probability — it's amplitude. The difference: amplitudes can be negative and cancel.
from qiskit import QuantumCircuit, Aer, execute
from qiskit.visualization import plot_histogram
# Hadamard creates superposition: |0⟩ → (|0⟩ + |1⟩)/√2
qc = QuantumCircuit(1, 1)
qc.h(0) # Equator of Bloch sphere
qc.measure(0, 0)
simulator = Aer.get_backend('qasm_simulator')
result = execute(qc, simulator, shots=1024).result()
counts = result.get_counts()
# Interference example: H-Z-H = Identity
# H puts in superposition, Z flips phase of |1⟩,
# second H makes amplitudes interfere back to |0⟩
qc2 = QuantumCircuit(1, 1)
qc2.h(0)
qc2.z(0) # Adds -1 phase to |1⟩ component
qc2.h(0) # Interference: |0⟩ and -|0⟩ cancel for |1⟩ output
qc2.measure(0, 0)
Reality check
Measuring destroys superposition. You don't get to read all $2^n$ values. You get one sample from the probability distribution $|\alpha_i|^2$. The art is using interference before measurement to ensure the distribution peaks on the right answer. Without interference, superposition is just expensive randomness.
3. Entanglement
Some composite states cannot be written as tensor products of individual states. These are entangled. Bell states are the canonical examples.
Measure the first qubit. If you get 0, the second is instantly 0. If you get 1, the second is instantly 1. No classical correlation can reproduce this if measurement bases are changed, per Bell's theorem.
from qiskit import QuantumCircuit
# Create Bell state: H on control, then CNOT
qc = QuantumCircuit(2, 2)
qc.h(0) # Superposition
qc.cx(0, 1) # Entangle: |00⟩ + |11⟩
qc.measure_all()
# This 2-qubit state requires 4 complex amplitudes classically.
# 50 qubits require 2^50 ≈ 1.1e15 amplitudes = 9 petabytes at 8 bytes each.
# That's why you can't simulate 50 qubits on your laptop.
So what?
This is why you can't simulate 50 qubits classically — you'd need $2^{50}$ amplitudes to store the statevector. That's 9 PB for double-precision complex numbers, before you even compute a single gate. Entanglement is what gives quantum computers their exponential state space. It's also what makes error correction exponentially hard.
4. Quantum Fourier Transform
The QFT is the key subroutine in Shor's algorithm. It maps from the computational basis to the Fourier basis, revealing periodic structure in amplitudes.
For $n$ qubits, $N = 2^n$. The QFT requires $O(n^2)$ gates, exponentially faster than the classical FFT's $O(N \log N) = O(n 2^n)$.
Periodicity in amplitudes → spikes in frequency domain
QFT is the lens that makes period visible. If your quantum state has amplitudes that repeat every $r$ indices, the QFT transforms it into peaks at multiples of $N/r$.
from qiskit.circuit.library import QFT
import numpy as np
# QFT on 3 qubits
qft = QFT(3)
# Inverse QFT is used in phase estimation
iqft = QFT(3, inverse=True)
# Period finding: if f(x) = f(x+r), QFT peaks at multiples of 2^n/r
# Shor exploits this: factoring → order finding → period finding
5. Shor's Algorithm
Shor's algorithm reduces factoring to order finding, which reduces to period finding, which reduces to QFT. The chain: $N = pq$ → find $r$ such that $a^r \equiv 1 \bmod N$ → $r$ is the period of $f(x) = a^x \bmod N$.
Resource Reality
Requires ~20M physical qubits for RSA-2048 at $10^{-3}$ error rate, 2D grid connectivity, using surface code. This assumes error rates stay at 0.1% and decoding is fast enough. Change assumptions, change the number. At $10^{-4}$ physical error, you might get away with 2M. At $10^{-2}$, you need 200M+.
Link to QFT intuition
Order finding: given $a, N$, find smallest $r > 0$ such that $a^r \bmod N = 1$. Create superposition $\frac{1}{\sqrt{Q}} \sum_x |x\rangle|a^x \bmod N\rangle$. The second register entangles with periodic $x$. Measure it, collapsing the first to a superposition of $x$ values spaced by $r$. QFT on the first register gives peaks at $j \cdot 2^n / r$. Measure, run continued fractions, extract $r$.
from math import gcd
from fractions import Fraction
def shor_classical_post(N, a, r):
"""After quantum step gives period r"""
if r % 2 != 0:
return None # Try again
x = pow(a, r//2, N)
if x == N - 1:
return None # Try again
p = gcd(x + 1, N)
q = gcd(x - 1, N)
return p, q
# Quantum part: find r such that a^r ≡ 1 (mod N)
# 1. Create |x⟩|a^x mod N⟩ in superposition
# 2. Measure second register → first becomes periodic
# 3. QFT first register → peaks at multiples of 2^n/r
# 4. Measure → get j, compute r from continued fractions of j/2^n
# Shor's algorithm is interference: amplitudes for non-periods cancel
6. Grover's Algorithm
Grover's algorithm provides a quadratic speedup for unstructured search. Classically $O(N)$, quantumly $O(\sqrt{N})$. The mechanism is amplitude amplification via reflection about the mean.
Why quadratic matters
Breaks AES-128 to 64-bit security since $\sqrt{2^{128}} = 2^{64}$. Still exhaust 2^64 operations, but now feasible for nation-states. Useless for web search because oracle construction + setup > $\sqrt{N}$. You'd spend more time building the quantum database than searching it classically.
from qiskit import QuantumCircuit
import numpy as np
def grover_oracle(n, marked):
"""Marks target state by flipping its phase"""
qc = QuantumCircuit(n)
# Example: mark |11...1⟩
qc.x(range(n)) # Flip to |00...0⟩
qc.h(n-1)
qc.mcx(list(range(n-1)), n-1) # Multi-controlled Z
qc.h(n-1)
qc.x(range(n))
return qc
def diffusion(n):
"""Reflection about mean: H-X-MCX-X-H"""
qc = QuantumCircuit(n)
qc.h(range(n))
qc.x(range(n))
qc.h(n-1)
qc.mcx(list(range(n-1)), n-1)
qc.h(n-1)
qc.x(range(n))
qc.h(range(n))
return qc
# Grover = (Oracle + Diffusion)^k where k ≈ π/4 * √N
# Interference: marked state amplitude grows, others shrink
7. Why Your Intuition From GPUs Breaks
GPUs do $10^9$ ops per kernel launch with all-to-all memory. Qubits have nearest-neighbor connectivity on a 2D grid. To run a CNOT between qubit 0 and qubit 99, you need ~200 SWAP gates to shuffle them adjacent.
SWAP-mapping overhead
$O(n^2)$ algorithmic gates → $O(n^3)$ physical gates due to limited connectivity on 2D grids. Each SWAP = 3 CNOTs. So your 10,000-gate algorithm becomes 1M physical gates.
Decoherence
T1/T2 ~100µs means ~1000 gate budget before your state is noise. Compare to GPU kernel with $10^9$ ops at 1GHz = 1s of compute. Quantum computers must finish before the quantum leaks away. You can't checkpoint and resume.
from qiskit import QuantumCircuit, transpile
from qiskit.providers.fake_provider import FakeBrooklyn
from qiskit.providers.aer.noise import NoiseModel
# Brooklyn: 65-qubit 2D grid, limited connectivity
backend = FakeBrooklyn()
coupling_map = backend.configuration().coupling_map
# Algorithmic circuit: all-to-all
qc = QuantumCircuit(65)
for i in range(64):
qc.cx(i, 64) # CNOT from each qubit to qubit 64
# Transpile to hardware: adds SWAPs to route
qc_mapped = transpile(qc, backend, optimization_level=3)
print(f"Algorithmic: {qc.size()} gates")
print(f"Physical: {qc_mapped.size()} gates") # Expect 10-20x overhead
# Noise model: T1=100µs, T2=100µs, gate_time=50ns
# Coherence limits you to ~2000 gates before fidelity < 1/e
noise_model = NoiseModel.from_backend(backend)
# SWAP = 3 CNOTs, each CNOT adds errors
# Fidelity drops exponentially: F = (1-p)^G
8. The Data-Loading Problem
HHL algorithm headline: $O(\log N)$ for solving $Ax = b$. Hides $O(N)$ state prep to load vector $b$. If loading $b$ takes $O(N)$ time, you already lost to classical $O(N^2)$ direct solve.
QRAM doesn't exist yet
Quantum RAM to load classical data in superposition is an unsolved hardware problem. Proposals require $O(N)$ physical qubits and $O(\log N)$ depth, but each access decoheres. Without QRAM, quantum machine learning on classical data is dead on arrival.
| Algorithm | Theoretical | With Data Loading | Verdict |
|---|---|---|---|
| HHL: $Ax=b$ | $O(\log N)$ | $O(N)$ prep | Worse than classical |
| QPCA | $O(\log N)$ | $O(N)$ prep | Worse than classical |
| QML Kernels | $O(\log N)$ | $O(N)$ prep | No advantage |
| Shor, Grover | Exp, Quad | $O(n)$ input | Input is small |
Quantum advantage requires problems where the input is exponentially compressible (e.g., factoring: input is $n = \log N$ bits) or generated quantumly (e.g., quantum chemistry).
9. Error Correction
Physical qubits have error rates ~$10^{-3}$. Useful algorithms need logical error rates ~$10^{-15}$. Surface code is the leading scheme: tile qubits in 2D, measure stabilizers, decode errors.
Order-of-magnitude estimates
Distance-$d$ surface code: $\sim 2d^2$ physical qubits per logical qubit. To get $10^{-12}$ logical error rate with $10^{-3}$ physical error, you need $d=25$.
Assumes 2D grid connectivity, real-time matching decoder, and $10^{-3}$ physical error rate. If errors are $10^{-2}$, overhead grows to ~10,000:1.
Clock speed
Surface code cycle time ~1µs on superconducting qubits. A single T-gate takes $d$ cycles via magic state distillation. At $d=25$, that's 25µs per T-gate. Shor's algorithm needs ~$10^8$ T-gates → ~2500 seconds. That's 40 minutes if everything works perfectly. A single cosmic ray hit resets you to zero.
# Surface code resource estimate
import math
def logical_error_rate(p, d):
"""Approx logical error for distance-d surface code"""
# P_L ≈ 0.1 * (p/p_th)^((d+1)/2), p_th ≈ 0.01
p_th = 0.01
if p >= p_th:
return 1.0 # Above threshold, no protection
return 0.1 * (p / p_th) ** ((d + 1) / 2)
p_physical = 1e-3
d = 25
p_logical = logical_error_rate(p_physical, d) # ~1e-12
physical_per_logical = 2 * d**2 # ~1250
# Shor-2048 needs ~4000 logical qubits × 1250 = 5M physical
# At 1µs cycle time, 1e8 T-gates × 25 cycles = 2500s runtime
10. NISQ Reality
Noisy Intermediate-Scale Quantum: 50-1000 qubits, no error correction, errors $10^{-2}$ to $10^{-3}$. What can you actually do?
| Claim | Reality |
|---|---|
| Quantum ML advantage | No end-to-end speedup demonstrated. Data loading kills it. |
| Chemistry simulation | VQE ansatz quality unknown. Barren plateaus kill training. Best classical methods still win. |
| Optimization (QAOA) | Heuristic. No proven separation. Needs $p \to \infty$ layers, but noise limits to $p \sim 3$. |
| Quantum supremacy | Google's 2019 result: sampling task with no application. Classical spoofed it 2 years later. |
Anti-patterns
VQE for chemistry: You measure energy $E(\theta)$ with $1/\epsilon^2$ shots for precision $\epsilon$. Barren plateaus mean gradients vanish exponentially in $n$. You need exponential shots to find any signal.
QAOA for optimization: Performance guaranteed only as layer count $p \to \infty$. But noise limits you to $p \sim 3$ before fidelity collapses. At $p=3$, you're worse than simulated annealing.
Quantum ML on MNIST: Loading 784 pixels costs 784 operations. Classical CNN inference is 10k FLOPs at 1ns each = 10µs. Your quantum computer hasn't even loaded the image.
References
Foundational Papers
- • Nielsen, M. A. & Chuang, I. L. Quantum Computation and Quantum Information. Cambridge University Press, 2010.
- • Shor, P. W. "Polynomial-time algorithms for prime factorization and discrete logarithms on a quantum computer." SIAM Review 41(2), 1999.
- • Grover, L. K. "A fast quantum mechanical algorithm for database search." STOC, 1996.
- • Preskill, J. "Quantum Computing in the NISQ era and beyond." Quantum 2, 79, 2018.
Resource Estimation
- • Gidney, C. & Ekerå, M. "How to factor 2048 bit RSA integers in 8 hours using 20 million noisy qubits." Quantum 5, 433, 2021.
- • Fowler, A. G. et al. "Surface codes: Towards practical large-scale quantum computation." Phys. Rev. A 86, 2012.
Interactive Learning
- • IBM Quantum Lab: Run real quantum circuits on superconducting hardware.
- • QuEra: Neutral atom quantum computers, 256+ qubits.
- • Qiskit: Open-source SDK for quantum computing.