Code that writes code

Exponential-looking growth in computing emerges when systems can improve their own code. Once a loop exists that can generate, execute, measure, and refine programs, capability compounds: each improvement enhances the next round of search over the program phase space. The driver is not just device density, but a feedback process that compresses errors, increases information flow, and reduces effective entropy in the distribution of behaviors.

In mathematical terms, let the capability at iteration \(t\) be \(C(t)\). A simple model of compounded self-improvement is \[ C(t+1) = (1 + r)\,C(t), \qquad r > 0, \] whose solution is \[ C(t) = C(0)\,(1 + r)^t \approx C(0)\,e^{rt}, \] capturing the "exponential-looking" growth. A more general continuous-time feedback model writes \[ \frac{dC}{dt} = f(I,C,E), \] where \(I\) measures information flow through the system and \(E\) is the average error. A negative dependence \(\partial f/\partial E < 0\) and positive \(\partial f/\partial I > 0\) encode that reducing errors and increasing information flow accelerates capability.

Let the program hypothesis space be a high-dimensional manifold \(\mathcal{H}\) with coordinates \(\theta \in \mathbb{R}^d\) parameterizing programs, and let a loss (or energy-like) function be \(L : \mathcal{H} \to \mathbb{R}_+\). Guided search corresponds to iterates \[ \theta_{t+1} = \theta_t - \eta_t \, \nabla_\theta L(\theta_t) + \xi_t, \] where \(\eta_t\) is a learning rate and \(\xi_t\) represents stochastic exploration. The associated probability density over programs, \(p_t(\theta)\), can be modeled as a Gibbs-like distribution \[ p_t(\theta) = \frac{1}{Z_t} \exp\big(-\beta_t L(\theta)\big), \] with partition function \(Z_t\) and inverse temperature \(\beta_t\) reflecting how strongly the system prefers lower-loss code. As \(t\) increases and \(\beta_t\) grows, the distribution contracts around high-quality programs, reducing effective entropy \[ S[p_t] = - \int_{\mathcal{H}} p_t(\theta)\, \log p_t(\theta)\, d\theta. \]

Mechanism of self-improvement

Formally, represent each candidate program by parameters \(\theta \in \mathbb{R}^d\) and define an objective (loss/energy) \(L(\theta)\). Gradient-based guidance is \[ \theta_{t+1} = \theta_t - \eta_t \, \nabla_\theta L(\theta_t), \] while purely heuristic or discrete search can be modeled by a Markov chain with transition kernel \(K(\theta'\mid\theta)\) that satisfies \[ p_{t+1}(\theta') = \int K(\theta'\mid\theta)\,p_t(\theta)\,d\theta, \] where \(p_t(\theta)\) is the probability density over program parameters at iteration \(t\). The stationary distribution \(p_*(\theta)\) often approximates a Boltzmann form \(p_*(\theta) \propto e^{-\beta L(\theta)}\), concentrating mass near minima of \(L\).

Let \(\Theta\) be a discrete hypothesis set of programs, with prior \(P(\theta)\) for \(\theta \in \Theta\). Given observed data or test outcomes \(D\), Bayes' rule updates beliefs via \[ P(\theta \mid D) = \frac{P(D \mid \theta)P(\theta)}{P(D)}, \qquad P(D) = \sum_{\theta' \in \Theta} P(D \mid \theta')P(\theta'). \] The Shannon entropy of the hypothesis distribution is \[ H[P] = -\sum_{\theta \in \Theta} P(\theta)\,\log P(\theta). \] Error-correcting feedback corresponds to sequences of measurements \(D_1, D_2, \dots\) such that \[ H\big[P(\cdot \mid D_1,\dots,D_t)\big] \searrow H_* \quad \text{as} \quad t \to \infty, \] where \(H_*\) is low and the posterior probability mass is concentrated on a small set of high-fidelity programs.

Landauer's principle states that erasing one bit of information incurs a minimum heat dissipation \[ Q_{\min} = k_B T \ln 2, \] where \(k_B\) is Boltzmann's constant and \(T\) is the ambient temperature. If a self-editing system performs \(N_\text{erase}\) bit erasures per update, the minimal thermodynamic cost per update is \[ Q_\text{update} \ge N_\text{erase}\,k_B T \ln 2. \] The computational efficiency can be expressed as useful information gain per unit energy, \[ \eta_\text{info} = \frac{\Delta I}{Q_\text{update}}, \] where \(\Delta I\) is the increase in mutual information between internal model parameters \(\Theta\) and task outcomes \(Y\): \[ I(\Theta;Y) = \sum_{\theta,y} p(\theta,y)\, \log \frac{p(\theta,y)}{p(\theta)\,p(y)}. \] Efficient self-editing corresponds to strategies that maximize \(\eta_\text{info}\) by minimizing unnecessary erasures and maximizing \(\Delta I\).

Let the Kolmogorov complexity (algorithmic information content) of a solution for task \(T\) be \(K(T)\), defined as the length (in bits) of the shortest program that solves \(T\) on a fixed universal Turing machine \(U\). For two tasks \(T_1, T_2\), the shared structure can be quantified by an information-theoretic overlap \[ I(T_1;T_2) \approx K(T_1) + K(T_2) - K(T_1,T_2), \] where \(K(T_1,T_2)\) is the complexity of jointly solving both tasks. Reusable compressed representations correspond to internal codes \(z\) of small description length \(L(z)\) that minimize expected description length over tasks: \[ \min_{z} \, \mathbb{E}_{T \sim \mathcal{D}}\big[ L(z) + L(T \mid z) \big], \] with \(\mathcal{D}\) a task distribution. Lower \(L(z)\) and \(L(T\mid z)\) imply faster adaptation and thus accelerated convergence on new problems.

Beginner note: If the math above feels heavy, you do not need to absorb every symbol. The key idea is simple: systems that can test and improve their own code can get better very quickly. Quantum computing plugs into this story by giving those systems a new kind of hardware to explore enormous search spaces more efficiently.

How this page is assembled

This page was assembled with automated assistance: generative tooling produced structure and text, which were reviewed and emitted as static HTML. In other words, the site itself serves as a small example of code that helps author more code. AI for Not Bad.

Abstractly, denote the human editor as \(H\) and the generative model as \(G\). Let \(x\) be an initial specification and \(y\) the final HTML. The interaction can be seen as an alternating minimization over drafts \(y_t\): \[ y_{t+1} = \operatorname*{arg\,min}_{y} \Big( \mathcal{L}_\text{spec}(y \mid x) + \mathcal{L}_\text{style}(y) + \mathcal{L}_\text{error}(y) \Big), \] with updates produced by either \(G\) or \(H\): \[ y_{t+1} = \begin{cases} G(y_t,x,\xi_t) & \text{with probability } p_G, \\ H(y_t,x) & \text{with probability } 1-p_G, \end{cases} \] where \(\xi_t\) captures model stochasticity. Convergence is reached when successive edits satisfy a small-difference condition, e.g. \[ d(y_{t+1},y_t) < \varepsilon, \] for a suitable distance metric \(d\) on documents.

Deep dive: a tiny quantum circuit in code and math

Here is a minimal example of preparing a Bell state using Python-style pseudocode with a Qiskit‑like API, plus the matching math.

# Pseudocode using a Qiskit-like API
from qiskit import QuantumCircuit

qc = QuantumCircuit(2, 2)  # 2 qubits, 2 classical bits

# 1. Put qubit 0 into a superposition
qc.h(0)

# 2. Use it as control for a CNOT on qubit 1
qc.cx(0, 1)

# 3. Measure both qubits
qc.measure(0, 0)
qc.measure(1, 1)

Mathematically, in the \(|00\rangle,|01\rangle,|10\rangle,|11\rangle\) basis:

  1. Start in \(|00\rangle\).
  2. Apply \(H\) to the first qubit: \[ H|0\rangle = \tfrac{1}{\sqrt{2}}(|0\rangle + |1\rangle), \] so the joint state becomes \[ |\psi_1\rangle = \tfrac{1}{\sqrt{2}}(|00\rangle + |10\rangle). \]
  3. Apply CNOT with qubit 0 as control and qubit 1 as target: \[ \text{CNOT}|00\rangle = |00\rangle, \quad \text{CNOT}|10\rangle = |11\rangle. \] Therefore \[ |\psi_2\rangle = \tfrac{1}{\sqrt{2}}(|00\rangle + |11\rangle) = |\Phi^+\rangle. \]

When you measure both qubits in the computational basis, you get:

  • \(P(00) = 1/2\)
  • \(P(11) = 1/2\)
  • \(P(01) = P(10) = 0\)

This is “maximal entanglement” in its simplest form.

Minimal meta-programming sketch

// Pseudocode for a self-improving loop
population = initialize_programs()
while (time < budget):
  for prog in population:
    result = execute(prog, tests)
    score  = measure(result)
    log(result, score)
  models = fit_surrogates(log)          // learn to predict score from program features
  proposals = propose(models)           // synthesize or mutate new programs
  population = select(population, proposals, log) // keep better, diverse candidates
  if converged(population): break
deploy(best(population))

Let the population at iteration \(t\) be \(\mathcal{P}_t = \{\theta_t^{(1)},\dots,\theta_t^{(N)}\}\), and let each candidate have a score (fitness) \(F(\theta)\). The selection step can be modeled by a softmax (Boltzmann) sampling distribution \[ P_t(\theta) = \frac{\exp\big(\beta_t F(\theta)\big)}{\sum_{\theta' \in \mathcal{P}_t} \exp\big(\beta_t F(\theta')\big)}, \] with inverse temperature \(\beta_t\) controlling selection pressure. Proposals (mutations or synthesized programs) \(\tilde{\theta}\) are drawn from a proposal kernel \(q_t(\tilde{\theta} \mid \theta)\), so the expected update of the population distribution is \[ p_{t+1}(\tilde{\theta}) = \sum_{\theta} P_t(\theta)\, q_t(\tilde{\theta} \mid \theta). \] Convergence of the loop can be expressed via a stopping condition such as \[ \operatorname{Var}_{\theta \sim p_t}[F(\theta)] < \delta \quad \text{or} \quad \max_{\theta \in \mathcal{P}_t} F(\theta) - \max_{\theta \in \mathcal{P}_{t-k}} F(\theta) < \epsilon, \] for some window size \(k\) and tolerances \(\delta, \epsilon\).

Hierarchical graph of matter and interactions

Universe
└─ Quantum fields
   ├─ Fermion fields (matter)
   │  ├─ Quarks (color charge: red/green/blue)
   │  │  ├─ Flavors: up, down, charm, strange, top, bottom
   │  │  └─ Bound states (hadrons)
   │  │     ├─ Baryons (3 quarks)
   │  │     │  ├─ Proton: u u d
   │  │     │  ├─ Neutron: u d d
   │  │     │  └─ Antibaryons (3 antiquarks)
   │  │     │     └─ Antiproton: ū ū d̄
   │  │     └─ Mesons (quark + antiquark)
   │  └─ Leptons
   │     ├─ Electron, Muon, Tau
   │     └─ Neutrinos (three types) + corresponding antiparticles (e.g., positron)
   ├─ Boson fields (interaction carriers)
   │  ├─ Gluons (strong interaction, SU(3) color)
   │  ├─ Photon (electromagnetic, U(1))
   │  ├─ W± and Z⁰ (weak interaction)
   │  └─ Gravitational quantum (graviton, hypothetical)
   └─ Scalar field associated with electroweak symmetry breaking
      └─ Scalar boson (mass-generating excitation)

Composite structures
└─ Atomic nucleus: protons + neutrons (held by residual strong force)
   ├─ Atoms: nucleus + electron cloud (quantized orbitals)
   ├─ Molecules: atoms bound via electromagnetic interaction
   └─ Condensed phases: solids, liquids, gases, plasmas, exotic matter

“Matter” typically denotes fermions and their composites. Gauge bosons (such as gluons) mediate interactions and are included for completeness.

In the Standard Model, the fundamental fields can be written as a Lagrangian density \(\mathcal{L}_\text{SM}\) of the form \[ \mathcal{L}_\text{SM} = -\frac{1}{4} \sum_{a} F_{\mu\nu}^a F^{a\,\mu\nu} + \sum_{f} \bar{\psi}_f\big(i\gamma^\mu D_\mu - m_f\big)\psi_f + (D_\mu \phi)^\dagger(D^\mu \phi) - V(\phi) + \mathcal{L}_\text{Yukawa}, \] where \ F_{\mu\nu}^a = \partial_\mu A_\nu^a - \partial_\nu A_\mu^a + g f^{abc} A_\mu^b A_\nu^c \\ encodes the gauge bosons (gluons, \(W^\pm, Z^0\), photon), \(\psi_f\) are fermion fields (quarks and leptons), \(\phi\) is the Higgs scalar field, and \(D_\mu = \partial_\mu - i g A_\mu^a T^a\) is the gauge-covariant derivative.

Color-charged quarks \(q\) interact via the SU(3) gauge field \(G_\mu^a\) with coupling constant \(g_s\), described by \[ \mathcal{L}_\text{QCD} = \bar{q}\big(i\gamma^\mu D_\mu - m_q\big)q - \frac{1}{4} G_{\mu\nu}^a G^{a\,\mu\nu}, \] where \(D_\mu = \partial_\mu - i g_s T^a G_\mu^a\) and \(G_{\mu\nu}^a\) has the same non-Abelian structure as \(F_{\mu\nu}^a\) above. Bound states such as protons and neutrons are color-singlet combinations of three quarks (baryons), while mesons are quark–antiquark pairs \(q \bar{q}\), consistent with overall color neutrality.

Leptons (electron \(e\), muon \(\mu\), tau \(\tau\) and their neutrinos \(\nu_e, \nu_\mu, \nu_\tau\)) couple to the electroweak SU(2)\(_L\)\timesU(1)\(_Y\) gauge fields. After electroweak symmetry breaking, the Higgs field acquires a vacuum expectation value \[ \langle \phi \rangle = \frac{1}{\sqrt{2}} \begin{pmatrix}0 \\ v \end{pmatrix}, \qquad v \approx 246~\text{GeV}, \] which generates fermion and weak boson masses through Yukawa terms \(y_f \bar{\psi}_f \phi \psi_f\) and gauge interactions, while leaving the photon massless.

Composite structures are organized hierarchically. A nucleus with \(Z\) protons and \(A-Z\) neutrons has baryon number \(B = A\) and charge \(Q = Z e\). An atom adds \(Z\) electrons, leading to a many-body Hamiltonian \[ H = \sum_{i=1}^{Z} \bigg( -\frac{\hbar^2}{2m_e} \nabla_i^2 - \frac{Z e^2}{4\pi \varepsilon_0 r_i} \bigg) + \sum_{1 \le i < j \le Z} \frac{e^2}{4\pi \varepsilon_0 \lvert \mathbf{r}_i - \mathbf{r}_j \rvert} + H_\text{nucleus}, \] whose eigenstates \(\Psi_n\) correspond to quantized orbitals and energy levels \(E_n\) solving \[ H \Psi_n = E_n \Psi_n. \] Molecules and condensed phases arise when these atomic states combine via electromagnetic interactions into multi-atom bound states and extended many-body systems.

Portfolio as a wave function: density matrices instead of variance–covariance matrices

Below is a compact, code-adjacent sketch of the idea: in a quantum-flavored actuarial / MPT model, the classical variance–covariance matrix \(\Sigma\) is upgraded to a density matrix \(\rho\), and a portfolio is treated as a wave function \(|\psi\rangle\). This mirrors what you might simulate with Stan, but in linear-algebra form.

Classical MPT / actuarial notation

// n assets, vector of random returns R
// mean vector μ, variance–covariance matrix Σ

E[R]   = μ               // n×1
Cov(R) = Σ               // n×n (symmetric, PSD)

// portfolio weights (deterministic)
vector[n] w;
real R_p = dot_product(w, R);     // portfolio return

real mean_p   = dot_product(w, μ);
real var_p    = quad_form(Σ, w);  // w^T Σ w

LaTeX form: \( \mathbb{E}[R] = \mu \), \( \operatorname{Cov}(R) = \Sigma \), \( R_p = w^\top R \), \( \mathbb{E}[R_p] = w^\top \mu \), \( \operatorname{Var}(R_p) = w^\top \Sigma w \).

Quantum-style upgrade

We now re-interpret these same objects in a Hilbert-space way:

In code-flavored pseudomath:

// amplitudes ψ (complex allowed), normalized
complex psi[n];
// density matrix ρ_ij = ψ_i ψ*_j
complex rho[n, n] = outer_product(psi, conj(psi));

// return operator R̂ (for illustration, take it diagonal)
real R_vals[n];    // possible asset returns
complex R_op[n, n];
for (i in 1:n) {
  for (j in 1:n) R_op[i,j] = (i == j) ? R_vals[i] : 0;
}

// quantum expectation of portfolio return
complex mean_p_q = trace(rho * R_op); // ≈ classical w^T μ

Mathematically,

ρ = |ψ><ψ|
<R̂>_ψ = <ψ| R̂ |ψ> = Tr(ρ R̂)

LaTeX form: \( \rho = |\psi\rangle\langle\psi| \), \( \langle \hat R \rangle_\psi = \langle \psi|\hat R|\psi\rangle = \operatorname{Tr}(\rho \hat R) \).

If \(\hat R\) is diagonal in the \(|e_i\rangle\) basis and we ignore phases, \(|\psi_i|^2\) plays the same role as classical weights \(w_i\). But once off-diagonal elements are allowed, \(\rho\) carries richer cross-asset structure than \(\Sigma\) alone.

Stan-style simulation vs Schrödinger-style evolution

Stan would typically simulate posterior draws of parameters \(\theta\) (e.g., drifts, vols, correlations) and then simulate returns:

// very schematic Stan pseudo-flow
for (m in 1:M) {
  θ[m] ~ posterior(· | data);      // parameters: μ[m], Σ[m], etc.
  R[m] ~ mvnormal(μ[m], Σ[m]);     // draw returns
  R_p[m] = dot_product(w, R[m]);
}
// compute empirical mean/var/VaR of R_p from samples

In the wave-function view, instead of sampling many \(\theta\), we evolve the state itself under a Hamiltonian \(\hat H\) that encodes drift/volatility/market structure:

// time evolution of portfolio state ψ_t
// i ℏ d|ψ_t>/dt = Ĥ |ψ_t>

psi_t+Δt ≈ exp(-i Δt Ĥ / ℏ) * psi_t;

// at horizon T, density matrix ρ_T = |ψ_T><ψ_T|
// expected payoff under operator Π̂
price_0 ≈ discount * trace(ρ_T * Π̂);

This is the code-level version of “instead of running Stan simulations over parameter space, we let the whole portfolio turn into a wave function and flow under \(\hat H\).”

Black–Scholes as the one-asset limit

For a single risky asset in Black–Scholes, under the risk-neutral measure we have

dS_t = r S_t dt + σ S_t dW_t

The price \(V(S,t)\) of a European claim satisfies

∂V/∂t + (1/2) σ^2 S^2 ∂^2V/∂S^2 + r S ∂V/∂S - r V = 0

After changing variables \(x = \ln S\) and switching to forward time \(τ = T - t\), this maps to a heat equation which can be written in imaginary time as

∂φ/∂τ = (1/2) σ^2 ∂^2φ/∂x^2 - V_eff(x) φ

Under a Wick rotation \(τ → i t\), this resembles a Schrödinger equation

i ℏ ∂ψ/∂t = Ĥ ψ

with \(\hat H\) containing a kinetic term (diffusion from \(σ\)) plus an effective potential. In this limit:

So Black–Scholes is already half-way to the quantum formalism: it evolves distributions through a linear PDE. The density-matrix formulation just generalizes this to multiple coupled assets and richer dependence than a single \(\Sigma\) can conveniently express.

Rank System Name Developer Qubit Type Physical Qubits Key Performance Metrics Notable Achievements
1 System Model H2 Quantinuum Trapped Ion 32+ (scalable to 56) QV: 2²⁵ (33,554,432); 2-qubit fidelity: 99.9% ("three 9s"); Logical qubits: Up to 12 entangled with 0.0011% error rate World's highest QV; 4× QV gain in 2025 alone; First chemistry sim combining QC, HPC, and AI; Outperforms all in stable, fault-tolerant ops.
2 Willow Google Quantum AI Superconducting 105 2-qubit fidelity: >99.9%; Performs RCS benchmark in <5 min (10²⁵ years classically); Logical qubits: Demonstrated below-threshold error correction Quantum supremacy on practical tasks; Doubles coherence vs. physical qubits; Outpaces supercomputers by 10 septillion times on sampling.
3 Zuchongzhi 3.0 USTC (China) Superconducting 105 1-qubit fidelity: 99.90%; 2-qubit fidelity: 99.62%; Task speed: Seconds (5.9B years classically) Rivals Willow in speed; Uses low-noise tantalum/niobium; 15×7 lattice for high connectivity; Major leap in raw performance.
4 Nighthawk IBM Superconducting 120 Up to 5,000 two-qubit gates; 2-qubit fidelity: ~99.5%; QV: >2²⁰ (est.); Tunable couplers: 218+ Path to 2026 quantum advantage; 20% more couplers than Heron; Real-time error decoding in <480 ns; Utility-scale molecular sims with Fugaku supercomputer.
5 Forte Enterprise IonQ Trapped Ion 36 (Tempo: 64 planned) 2-qubit fidelity: 99.99% ("four 9s"); #AQ 36 (all-to-all connectivity); 20× performance gains in apps World-record fidelity for error correction; Efficient logical qubits with fewer physical ones; Used in drug discovery and finance modeling.
6 Ankaa-3 Rigetti Superconducting 84+ (100+ modular by end-2025) 2-qubit fidelity: 99.5%; Nanosecond gate speeds; Real-time error correction on 84 qubits Fastest gate times (vs. microsecond rivals); Sold as QPUs to labs; Chiplet roadmap for utility-scale; 98% median fidelity in square lattice.
7 Neutral Atom Array QuEra Computing Neutral Atom 3,000 (planned; current: 256) Logical qubits: 48 with 0.5% error (vs. IBM's 2.9%); High entanglement stability Fault-tolerant leader; Outperforms Heron in error rates; Scales to 10,000 qubits by 2026; Apps in logistics and materials science.
8 Majorana 1 Microsoft (Azure Quantum) Topological ~24 logical (scalable to 1M) Inherent low error rates; High-fidelity Majorana quasiparticles; Logical qubits: 12+ entangled First topological processor; Million-qubit potential; Integrates with Quantinuum/Atom for chemistry/AI; Below-threshold errors.
9 Advantage 2 D-Wave Quantum Annealing 7,440 15-way connectivity; Quantum supremacy on real-world optimization Fastest for optimization (e.g., logistics); Not gate-based, but 3-month free trials via Leap; Beats classical on QUBO problems.
10 Kookaburra IBM Superconducting 1,386 (multi-chip) Enhanced coherence; High gate fidelity (~99%); Part of Heron R2 upgrades Massive scale for 2025; Builds on Condor (1,121 qubits); Focus on error-corrected simulations; Quantum-centric supercomputing.

Classical vs Quantum Computer: pseudo-code views

This section strips away hardware details and shows, in pseudo‑code, how a conventional laptop and a quantum processor “feel” different as machines.

Conventional computer (motherboard + CPU + RAM)

// physical picture
// --------------------------------------------------
// motherboard: connects CPU, RAM, storage, peripherals
// CPU: executes instructions sequentially (with some parallelism)
// RAM: holds bits (0 or 1) for active programs

machine ClassicalComputer {
  Motherboard board;
  CPU         cpu;
  RAM         ram;
  Storage     disk;
}

// run a program
function run_classical(program P, input bits_in[]):
  // 1. load code and data into RAM
  ram.load(P.code)
  ram.load(bits_in)

  // 2. CPU executes instructions one by one
  while cpu.instruction_pointer not at END(P):
    instr = ram.fetch(cpu.instruction_pointer)
    cpu.execute(instr, ram)
    cpu.instruction_pointer++

  // 3. read out output bits from RAM
  bits_out = ram.read(P.output_region)
  return bits_out

Quantum processor (QPU + classical control computer)

// physical picture
// --------------------------------------------------
// classical control computer: compiles code, sends pulses
// quantum processing unit (QPU): array of qubits on a chip
// cryostat: keeps QPU near absolute zero

machine QuantumComputer {
  ClassicalControl ctrl;     // compiler, schedulers
  QuantumProcessingUnit qpu; // qubits + control lines
}

// high-level quantum run
function run_quantum(circuit C, classical_input bits_in[]):
  // 1. classical pre-processing
  //    e.g., encode bits_in into initial qubit states
  compiled_pulses = ctrl.compile(C, bits_in)

  // 2. upload pulse schedule to QPU
  qpu.load(compiled_pulses)

  // 3. apply quantum operations (unitaries)
  qpu.apply_pulses()
  // internally, each gate is a unitary U on |ψ>:
  //    |ψ_new> = U |ψ_old>

  // 4. measure qubits
  measurement_record = qpu.measure_all() // collapses |ψ> → classical bits

  // 5. classical post-processing
  result = ctrl.post_process(measurement_record)
  return result

Conceptually:

Same abstract computation, two execution models

Imagine we want to compute the parity (even/odd) of \(N\) bits.

Classical laptop parity

function parity_classical(bits[]):
  acc = 0
  for b in bits:
    acc = acc XOR b
  return acc  // 0 = even, 1 = odd

Quantum parity sketch (conceptual)

function parity_quantum(bits[]):
  // 1. encode bits into computational basis states
  //    |b_1 b_2 ... b_N>
  |ψ> = |b_1 b_2 ... b_N>

  // 2. use a circuit of CNOTs to copy global parity into an ancilla qubit
  //    |ψ, 0>  →  |ψ, parity(bits)>
  for i in 1..N:
    CNOT(control = qubit_i, target = ancilla)

  // 3. measure only the ancilla qubit
  result = measure(ancilla)
  return result  // 0 = even, 1 = odd

On such a small task, the quantum route is not “better” than the classical one—it is just a different physical implementation of the same logical function. Real quantum advantage typically appears in problems that exploit superposition and interference over huge state spaces.

Deep dive: complexity classes BPP vs BQP

If you like big‑picture theory, the usual cartoon is:

  • BPP (“bounded‑error probabilistic polynomial time”): problems efficiently solvable by a classical computer that can flip random bits, with error probability \(< 1/3\) (or any fixed constant < 1/2).
  • BQP (“bounded‑error quantum polynomial time”): problems efficiently solvable by a quantum computer, again with error probability \(< 1/3\).

Formally, a language \(L\) is in BQP if there is a family of quantum circuits \(\{C_n\}\) of size polynomial in \(n\) such that for all inputs \(x\) of length \(n\):

  • If \(x \in L\), then \(\Pr[C_n(x) = 1] \ge 2/3\).
  • If \(x \notin L\), then \(\Pr[C_n(x) = 1] \le 1/3\).

We know that \(\text{BPP} \subseteq \text{BQP}\) (quantum can simulate classical randomness), but we do not know whether \(\text{BPP} = \text{BQP}\) or \(\text{BPP} \subsetneq \text{BQP}\). Evidence from Shor’s algorithm and other candidates strongly suggests that \(\text{BQP}\) is strictly more powerful than \(\text{BPP}\) for some natural problems (like factoring large integers), but there is no proof yet.

Hardware analogy table: consumer laptop vs quantum processor

The numbers below are intentionally rounded, order‑of‑magnitude style, to give intuition. Real devices vary wildly; treat these as cartoon benchmarks, not spec sheets.

Device Rough scale Operation type Throughput (back‑of‑envelope) Notes
Consumer laptop CPU (1 core) ~3 GHz clock Classical bit ops ~3×109 simple ops/s One scalar instruction stream; vector units add more parallelism.
Consumer laptop (8 cores, SIMD) 8 cores × 3 GHz × 128‑bit SIMD Classical bit/word ops ~1011–1012 basic ops/s Depends heavily on workload and vectorization.
Mid‑range GPU in a laptop ~103–104 cores Classical float ops ~1012–1013 FLOP/s peak Great for dense linear algebra and ML training.
Small noisy quantum processor ~50–100 physical qubits Quantum gates on 2N-dim state ~103–104 2‑qubit gates per run before noise Each gate acts on amplitudes of 2N basis states in superposition.
Hypothetical fault‑tolerant QPU ~106 physical qubits → 103 logical Error‑corrected quantum circuits ~109–1012 logical gates over long algorithms Aimed at large chemistry, optimization, or cryptographic tasks.

"How many laptops equal one quantum processor?" (cartoon answers)

The trick is: you cannot fairly compare them just by counting operations per second, because a quantum gate on \(N\) qubits simultaneously transforms amplitudes across \(2^N\) basis states. Still, you can get a feeling by asking:

Example 1: simulating 30 qubits on laptops

A small physical QPU with 30–40 qubits applies the same logical layer directly in hardware; it is as if you had a cluster of hundreds to thousands of laptops working together to update all amplitudes every gate.

Example 2: 50‑qubit state

So for generic 50‑qubit circuits, one real QPU is morally comparable to an enormous classical cluster. For structured problems, clever classical algorithms can do much better—that’s why the “how many laptops” question has no single honest number.

Very rough equivalence table (for mental models only)

Quantum device (noisy) State size 2N Naive RAM to store state Approx. # of 16 GB laptops for RAM only
20‑qubit QPU ~106 ~16 MB < 1 laptop (fits easily)
30‑qubit QPU ~109 ~16 GB ~1 laptop (RAM is tight but possible)
40‑qubit QPU ~1012 ~16 TB ~1000 laptops (each 16 GB)
50‑qubit QPU ~1015 ~16 PB ~1,000,000 laptops (each 16 GB)

This table only matches memory capacity, not speed. But it gives a rough intuition: once you pass ~40–50 entangled qubits in a generic circuit, brute‑force classical simulation starts looking like “you’d need an absurd number of consumer laptops.”

SHA-256 password hashing examples

The table below shows a few simple example passwords and their corresponding SHA-256 hash values. These are illustrative only; never hard‑code real passwords or reuse simple patterns like these in production.

Example password SHA-256 hash (hex)
password 5e884898da28047151d0e56f8dc6292773603d0d6aabbdd62a11ef721d1542d8
123456 8d969eef6ecad3c29a3a629280e686cf0c3f5d5a86aff3ca12020c923adc6c92
correcthorsebatterystaple cbe6beb26479b568e5f15b50217c6c83c0ee051dc4e522b9840d8e291d6aaf46
QuantumR0cks! a7c2a1b4b785bb7b39d6b26bafdc6f919aa0cc47a18eb2f6559bf0369a671a7a

Hashes above were computed with standard SHA-256; you can reproduce them using tools like sha256sum, openssl dgst -sha256, Python’s hashlib.sha256, or any reputable online hash calculator.

How long would it take to crack these passwords?

The table below uses deliberately rough, order‑of‑magnitude estimates to show how different passwords fare against brute‑force search on classical hardware and on a future, optimistic quantum attacker using Grover-style speedups. The point is not the exact numbers, but the contrast between weak and strong secrets—and why multi‑factor authentication (MFA) is so valuable.

Password
(example only)
SHA‑256 hash (first 16 hex chars) Approx. search space Brute‑force time – classical attacker
(1012 guesses/sec)
Brute‑force time – quantum attacker
(1018 "effective" guesses/sec with Grover)
123456
6 digits, only numbers
8d969eef6ecad3c2… 106 ≈ 1,000,000 ≈ 10-6 s (microseconds) Effectively instant
password
8 lower‑case letters
5e884898da280471… 268 ≈ 2×1011 ≈ 200 seconds (minutes) Grover over full space makes it even easier – effectively trivial
correcthorsebatterystaple
~25 lower‑case; if chosen from 104 words
and 4‑word combos ⇒ ~1016 possibilities
cbe6beb26479b568… ≈ 1016 (for a simple wordlist model) ≈ 104 s ≈ a few hours
with a strong classical rig and very good wordlist
Grover‑style speedup over 1016 gives √1016=108 steps.
At 1018 "effective" ops/s ⇒ ~10-10 s, but this
ignores huge practical overheads. Still: quantum helps.
S7q!vP9$Lm@2
12 chars, ~72‑char alphabet
(upper/lower/digits/symbols)
(example) 4f9c2b1a7d8e6c5b… ≈ 7212 ≈ 4×1022 4×1010 seconds ≈ 1,200+ years Grover reduces work to √4×1022 ≈ 2×1011 steps.
At 1018/s ⇒ ≈ 2×10-7 s in the toy model,
but building such a fault‑tolerant quantum machine for real‑world hash cracking is far beyond current tech.

Assumptions are deliberately aggressive in favor of the attacker and ignore I/O, memory, and protocol defenses; they’re meant as back‑of‑the‑envelope illustrations, not operational guarantees. Also, SHA‑256 is used here in isolation; real systems should wrap it in slow, memory‑hard KDFs such as bcrypt, scrypt, or Argon2.

Why MFA (multi‑factor authentication) still matters

Passwords—no matter how long—are only one factor: something you know. MFA adds at least one extra, independent factor, such as:

  • Something you have: a hardware security key (FIDO2/U2F), phone‑based OTP app, or smart card.
  • Something you are: biometrics like fingerprint or face unlock (ideally used as a local unlock for a hardware token, not sent to servers).

Even if an attacker fully recovers your password hash and somehow inverts it, they still need to satisfy the other factor in real time. Concretely, MFA helps block:

  • Password database leaks: A stolen hash alone won’t log in without your second factor.
  • Credential stuffing: Re‑used passwords sprayed across many sites are far less useful when those sites enforce MFA.
  • Quantum‑assisted guessing: If powerful quantum machines eventually make certain brute‑force attacks cheaper, MFA still forces the attacker to compromise a device or biometric, not just a string.

In practice, combining a strong, unique password (or passphrase) with hardware‑token‑based MFA gives you a large safety margin against both classical and foreseeable quantum brute‑force attacks.