Code that writes code

Exponential-looking growth in computing emerges when systems can improve their own code. Once a loop exists that can generate, execute, measure, and refine programs, capability compounds: each improvement enhances the next round of search over the program phase space. The driver is not just device density, but a feedback process that compresses errors, increases information flow, and reduces effective entropy in the distribution of behaviors.

In mathematical terms, let the capability at iteration $t$ be $C(t)$. A simple model of compounded self-improvement is \[ C(t+1) = (1 + r)\,C(t), \qquad r > 0, \] whose solution is \[ C(t) = C(0)\,(1 + r)^t \approx C(0)\,e^{rt}, \] capturing the "exponential-looking" growth. A more general continuous-time feedback model writes \[ \frac{dC}{dt} = f(I,C,E), \] where $I$ measures information flow through the system and $E$ is the average error. A negative dependence $\partial f/\partial E < 0$ and positive $\partial f/\partial I > 0$ encode that reducing errors and increasing information flow accelerates capability.

Let the program hypothesis space be a high-dimensional manifold $\mathcal{H}$ with coordinates $\theta \in \mathbb{R}^d$ parameterizing programs, and let a loss (or energy-like) function be $L : \mathcal{H} \to \mathbb{R}_+$. Guided search corresponds to iterates \[ \theta_{t+1} = \theta_t - \eta_t \, \nabla_\theta L(\theta_t) + \xi_t, \] where $\eta_t$ is a learning rate and $\xi_t$ represents stochastic exploration. The associated probability density over programs, $p_t(\theta)$, can be modeled as a Gibbs-like distribution \[ p_t(\theta) = \frac{1}{Z_t} \exp\big(-\beta_t L(\theta)\big), \] with partition function $Z_t$ and inverse temperature $\beta_t$ reflecting how strongly the system prefers lower-loss code. As $t$ increases and $\beta_t$ grows, the distribution contracts around high-quality programs, reducing effective entropy \[ S[p_t] = - \int_{\mathcal{H}} p_t(\theta)\, \log p_t(\theta)\, d\theta. \]

Mechanism of self-improvement

Program synthesis as guided search over a high-dimensional landscape; gradients, heuristics, and discrete exploration move candidates toward lower loss (energy-like objective).

Formally, represent each candidate program by parameters $\theta \in \mathbb{R}^d$ and define an objective (loss/energy) $L(\theta)$. Gradient-based guidance is \[ \theta_{t+1} = \theta_t - \eta_t \, \nabla_\theta L(\theta_t), \] while purely heuristic or discrete search can be modeled by a Markov chain with transition kernel $K(\theta'\mid\theta)$ that satisfies \[ p_{t+1}(\theta') = \int K(\theta'\mid\theta)\,p_t(\theta)\,d\theta, \] where $p_t(\theta)$ is the probability density over program parameters at iteration $t$. The stationary distribution $p_*(\theta)$ often approximates a Boltzmann form $p_*(\theta) \propto e^{-\beta L(\theta)}$, concentrating mass near minima of $L$.

Error-correcting feedback reduces uncertainty: observations act as measurements that collapse hypotheses, concentrating probability mass on higher-fidelity code.

Let $\Theta$ be a discrete hypothesis set of programs, with prior $P(\theta)$ for $\theta \in \Theta$. Given observed data or test outcomes $D$, Bayes' rule updates beliefs via \[ P(\theta \mid D) = \frac{P(D \mid \theta)P(\theta)}{P(D)}, \qquad P(D) = \sum_{\theta' \in \Theta} P(D \mid \theta')P(\theta'). \] The Shannon entropy of the hypothesis distribution is \[ H[P] = -\sum_{\theta \in \Theta} P(\theta)\,\log P(\theta). \] Error-correcting feedback corresponds to sequences of measurements $D_1, D_2, \dots$ such that \[ H\big[P(\cdot \mid D_1,\dots,D_t)\big] \searrow H_* \quad \text{as} \quad t \to \infty, \] where $H_*$ is low and the posterior probability mass is concentrated on a small set of high-fidelity programs.

Thermodynamics of computation: every irreversible operation dissipates energy; efficient self-editing favors representations and compilers that minimize erasures while maximizing useful work per joule.

Landauer's principle states that erasing one bit of information incurs a minimum heat dissipation \[ Q_{\min} = k_B T \ln 2, \] where $k_B$ is Boltzmann's constant and $T$ is the ambient temperature. If a self-editing system performs $N_\text{erase}$ bit erasures per update, the minimal thermodynamic cost per update is \[ Q_\text{update} \ge N_\text{erase}\,k_B T \ln 2. \] The computational efficiency can be expressed as useful information gain per unit energy, \[ \eta_\text{info} = \frac{\Delta I}{Q_\text{update}}, \] where $\Delta I$ is the increase in mutual information between internal model parameters $\Theta$ and task outcomes $Y$: \[ I(\Theta;Y) = \sum_{\theta,y} p(\theta,y)\, \log \frac{p(\theta,y)}{p(\theta)\,p(y)}. \] Efficient self-editing corresponds to strategies that maximize $\eta_\text{info}$ by minimizing unnecessary erasures and maximizing $\Delta I$.

Compression and generalization: shorter effective descriptions (lower algorithmic complexity) tend to transfer across tasks, enabling accelerated reuse and faster convergence on new problems.

Let the Kolmogorov complexity (algorithmic information content) of a solution for task $T$ be $K(T)$, defined as the length (in bits) of the shortest program that solves $T$ on a fixed universal Turing machine $U$. For two tasks $T_1, T_2$, the shared structure can be quantified by an information-theoretic overlap \[ I(T_1;T_2) \approx K(T_1) + K(T_2) - K(T_1,T_2), \] where $K(T_1,T_2)$ is the complexity of jointly solving both tasks. Reusable compressed representations correspond to internal codes $z$ of small description length $L(z)$ that minimize expected description length over tasks: \[ \min_{z} \, \mathbb{E}_{T \sim \mathcal{D}}\big[ L(z) + L(T \mid z) \big], \] with $\mathcal{D}$ a task distribution. Lower $L(z)$ and $L(T\mid z)$ imply faster adaptation and thus accelerated convergence on new problems.

Beginner note: If the math above feels heavy, you do not need to absorb every symbol. The key idea is simple: systems that can test and improve their own code can get better very quickly. Quantum computing plugs into this story by giving those systems a new kind of hardware to explore enormous search spaces more efficiently.

How this page is assembled

This page was assembled with automated assistance: generative tooling produced structure and text, which were reviewed and emitted as static HTML. In other words, the site itself serves as a small example of code that helps author more code. AI for Not Bad.

Abstractly, denote the human editor as $H$ and the generative model as $G$. Let $x$ be an initial specification and $y$ the final HTML. The interaction can be seen as an alternating minimization over drafts $y_t$: \[ y_{t+1} = \operatorname*{arg\,min}_{y} \Big( \mathcal{L}_\text{spec}(y \mid x) + \mathcal{L}_\text{style}(y) + \mathcal{L}_\text{error}(y) \Big), \] with updates produced by either $G$ or $H$: \[ y_{t+1} = \begin{cases} G(y_t,x,\xi_t) & \text{with probability } p_G, \\ H(y_t,x) & \text{with probability } 1-p_G, \end{cases} \] where $\xi_t$ captures model stochasticity. Convergence is reached when successive edits satisfy a small-difference condition, e.g. \[ d(y_{t+1},y_t) < \varepsilon, \] for a suitable distance metric $d$ on documents.

Deep dive: a tiny quantum circuit in code and math

Here is a minimal example of preparing a Bell state using Python-style pseudocode with a Qiskit‑like API, plus the matching math.

# Pseudocode using a Qiskit-like API
from qiskit import QuantumCircuit

qc = QuantumCircuit(2, 2)  # 2 qubits, 2 classical bits

# 1. Put qubit 0 into a superposition
qc.h(0)

# 2. Use it as control for a CNOT on qubit 1
qc.cx(0, 1)

# 3. Measure both qubits
qc.measure(0, 0)
qc.measure(1, 1)

Mathematically, in the $|00\rangle,|01\rangle,|10\rangle,|11\rangle$ basis:

Start in $|00\rangle$.
Apply $H$ to the first qubit: \[ H|0\rangle = \tfrac{1}{\sqrt{2}}(|0\rangle + |1\rangle), \] so the joint state becomes \[ |\psi_1\rangle = \tfrac{1}{\sqrt{2}}(|00\rangle + |10\rangle). \]
Apply CNOT with qubit 0 as control and qubit 1 as target: \[ \text{CNOT}|00\rangle = |00\rangle, \quad \text{CNOT}|10\rangle = |11\rangle. \] Therefore \[ |\psi_2\rangle = \tfrac{1}{\sqrt{2}}(|00\rangle + |11\rangle) = |\Phi^+\rangle. \]

When you measure both qubits in the computational basis, you get:

$P(00) = 1/2$
$P(11) = 1/2$
$P(01) = P(10) = 0$

This is “maximal entanglement” in its simplest form.

Minimal meta-programming sketch

// Pseudocode for a self-improving loop
population = initialize_programs()
while (time < budget):
  for prog in population:
    result = execute(prog, tests)
    score  = measure(result)
    log(result, score)
  models = fit_surrogates(log)          // learn to predict score from program features
  proposals = propose(models)           // synthesize or mutate new programs
  population = select(population, proposals, log) // keep better, diverse candidates
  if converged(population): break
deploy(best(population))

Let the population at iteration $t$ be $\mathcal{P}_t = \{\theta_t^{(1)},\dots,\theta_t^{(N)}\}$, and let each candidate have a score (fitness) $F(\theta)$. The selection step can be modeled by a softmax (Boltzmann) sampling distribution \[ P_t(\theta) = \frac{\exp\big(\beta_t F(\theta)\big)}{\sum_{\theta' \in \mathcal{P}_t} \exp\big(\beta_t F(\theta')\big)}, \] with inverse temperature $\beta_t$ controlling selection pressure. Proposals (mutations or synthesized programs) $\tilde{\theta}$ are drawn from a proposal kernel $q_t(\tilde{\theta} \mid \theta)$, so the expected update of the population distribution is \[ p_{t+1}(\tilde{\theta}) = \sum_{\theta} P_t(\theta)\, q_t(\tilde{\theta} \mid \theta). \] Convergence of the loop can be expressed via a stopping condition such as \[ \operatorname{Var}_{\theta \sim p_t}[F(\theta)] < \delta \quad \text{or} \quad \max_{\theta \in \mathcal{P}_t} F(\theta) - \max_{\theta \in \mathcal{P}_{t-k}} F(\theta) < \epsilon, \] for some window size $k$ and tolerances $\delta, \epsilon$.

Hierarchical graph of matter and interactions

Universe
└─ Quantum fields
   ├─ Fermion fields (matter)
   │  ├─ Quarks (color charge: red/green/blue)
   │  │  ├─ Flavors: up, down, charm, strange, top, bottom
   │  │  └─ Bound states (hadrons)
   │  │     ├─ Baryons (3 quarks)
   │  │     │  ├─ Proton: u u d
   │  │     │  ├─ Neutron: u d d
   │  │     │  └─ Antibaryons (3 antiquarks)
   │  │     │     └─ Antiproton: ū ū d̄
   │  │     └─ Mesons (quark + antiquark)
   │  └─ Leptons
   │     ├─ Electron, Muon, Tau
   │     └─ Neutrinos (three types) + corresponding antiparticles (e.g., positron)
   ├─ Boson fields (interaction carriers)
   │  ├─ Gluons (strong interaction, SU(3) color)
   │  ├─ Photon (electromagnetic, U(1))
   │  ├─ W± and Z⁰ (weak interaction)
   │  └─ Gravitational quantum (graviton, hypothetical)
   └─ Scalar field associated with electroweak symmetry breaking
      └─ Scalar boson (mass-generating excitation)

Composite structures
└─ Atomic nucleus: protons + neutrons (held by residual strong force)
   ├─ Atoms: nucleus + electron cloud (quantized orbitals)
   ├─ Molecules: atoms bound via electromagnetic interaction
   └─ Condensed phases: solids, liquids, gases, plasmas, exotic matter

“Matter” typically denotes fermions and their composites. Gauge bosons (such as gluons) mediate interactions and are included for completeness.

In the Standard Model, the fundamental fields can be written as a Lagrangian density $\mathcal{L}_\text{SM}$ of the form \[ \mathcal{L}_\text{SM} = -\frac{1}{4} \sum_{a} F_{\mu\nu}^a F^{a\,\mu\nu} + \sum_{f} \bar{\psi}_f\big(i\gamma^\mu D_\mu - m_f\big)\psi_f + (D_\mu \phi)^\dagger(D^\mu \phi) - V(\phi) + \mathcal{L}_\text{Yukawa}, \] where \ F_{\mu\nu}^a = \partial_\mu A_\nu^a - \partial_\nu A_\mu^a + g f^{abc} A_\mu^b A_\nu^c \\ encodes the gauge bosons (gluons, $W^\pm, Z^0$, photon), $\psi_f$ are fermion fields (quarks and leptons), $\phi$ is the Higgs scalar field, and $D_\mu = \partial_\mu - i g A_\mu^a T^a$ is the gauge-covariant derivative.

Color-charged quarks $q$ interact via the SU(3) gauge field $G_\mu^a$ with coupling constant $g_s$, described by \[ \mathcal{L}_\text{QCD} = \bar{q}\big(i\gamma^\mu D_\mu - m_q\big)q - \frac{1}{4} G_{\mu\nu}^a G^{a\,\mu\nu}, \] where $D_\mu = \partial_\mu - i g_s T^a G_\mu^a$ and $G_{\mu\nu}^a$ has the same non-Abelian structure as $F_{\mu\nu}^a$ above. Bound states such as protons and neutrons are color-singlet combinations of three quarks (baryons), while mesons are quark–antiquark pairs $q \bar{q}$, consistent with overall color neutrality.

Leptons (electron $e$, muon $\mu$, tau $\tau$ and their neutrinos $\nu_e, \nu_\mu, \nu_\tau$) couple to the electroweak SU(2)$_L$\timesU(1)$_Y$ gauge fields. After electroweak symmetry breaking, the Higgs field acquires a vacuum expectation value \[ \langle \phi \rangle = \frac{1}{\sqrt{2}} \begin{pmatrix}0 \\ v \end{pmatrix}, \qquad v \approx 246~\text{GeV}, \] which generates fermion and weak boson masses through Yukawa terms $y_f \bar{\psi}_f \phi \psi_f$ and gauge interactions, while leaving the photon massless.

Composite structures are organized hierarchically. A nucleus with $Z$ protons and $A-Z$ neutrons has baryon number $B = A$ and charge $Q = Z e$. An atom adds $Z$ electrons, leading to a many-body Hamiltonian \[ H = \sum_{i=1}^{Z} \bigg( -\frac{\hbar^2}{2m_e} \nabla_i^2 - \frac{Z e^2}{4\pi \varepsilon_0 r_i} \bigg) + \sum_{1 \le i < j \le Z} \frac{e^2}{4\pi \varepsilon_0 \lvert \mathbf{r}_i - \mathbf{r}_j \rvert} + H_\text{nucleus}, \] whose eigenstates $\Psi_n$ correspond to quantized orbitals and energy levels $E_n$ solving \[ H \Psi_n = E_n \Psi_n. \] Molecules and condensed phases arise when these atomic states combine via electromagnetic interactions into multi-atom bound states and extended many-body systems.

Portfolio as a wave function: density matrices instead of variance–covariance matrices

Below is a compact, code-adjacent sketch of the idea: in a quantum-flavored actuarial / MPT model, the classical variance–covariance matrix $\Sigma$ is upgraded to a density matrix $\rho$, and a portfolio is treated as a wave function $|\psi\rangle$. This mirrors what you might simulate with Stan, but in linear-algebra form.

Classical MPT / actuarial notation

// n assets, vector of random returns R
// mean vector μ, variance–covariance matrix Σ

E[R]   = μ               // n×1
Cov(R) = Σ               // n×n (symmetric, PSD)

// portfolio weights (deterministic)
vector[n] w;
real R_p = dot_product(w, R);     // portfolio return

real mean_p   = dot_product(w, μ);
real var_p    = quad_form(Σ, w);  // w^T Σ w

LaTeX form: $ \mathbb{E}[R] = \mu $, $ \operatorname{Cov}(R) = \Sigma $, $ R_p = w^\top R $, $ \mathbb{E}[R_p] = w^\top \mu $, $ \operatorname{Var}(R_p) = w^\top \Sigma w $.

Quantum-style upgrade

We now re-interpret these same objects in a Hilbert-space way:

Basis states $|e_i\rangle$: one-hot exposure to asset $i$.
Portfolio wave function $|\psi\rangle = \sum_i \psi_i |e_i\rangle$ with $\sum_i |\psi_i|^2 = 1$.
Density matrix (pure state) $\rho = |\psi\rangle\langle\psi|$.
Return operator $\hat R$ with components $\hat R |e_i\rangle = R_i |e_i\rangle$ in the simplest diagonal case.

In code-flavored pseudomath:

// amplitudes ψ (complex allowed), normalized
complex psi[n];
// density matrix ρ_ij = ψ_i ψ*_j
complex rho[n, n] = outer_product(psi, conj(psi));

// return operator R̂ (for illustration, take it diagonal)
real R_vals[n];    // possible asset returns
complex R_op[n, n];
for (i in 1:n) {
  for (j in 1:n) R_op[i,j] = (i == j) ? R_vals[i] : 0;
}

// quantum expectation of portfolio return
complex mean_p_q = trace(rho * R_op); // ≈ classical w^T μ

Mathematically,

ρ = |ψ><ψ|
<R̂>_ψ = <ψ| R̂ |ψ> = Tr(ρ R̂)

LaTeX form: $ \rho = |\psi\rangle\langle\psi| $, $ \langle \hat R \rangle_\psi = \langle \psi|\hat R|\psi\rangle = \operatorname{Tr}(\rho \hat R) $.

If $\hat R$ is diagonal in the $|e_i\rangle$ basis and we ignore phases, $|\psi_i|^2$ plays the same role as classical weights $w_i$. But once off-diagonal elements are allowed, $\rho$ carries richer cross-asset structure than $\Sigma$ alone.

Stan-style simulation vs Schrödinger-style evolution

Stan would typically simulate posterior draws of parameters $\theta$ (e.g., drifts, vols, correlations) and then simulate returns:

// very schematic Stan pseudo-flow
for (m in 1:M) {
  θ[m] ~ posterior(· | data);      // parameters: μ[m], Σ[m], etc.
  R[m] ~ mvnormal(μ[m], Σ[m]);     // draw returns
  R_p[m] = dot_product(w, R[m]);
}
// compute empirical mean/var/VaR of R_p from samples

In the wave-function view, instead of sampling many $\theta$, we evolve the state itself under a Hamiltonian $\hat H$ that encodes drift/volatility/market structure:

// time evolution of portfolio state ψ_t
// i ℏ d|ψ_t>/dt = Ĥ |ψ_t>

psi_t+Δt ≈ exp(-i Δt Ĥ / ℏ) * psi_t;

// at horizon T, density matrix ρ_T = |ψ_T><ψ_T|
// expected payoff under operator Π̂
price_0 ≈ discount * trace(ρ_T * Π̂);

This is the code-level version of “instead of running Stan simulations over parameter space, we let the whole portfolio turn into a wave function and flow under $\hat H$.”

Black–Scholes as the one-asset limit

For a single risky asset in Black–Scholes, under the risk-neutral measure we have

dS_t = r S_t dt + σ S_t dW_t

The price $V(S,t)$ of a European claim satisfies

∂V/∂t + (1/2) σ^2 S^2 ∂^2V/∂S^2 + r S ∂V/∂S - r V = 0

After changing variables $x = \ln S$ and switching to forward time $τ = T - t$, this maps to a heat equation which can be written in imaginary time as

∂φ/∂τ = (1/2) σ^2 ∂^2φ/∂x^2 - V_eff(x) φ

Under a Wick rotation $τ → i t$, this resembles a Schrödinger equation

i ℏ ∂ψ/∂t = Ĥ ψ

with $\hat H$ containing a kinetic term (diffusion from $σ$) plus an effective potential. In this limit:

Classical risk-neutral density $f_{S_T}(s)$ ↔ $|\psi(s,T)|^2$.
Option price $V_0$ ↔ expectation $\langle \psi_T| \hat \Pi |\psi_T\rangle$.

So Black–Scholes is already half-way to the quantum formalism: it evolves distributions through a linear PDE. The density-matrix formulation just generalizes this to multiple coupled assets and richer dependence than a single $\Sigma$ can conveniently express.

Rank	System Name	Developer	Qubit Type	Physical Qubits	Key Performance Metrics	Notable Achievements
1	System Model H2	Quantinuum	Trapped Ion	32+ (scalable to 56)	QV: 2²⁵ (33,554,432); 2-qubit fidelity: 99.9% ("three 9s"); Logical qubits: Up to 12 entangled with 0.0011% error rate	World's highest QV; 4× QV gain in 2025 alone; First chemistry sim combining QC, HPC, and AI; Outperforms all in stable, fault-tolerant ops.
2	Willow	Google Quantum AI	Superconducting	105	2-qubit fidelity: >99.9%; Performs RCS benchmark in <5 min (10²⁵ years classically); Logical qubits: Demonstrated below-threshold error correction	Quantum supremacy on practical tasks; Doubles coherence vs. physical qubits; Outpaces supercomputers by 10 septillion times on sampling.
3	Zuchongzhi 3.0	USTC (China)	Superconducting	105	1-qubit fidelity: 99.90%; 2-qubit fidelity: 99.62%; Task speed: Seconds (5.9B years classically)	Rivals Willow in speed; Uses low-noise tantalum/niobium; 15×7 lattice for high connectivity; Major leap in raw performance.
4	Nighthawk	IBM	Superconducting	120	Up to 5,000 two-qubit gates; 2-qubit fidelity: ~99.5%; QV: >2²⁰ (est.); Tunable couplers: 218+	Path to 2026 quantum advantage; 20% more couplers than Heron; Real-time error decoding in <480 ns; Utility-scale molecular sims with Fugaku supercomputer.
5	Forte Enterprise	IonQ	Trapped Ion	36 (Tempo: 64 planned)	2-qubit fidelity: 99.99% ("four 9s"); #AQ 36 (all-to-all connectivity); 20× performance gains in apps	World-record fidelity for error correction; Efficient logical qubits with fewer physical ones; Used in drug discovery and finance modeling.
6	Ankaa-3	Rigetti	Superconducting	84+ (100+ modular by end-2025)	2-qubit fidelity: 99.5%; Nanosecond gate speeds; Real-time error correction on 84 qubits	Fastest gate times (vs. microsecond rivals); Sold as QPUs to labs; Chiplet roadmap for utility-scale; 98% median fidelity in square lattice.
7	Neutral Atom Array	QuEra Computing	Neutral Atom	3,000 (planned; current: 256)	Logical qubits: 48 with 0.5% error (vs. IBM's 2.9%); High entanglement stability	Fault-tolerant leader; Outperforms Heron in error rates; Scales to 10,000 qubits by 2026; Apps in logistics and materials science.
8	Majorana 1	Microsoft (Azure Quantum)	Topological	~24 logical (scalable to 1M)	Inherent low error rates; High-fidelity Majorana quasiparticles; Logical qubits: 12+ entangled	First topological processor; Million-qubit potential; Integrates with Quantinuum/Atom for chemistry/AI; Below-threshold errors.
9	Advantage 2	D-Wave	Quantum Annealing	7,440	15-way connectivity; Quantum supremacy on real-world optimization	Fastest for optimization (e.g., logistics); Not gate-based, but 3-month free trials via Leap; Beats classical on QUBO problems.
10	Kookaburra	IBM	Superconducting	1,386 (multi-chip)	Enhanced coherence; High gate fidelity (~99%); Part of Heron R2 upgrades	Massive scale for 2025; Builds on Condor (1,121 qubits); Focus on error-corrected simulations; Quantum-centric supercomputing.

Classical vs Quantum Computer: pseudo-code views

This section strips away hardware details and shows, in pseudo‑code, how a conventional laptop and a quantum processor “feel” different as machines.

Conventional computer (motherboard + CPU + RAM)

// physical picture
// --------------------------------------------------
// motherboard: connects CPU, RAM, storage, peripherals
// CPU: executes instructions sequentially (with some parallelism)
// RAM: holds bits (0 or 1) for active programs

machine ClassicalComputer {
  Motherboard board;
  CPU         cpu;
  RAM         ram;
  Storage     disk;
}

// run a program
function run_classical(program P, input bits_in[]):
  // 1. load code and data into RAM
  ram.load(P.code)
  ram.load(bits_in)

  // 2. CPU executes instructions one by one
  while cpu.instruction_pointer not at END(P):
    instr = ram.fetch(cpu.instruction_pointer)
    cpu.execute(instr, ram)
    cpu.instruction_pointer++

  // 3. read out output bits from RAM
  bits_out = ram.read(P.output_region)
  return bits_out

Quantum processor (QPU + classical control computer)

// physical picture
// --------------------------------------------------
// classical control computer: compiles code, sends pulses
// quantum processing unit (QPU): array of qubits on a chip
// cryostat: keeps QPU near absolute zero

machine QuantumComputer {
  ClassicalControl ctrl;     // compiler, schedulers
  QuantumProcessingUnit qpu; // qubits + control lines
}

// high-level quantum run
function run_quantum(circuit C, classical_input bits_in[]):
  // 1. classical pre-processing
  //    e.g., encode bits_in into initial qubit states
  compiled_pulses = ctrl.compile(C, bits_in)

  // 2. upload pulse schedule to QPU
  qpu.load(compiled_pulses)

  // 3. apply quantum operations (unitaries)
  qpu.apply_pulses()
  // internally, each gate is a unitary U on |ψ>:
  //    |ψ_new> = U |ψ_old>

  // 4. measure qubits
  measurement_record = qpu.measure_all() // collapses |ψ> → classical bits

  // 5. classical post-processing
  result = ctrl.post_process(measurement_record)
  return result

Conceptually:

The classical CPU walks through a list of instructions, flipping bits in RAM.
The QPU evolves a joint quantum state $|\psi\rangle$ of many qubits by applying gates (unitaries), then collapses that state to classical bits via measurement.

Same abstract computation, two execution models

Imagine we want to compute the parity (even/odd) of $N$ bits.

Classical laptop parity

function parity_classical(bits[]):
  acc = 0
  for b in bits:
    acc = acc XOR b
  return acc  // 0 = even, 1 = odd

Quantum parity sketch (conceptual)

function parity_quantum(bits[]):
  // 1. encode bits into computational basis states
  //    |b_1 b_2 ... b_N>
  |ψ> = |b_1 b_2 ... b_N>

  // 2. use a circuit of CNOTs to copy global parity into an ancilla qubit
  //    |ψ, 0>  →  |ψ, parity(bits)>
  for i in 1..N:
    CNOT(control = qubit_i, target = ancilla)

  // 3. measure only the ancilla qubit
  result = measure(ancilla)
  return result  // 0 = even, 1 = odd

On such a small task, the quantum route is not “better” than the classical one—it is just a different physical implementation of the same logical function. Real quantum advantage typically appears in problems that exploit superposition and interference over huge state spaces.

Deep dive: complexity classes BPP vs BQP

If you like big‑picture theory, the usual cartoon is:

BPP (“bounded‑error probabilistic polynomial time”): problems efficiently solvable by a classical computer that can flip random bits, with error probability $< 1/3$ (or any fixed constant < 1/2).
BQP (“bounded‑error quantum polynomial time”): problems efficiently solvable by a quantum computer, again with error probability $< 1/3$.

Formally, a language $L$ is in BQP if there is a family of quantum circuits $\{C_n\}$ of size polynomial in $n$ such that for all inputs $x$ of length $n$:

If $x \in L$, then $\Pr[C_n(x) = 1] \ge 2/3$.
If $x \notin L$, then $\Pr[C_n(x) = 1] \le 1/3$.

We know that $\text{BPP} \subseteq \text{BQP}$ (quantum can simulate classical randomness), but we do not know whether $\text{BPP} = \text{BQP}$ or $\text{BPP} \subsetneq \text{BQP}$. Evidence from Shor’s algorithm and other candidates strongly suggests that $\text{BQP}$ is strictly more powerful than $\text{BPP}$ for some natural problems (like factoring large integers), but there is no proof yet.

Hardware analogy table: consumer laptop vs quantum processor

The numbers below are intentionally rounded, order‑of‑magnitude style, to give intuition. Real devices vary wildly; treat these as cartoon benchmarks, not spec sheets.

Device	Rough scale	Operation type	Throughput (back‑of‑envelope)	Notes
Consumer laptop CPU (1 core)	~3 GHz clock	Classical bit ops	~3×10⁹ simple ops/s	One scalar instruction stream; vector units add more parallelism.
Consumer laptop (8 cores, SIMD)	8 cores × 3 GHz × 128‑bit SIMD	Classical bit/word ops	~10¹¹–10¹² basic ops/s	Depends heavily on workload and vectorization.
Mid‑range GPU in a laptop	~10³–10⁴ cores	Classical float ops	~10¹²–10¹³ FLOP/s peak	Great for dense linear algebra and ML training.
Small noisy quantum processor	~50–100 physical qubits	Quantum gates on 2^N-dim state	~10³–10⁴ 2‑qubit gates per run before noise	Each gate acts on amplitudes of 2^N basis states in superposition.
Hypothetical fault‑tolerant QPU	~10⁶ physical qubits → 10³ logical	Error‑corrected quantum circuits	~10⁹–10¹² logical gates over long algorithms	Aimed at large chemistry, optimization, or cryptographic tasks.

"How many laptops equal one quantum processor?" (cartoon answers)

The trick is: you cannot fairly compare them just by counting operations per second, because a quantum gate on $N$ qubits simultaneously transforms amplitudes across $2^N$ basis states. Still, you can get a feeling by asking:

Example 1: simulating 30 qubits on laptops

State vector size: $2^{30} \approx 10^9$ complex amplitudes.
One layer of single‑qubit gates might touch all $10^9$ amplitudes.
A single good laptop GPU at ~10¹² FLOP/s can simulate ~10³ such layers per second in theory, but memory bandwidth and overhead reduce this a lot.

A small physical QPU with 30–40 qubits applies the same logical layer directly in hardware; it is as if you had a cluster of hundreds to thousands of laptops working together to update all amplitudes every gate.

Example 2: 50‑qubit state

State vector size: $2^{50} \approx 10^{15}$ complex amplitudes.
Storing this naively (16 bytes per complex) needs ~16 petabytes of RAM.
Even a supercomputer cluster of millions of consumer‑class laptops would struggle to hold and update this in full generality.

So for generic 50‑qubit circuits, one real QPU is morally comparable to an enormous classical cluster. For structured problems, clever classical algorithms can do much better—that’s why the “how many laptops” question has no single honest number.

Very rough equivalence table (for mental models only)

Quantum device (noisy)	State size 2^N	Naive RAM to store state	Approx. # of 16 GB laptops for RAM only
20‑qubit QPU	~10⁶	~16 MB	< 1 laptop (fits easily)
30‑qubit QPU	~10⁹	~16 GB	~1 laptop (RAM is tight but possible)
40‑qubit QPU	~10¹²	~16 TB	~1000 laptops (each 16 GB)
50‑qubit QPU	~10¹⁵	~16 PB	~1,000,000 laptops (each 16 GB)

This table only matches memory capacity, not speed. But it gives a rough intuition: once you pass ~40–50 entangled qubits in a generic circuit, brute‑force classical simulation starts looking like “you’d need an absurd number of consumer laptops.”

SHA-256 password hashing examples

The table below shows a few simple example passwords and their corresponding SHA-256 hash values. These are illustrative only; never hard‑code real passwords or reuse simple patterns like these in production.

Example password	SHA-256 hash (hex)
password	5e884898da28047151d0e56f8dc6292773603d0d6aabbdd62a11ef721d1542d8
123456	8d969eef6ecad3c29a3a629280e686cf0c3f5d5a86aff3ca12020c923adc6c92
correcthorsebatterystaple	cbe6beb26479b568e5f15b50217c6c83c0ee051dc4e522b9840d8e291d6aaf46
QuantumR0cks!	a7c2a1b4b785bb7b39d6b26bafdc6f919aa0cc47a18eb2f6559bf0369a671a7a

Hashes above were computed with standard SHA-256; you can reproduce them using tools like sha256sum, openssl dgst -sha256, Python’s hashlib.sha256, or any reputable online hash calculator.

How long would it take to crack these passwords?

The table below uses deliberately rough, order‑of‑magnitude estimates to show how different passwords fare against brute‑force search on classical hardware and on a future, optimistic quantum attacker using Grover-style speedups. The point is not the exact numbers, but the contrast between weak and strong secrets—and why multi‑factor authentication (MFA) is so valuable.

Password (example only)	SHA‑256 hash (first 16 hex chars)	Approx. search space	Brute‑force time – classical attacker (10¹² guesses/sec)	Brute‑force time – quantum attacker (10¹⁸ "effective" guesses/sec with Grover)
123456 6 digits, only numbers	8d969eef6ecad3c2…	10⁶ ≈ 1,000,000	≈ 10^-6 s (microseconds)	Effectively instant
password 8 lower‑case letters	5e884898da280471…	26⁸ ≈ 2×10¹¹	≈ 200 seconds (minutes)	Grover over full space makes it even easier – effectively trivial
correcthorsebatterystaple ~25 lower‑case; if chosen from 10⁴ words and 4‑word combos ⇒ ~10¹⁶ possibilities	cbe6beb26479b568…	≈ 10¹⁶ (for a simple wordlist model)	≈ 10⁴ s ≈ a few hours with a strong classical rig and very good wordlist	Grover‑style speedup over 10¹⁶ gives √10¹⁶=10⁸ steps. At 10¹⁸ "effective" ops/s ⇒ ~10^-10 s, but this ignores huge practical overheads. Still: quantum helps.
S7q!vP9$Lm@2 12 chars, ~72‑char alphabet (upper/lower/digits/symbols)	(example) 4f9c2b1a7d8e6c5b…	≈ 72¹² ≈ 4×10²²	4×10¹⁰ seconds ≈ 1,200+ years	Grover reduces work to √4×10²² ≈ 2×10¹¹ steps. At 10¹⁸/s ⇒ ≈ 2×10^-7 s in the toy model, but building such a fault‑tolerant quantum machine for real‑world hash cracking is far beyond current tech.

Assumptions are deliberately aggressive in favor of the attacker and ignore I/O, memory, and protocol defenses; they’re meant as back‑of‑the‑envelope illustrations, not operational guarantees. Also, SHA‑256 is used here in isolation; real systems should wrap it in slow, memory‑hard KDFs such as bcrypt, scrypt, or Argon2.

Why MFA (multi‑factor authentication) still matters

Passwords—no matter how long—are only one factor: something you know. MFA adds at least one extra, independent factor, such as:

Something you have: a hardware security key (FIDO2/U2F), phone‑based OTP app, or smart card.
Something you are: biometrics like fingerprint or face unlock (ideally used as a local unlock for a hardware token, not sent to servers).

Even if an attacker fully recovers your password hash and somehow inverts it, they still need to satisfy the other factor in real time. Concretely, MFA helps block:

Password database leaks: A stolen hash alone won’t log in without your second factor.
Credential stuffing: Re‑used passwords sprayed across many sites are far less useful when those sites enforce MFA.
Quantum‑assisted guessing: If powerful quantum machines eventually make certain brute‑force attacks cheaper, MFA still forces the attacker to compromise a device or biometric, not just a string.

In practice, combining a strong, unique password (or passphrase) with hardware‑token‑based MFA gives you a large safety margin against both classical and foreseeable quantum brute‑force attacks.