Atlas · Methodology

How the verdict is produced.

The claim "this circuit is classically tractable" is only worth the method that backs it. Atlas does not predict simulability from features — it measures it, four independent ways, and certifies the case only when independent methods agree. That is how "you don't need the QPU" becomes a constructive, checkable result instead of a guess. Here is the full mechanism, including where it abstains.

Atlas (Krenn·IQ) · the methodology behind every result on the Evidence page

1 Four independent estimators

Each estimator measures the same circuit from a different mathematical space. Independence is the point: agreement among methods built on different mathematics is corroboration a single tool cannot fake.

Stim · #T / stabilizer

Magic (non-stabilizerness)

Counts the non-Clifford (T-gate) budget. A pure stabilizer circuit is classically simulable at any size; magic is what can push a circuit past the classical frontier.

Space: the stabilizer polytope · Clifford algebra

quimb · MPS bond

Entanglement / bond dimension

Measures the matrix-product-state bond needed to represent the state. Area-law / structured circuits stay cheap; volume-law entanglement makes the bond blow up.

Space: tensor networks · Schmidt rank

cotengra · treewidth

Contraction complexity

Measures the treewidth of the circuit's interaction graph — the cost of contracting it as a tensor network. Low-treewidth topologies contract cheaply regardless of depth.

Space: graph theory · circuit topology

Pauli-spread

Operator locality

Tracks how a local operator spreads under the circuit (scrambling). Slow spread means locality is preserved and classical methods stay tractable.

Space: Heisenberg picture · operator growth

These are open SoTA engines (Stim, quimb, cotengra) plus a measured operator-spread signal. Atlas's contribution is not the engines — it is the layer that runs them as independent witnesses and adjudicates the result. Reproduce: route_adjudicator.py.

2 The Certificate — agreement as convergent validity

When ≥2 independent folds agree "cheap", that is corroboration, not a single prediction. Atlas turns that agreement into a signed, hash-stamped certificate with five levels.

Level	Agreement	Meaning
STRONG	≥2 folds cheap, 0 hard	classically simulable — convergent validity
FIRM	classical majority, named dissenter	simulable, dissent documented
WEAK	a single fold carries it	provisional, thin evidence
SPLIT	even split	on the frontier — the intractable region made observable
NULL	no fold cheap	QPU-required

Each certificate carries a SHA-256 content hash of the circuit, the per-fold signers (axis · cost · vote), the level, and the engine's actual route — an archivable, citable audit artifact, not a black-box verdict. Soundness battery: as the compute budget tightens, the level degrades STRONG → SPLIT → WEAK exactly as the independent folds begin to disagree, and 0 false-STRONG is ever issued on a hard verdict.

Reproduce: physics_magnitude_lab.certificate.certificate(n, circuit, budget_log2=30) → level + signers + hash. Soundness: scripts/certificate_validate.py → certificate_validation.json. Full discussion: Evidence §10.

3 The Convergence Map — disagreement made observable

The same call that certifies agreement also names which fold dissents, and in which direction. A SPLIT is not noise — it is the classically-intractable frontier (Leone-region) becoming visible.

When the treewidth axis and the MPS-bond axis disagree, that divergence is not a failure of the tool — it is a measurement of where the cheap structural proxies and the entanglement proxy stop agreeing about the same circuit. Atlas surfaces that divergence as a map rather than hiding it inside a single score. The disagreement tells a researcher exactly where the circuit sits on the simulability frontier, and which resource (magic, entanglement, topology, locality) is the one pushing it across. Two faces, one call: agreement certifies; disagreement maps the frontier.

This is the publishable core — formalising the treewidth↔MPS divergence as a central metric, with concordance against the noise-bound theory of Shao et al. (arXiv:2606.00474). Roadmap-tracked, not yet a closed result.

4 Five honesty principles

The architecture is constrained, by design, so that the failure modes that make confidence scores dangerous cannot occur.

No false-STRONG. A STRONG certificate is never issued on a hard circuit. Soundness is validated as a permanent battery (0 false-STRONG); the dangerous failure — routing an expensive circuit as cheap — is the one Atlas is built to never make.
Honest abstention = MEDIUM. Deciding exactly is provably super-exponential (Leone et al., arXiv:2602.22330). When the evidence splits, Atlas returns a calibrated MEDIUM — the theoretical ceiling speaking, not a bug. An exact classifier cannot exist, so feigning one would be the lie.
No black box — an Evidence Ledger. Every verdict ships with the per-estimator signers, costs, votes, and a content hash. The reasoning is re-derivable from named scripts and data; nothing is asserted that cannot be reproduced.
No lying translation. The plain-language layer never states more than the estimators measured. Natural-language summaries are constrained to the ledger — no constructive hallucination beyond what the numbers license.
No anonymity. Every certificate is signed and attributable; the operator and the engine version are on the record. A rating no one stands behind is worth zero.

5 Field context — the four papers, cited honestly

Pre-flight simulability triage went from folklore to an active research topic in mid-2026. Atlas is the measured, multi-engine, QPU-validated point in that space — complementary to the predict-from-features and pure-theory work. We cite this work because honesty about the landscape is the point; Atlas is first as a running product, not first to ask the question.

Work (2026)	What it does	How Atlas relates
Leone, Eisert & Oliviero — `arXiv:2602.22330`	Proves deciding exact stabilizer membership is super-exponential (Ω(2^(n²)) under ETH)	The theoretical basis for why MEDIUM is the honest answer — an exact classifier provably cannot exist
Xing et al. — `arXiv:2606.11620`	Family-aware ML that predicts the MPS-bond threshold from static gate features (~50 ms)	Atlas measures the exact bond/treewidth — a predicted threshold can be wrong (false-security risk); a measured one can't
Del Rey et al. — `arXiv:2605.28986`	Studies T-count & MPS bond as control variables for learning simulability	Atlas uses the same two as decision variables, cross-validated against Stim, in a verdict
Shao et al. — `arXiv:2606.00474`	Pure theory: when a polynomial TN bond suffices under noise (no tool)	Atlas is the running implementation; the Convergence Map aims to be measured against this bound

An independent deep review (primary sources) confirmed all four citations are real and accurately characterised, and judged the self-positioning honest. Full landscape: Evidence §6.

Open Atlas → See the measured evidence