The claim "this circuit is classically tractable" is only worth the method that backs it. Atlas does not predict simulability from features — it measures it, four independent ways, and certifies the case only when independent methods agree. That is how "you don't need the QPU" becomes a constructive, checkable result instead of a guess. Here is the full mechanism, including where it abstains.
Each estimator measures the same circuit from a different mathematical space. Independence is the point: agreement among methods built on different mathematics is corroboration a single tool cannot fake.
Counts the non-Clifford (T-gate) budget. A pure stabilizer circuit is classically simulable at any size; magic is what can push a circuit past the classical frontier.
Measures the matrix-product-state bond needed to represent the state. Area-law / structured circuits stay cheap; volume-law entanglement makes the bond blow up.
Measures the treewidth of the circuit's interaction graph — the cost of contracting it as a tensor network. Low-treewidth topologies contract cheaply regardless of depth.
Tracks how a local operator spreads under the circuit (scrambling). Slow spread means locality is preserved and classical methods stay tractable.
These are open SoTA engines (Stim, quimb, cotengra) plus a measured operator-spread signal. Atlas's contribution is not the engines — it is the layer that runs them as independent witnesses and adjudicates the result. Reproduce: route_adjudicator.py.
When ≥2 independent folds agree "cheap", that is corroboration, not a single prediction. Atlas turns that agreement into a signed, hash-stamped certificate with five levels.
| Level | Agreement | Meaning |
|---|---|---|
| STRONG | ≥2 folds cheap, 0 hard | classically simulable — convergent validity |
| FIRM | classical majority, named dissenter | simulable, dissent documented |
| WEAK | a single fold carries it | provisional, thin evidence |
| SPLIT | even split | on the frontier — the intractable region made observable |
| NULL | no fold cheap | QPU-required |
Each certificate carries a SHA-256 content hash of the circuit, the per-fold signers (axis · cost · vote), the level, and the engine's actual route — an archivable, citable audit artifact, not a black-box verdict. Soundness battery: as the compute budget tightens, the level degrades STRONG → SPLIT → WEAK exactly as the independent folds begin to disagree, and 0 false-STRONG is ever issued on a hard verdict.
Reproduce: physics_magnitude_lab.certificate.certificate(n, circuit, budget_log2=30) → level + signers + hash. Soundness: scripts/certificate_validate.py → certificate_validation.json. Full discussion: Evidence §10.
The same call that certifies agreement also names which fold dissents, and in which direction. A SPLIT is not noise — it is the classically-intractable frontier (Leone-region) becoming visible.
When the treewidth axis and the MPS-bond axis disagree, that divergence is not a failure of the tool — it is a measurement of where the cheap structural proxies and the entanglement proxy stop agreeing about the same circuit. Atlas surfaces that divergence as a map rather than hiding it inside a single score. The disagreement tells a researcher exactly where the circuit sits on the simulability frontier, and which resource (magic, entanglement, topology, locality) is the one pushing it across. Two faces, one call: agreement certifies; disagreement maps the frontier.
This is the publishable core — formalising the treewidth↔MPS divergence as a central metric, with concordance against the noise-bound theory of Shao et al. (arXiv:2606.00474). Roadmap-tracked, not yet a closed result.
The architecture is constrained, by design, so that the failure modes that make confidence scores dangerous cannot occur.
Pre-flight simulability triage went from folklore to an active research topic in mid-2026. Atlas is the measured, multi-engine, QPU-validated point in that space — complementary to the predict-from-features and pure-theory work. We cite this work because honesty about the landscape is the point; Atlas is first as a running product, not first to ask the question.
| Work (2026) | What it does | How Atlas relates |
|---|---|---|
Leone, Eisert & Oliviero — arXiv:2602.22330 | Proves deciding exact stabilizer membership is super-exponential (Ω(2^(n²)) under ETH) | The theoretical basis for why MEDIUM is the honest answer — an exact classifier provably cannot exist |
Xing et al. — arXiv:2606.11620 | Family-aware ML that predicts the MPS-bond threshold from static gate features (~50 ms) | Atlas measures the exact bond/treewidth — a predicted threshold can be wrong (false-security risk); a measured one can't |
Del Rey et al. — arXiv:2605.28986 | Studies T-count & MPS bond as control variables for learning simulability | Atlas uses the same two as decision variables, cross-validated against Stim, in a verdict |
Shao et al. — arXiv:2606.00474 | Pure theory: when a polynomial TN bond suffices under noise (no tool) | Atlas is the running implementation; the Convergence Map aims to be measured against this bound |
An independent deep review (primary sources) confirmed all four citations are real and accurately characterised, and judged the self-positioning honest. Full landscape: Evidence §6.