Randomised Benchmarking
Average Clifford infidelity across length sweeps with a randomised twirl.
Single-exponential survival fit returns the depolarising channel parameter.
Six vendors, four published benchmark protocols, bootstrap confidence intervals on every cell, dated spoof-hardness witness on every XEB number. The scorecard a procurement committee can sign on.
Hover a cell for methodology, sample size, and dated witness. Darker steel-blue tracks higher fidelity. Amber on Forrelation flags upstream-fit caveats. The same shape across vendors so a procurement reviewer reads the page in seconds.
The protocols are not a DeployQuantum invention. Each is anchored to a published paper, a published estimator, and a published statistical procedure. The platform binds them to one signed scorecard shape per run.
Average Clifford infidelity across length sweeps with a randomised twirl.
Single-exponential survival fit returns the depolarising channel parameter.
Separates coherent from stochastic error per cycle, parameter-efficient relative to full process tomography.
Pauli-twirled cycle decay returns a sparse Pauli-Lindblad coefficient map per coupling.
Random-circuit sampling primitive with a dated spoof-hardness witness on every cell.
Linear cross-entropy estimator returns a fidelity proxy under a published anti-concentration assumption.
Provable separation rather than a measurement primitive. The lower bound is the proof object.
Classical lower-bound certificate consistent with the published k-fold Forrelation oracle separation.
Anchors are cited in the manifest by paper identifier. This surface carries no person-name attribution.
Every XEB cell carries a dated reference. When classical attacks advance, affected scorecards demote with a dated note. The reference date is a field on every scorecard, not a footnote.
Same seed. Same SDK version. Same hash. A third party reruns the script and reproduces the cell byte for byte.
{
"schema_version": "0.1.0",
"vendor": "IBM",
"backend_name": "ibm_torino",
"vendor_sdk_version": "qiskit-1.x (pinned)",
"benchmark": "RB_2Q",
"n_qubits": 2,
"shots_per_circuit": 1024,
"transpile": {
"seed": 7,
"basis_gates": ["sx", "rz", "cz"],
"qubit_indices": [12, 13]
},
"measured": {
"primary_metric_name": "r_per_clifford",
"value": 0.9968,
"ci_95_lower": 0.9960,
"ci_95_upper": 0.9976
},
"spoof_hardness_witness": {
"named_assumption": "BFNV anti-concentration plus #P-hardness",
"defeated_attacks": ["..."],
"not_defeated_attacks": ["..."],
"frontier_reference_date": "2026-05-10"
},
"fit_residuals_path": "build/output/rb_2q_fit_residuals.json",
"rb_certificate_hash": "<sha-256>",
"scorecard_hash": "<sha-256>"
} Eleven more fields in the manifest. Each one replayable.
dfa7c0777eca70ffb835a1ec6f3f351cee7b2ea848ba3a7f228082d08db80bc7 Pinned in every manifest. The classical pre and post script hashes pin alongside, and the scorecard hash closes the chain.
F_XEB above zero on a noiseless simulator is uninformative about advantage. On a real QPU it is hard to spoof classically only under the BFNV anti-concentration plus #P-hardness conjecture pair, which can fail. The witness demotes when the frontier advances. The Forrelation lower bound is information-theoretic at the oracle level. The platform separates the conjecture-conditional claim from the unconditional one in every scorecard so the buyer reads each on its own footing.
Per-vendor scorecards are deferred to Tool 3 and Tool 5 operational, with a real-QPU run in the hardware-run-evidence block. At this ship gate, the release state is internal-only and the leaderboard you see is the shape the scorecard takes when it ships, not a published vendor ranking.
The scorecard is the product. A vendor brochure is a chart. The buyer reads the same shape an auditor reads: cells with a fidelity number, a CI whisker, a witness date. Marketing positioning lives in vendor brochures. Procurement lives in the scorecard.
No. At this ship gate, the release state is internal-only and the real-QPU run is deferred to Tool 3 plus Tool 5 operational. The leaderboard renders the shape the scorecard takes when shipped. Numbers derive from each vendor's published median two-qubit fidelity in the hardware-roadmap registry, perturbed by a protocol-specific delta. Treat the page as illustrative envelope, not as a vendor ranking.
A 95% bootstrap confidence interval around the primary metric, computed by resampling the underlying circuit-shot dataset. The whisker is a visual rendering of the CI half-width. Where the CI is wide relative to the difference between two vendors, the chart says they are not separable at the run's shot budget. The CI is reported as a number in every scorecard JSON, not only as a whisker.
The Forrelation protocol is a provable oracle separation, not a fidelity primitive. The cell is a consistency check against the classical lower bound. LB-OK means the run is consistent with the published lower-bound exponent. LB-WIDE means consistent but with upstream fit caveats in the manifest.
Every XEB cell carries a dated frontier reference. When the classical-attack frontier advances, the affected scorecards demote in the next registry update, with a dated note recording which attack class moved and which cells were re-rendered. The witness is not a frozen number. It is a measurement under a stated and dated assumption envelope.
Request a run. We respond within one business day. The customer-asset attestation is a precondition, and the release-gate state is recorded against your engagement, not against a generic download.