AISI evaluation can be cryptographic, not contractual

#ai #security #cryptography #safety

The AI Safety Institute publishes evaluations of frontier AI models against a stated harm taxonomy. The evaluations cover chemical, biological, radiological, and nuclear assistance; cyber-offence assistance; the capacity to deceive evaluators or to undermine human oversight; and a small number of additional categories the Institute has named in its public papers.

The evaluations sit, in practice, at the structural centre of the UK's AI safety claim. If the AISI says a model has passed, the UK government can act on the model. If the AISI says a model has failed, the government can withhold deployment. The evaluations are, by design, evidence of safety presented to power.

The question this article asks is what kind of evidence they are.

The contractual evaluation and its limit

A model evaluated by the AISI is typically supplied as a hosted endpoint by the model developer, or as a downloaded weight in a controlled environment. The AISI runs its evaluation suite against the model. The evaluation produces a structured report. The report is shared with the developer, the relevant ministers, and a subset of regulators on a controlled basis.

The structural property of this arrangement is that the evaluation is, in effect, contractual. The evidence the AISI produces is the AISI's own statement that it ran the evaluation, against a specified snapshot, with specified prompts, and obtained specified results. The credibility of the statement rests on the credibility of the AISI as an institution. Where the AISI is trusted, the statement holds. Where a regulator, an opposition party, an allied government, a court, or a future government wants to verify the statement independently, the structural answer is that they cannot, because the evidence is not formed in a way that supports replay.

This is the same structural limit the Open Audit Record was filed to address in adjacent contexts.

What cryptographic evaluation looks like

A cryptographic evaluation, in the sense this article uses, is an evaluation produced as a chain of signed actions, each action capturing the input, the output, the scoring, and the policy at the moment, and the whole chain sealed against the evaluator's hardware-bound key.

The structural property of a cryptographic evaluation is that any party with the evaluator's published verification key can recover the chain, verify each action, replay each prompt against the same model snapshot, and confirm the score. The credibility of the evaluation does not depend on trusting the evaluator. The credibility depends on the arithmetic.

This is the evidence form a regulator who does not yet exist can verify the evaluation a regulator who exists today made.

How Mickai produces it

The Mickai Sovereign Intelligence Operating System runs evaluations as a signed action chain on the AISI's own deployment of the SIOS, with the AISI's hardware-bound key.

The evaluation campaign begins with a signed campaign declaration: the harm taxonomy, the model snapshot reference, the prompt corpus identifier, the scoring rubric, the participating evaluator identities, and the campaign window. The campaign declaration is a signed action.

Each prompt run is a signed action: the prompt text, the model response, the reasoning trace where the model exposes one, the scoring against the rubric, the participating evaluator's signed concurrence or dissent, and the time. Each prompt action references the campaign declaration by signed hash.

A red-team probe is a signed action with the same structure, with a probe-class tag for the harm category being attacked, the attacker identity (the AISI evaluator who composed the probe), and the model's response. A successful probe is signed alongside the campaign declaration with a high-priority class tag.

A consensus closure is a signed action that aggregates the campaign's prompt and probe records into a structured outcome, with the participating evaluators' signed concurrences.

The whole chain is held in the Audit Ledger under the AISI's key, browser-verifiable offline against the AISI's published verification key.

What a regulator who does not trust the AISI can do

A future government, a court, an opposition party, an allied government, or a regulator in another jurisdiction with a cooperation agreement can recover the campaign chain, replay each prompt against the same signed model snapshot, verify the scoring against the rubric, examine each probe and its outcome, and reach a verifiable view of the campaign's structural soundness.

The view is independent of the AISI. The view depends on the arithmetic, the published verification key, and the access to the signed snapshot.

This is what cryptographic evaluation means in practice. The Institute's authority does not depend on the audience's trust. The audience can verify.

The case for the UK going first

The UK has, in the AISI, the first national-level evaluator with the institutional standing to produce evidence of this kind. The UK has, in the AI Opportunities Action Plan and the Sovereign AI Fund, a procurement vehicle that can underwrite a deployment of the substrate on the AISI's own infrastructure. The UK has, in Mickai, a Sovereign Intelligence Operating System that produces the signed action chain at scale, under the AISI's key, with the post-quantum signature scheme the NCSC names for the migration.

A second-mover allied evaluator will face the same structural choice the AISI faces now. The first mover defines the evidence form.

What the next AISI campaign can specify

The campaign declaration is a signed action. Every prompt run, red-team probe, and consensus closure is a signed action. The chain is held under the AISI's hardware-bound key, against a key custody arrangement the AISI controls. The signature scheme is FIPS 204 ML-DSA-65 from inception. The verification path is browser-resident WebAssembly against the AISI's published key.

The Mickai Sovereign Intelligence Operating System runs the campaign on this specification. The 101 filed UK patent applications cover the primitives. The AISI's evaluations can move from contractual to cryptographic, on filed substrate, on a UK timeline.

The evidence the AISI presents to government becomes evidence a future government can independently verify. That is the structural property the Institute's authority can be built on.