AI x Crypto Systems

Posted on Jun 1

Model Weight Registry: The Name Is Not the Model

#ai #architecture #blockchain #machinelearning

Model Weight Registry

Disclosure: AI tools were used for source collection and editorial review. The article was written by a human author, who checked the facts, code, and conclusions.

Crypto risk disclosure: This article is a technical explanation, not investment advice. It is not a recommendation to buy, sell or hold any cryptoasset.

Model Weight Registry should not treat a model name as a model identity. A name, repository, tag, branch, or user-facing label can point to useful software, but stable identity for AI weights starts with the exact bytes being loaded.

That boundary matters for AI x crypto systems because onchain claims are expensive to correct after users rely on them. If a contract, agent, or audit trail says "model X," the next question should be "which file, revision, size, digest, and receipt?"

Byte Identity

A digest identifies bytes under a chosen algorithm and input. NIST FIPS 180-4 defines secure hash algorithms, while RFC 6920 describes naming information with hashes.

That support is narrow and useful. A SHA-256 digest can say the file bytes match a recorded value; a digest cannot say the model is safe, aligned, licensed, useful, or trained on the right data.

Weight Receipt

The practical artifact is a receipt that refuses to overclaim. Instead of a full JSON object, the registry can show the receipt as an audit line with named fields:

Receipt field	Example value	Why the field exists
Type	`ai.weight_hash_receipt.v1`	Separates this statement from a model card or benchmark
Model label	`org/model-name`	Keeps the human pointer visible
Source revision	Full commit hash	Avoids a floating branch as identity
File path	`model.safetensors`	Names the exact artifact inside the source
Format and size	`safetensors`, byte count	Catches conversion and truncation mistakes
Hash	`sha256:<64 hex chars>`	Identifies the exact byte sequence
Optional content address	CID plus construction notes	Prevents CID/file-hash confusion
Issuer and signature	Registry key id, EIP-712 profile	Says who made the statement
Limits	Byte identity only	Blocks safety, license, and behavior overclaims

This receipt is not a universal standard. This receipt is a defensive shape for registries that want to separate exact artifact identity from marketing labels.

Canonical Receipt

If the receipt itself is hashed or signed, its representation matters. RFC 8785 exists because JSON needs canonicalization before stable hashing and signing.

The registry should therefore decide what is hashed: the model file, the canonical receipt, or both. Mixing those claims is how a model registry starts saying "verified" without saying what was verified.

Supply-Chain Pattern

Software supply-chain systems already use the subject-plus-digest pattern. SLSA Provenance and the in-toto Statement specification bind subjects to names and digests in provenance statements.

Model Weight Registry can borrow that habit without pretending the problem is solved. The digest gives artifact identity; the provenance statement gives issuer and process context; neither one proves the model's behavior.

Tag Boundary

Container registries make the pointer problem familiar. The OCI descriptor identifies content with a media type, digest, and size, while tags remain convenient names that can move.

AI weights need the same discipline. A label such as "latest," "main," or "production" is an operational pointer. A digest and size are the beginning of stable artifact identity.

Revision Boundary

Model hubs already expose a better path than floating names. Hugging Face Hub download docs describe revision-pinned downloads, including branches, tags, and commit identifiers.

The registry should record the revision, not just the repository name. Without the revision and file path, the phrase "we used org/model-name" is a clue, not an identity.

Content Address

Content addressing can help, but Model Weight Registry should not flatten every content address into "the file hash." IPFS content-addressing docs explain content-derived identifiers, while CID construction depends on representation details.

That caveat belongs in the receipt. A CID is useful when the registry records the CID version, codec, chunking or import method, and the relationship between the CID and the file digest.

Signed Statement

A signed receipt can authenticate who made the claim. EIP-712 supports typed structured-data signing with domain separation, which fits a registry receipt better than an opaque string.

The signature still has a hard limit. A signed false receipt is still false; a signed byte-identity receipt still says nothing about safety, license rights, or training data.

Format Boundary

File format is part of the receipt because model weights are not just names. SafeTensors gives a concrete format context for tensor files and metadata.

The format field prevents a common mistake: treating a converted artifact as the same object without recording the conversion. Byte identity changes when serialization changes, even if the model is intended to behave similarly.

Boundary Table

The registry should keep every claim in its lane.

Field	What it can say	What it cannot say
Model label	Human-readable pointer	Stable identity
Revision	Source state or commit context	File bytes without path and digest
Digest	Exact bytes under an algorithm	Quality, safety, or license validity
CID	Content-addressed object reference	Raw file hash unless construction matches
Signature	Issuer made the statement	Statement is true
Model card	Intended use and evaluation context	Exact loaded weights

This table is the product. Model Weight Registry becomes useful when a consumer can tell whether a claim is about a name, a file, a receipt, or the model's behavior.

Final Receipt

The safest registry sentence is short: "This receipt identifies these bytes and these limits." Everything else should be linked as separate evidence.

That makes onchain AI claims less brittle. A model name is a pointer; a weight hash receipt is a checkable boundary around the artifact a system actually loaded.

DEV Community