DEV Community

AI x Crypto Systems
AI x Crypto Systems

Posted on

Model Weight Registry: The Name Is Not the Model

Model Weight Registry

Disclosure: AI tools were used for source collection and editorial review. The article was written by a human author, who checked the facts, code, and conclusions.

Crypto risk disclosure: This article is a technical explanation, not investment advice. It is not a recommendation to buy, sell or hold any cryptoasset.

Model Weight Registry should not treat a model name as a model identity. A name, repository, tag, branch, or user-facing label can point to useful software, but stable identity for AI weights starts with the exact bytes being loaded.

That boundary matters for AI x crypto systems because onchain claims are expensive to correct after users rely on them. If a contract, agent, or audit trail says "model X," the next question should be "which file, revision, size, digest, and receipt?"

Byte Identity

A digest identifies bytes under a chosen algorithm and input. NIST FIPS 180-4 defines secure hash algorithms, while RFC 6920 describes naming information with hashes.

That support is narrow and useful. A SHA-256 digest can say the file bytes match a recorded value; a digest cannot say the model is safe, aligned, licensed, useful, or trained on the right data.

Weight Receipt

The practical artifact is a receipt that refuses to overclaim. Instead of a full JSON object, the registry can show the receipt as an audit line with named fields:

Receipt field Example value Why the field exists
Type ai.weight_hash_receipt.v1 Separates this statement from a model card or benchmark
Model label org/model-name Keeps the human pointer visible
Source revision Full commit hash Avoids a floating branch as identity
File path model.safetensors Names the exact artifact inside the source
Format and size safetensors, byte count Catches conversion and truncation mistakes
Hash sha256:<64 hex chars> Identifies the exact byte sequence
Optional content address CID plus construction notes Prevents CID/file-hash confusion
Issuer and signature Registry key id, EIP-712 profile Says who made the statement
Limits Byte identity only Blocks safety, license, and behavior overclaims

This receipt is not a universal standard. This receipt is a defensive shape for registries that want to separate exact artifact identity from marketing labels.

Canonical Receipt

If the receipt itself is hashed or signed, its representation matters. RFC 8785 exists because JSON needs canonicalization before stable hashing and signing.

The registry should therefore decide what is hashed: the model file, the canonical receipt, or both. Mixing those claims is how a model registry starts saying "verified" without saying what was verified.

Supply-Chain Pattern

Software supply-chain systems already use the subject-plus-digest pattern. SLSA Provenance and the in-toto Statement specification bind subjects to names and digests in provenance statements.

Model Weight Registry can borrow that habit without pretending the problem is solved. The digest gives artifact identity; the provenance statement gives issuer and process context; neither one proves the model's behavior.

Tag Boundary

Container registries make the pointer problem familiar. The OCI descriptor identifies content with a media type, digest, and size, while tags remain convenient names that can move.

AI weights need the same discipline. A label such as "latest," "main," or "production" is an operational pointer. A digest and size are the beginning of stable artifact identity.

Revision Boundary

Model hubs already expose a better path than floating names. Hugging Face Hub download docs describe revision-pinned downloads, including branches, tags, and commit identifiers.

The registry should record the revision, not just the repository name. Without the revision and file path, the phrase "we used org/model-name" is a clue, not an identity.

Content Address

Content addressing can help, but Model Weight Registry should not flatten every content address into "the file hash." IPFS content-addressing docs explain content-derived identifiers, while CID construction depends on representation details.

That caveat belongs in the receipt. A CID is useful when the registry records the CID version, codec, chunking or import method, and the relationship between the CID and the file digest.

Signed Statement

A signed receipt can authenticate who made the claim. EIP-712 supports typed structured-data signing with domain separation, which fits a registry receipt better than an opaque string.

The signature still has a hard limit. A signed false receipt is still false; a signed byte-identity receipt still says nothing about safety, license rights, or training data.

Format Boundary

File format is part of the receipt because model weights are not just names. SafeTensors gives a concrete format context for tensor files and metadata.

The format field prevents a common mistake: treating a converted artifact as the same object without recording the conversion. Byte identity changes when serialization changes, even if the model is intended to behave similarly.

Boundary Table

The registry should keep every claim in its lane.

Field What it can say What it cannot say
Model label Human-readable pointer Stable identity
Revision Source state or commit context File bytes without path and digest
Digest Exact bytes under an algorithm Quality, safety, or license validity
CID Content-addressed object reference Raw file hash unless construction matches
Signature Issuer made the statement Statement is true
Model card Intended use and evaluation context Exact loaded weights

This table is the product. Model Weight Registry becomes useful when a consumer can tell whether a claim is about a name, a file, a receipt, or the model's behavior.

Final Receipt

The safest registry sentence is short: "This receipt identifies these bytes and these limits." Everything else should be linked as separate evidence.

That makes onchain AI claims less brittle. A model name is a pointer; a weight hash receipt is a checkable boundary around the artifact a system actually loaded.

Top comments (0)