AI provenance systems need one shared record contract

#ai #devops #automation #opensource

AI provenance systems usually fail in a boring way: the data exists, but different parts of the product disagree about what it means. In LineageLens, the extension captures the insertion, the backend stores the provenance record, and the MCP server answers questions from that same record. If one layer renames a field, drops a status, or normalizes a value differently, the record still exists, but the product stops telling one coherent story.
The fix is to treat provenance as a contract, not a payload. Canonical fields matter: prompt, model, tool, file path, line range, capture method, risk score, and outcome. The outcome is especially important because applied, rejected, and errored are not the same thing. Neither are full, metadata_only, tunnel_only, and unavailable. Those labels drive dashboards, exports, and assistant answers.

The practical takeaway is simple: version the schema, validate at the boundary, and test every surface against the same fixture. If you are building any internal system where capture, storage, and assistant access all sit on top of the same record, consistency is a feature, not a nice-to-have.

DEV Community

AI provenance systems need one shared record contract

Top comments (0)