Praveen

Posted on Jun 3

Logs Are Not Audit Artifacts: Why AI-Generated Code Needs a Signed AI BOM

#opensource #discuss #news #showdev

Most teams are trying to solve AI provenance with dashboards. That is the wrong object.

A dashboard is useful while the system is live. It tells you what happened, who touched what, and where the risk seems concentrated. But audit work happens later, after the prompt is forgotten, the model changes, the PR is merged, and someone asks a question the dashboard was never designed to answer:

Can you prove this record still means what it said it meant?

That is a different problem.

A log answers one question: what did the system emit at the time?

A provenance record answers a better one: what happened, with enough context to reconstruct the event?

An AI BOM answers the hardest one: can I trust this exported summary later, after the data has moved, been redacted, or been shared with another team?

That distinction matters.

Why logs are not enough
Logs are operational data. They are great for debugging and incident response. They are not naturally shaped for later trust.

They are usually mutable. They are often provider-specific. They are usually optimized for collection, not for a later proof. And in the AI code provenance case, they miss the exact thing people ask for in review:
What model produced this? What prompt triggered it? Was the record modified after the fact?

If the answer is “we have logs,” that is usually a sign the system still treats provenance like telemetry.

That is why I prefer a different frame.

AI provenance needs an artifact.

Not a chart. Not a usage counter. Not a dashboard screenshot.

An artifact.

What an AI BOM is
AI BOM means AI Bill of Materials.

It is the same general idea as an SBOM, but for AI-generated code. Instead of listing dependencies, it lists AI-generated provenance: what got changed, which model produced it, whether the prompt was captured or redacted, and whether the export itself still verifies.

A useful AI BOM should answer, at minimum:

What was changed?
Which file was touched?
Which model generated the code?
Was the prompt captured, hashed, or redacted?
Is the provenance chain intact?
Can the exported document be verified later?
That is the object we are trying to produce.

In LineageLens, that means the backend now does two things in Plus/Max mode:

It chains provenance records together with prev_hash and record_hash.
It exports a signed AI BOM that can be checked later.
The point is not to make the dashboard prettier. The point is to make the record trustable.

What the integrity layer looks like
Here is the basic flow:

AI tool / editor
|
v
capture layer / proxy
|
v
provenance record
|
+--> record_hash + prev_hash
|
+--> signed AI BOM export
|
v
verifier / auditor / downstream system
The important part is that the exported document is not just a dump of rows. It is a signed summary of the provenance state.

In the current implementation, each provenance record gets a chain hash built from a canonical set of fields. The prompt itself is not copied into the AI BOM export. Instead, the export uses a prompt hash, which gives you disclosure tracking without leaking raw prompt text into the artifact.

A simplified version looks like this:
canonical = {
"uuid": record.uuid,
"workspace_id": record.workspace_id,
"file_path": record.file_path,
"inserted_code": record.inserted_code or "",
"model_name": record.model_name or "",
"prompt_sha256": sha256(prompt_messages),
"timestamp_iso": record.timestamp_iso,
"prev_hash": previous_hash or "",
}

record_hash = sha256(json.dumps(canonical, sort_keys=True).encode()).hexdigest()
If any field changes, the hash changes.

That is the point.

The verify path then recomputes the chain and stops at the first mismatch. If something in the middle was altered, the first broken UUID tells you exactly where the trust boundary failed.

Why prompt hashes instead of raw prompts
This part matters more than people think.

Raw prompts are useful inside the provenance store. They are often essential for internal debugging, review, and governance workflows. But the exported artifact has a different job.

An export is meant to be shared.

Once you are sharing, raw prompts become a liability. They can contain code, internal names, API structure, or sensitive business context. That is why the AI BOM uses a prompt hash in the export. It gives you fingerprinting without dumping private text into every report.

Top comments (1)

Praveen • Jun 3

Drop the Questions below!!