In late April 2026, Microsoft shipped VS Code 1.117. Buried in the release was a change: the github.copilot.chat.generateCommitMessage.addCoAuthoring setting was flipped from off to all by default. That meant "Co-authored-by: Copilot copilot@github.com" was now being appended to every commit message in the background — silently, without showing up in the commit message editor, and critically, without verifying that Copilot had generated any of the code.
Developers noticed within days. The backlash was significant. VS Code 1.119 shipped May 3 with the default reverted and a consent requirement added. Microsoft apologized.
The technical fix was straightforward. The governance question it exposed is not.
What the incident actually revealed
The developer anger wasn't really about attribution credit. It was about consent and accuracy. The co-author trailer was added to commits where AI features were disabled. It was added when developers had manually written every line. It attributed work that wasn't done.
But underneath that anger is a more important problem: even when Copilot does write code, a "Co-authored-by" git trailer tells you almost nothing useful from a governance or security standpoint.
It tells you that a tool called Copilot existed somewhere in the developer's editor during some portion of the work that eventually became this commit. That's it.
It doesn't tell you which model generated which lines. It doesn't tell you what the developer prompted for. It doesn't contain the raw model response. It doesn't tell you whether any of the AI-generated lines touched authentication paths, hardcoded credentials, or SQL construction. It says nothing about when generation happened relative to the commit. It carries no risk score.
If you had to defend a specific commit in a security audit six months from now — "which parts of this function were AI-generated, under what prompt, using what model?" — a git trailer gets you nowhere.
What a real provenance record contains
LineageLens captures provenance at insertion time, not commit time. Each AI code insertion generates a ProviderAgnosticProvenanceEvent structured around schema version lineagelens.provenance-event.v1. Here is what that record contains:
type ProviderAgnosticProvenanceEvent = {
schemaVersion: 'lineagelens.provenance-event.v1';
eventId: string;
timestamps: {
observedAtIso: string; // when the extension saw the insertion
insertedAtIso: string; // when the text hit the buffer
requestAtIso: string | null; // when the proxy saw the outbound request
responseAtIso: string | null; // when the model responded
};
source: {
ide: string | null; // 'vscode'
shim: string; // which capture path fired
toolName: string | null; // 'Edit', 'Write', 'apply_patch', etc.
provider: string | null; // 'anthropic', 'openai', 'google'
adapterName: string | null; // 'claude-code', 'copilot', 'cursor', etc.
};
capture: {
level: CaptureStatus; // 'full' | 'metadata_only' | 'tunnel_only' | 'file_diff'
promptStatus: 'captured' | 'not-captured';
capabilities: ProvenanceEventCapability[]; // 10 named slots
};
model: {
name: unknown;
parameters: Record<string, unknown> | null; // temperature, max_tokens, etc.
};
prompt: {
body: unknown; // the full prompt messages array
system: unknown; // the system prompt
};
diff: {
insertedText: string;
chunks: ProvenanceInsertedChunk[];
netAddedLines: number;
};
correlation: {
confidence: number; // 0.0–1.0
timingDifferenceMs: number | null;
contentSimilarityScore: number | null;
fileContextMatched: boolean;
};
};
Compare that to what a git trailer gives you: a tool name and an email address.
The 10 capability slots
The most important part of the schema is the capture.capabilities array. Every provenance event gets 10 named capability assessments:
prompt-body — was the full prompt captured?
response-body — was the raw model response captured?
headers — were the request headers available?
request-id — was a UUID present to link request to insertion?
session-id — was there session context?
model — was the model name captured?
user-agent — was the tool's user-agent available?
file-diff — was the inserted diff captured? (always 'provided')
file-context — did the file context match to the capture?
workspace — was workspace context available?
Each entry carries a status: provided, missing, or unknown.
This matters because it tells you precisely what you know about a given insertion and what you don't. A record with prompt-body: missing and promptStatus: 'not-captured' is not the same as no record — it is an explicit declaration that the prompt gap exists. That gap is auditable. An audit trail with explicit gaps is categorically more useful than a label with no gaps declared.
The VS Code co-author trailer has no gap declarations. It has no granularity at all — it just has nothing.
Capture time vs. commit time
The harder architectural point: by the time you are in a git commit, you have already lost the evidence.
The prompt body does not live anywhere post-generation. The model name was in the HTTP response header. The raw response body was discarded after the tool processed it. The timing data only exists in the milliseconds between request and file write.
None of that is in the commit. None of it can be recovered retroactively.
LineageLens captures the ProviderAgnosticProvenanceEvent at the insertion event — before the diff even exists as a file change. The observedAtIso timestamp records when the VS Code extension detected the text entering the buffer. The requestAtIso and responseAtIso timestamps come from the proxy intercept that happened seconds or minutes before. By the time you type a commit message, the provenance record has already been stored.
A git trailer is a retroactive label. Provenance is an evidence chain that exists before the label does.
What actually changed after the VS Code incident
Microsoft reverted the default. They added a consent gate. They clarified that disableAIFeatures: true now also disables the co-authoring trailer.
None of that gives you a provenance record. You still do not know which lines in a given commit were AI-generated. You still cannot answer "what did Copilot generate in auth.py last month" from git history alone.
The incident forced consent around labeling. That is progress. It did not touch the underlying gap: labeling that something was AI-assisted is not the same as recording what the AI actually did.
The practical implication
If you are on a team shipping AI-generated code — and 84% of development teams are, per the 2026 Stack Overflow Developer Survey — you are almost certainly making three implicit assumptions:
- That your CI pipeline or git history contains enough attribution information to answer an audit question.
- That "Co-authored-by" or an equivalent label satisfies your traceability obligations.
- That you could reconstruct the provenance of a specific function if you had to.
All three assumptions are likely wrong for the same reason: commit-time labeling cannot carry insertion-time evidence.
The EU AI Act Articles 11 and 12 enforcement window opens in August 2026. The question "which AI model generated this code, under what prompt, at what risk level?" is going to become a routine compliance requirement.
When it does, a git trailer is not going to be a defensible answer.
Try it
LineageLens Base is a free VS Code extension that starts capturing provenance events at insertion time today, even without proxy infrastructure. Lite, Plus, and Max tiers add proxy capture for full prompt, model, and response-body fields. The full architecture details are at lineage-website.vercel.app. The Hashnode post goes deeper on schema design tradeoffs.
One question for the comments: what data would your team actually need to survive a security audit of your AI-generated code? Not in theory — what specific fields would an auditor ask for?
Top comments (1)
what do you think about it ?