DEV Community

Cover image for Building a Flight Recorder for AI: Inside the VAP Framework's 5-Layer Provenance Architecture

Building a Flight Recorder for AI: Inside the VAP Framework's 5-Layer Provenance Architecture

Your AI made a decision. Now prove it. 🔍

Here's a scenario that's playing out right now in California courtrooms:

An AI hiring tool scores 10,000 job applicants for a Fortune 500 company. Applicant #7,342 — a 58-year-old software engineer with 30 years of experience — gets a "42% match" score and is automatically filtered out. She sues for age discrimination.

Discovery request: "Produce the complete decision record showing why this applicant received this score, including model version, feature weights, input data, and any human review."

The company's response: ðŸĶ—

Not because they're hiding something. Because the record doesn't exist. The system logged that a score was produced. It did not log why. The model version active at the moment of scoring? Not recorded. The feature weights? Ephemeral — overwritten on the next training cycle. Human review? The system was designed to operate without it.

This isn't hypothetical. CalMatters reported in March 2026 on exactly this pattern across California's AI employment landscape. The state's updated FEHA regulations prohibit AI-driven discrimination, but the enforcement mechanism assumes evidence that the technology never creates.

Meanwhile, in enterprise security: Kiteworks' 2026 forecast report surveyed 225 organizations deploying AI agents. 33% had zero audit logs for agent operations. 61% had only fragmented logs that couldn't be correlated across systems. A Harvard-led red team of 20+ researchers demonstrated that prompt injection could cause agents to autonomously delete email archives and exfiltrate data — and the affected organizations had no forensic mechanism to reconstruct what happened.

Both cases share the same root cause: the absence of cryptographic provenance infrastructure.

Let's build one.


The Problem, In Code

Here's what "logging" looks like in most AI systems today:

# What most AI systems actually log
import logging
import datetime

logger = logging.getLogger("hiring_ai")

def score_candidate(candidate_id, resume_data):
    score = model.predict(resume_data)  # black box

    # This is the entire "audit trail"
    logger.info(f"{datetime.datetime.now()} | candidate={candidate_id} | score={score}")

    return score
Enter fullscreen mode Exit fullscreen mode

What's wrong with this? Let me count the ways:

┌─────────────────────────────────────────────────────┐
│           Why This "Logging" Is Worthless            │
├─────────────────────────────────────────────────────â”Ī
│  ✗ Timestamp comes from system clock (forgeable)    │
│  ✗ No model version recorded                        │
│  ✗ No input features captured                       │
│  ✗ No confidence score preserved                    │
│  ✗ No human oversight action logged                 │
│  ✗ Admin can DELETE or MODIFY any entry             │
│  ✗ No hash chain — deletion is undetectable         │
│  ✗ No external anchor — must trust the operator     │
│  ✗ No way to prove completeness                     │
└─────────────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

A regulator looking at this log can determine that something happened. They cannot determine what, why, when (with certainty), or whether the log is complete.

Now here's what a VAP-compliant provenance record looks like for the same event:

{
  "header": {
    "event_id": "01938a2f-4c1d-7e89-b2a3-1234567890ab",
    "trace_id": "01938a2f-4c1c-7000-8000-000000000001",
    "timestamp_int": "1743004800123456789",
    "timestamp_iso": "2025-03-26T12:00:00.123456789Z",
    "event_type": "SCORING_COMPLETED",
    "hash_algo": "SHA256",
    "sign_algo": "ED25519"
  },
  "provenance": {
    "actor": {
      "type": "AI_MODEL",
      "identifier": "hiring-scorer-v3.2.1",
      "version": "3.2.1",
      "hash": "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855"
    },
    "input": {
      "sources": ["resume_upload_01938a2e", "job_req_JR-2025-4421"],
      "timestamp": 1743004800100000000,
      "hash": "a7ffc6f8bf1ed76651c14756a061d662f580ff4de43b49fa82d80a4b80f8434a"
    },
    "context": {
      "parameters": {
        "score_threshold": "0.70",
        "feature_set": "v12",
        "bias_mitigation": "demographic_parity_v2"
      },
      "constraints": {
        "protected_attributes_excluded": true,
        "disparate_impact_ratio_limit": "0.80"
      }
    },
    "action": {
      "type": "CANDIDATE_SCORE",
      "decision": {
        "raw_score": "0.42",
        "normalized_score": "0.42",
        "outcome": "FILTERED_OUT",
        "threshold_applied": "0.70"
      },
      "confidence": "0.87",
      "explainability": {
        "method": "SHAP",
        "factors": [
          {"feature": "years_relevant_experience", "weight": "0.31"},
          {"feature": "skill_keyword_match", "weight": "0.28"},
          {"feature": "education_recency", "weight": "-0.22"},
          {"feature": "career_progression_velocity", "weight": "0.05"}
        ]
      }
    },
    "outcome": {
      "result": {"disposition": "AUTO_REJECTED", "queue": "no_human_review"},
      "timestamp": 1743004800123456789,
      "status": "SUCCESS"
    }
  },
  "accountability": {
    "operator_id": "hr-platform-acme-corp",
    "last_approval_by": "system_auto",
    "delegation_chain": [
      {
        "delegator": "jane.doe@acme.example.com",
        "delegatee": "hiring-scorer-v3.2.1",
        "scope": "AUTO_SCORE_AND_FILTER",
        "valid_from": 1740000000000000000,
        "valid_until": 1748000000000000000
      }
    ],
    "override_history": []
  },
  "security": {
    "event_hash": "c3ab8ff13720e8ad9047dd39466b3c89...",
    "prev_hash": "5d41402abc4b2a76b9719d911017c592...",
    "signature": "base64-encoded-ed25519-signature...",
    "merkle_position": 7342,
    "anchor_reference": {
      "type": "RFC3161_TSA",
      "uri": "https://freetsa.org/tsr",
      "timestamp": "2025-03-26T12:05:00Z"
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Now the regulator can see:

  • ✅ Exactly which model version made the decision (hiring-scorer-v3.2.1)
  • ✅ What inputs were used (hashed, so privacy-preserving but verifiable)
  • ✅ What features drove the score (SHAP explanations, preserved at decision time)
  • ✅ Who delegated authority to the AI (jane.doe@acme.example.com)
  • ✅ Whether a human reviewed the decision (no — override_history is empty)
  • ✅ That the log hasn't been tampered with (hash chain + external anchor)

That's the difference between "trust us, we have logs" and "verify it yourself."


VAP Architecture: 5 Layers, Zero Trust 🏗ïļ

VAP — the Verifiable AI Provenance Framework — is organized around five core layers. Think of them as a stack: each layer builds on the one below it, and you need all five to achieve cryptographically verifiable provenance.

┌─────────────────────────────────────────────────────────┐
│                                                         │
│  Layer 5: EXTERNAL ANCHORING                            │
│  ──────────────────────────────────────────────         │
│  Periodic commitments to independent third parties.     │
│  RFC 3161 TSA, transparency logs, blockchain anchors.   │
│  "Don't trust the operator. Verify against the anchor." │
│                                                         │
│  Layer 4: COMPLETENESS                                  │
│  ──────────────────────────────────────────────         │
│  Hash chain continuity. Mandatory event types.          │
│  Multi-log replication. Omission = detectable.          │
│  "You can't quietly drop events."                       │
│                                                         │
│  Layer 3: ACCOUNTABILITY                                │
│  ──────────────────────────────────────────────         │
│  Delegation chains. Override history.                   │
│  Human oversight tracking. operator_id on everything.   │
│  "Who authorized this? Prove it."                       │
│                                                         │
│  Layer 2: PROVENANCE                                    │
│  ──────────────────────────────────────────────         │
│  Actor identity + version. Input hashes + sources.      │
│  Decision context. Explainability factors.              │
│  "What happened, and why?"                              │
│                                                         │
│  Layer 1: INTEGRITY                                     │
│  ──────────────────────────────────────────────         │
│  SHA-256 hash chains. Ed25519 signatures.               │
│  RFC 6962 Merkle trees. RFC 8785 canonicalization.      │
│  "The math proves nobody changed anything."             │
│                                                         │
└─────────────────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

Let's walk through each one with code.


Layer 1: Integrity — The Math That Keeps Everyone Honest 🔐

The foundation of everything in VAP is the hash chain. Every event is cryptographically linked to the previous event. If any event is modified, deleted, or reordered, the chain breaks — and the break is mathematically provable.

Hash Chain Construction

import hashlib
import json

def canonical_json(obj: dict) -> str:
    """
    RFC 8785 JSON Canonicalization.
    Deterministic serialization — same input always produces same output.
    """
    return json.dumps(obj, sort_keys=True, separators=(',', ':'))

def compute_event_hash(event: dict, prev_hash: str) -> str:
    """
    Compute the hash for a VAP event.
    The hash covers: header + provenance + accountability + prev_hash.
    """
    hashable = {
        "header": event["header"],
        "provenance": event["provenance"],
        "accountability": event["accountability"],
        "prev_hash": prev_hash
    }
    canonical = canonical_json(hashable)
    return hashlib.sha256(canonical.encode('utf-8')).hexdigest()

# Build a chain
genesis_hash = "0" * 64  # Genesis block
events = [event_1, event_2, event_3]  # Your VAP events

chain = []
prev = genesis_hash
for event in events:
    event_hash = compute_event_hash(event, prev)
    event["security"]["event_hash"] = event_hash
    event["security"]["prev_hash"] = prev
    chain.append(event)
    prev = event_hash
Enter fullscreen mode Exit fullscreen mode

Chain Verification

Here's the verification function — this is what an auditor runs:

def verify_chain_integrity(events: list) -> bool:
    """
    Verify the entire hash chain.
    Returns False if ANY event has been tampered with.
    """
    for i, event in enumerate(events):
        expected_prev = events[i-1]["security"]["event_hash"] if i > 0 else "0" * 64

        if event["security"]["prev_hash"] != expected_prev:
            print(f"❌ Chain break at event {i}: prev_hash mismatch")
            return False

        recomputed = compute_event_hash(event, expected_prev)
        if event["security"]["event_hash"] != recomputed:
            print(f"❌ Tamper detected at event {i}: hash mismatch")
            return False

    print(f"✅ Chain verified: {len(events)} events, integrity confirmed")
    return True
Enter fullscreen mode Exit fullscreen mode

The key insight: an auditor doesn't need to trust the operator. They take the chain, run verify_chain_integrity(), and the math gives a binary answer. Either the chain is intact, or it's been tampered with. No gray area. No "trust me."

Merkle Tree for Batch Verification

For large event volumes, VAP uses RFC 6962-compliant Merkle trees to efficiently commit batches of events:

def merkle_hash_leaf(data: bytes) -> bytes:
    """RFC 6962 leaf hash: H(0x00 || data)"""
    return hashlib.sha256(b'\x00' + data).digest()

def merkle_hash_node(left: bytes, right: bytes) -> bytes:
    """RFC 6962 node hash: H(0x01 || left || right)"""
    return hashlib.sha256(b'\x01' + left + right).digest()

def merkle_root(leaves: list) -> bytes:
    """Build RFC 6962 Merkle tree and return root."""
    if not leaves:
        return hashlib.sha256(b'').digest()

    hashed = [merkle_hash_leaf(leaf) for leaf in leaves]

    # Pad to power of 2
    while len(hashed) & (len(hashed) - 1):
        hashed.append(hashed[-1])

    while len(hashed) > 1:
        hashed = [
            merkle_hash_node(hashed[i], hashed[i+1])
            for i in range(0, len(hashed), 2)
        ]

    return hashed[0]
Enter fullscreen mode Exit fullscreen mode

The Merkle root is a single hash that commits to an entire batch of events. Publish this root to an external anchor, and you've created a cryptographic proof that those exact events existed at that exact time — without revealing any of the event contents.


Layer 2: Provenance — The "Why" Behind Every Decision 📋

This is the layer that transforms a black box into a verifiable box. The provenance layer captures the full decision context using a domain-agnostic abstract model:

{
  "provenance": {
    "actor": {
      "type": "AI_MODEL | HUMAN | EXTERNAL_AGENT | HYBRID",
      "identifier": "unique-system-id",
      "version": "semantic-version",
      "hash": "sha256-of-model-parameters"
    },
    "input": {
      "sources": ["array-of-input-source-identifiers"],
      "timestamp": "nanosecond-precision-int64",
      "hash": "sha256-of-input-data"
    },
    "context": {
      "parameters": {},
      "constraints": {},
      "environment": {}
    },
    "action": {
      "type": "string",
      "decision": {},
      "confidence": "0.0-1.0",
      "explainability": {
        "method": "SHAP | LIME | GRADCAM | RULE_TRACE | NONE",
        "factors": []
      }
    },
    "outcome": {
      "result": {},
      "timestamp": "nanosecond-precision-int64",
      "status": "SUCCESS | FAILURE | PARTIAL | PENDING"
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

The schema is deliberately abstract because VAP is a cross-domain framework. The same provenance structure maps to radically different use cases:

  • Finance (VCP profile): actor = algorithm/trader, input = market data, action = order placement
  • Medical (MAP profile): actor = diagnostic AI, input = patient imaging, action = diagnosis suggestion
  • Public Admin (PAP profile): actor = scoring AI, input = application data, action = eligibility decision
  • Automotive (DVP profile): actor = autonomous system, input = LIDAR/camera, action = path planning

Each domain profile extends this abstract model with domain-specific event types, constraints, and compliance mappings.

Why This Matters for the California Hiring Case

Remember applicant #7,342? With VAP provenance, her lawyer doesn't need to reverse-engineer a proprietary model. The provenance record already contains:

# What the lawyer can extract from the VAP provenance log
factors = event["provenance"]["action"]["explainability"]["factors"]

for f in factors:
    print(f"Feature: {f['feature']}, Weight: {f['weight']}")

# Output:
# Feature: years_relevant_experience, Weight: 0.31
# Feature: skill_keyword_match, Weight: 0.28  
# Feature: education_recency, Weight: -0.22    ← ðŸšĐ this penalizes older grads
# Feature: career_progression_velocity, Weight: 0.05
Enter fullscreen mode Exit fullscreen mode

That education_recency feature with a negative weight of -0.22 is now visible evidence. It penalizes candidates whose degrees are older — a direct proxy for age. Without provenance logging, this bias is invisible. With it, the evidence is cryptographically signed and externally anchored.


Layer 3: Accountability — The Delegation Chain Problem 🔗

This layer is critical for AI agent governance. When 51% of surveyed enterprises are already running AI agents in production, the question "who authorized this?" needs a cryptographically verifiable answer.

{
  "accountability": {
    "operator_id": "acme-corp-hr-platform",
    "last_approval_by": "jane.doe@acme.example.com",
    "approval_timestamp": 1743004700000000000,
    "delegation_chain": [
      {
        "delegator": "ciso@acme.example.com",
        "delegatee": "ai-agent-orchestrator-v2",
        "scope": "EMAIL_ACCESS_READ_WRITE",
        "valid_from": 1740000000000000000,
        "valid_until": 1748000000000000000
      },
      {
        "delegator": "ai-agent-orchestrator-v2",
        "delegatee": "email-cleanup-agent-v1.3",
        "scope": "EMAIL_DELETE_OLDER_THAN_90D",
        "valid_from": 1743000000000000000,
        "valid_until": 1743100000000000000
      }
    ],
    "override_history": [
      {
        "original_action": {"scope": "EMAIL_DELETE_OLDER_THAN_90D"},
        "override_action": {"scope": "EMAIL_DELETE_ALL"},
        "override_by": "PROMPT_INJECTION",
        "reason": "Injected instruction in email body",
        "timestamp": 1743004800000000000
      }
    ]
  }
}
Enter fullscreen mode Exit fullscreen mode

☝ïļ This is what the Kiteworks red team scenario looks like when you have provenance infrastructure. The override_history captures the exact moment the agent's scope was hijacked via prompt injection. The delegation_chain shows that the CISO authorized EMAIL_ACCESS_READ_WRITE but the agent was re-delegated to a sub-agent with EMAIL_DELETE_OLDER_THAN_90D scope — which was then overridden to EMAIL_DELETE_ALL.

Without this layer, the incident report says "emails were deleted." With it, investigators can reconstruct the exact authorization chain and pinpoint where the governance failure occurred.

EU AI Act Article 14 Compliance

The Accountability layer directly maps to EU AI Act human oversight requirements:

EU AI Act Article 14          →   VAP Accountability Field
──────────────────────────────────────────────────────────
Human oversight enabled       →   operator_id
Intervention capability       →   HALT/OVERRIDE event types  
Override function             →   override_history
Supervisory identification    →   last_approval_by
Enter fullscreen mode Exit fullscreen mode

Layer 4: Completeness — You Can't Quietly Drop Events ðŸ•ģïļ

A provenance system that allows selective logging is worthless. If the operator can choose which events to record and which to silently omit, the logs prove nothing.

VAP enforces completeness through three mechanisms:

1. Hash Chain Continuity

If event #7,342 is missing, the chain from #7,341 to #7,343 breaks:

def detect_omissions(events: list) -> list:
    """Detect gaps in the hash chain that indicate missing events."""
    gaps = []
    for i in range(1, len(events)):
        expected_prev = events[i-1]["security"]["event_hash"]
        actual_prev = events[i]["security"]["prev_hash"]

        if expected_prev != actual_prev:
            gaps.append({
                "position": i,
                "expected": expected_prev[:16] + "...",
                "actual": actual_prev[:16] + "...",
                "verdict": "⚠ïļ MISSING EVENT(S) DETECTED"
            })

    return gaps
Enter fullscreen mode Exit fullscreen mode

2. UUIDv7 Monotonicity

VAP uses UUIDv7 (RFC 9562) for event IDs. UUIDv7 embeds a millisecond-precision timestamp in the most significant bits, making IDs naturally time-ordered:

import uuid

def extract_timestamp_from_uuidv7(event_id: str) -> int:
    """Extract millisecond timestamp from UUIDv7."""
    u = uuid.UUID(event_id)
    # UUIDv7: first 48 bits are Unix timestamp in ms
    timestamp_ms = u.int >> 80
    return timestamp_ms

# Verify monotonicity
timestamps = [extract_timestamp_from_uuidv7(e["header"]["event_id"]) for e in events]
is_monotonic = all(timestamps[i] <= timestamps[i+1] for i in range(len(timestamps)-1))
print(f"Monotonic ordering: {'✅' if is_monotonic else '❌'}")
Enter fullscreen mode Exit fullscreen mode

3. VCP-XREF Dual Logging

For multi-party scenarios, VCP-XREF enables bilateral completeness verification:

{
  "VCP-XREF": {
    "PrimaryLogServer": "log-a.example.com",
    "ReplicaLogServers": ["log-b.example.com"],
    "EventID": "01938a2f-4c1d-7e89-b2a3-1234567890ab",
    "DeliveryReceipts": [
      {"server": "log-a", "timestamp": "2026-03-24T14:30:00.123Z", "merkle_position": 7342},
      {"server": "log-b", "timestamp": "2026-03-24T14:30:00.127Z", "merkle_position": 7342}
    ],
    "GossipConsensus": {
      "root_hash": "abc123...",
      "consensus_timestamp": "2026-03-24T14:35:00Z",
      "conflict_detected": false
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

If Party A claims an event occurred but Party B's XREF shows no corresponding record — that's a discrepancy, and it's automatically flagged. No manual reconciliation needed.


Layer 5: External Anchoring — Don't Trust, Verify 🌍

This is the layer that makes everything else meaningful. Without external anchoring, the entire chain lives under the operator's control. They could theoretically rebuild the whole thing from scratch.

External anchoring solves this by periodically publishing cryptographic commitments — Merkle roots — to independent third parties:

from nacl.signing import SigningKey
import base64
import requests

def anchor_to_external(merkle_root: bytes, private_key: bytes) -> dict:
    """
    Sign the Merkle root and submit to an RFC 3161 TSA.

    After this, the operator CANNOT:
    - Claim the events didn't exist
    - Modify events without detection
    - Insert events that weren't in the original batch

    An independent auditor CAN:
    - Verify the Merkle root matches the published anchor
    - Verify individual events via Merkle audit paths
    - Confirm the timestamp is authentic (TSA-signed)
    """
    # Sign the root
    signing_key = SigningKey(private_key)
    signature = signing_key.sign(merkle_root)

    # Submit to RFC 3161 Timestamp Authority
    tsa_request = create_tsa_request(merkle_root)
    tsa_response = requests.post(
        "https://freetsa.org/tsr",
        data=tsa_request,
        headers={"Content-Type": "application/timestamp-query"}
    )

    return {
        "merkle_root": merkle_root.hex(),
        "signature": base64.b64encode(signature.signature).decode(),
        "tsa_token": base64.b64encode(tsa_response.content).decode(),
        "anchor_timestamp": "2026-03-24T14:35:00Z"
    }
Enter fullscreen mode Exit fullscreen mode

Anchoring Frequency by Tier

VAP (via its VCP financial profile) defines three conformance tiers:

Tier        Target Users              Clock Sync    Anchor Frequency
─────────────────────────────────────────────────────────────────────
Platinum    HFT / Exchanges           PTPv2 (<1Ξs)  Every 10 minutes
Gold        Institutional traders     NTP (<1ms)     Every 1 hour
Silver      Retail / MT4/MT5          Best-effort    Every 24 hours
Enter fullscreen mode Exit fullscreen mode

Even at Silver tier, anchoring is mandatory (changed in VCP v1.1). A lightweight or delegated mechanism is acceptable, but the external commitment must exist.


The Profile System: One Framework, Many Domains 🌐

VAP itself is domain-agnostic. It defines the abstract layers. Concrete implementations come via domain profiles that extend the base framework:

VAP (Verifiable AI Provenance Framework)
 │
 ├── VCP  (Finance / Algorithmic Trading)    ← v1.1 released
 ├── CAP  (Content / Creative AI)            ← v1.0 released
 ├── CPP  (Capture Provenance)               ← v1.3 released
 ├── LAP  (Legal AI)                         ← v0.4 draft
 ├── DAP  (Defense AI)                       ← v0.1 draft
 ├── DVP  (Automotive)                       ← planned
 ├── MAP  (Medical)                          ← planned
 ├── PAP  (Public Administration)            ← planned
 ├── EIP  (Energy Infrastructure)            ← planned
 ├── AAP  (Aviation)                         ← planned
 └── IAP  (Industry Accountability)          ← planned
Enter fullscreen mode Exit fullscreen mode

Each profile defines:

{
  "profile_extension": {
    "profile_id": "VCP",
    "profile_version": "1.1.0",
    "domain_specific_modules": [
      "VCP-CORE", "VCP-TRADE", "VCP-GOV",
      "VCP-RISK", "VCP-PRIVACY", "VCP-RECOVERY"
    ],
    "domain_specific_events": [
      "SIG", "ORD", "EXE", "CXL", "MOD",
      "POS", "RSK", "HLT", "GOV", "ERR", "RCV", "HBT"
    ],
    "domain_specific_constraints": {
      "timestamp_precision": "NANOSECOND",
      "clock_sync_requirement": "PER_TIER"
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

The profile mechanism means you don't need to reinvent the provenance wheel for every industry. An automotive engineer working on DVP (autonomous driving) inherits the same integrity, completeness, and anchoring guarantees as a quantitative analyst working with VCP (algorithmic trading). Only the domain-specific events and schemas differ.


Why Not Just Use Blockchain? ðŸĪ”

Fair question. The short answer: blockchain solves a different problem.

Blockchain:
  - Consensus among UNTRUSTED, UNKNOWN parties
  - Throughput: ~15 TPS (Ethereum) to ~65K TPS (Solana)
  - Latency: seconds to minutes
  - Cost: gas fees per transaction
  - Overkill for: organizational audit trails

VAP:
  - Verification by KNOWN parties (regulators, auditors)
  - Throughput: limited only by local storage I/O
  - Latency: microseconds
  - Cost: compute + storage only
  - Optimized for: high-frequency decision logging
Enter fullscreen mode Exit fullscreen mode

VAP uses blockchain selectively — as one possible external anchoring target. You can anchor your Merkle roots to Bitcoin via OpenTimestamps, to Ethereum via a smart contract, or to a traditional RFC 3161 timestamp authority. The choice is implementation-specific, not framework-mandated.

The design principle: use the lightest-weight mechanism that provides the required trust guarantee for your use case.


Putting It All Together: Full Verification Pipeline 🔧

Here's a complete verification pipeline that an auditor would run against a VAP event chain:

#!/usr/bin/env python3
"""
VAP Chain Verifier — Complete audit verification pipeline.
Checks: hash chain integrity, signature validity, 
        Merkle consistency, external anchor verification.
"""

import hashlib
import json
from nacl.signing import VerifyKey
from nacl.exceptions import BadSignatureError

def verify_full_chain(events: list, public_key: bytes, anchor_roots: list) -> dict:
    """
    Full VAP chain verification.

    Returns a structured audit report.
    """
    report = {
        "total_events": len(events),
        "hash_chain_valid": True,
        "signatures_valid": True,
        "merkle_consistent": True,
        "anchor_verified": True,
        "issues": []
    }

    # 1. Hash chain integrity
    for i, event in enumerate(events):
        expected_prev = events[i-1]["security"]["event_hash"] if i > 0 else "0" * 64

        if event["security"]["prev_hash"] != expected_prev:
            report["hash_chain_valid"] = False
            report["issues"].append(f"Chain break at event {i}")

        recomputed = compute_event_hash(event, expected_prev)
        if event["security"]["event_hash"] != recomputed:
            report["hash_chain_valid"] = False
            report["issues"].append(f"Tamper detected at event {i}")

    # 2. Signature verification
    verify_key = VerifyKey(public_key)
    for i, event in enumerate(events):
        try:
            sig = base64.b64decode(event["security"]["signature"])
            msg = bytes.fromhex(event["security"]["event_hash"])
            verify_key.verify(msg, sig)
        except BadSignatureError:
            report["signatures_valid"] = False
            report["issues"].append(f"Invalid signature at event {i}")

    # 3. Merkle root consistency
    leaf_hashes = [
        bytes.fromhex(e["security"]["event_hash"]) for e in events
    ]
    computed_root = merkle_root(leaf_hashes)

    if computed_root.hex() not in [a["merkle_root"] for a in anchor_roots]:
        report["merkle_consistent"] = False
        report["issues"].append("Merkle root does not match any anchor")

    # 4. External anchor timestamp verification (simplified)
    for anchor in anchor_roots:
        if not verify_tsa_token(anchor["tsa_token"]):
            report["anchor_verified"] = False
            report["issues"].append("TSA token verification failed")

    return report

# Run verification
result = verify_full_chain(events, public_key, anchors)

if not result["issues"]:
    print("✅ AUDIT PASSED — All events verified")
else:
    print("❌ AUDIT FAILED:")
    for issue in result["issues"]:
        print(f"   â€Ē {issue}")
Enter fullscreen mode Exit fullscreen mode

That's the complete pipeline. Hash chain → signatures → Merkle consistency → external anchor. If all four pass, the auditor has mathematical proof that the event chain is intact, unmodified, and was committed at the claimed time. No trust required.


The Regulatory Clock Is Ticking ⏰

This isn't a nice-to-have. Multiple jurisdictions are converging on mandatory AI audit trail requirements:

EU AI Act (Regulation 2024/1689):

  • Article 12: Automatic event logging over system lifetime
  • Article 13: Transparency — deployers must interpret outputs
  • Article 14: Human oversight with override capability
  • Article 19: Minimum 6-month log retention
  • Article 86: Right to explanation for affected individuals
  • Enforcement date for high-risk AI: August 2, 2026 (subject to Digital Omnibus extension)
  • Penalties: up to ₮15M or 3% of global turnover

Colorado AI Act (SB 24-205):

  • Algorithmic impact assessments
  • Disclosure requirements for high-risk AI
  • Enforcement: June 30, 2026

Key gap: The EU AI Act mandates logging but doesn't specify how. The words "tamper-evident," "immutable," and "cryptographic integrity" appear nowhere in the legal text. The harmonized standards from CEN-CENELEC JTC 21 are still in development — prEN ISO/IEC 24970 (the logging standard) won't be finalized until mid-2026 at the earliest.

Organizations face enforcement with no officially harmonized technical standards for compliant logging. That's the gap VAP is designed to fill.


Get Started 🚀

VAP is an open specification published under CC BY 4.0. The flagship VCP profile (algorithmic trading) is at v1.1 with production-ready specifications.

GitHub: github.com/veritaschain/vcp-spec

Specification: github.com/veritaschain/vcp-spec/tree/main/spec/v1.1

IETF Internet-Drafts: datatracker.ietf.org/doc/draft-kamimura-scitt-vcp

Website: veritaschain.org

The cryptographic stack is built entirely on established standards:

  • SHA-256 (FIPS 180-4) for hashing
  • Ed25519 (RFC 8032) for signatures
  • Merkle trees (RFC 6962) for batch commitments
  • JSON Canonicalization (RFC 8785) for deterministic serialization
  • UUIDv7 (RFC 9562) for time-ordered identifiers
  • RFC 3161 for trusted timestamps
  • ML-DSA/Dilithium (NIST FIPS 204) planned for post-quantum readiness

Nothing exotic. Just proven primitives, assembled into a coherent provenance standard for AI systems.

Note: VAP and VCP are early-stage open standards. The IETF Internet-Drafts are individual submissions and do not represent IETF consensus or endorsement. Independent review, third-party implementations, and community feedback are actively welcomed.


Technical questions? Open an issue on GitHub or reach out at technical@veritaschain.org.

Found a bug in the spec? We'd rather know now than after deployment. Technical critique makes protocols stronger — and we credit contributors in the changelog.

Top comments (0)