DEV Community

Cover image for Proving What AI Didn't Generate: Building Cryptographic Refusal Logs with CAP-SRP

Proving What AI Didn't Generate: Building Cryptographic Refusal Logs with CAP-SRP

You've probably seen the headlines about the Grok crisis in January 2026—xAI's image generator producing thousands of non-consensual sexual images per hour before safeguards were bypassed. What you might not have seen is what happened next: 35 U.S. state attorneys general demanded proof that xAI's safety filters actually work.

xAI couldn't provide it. Not because they were lying, but because no AI system today can cryptographically prove what it refused to generate.

Think about that for a second. We have C2PA proving content provenance. We have SynthID watermarking generated images. We have elaborate safety filters. But when a regulator asks "prove your system blocked 10,000 CSAM generation attempts yesterday," every AI company in the world has the same answer: "Trust us."

That's not good enough anymore. And as developers building these systems, we need to fix it.

This post introduces CAP-SRP (Content Authenticity Protocol - Safe Refusal Provenance)—an open specification for creating cryptographic proof of AI refusals. I'll walk through the architecture, show you working code, and explain why this matters for every developer working on generative AI.


Table of Contents

  1. The Problem: Positive-Only Attestation
  2. Why Watermarks Can't Solve This
  3. The CAP-SRP Architecture
  4. The Completeness Invariant: Math That Catches Liars
  5. Implementation Deep-Dive
  6. Cryptographic Primitives
  7. Privacy-Preserving Verification
  8. Integration Patterns
  9. The Regulatory Clock is Ticking
  10. Getting Started

The Problem: Positive-Only Attestation

Let's look at what C2PA (the leading content authenticity standard) actually does:

User Request → AI Generation → Content Created → C2PA Manifest Attached
                                      ↓
                              "This image was created by
                               DALL-E 3 on 2026-01-28"
Enter fullscreen mode Exit fullscreen mode

C2PA is excellent at this. It cryptographically binds provenance metadata to content. Adobe, Google, Microsoft, OpenAI—everyone's adopting it.

But here's the gap:

User Request → Safety Filter → REFUSED → ??? 
                                   ↓
                            No content created
                            No manifest possible
                            No cryptographic proof
Enter fullscreen mode Exit fullscreen mode

When content isn't created, there's nothing to attach a manifest to. C2PA's data model literally doesn't have a concept of "null content" or "refusal receipt."

This isn't a C2PA bug—it was designed for content provenance, not system behavior attestation. But it means we're missing a critical piece of the accountability puzzle.

The Trust Gap in Action

Here's what the NY Attorney General's letter to xAI actually asked for:

"Detailed information about how you verify the effectiveness of your safety measures"

And here's what xAI could provide:

  • ❌ Cryptographic proof of refusals
  • ❌ Verifiable count of blocked requests
  • ❌ Third-party auditable refusal logs
  • ✅ Internal logs (unverifiable)
  • ✅ Policy documents (claims, not proof)
  • ✅ "Trust us" statements

This is the accountability gap. And it exists in every generative AI system deployed today.


Why Watermarks Can't Solve This

"But wait," you might say, "can't we just check for missing watermarks?"

Three problems with that:

Problem 1: Watermarks Can Be Stripped

The DeMark attack (arXiv:2601.16473, January 2026) demonstrated this devastatingly:

Watermark Scheme Detection Before Detection After DeMark
StegaStamp 100% 28.4%
RivaGAN 100% 31.2%
MBRS 100% 35.7%
Average (8 schemes) 100% 32.9%

The attack is black-box (no model access needed) and maintains visual quality. Every "robust" watermark tested failed.

Problem 2: Absence Proves Nothing

If I show you an image without a watermark, what do you know?

  • Human created?
  • AI refused to create?
  • AI created + watermark stripped?
  • Non-compliant AI created?

You can't distinguish these cases. A missing watermark is ambiguous by definition.

Problem 3: You Can't Watermark Nothing

This is the fundamental issue. Watermarks mark content. Refusals produce no content. You cannot watermark an image that doesn't exist.

We need to think at a different layer.


The CAP-SRP Architecture

CAP-SRP operates at the system layer, not the content layer:

┌─────────────────────────────────────────────────────────────────┐
│                     CONTENT LAYER                               │
│   ┌─────────────┐    ┌─────────────┐    ┌─────────────┐        │
│   │ Watermarks  │    │    C2PA     │    │   SynthID   │        │
│   │             │    │  Manifests  │    │             │        │
│   └─────────────┘    └─────────────┘    └─────────────┘        │
│                                                                 │
│   Problem: Can only attest to content that EXISTS              │
└─────────────────────────────────────────────────────────────────┘
                              │
                              │ CAP-SRP bridges this gap
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│                     SYSTEM LAYER                                │
│   ┌─────────────────────────────────────────────────────────┐  │
│   │                      CAP-SRP                            │  │
│   │                                                         │  │
│   │  Logs DECISIONS, not content:                          │  │
│   │  • GEN_ATTEMPT  (request received)                     │  │
│   │  • GEN          (content created)                      │  │
│   │  • GEN_DENY     (request refused)                      │  │
│   │  • GEN_ERROR    (system failure)                       │  │
│   │                                                         │  │
│   │  Every attempt has exactly one outcome.                │  │
│   │  Cryptographically provable.                           │  │
│   │  Externally anchored.                                  │  │
│   └─────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

The key insight: Log the decision, not the content.

Event Types

CAP-SRP defines four event types:

enum EventType {
  GEN_ATTEMPT = "GEN_ATTEMPT",  // Request received (logged BEFORE evaluation)
  GEN = "GEN",                  // Content successfully generated
  GEN_DENY = "GEN_DENY",        // Request refused by safety system
  GEN_ERROR = "GEN_ERROR"       // System error prevented completion
}
Enter fullscreen mode Exit fullscreen mode

Every GEN_ATTEMPT must have exactly one corresponding outcome event. This is enforced cryptographically.

The Event Flow

┌──────────────────────────────────────────────────────────────────┐
│ Timeline                                                         │
├──────────────────────────────────────────────────────────────────┤
│                                                                  │
│ t₀ ─────► Request received                                       │
│           │                                                      │
│ t₁ ─────► GEN_ATTEMPT logged ◄─── COMMITMENT POINT               │
│           │                       (before safety evaluation)     │
│           │                                                      │
│ t₂ ─────► Safety evaluation begins                               │
│           │                                                      │
│           ├──► Safe ──────► t₃: Generate content                 │
│           │                      │                               │
│           │                 t₄: GEN logged (links to attempt)    │
│           │                      │                               │
│           │                 t₅: C2PA manifest attached           │
│           │                                                      │
│           └──► Unsafe ────► t₃: GEN_DENY logged                  │
│                                  (links to attempt)              │
│                                                                  │
└──────────────────────────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

The critical detail: GEN_ATTEMPT is logged BEFORE safety evaluation. This prevents selective logging—you can't evaluate first and then decide whether to record.


The Completeness Invariant: Math That Catches Liars

Here's where it gets interesting. CAP-SRP enforces a mathematical invariant:

∀ time_window T:
  COUNT(GEN_ATTEMPT) = COUNT(GEN) + COUNT(GEN_DENY) + COUNT(GEN_ERROR)
Enter fullscreen mode Exit fullscreen mode

In plain English: Every attempt must have exactly one outcome.

This simple equation enables powerful fraud detection:

Detecting Hidden Generations

def detect_hidden_generations(events: List[Event]) -> List[str]:
    """Find attempts without recorded outcomes."""
    attempts = {e.attempt_id for e in events if e.type == "GEN_ATTEMPT"}
    outcomes = {e.attempt_id for e in events if e.type in ["GEN", "GEN_DENY", "GEN_ERROR"]}

    orphan_attempts = attempts - outcomes

    if orphan_attempts:
        return [f"ALERT: Attempt {id} has no recorded outcome" for id in orphan_attempts]
    return []
Enter fullscreen mode Exit fullscreen mode

If orphan_attempts > 0, the system is hiding results. Maybe it generated something it shouldn't have and didn't want to log it.

Detecting Fabricated Refusals

def detect_fabricated_refusals(events: List[Event]) -> List[str]:
    """Find outcomes without corresponding attempts."""
    attempts = {e.attempt_id for e in events if e.type == "GEN_ATTEMPT"}
    outcomes = {e.attempt_id for e in events if e.type in ["GEN", "GEN_DENY", "GEN_ERROR"]}

    orphan_outcomes = outcomes - attempts

    if orphan_outcomes:
        return [f"ALERT: Outcome {id} has no recorded attempt" for id in orphan_outcomes]
    return []
Enter fullscreen mode Exit fullscreen mode

If orphan_outcomes > 0, the system is fabricating refusal records to inflate its safety metrics.

The Power of Simple Math

This invariant is trivially verifiable by any third party with access to the logs. No special knowledge needed. No trust required. Just count and compare.

Combined with cryptographic signatures and external anchoring, it becomes extremely difficult to cheat:

  • Can't delete attempts (hash chain breaks)
  • Can't fabricate outcomes (no matching attempt signature)
  • Can't backdate entries (external timestamp authority)
  • Can't selectively log (attempt logged before evaluation)

Implementation Deep-Dive

Let's build a working CAP-SRP implementation. I'll use Python, but the concepts apply to any language.

Core Data Structures

from dataclasses import dataclass, field
from datetime import datetime, timezone
from enum import Enum
from typing import Optional
import hashlib
import uuid

class EventType(Enum):
    GEN_ATTEMPT = "GEN_ATTEMPT"
    GEN = "GEN"
    GEN_DENY = "GEN_DENY"
    GEN_ERROR = "GEN_ERROR"

class RiskCategory(Enum):
    CSAM_RISK = "CSAM_RISK"
    NCII_RISK = "NCII_RISK"
    MINOR_SEXUALIZATION = "MINOR_SEXUALIZATION"
    REAL_PERSON_DEEPFAKE = "REAL_PERSON_DEEPFAKE"
    VIOLENCE_GRAPHIC = "VIOLENCE_GRAPHIC"
    HATE_SPEECH = "HATE_SPEECH"
    SELF_HARM = "SELF_HARM"
    ILLEGAL_ACTIVITY = "ILLEGAL_ACTIVITY"

@dataclass
class CAPEvent:
    event_id: str = field(default_factory=lambda: str(uuid.uuid7()))
    event_type: EventType
    timestamp: datetime = field(default_factory=lambda: datetime.now(timezone.utc))
    attempt_id: str  # Links outcomes to attempts

    # For GEN_ATTEMPT
    prompt_hash: Optional[str] = None  # SHA-256 of prompt (never raw text!)
    model_id: Optional[str] = None
    user_context_hash: Optional[str] = None

    # For GEN
    output_hash: Optional[str] = None
    c2pa_manifest_id: Optional[str] = None

    # For GEN_DENY
    denial_reason_code: Optional[str] = None
    risk_category: Optional[RiskCategory] = None
    risk_score: Optional[float] = None
    policy_version: Optional[str] = None

    # For GEN_ERROR
    error_code: Optional[str] = None
    error_category: Optional[str] = None

    # Integrity fields
    previous_hash: Optional[str] = None
    signature: Optional[str] = None

    def compute_hash(self) -> str:
        """Compute SHA-256 hash of event content."""
        content = f"{self.event_id}|{self.event_type.value}|{self.timestamp.isoformat()}|{self.attempt_id}"
        return hashlib.sha256(content.encode()).hexdigest()
Enter fullscreen mode Exit fullscreen mode

The Event Logger

from typing import List, Tuple
from cryptography.hazmat.primitives.asymmetric.ed25519 import Ed25519PrivateKey
from cryptography.hazmat.primitives import serialization
import json

class CAPEventLogger:
    def __init__(self, signing_key: Ed25519PrivateKey, model_id: str):
        self.signing_key = signing_key
        self.model_id = model_id
        self.events: List[CAPEvent] = []
        self.last_hash: Optional[str] = None

    def _sign_event(self, event: CAPEvent) -> str:
        """Sign event with Ed25519."""
        message = event.compute_hash().encode()
        signature = self.signing_key.sign(message)
        return signature.hex()

    def _hash_prompt(self, prompt: str) -> str:
        """Hash prompt - NEVER store raw prompts!"""
        return hashlib.sha256(prompt.encode()).hexdigest()

    def _hash_user_context(self, user_id: str) -> str:
        """Hash user context for privacy."""
        # Add salt in production!
        return hashlib.sha256(user_id.encode()).hexdigest()

    def log_attempt(self, prompt: str, user_id: str) -> CAPEvent:
        """
        Log a generation attempt BEFORE safety evaluation.

        This is the commitment point - must happen before any
        decision is made about whether to generate.
        """
        attempt_id = str(uuid.uuid7())

        event = CAPEvent(
            event_type=EventType.GEN_ATTEMPT,
            attempt_id=attempt_id,
            prompt_hash=self._hash_prompt(prompt),
            model_id=self.model_id,
            user_context_hash=self._hash_user_context(user_id),
            previous_hash=self.last_hash
        )

        event.signature = self._sign_event(event)
        self.last_hash = event.compute_hash()
        self.events.append(event)

        return event

    def log_generation(self, attempt_id: str, output: bytes, 
                       c2pa_manifest_id: Optional[str] = None) -> CAPEvent:
        """Log successful content generation."""
        event = CAPEvent(
            event_type=EventType.GEN,
            attempt_id=attempt_id,
            output_hash=hashlib.sha256(output).hexdigest(),
            c2pa_manifest_id=c2pa_manifest_id,
            previous_hash=self.last_hash
        )

        event.signature = self._sign_event(event)
        self.last_hash = event.compute_hash()
        self.events.append(event)

        return event

    def log_denial(self, attempt_id: str, risk_category: RiskCategory,
                   risk_score: float, policy_version: str) -> CAPEvent:
        """Log a safety-based refusal."""
        event = CAPEvent(
            event_type=EventType.GEN_DENY,
            attempt_id=attempt_id,
            denial_reason_code=risk_category.value,
            risk_category=risk_category,
            risk_score=risk_score,
            policy_version=policy_version,
            previous_hash=self.last_hash
        )

        event.signature = self._sign_event(event)
        self.last_hash = event.compute_hash()
        self.events.append(event)

        return event

    def log_error(self, attempt_id: str, error_code: str, 
                  error_category: str) -> CAPEvent:
        """Log a system error."""
        event = CAPEvent(
            event_type=EventType.GEN_ERROR,
            attempt_id=attempt_id,
            error_code=error_code,
            error_category=error_category,
            previous_hash=self.last_hash
        )

        event.signature = self._sign_event(event)
        self.last_hash = event.compute_hash()
        self.events.append(event)

        return event

    def verify_completeness(self) -> Tuple[bool, List[str]]:
        """Verify the completeness invariant."""
        issues = []

        attempts = {e.attempt_id for e in self.events 
                   if e.event_type == EventType.GEN_ATTEMPT}
        outcomes = {e.attempt_id for e in self.events 
                   if e.event_type in [EventType.GEN, EventType.GEN_DENY, EventType.GEN_ERROR]}

        # Check for orphan attempts (hidden generations)
        orphan_attempts = attempts - outcomes
        for aid in orphan_attempts:
            issues.append(f"VIOLATION: Attempt {aid} has no recorded outcome")

        # Check for orphan outcomes (fabricated refusals)
        orphan_outcomes = outcomes - attempts
        for aid in orphan_outcomes:
            issues.append(f"VIOLATION: Outcome for {aid} has no recorded attempt")

        return len(issues) == 0, issues
Enter fullscreen mode Exit fullscreen mode

Usage Example

from cryptography.hazmat.primitives.asymmetric.ed25519 import Ed25519PrivateKey

# Initialize
private_key = Ed25519PrivateKey.generate()
logger = CAPEventLogger(signing_key=private_key, model_id="image-gen-v3")

# Simulate a generation flow
def handle_generation_request(prompt: str, user_id: str):
    # Step 1: Log attempt BEFORE evaluation (commitment point)
    attempt = logger.log_attempt(prompt, user_id)

    # Step 2: Safety evaluation
    risk = evaluate_safety(prompt)  # Your safety system

    # Step 3: Log outcome
    if risk.should_deny:
        logger.log_denial(
            attempt_id=attempt.attempt_id,
            risk_category=RiskCategory.NCII_RISK,
            risk_score=risk.score,
            policy_version="2026-01-28"
        )
        return {"status": "denied", "reason": risk.category}
    else:
        try:
            output = generate_image(prompt)  # Your generation system
            logger.log_generation(
                attempt_id=attempt.attempt_id,
                output=output,
                c2pa_manifest_id=attach_c2pa(output)
            )
            return {"status": "success", "image": output}
        except Exception as e:
            logger.log_error(
                attempt_id=attempt.attempt_id,
                error_code=str(type(e).__name__),
                error_category="GENERATION_FAILURE"
            )
            return {"status": "error", "message": str(e)}

# Later: Verify integrity
is_valid, issues = logger.verify_completeness()
if not is_valid:
    print("INTEGRITY VIOLATIONS DETECTED:")
    for issue in issues:
        print(f"  - {issue}")
Enter fullscreen mode Exit fullscreen mode

Cryptographic Primitives

Why Ed25519?

CAP-SRP specifies Ed25519 (RFC 8032) as the primary signature algorithm:

Property Ed25519 Why It Matters
Key size 32 bytes Compact storage
Signature size 64 bytes Efficient transmission
Sign speed ~50,000/sec Handles high throughput
Deterministic Yes No RNG-based key leakage
Side-channel resistant Yes Timing attack protection
from cryptography.hazmat.primitives.asymmetric.ed25519 import (
    Ed25519PrivateKey, Ed25519PublicKey
)

# Key generation
private_key = Ed25519PrivateKey.generate()
public_key = private_key.public_key()

# Signing
message = b"event data to sign"
signature = private_key.sign(message)

# Verification
try:
    public_key.verify(signature, message)
    print("Signature valid")
except Exception:
    print("Signature INVALID - tampering detected!")
Enter fullscreen mode Exit fullscreen mode

Merkle Trees for Efficient Proofs

For large-scale deployments, we use Merkle trees (same as Certificate Transparency):

import hashlib
from typing import List, Optional

class MerkleTree:
    def __init__(self):
        self.leaves: List[bytes] = []
        self.tree: List[List[bytes]] = []

    def _hash(self, data: bytes) -> bytes:
        return hashlib.sha256(data).digest()

    def _hash_pair(self, left: bytes, right: bytes) -> bytes:
        return self._hash(left + right)

    def add_leaf(self, data: bytes) -> int:
        """Add a leaf and return its index."""
        leaf_hash = self._hash(data)
        self.leaves.append(leaf_hash)
        self._rebuild_tree()
        return len(self.leaves) - 1

    def _rebuild_tree(self):
        """Rebuild the Merkle tree from leaves."""
        if not self.leaves:
            self.tree = []
            return

        self.tree = [self.leaves.copy()]

        while len(self.tree[-1]) > 1:
            level = self.tree[-1]
            next_level = []

            for i in range(0, len(level), 2):
                left = level[i]
                right = level[i + 1] if i + 1 < len(level) else left
                next_level.append(self._hash_pair(left, right))

            self.tree.append(next_level)

    @property
    def root(self) -> Optional[bytes]:
        """Get the Merkle root."""
        if not self.tree:
            return None
        return self.tree[-1][0]

    def get_proof(self, index: int) -> List[tuple]:
        """
        Get inclusion proof for leaf at index.
        Returns list of (sibling_hash, is_left) tuples.
        """
        if index >= len(self.leaves):
            raise ValueError("Index out of range")

        proof = []
        for level in self.tree[:-1]:
            sibling_index = index ^ 1  # XOR with 1 gives sibling
            if sibling_index < len(level):
                is_left = index % 2 == 1
                proof.append((level[sibling_index], is_left))
            index //= 2

        return proof

    @staticmethod
    def verify_proof(leaf_hash: bytes, proof: List[tuple], root: bytes) -> bool:
        """Verify an inclusion proof."""
        current = leaf_hash
        for sibling, is_left in proof:
            if is_left:
                current = hashlib.sha256(sibling + current).digest()
            else:
                current = hashlib.sha256(current + sibling).digest()
        return current == root

# Usage
tree = MerkleTree()

# Add events
for event in events:
    tree.add_leaf(event.compute_hash().encode())

# Get root for external anchoring
merkle_root = tree.root

# Generate proof for specific event
proof = tree.get_proof(event_index)

# Anyone can verify
is_included = MerkleTree.verify_proof(
    leaf_hash=hashlib.sha256(event.compute_hash().encode()).digest(),
    proof=proof,
    root=merkle_root
)
Enter fullscreen mode Exit fullscreen mode

Why Merkle trees?

  • O(log n) proofs: Prove inclusion of 1 event in 80 million with ~3 KB
  • Tamper detection: Any modification breaks the root hash
  • Efficient updates: Add new leaves without recomputing everything

Post-Quantum Considerations

For audit trails that need to remain verifiable for decades, consider hybrid signatures:

# Hybrid Ed25519 + ML-DSA (post-quantum)
# Note: ML-DSA libraries still maturing

class HybridSigner:
    def __init__(self, ed25519_key, ml_dsa_key):
        self.ed25519_key = ed25519_key
        self.ml_dsa_key = ml_dsa_key

    def sign(self, message: bytes) -> dict:
        return {
            "ed25519": self.ed25519_key.sign(message).hex(),
            "ml_dsa": self.ml_dsa_key.sign(message).hex(),
            "binding": hashlib.sha256(
                self.ed25519_key.sign(message) + 
                self.ml_dsa_key.sign(message)
            ).hexdigest()
        }

    def verify(self, message: bytes, signature: dict) -> bool:
        # Both must verify
        ed25519_valid = verify_ed25519(message, signature["ed25519"])
        ml_dsa_valid = verify_ml_dsa(message, signature["ml_dsa"])
        return ed25519_valid and ml_dsa_valid
Enter fullscreen mode Exit fullscreen mode

Privacy-Preserving Verification

The Prompt Privacy Problem

Here's a dilemma: to verify a refusal was legitimate, you might want to see what was refused. But prompts can contain:

  • Personal information
  • Proprietary business data
  • Information that itself needs protection

CAP-SRP's Solution: Log Decisions, Not Content

# What we LOG (verifiable, privacy-preserving)
logged_data = {
    "prompt_hash": hashlib.sha256(prompt.encode()).hexdigest(),  # ✓
    "risk_category": "NCII_RISK",  # ✓ Categorical
    "risk_score": 0.94,  # ✓ Confidence
    "policy_version": "2026-01-28"  # ✓ Reference
}

# What we DON'T log
not_logged = {
    "raw_prompt": prompt,  # ✗ Privacy violation
    "user_identity": user.email,  # ✗ Privacy violation  
    "detailed_analysis": safety_report.full_text  # ✗ May reveal prompt
}
Enter fullscreen mode Exit fullscreen mode

The prompt hash enables verification ("yes, this specific prompt was refused") without revealing content.

Zero-Knowledge Proof Integration (Advanced)

For scenarios requiring stronger guarantees:

# Conceptual ZK proof structure
# "I can prove this prompt matched a dangerous pattern without showing you the prompt"

class RefusalZKProof:
    """
    Proves:
    1. prompt_hash corresponds to some prompt P
    2. P matches safety rule R
    3. R is part of policy version V

    Without revealing:
    - The actual prompt P
    - The specific rule pattern
    - Other policy details
    """

    def __init__(self, prompt_hash: str, policy_version: str):
        self.prompt_hash = prompt_hash
        self.policy_version = policy_version

    def generate_proof(self, prompt: str, matched_rule: SafetyRule) -> bytes:
        # In production: use zk-STARK library (no trusted setup, quantum-resistant)
        # This is a conceptual placeholder
        pass

    def verify_proof(self, proof: bytes) -> bool:
        # Anyone can verify without seeing the prompt
        pass
Enter fullscreen mode Exit fullscreen mode

GDPR Compliance with Crypto-Shredding

For EU compliance, implement crypto-shredding:

from cryptography.fernet import Fernet

class GDPRCompliantLogger:
    def __init__(self, base_logger: CAPEventLogger):
        self.base_logger = base_logger
        self.user_keys: Dict[str, bytes] = {}  # user_id -> encryption_key

    def _get_or_create_user_key(self, user_id: str) -> bytes:
        if user_id not in self.user_keys:
            self.user_keys[user_id] = Fernet.generate_key()
        return self.user_keys[user_id]

    def log_attempt_gdpr(self, prompt: str, user_id: str) -> CAPEvent:
        """Log with encrypted user context."""
        key = self._get_or_create_user_key(user_id)
        fernet = Fernet(key)

        # Encrypt user-specific data
        encrypted_context = fernet.encrypt(user_id.encode())

        # Log with encrypted context
        event = self.base_logger.log_attempt(prompt, user_id)
        event.user_context_encrypted = encrypted_context.hex()

        return event

    def handle_deletion_request(self, user_id: str):
        """
        GDPR Article 17: Right to Erasure

        By deleting the key, all encrypted user data becomes 
        cryptographically inaccessible (crypto-shredding).
        """
        if user_id in self.user_keys:
            # Secure deletion of key
            key = self.user_keys[user_id]
            # Overwrite memory (in production, use secure_delete library)
            self.user_keys[user_id] = b'\x00' * len(key)
            del self.user_keys[user_id]

            return {"status": "deleted", "method": "crypto_shredding"}

        return {"status": "not_found"}
Enter fullscreen mode Exit fullscreen mode

Note: Per EDPB guidance, encrypted data "remains personal data." Crypto-shredding satisfies technical erasure but consult legal counsel for your jurisdiction.


Integration Patterns

Sidecar Architecture (Recommended)

The cleanest integration is a sidecar that intercepts the generation flow:

┌─────────────────────────────────────────────────────────────┐
│                  Your AI Service                            │
│                                                             │
│  ┌───────────┐    ┌───────────┐    ┌───────────────┐       │
│  │  Request  │───►│  Safety   │───►│  Generation   │       │
│  │  Handler  │    │  Filter   │    │    Engine     │       │
│  └───────────┘    └───────────┘    └───────────────┘       │
│        │               │                   │               │
│        │      CAP-SRP Sidecar             │               │
│        ▼               ▼                   ▼               │
│  ┌─────────────────────────────────────────────────────┐   │
│  │                Event Interceptor                    │   │
│  │                                                     │   │
│  │  • Logs GEN_ATTEMPT before safety check             │   │
│  │  • Logs GEN_DENY when filter triggers               │   │
│  │  • Logs GEN when content created                    │   │
│  │  • Logs GEN_ERROR on failures                       │   │
│  └─────────────────────────────────────────────────────┘   │
│                          │                                 │
└──────────────────────────┼─────────────────────────────────┘
                           │
                           ▼
              ┌─────────────────────────────┐
              │   Transparency Log Service  │
              │   (SCITT / External)        │
              └─────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode
# FastAPI sidecar example
from fastapi import FastAPI, Request
from typing import Callable

app = FastAPI()
logger = CAPEventLogger(signing_key=load_key(), model_id="image-gen-v3")

class CAPSRPMiddleware:
    def __init__(self, app: FastAPI, logger: CAPEventLogger):
        self.app = app
        self.logger = logger

    async def __call__(self, request: Request, call_next: Callable):
        if request.url.path == "/generate":
            body = await request.json()

            # COMMITMENT POINT: Log attempt before processing
            attempt = self.logger.log_attempt(
                prompt=body["prompt"],
                user_id=request.headers.get("X-User-ID", "anonymous")
            )

            # Add attempt_id to request state
            request.state.cap_attempt_id = attempt.attempt_id

        response = await call_next(request)

        # Outcome logged by endpoint handler
        return response

app.add_middleware(CAPSRPMiddleware, logger=logger)

@app.post("/generate")
async def generate(request: Request, body: GenerateRequest):
    attempt_id = request.state.cap_attempt_id

    # Safety check
    risk = await safety_evaluator.evaluate(body.prompt)

    if risk.should_deny:
        logger.log_denial(attempt_id, risk.category, risk.score, POLICY_VERSION)
        return {"status": "denied", "reason": risk.category.value}

    try:
        output = await generator.generate(body.prompt)
        logger.log_generation(attempt_id, output, attach_c2pa(output))
        return {"status": "success", "image_url": upload(output)}
    except Exception as e:
        logger.log_error(attempt_id, type(e).__name__, "GENERATION")
        raise
Enter fullscreen mode Exit fullscreen mode

Compliance Tiers

CAP-SRP defines three tiers based on assurance requirements:

Tier Local Logging External Anchoring Batch Frequency HSM Required
Silver TSA only Daily No
Gold TSA + Multi-log Hourly Recommended
Platinum TSA + Blockchain Real-time Yes
class ComplianceTier(Enum):
    SILVER = "silver"
    GOLD = "gold"
    PLATINUM = "platinum"

class CAPSRPClient:
    def __init__(self, tier: ComplianceTier, config: dict):
        self.tier = tier
        self.anchors = self._setup_anchors(tier, config)

    def _setup_anchors(self, tier: ComplianceTier, config: dict):
        anchors = [TSAnchor(config["tsa_url"])]  # All tiers

        if tier in [ComplianceTier.GOLD, ComplianceTier.PLATINUM]:
            anchors.append(MultiLogAnchor(config["log_urls"]))

        if tier == ComplianceTier.PLATINUM:
            anchors.append(BlockchainAnchor(config["blockchain"]))

        return anchors

    async def anchor_batch(self, merkle_root: bytes):
        """Anchor Merkle root to all configured external services."""
        receipts = []
        for anchor in self.anchors:
            receipt = await anchor.anchor(merkle_root)
            receipts.append(receipt)
        return receipts
Enter fullscreen mode Exit fullscreen mode

The Regulatory Clock is Ticking

Key Deadlines

Date Requirement Jurisdiction
Already in effect GB 45438-2025 AI labeling China
August 2, 2026 EU AI Act Article 12 (logging) European Union
August 2, 2026 California AI Transparency Act California, USA
January 1, 2027 Large Online Platform requirements California, USA

What "Tamper-Evident Logging" Means

EU AI Act Article 12 requires high-risk AI systems to enable:

"automatic recording of events (logs) over the lifetime of the system"

The logs must:

  • Identify risk situations
  • Facilitate post-market monitoring
  • Enable operational oversight
  • Not be altered "in a way which may affect any subsequent evaluation"

That last requirement effectively mandates tamper-evidence. CAP-SRP provides exactly this.

The 35-State Letter as Preview

The January 23, 2026 letter from NY AG Letitia James (representing 35 states) shows where enforcement is heading:

"We request detailed information about:

  • How your safeguards work
  • How you verify their effectiveness
  • What logging and monitoring you perform"

Without CAP-SRP or equivalent, the answer to "verify their effectiveness" is always "trust us."


Getting Started

Quick Start

# Install the SDK
pip install vap-framework

# Or from source
git clone https://github.com/veritaschain/vap-sdk-python
cd vap-sdk-python
pip install -e .
Enter fullscreen mode Exit fullscreen mode

Minimal Example

from vap_sdk import CAPClient, RiskCategory

# Initialize client
client = CAPClient(
    model_id="my-image-generator",
    transparency_service="https://log.veritaschain.org"  # Or self-hosted
)

# In your generation flow
async def generate(prompt: str, user_id: str):
    # 1. Log attempt (commitment point)
    attempt = await client.log_attempt(prompt, user_id)

    # 2. Your safety evaluation
    risk = await your_safety_check(prompt)

    # 3. Log outcome
    if risk.is_unsafe:
        await client.log_denial(attempt.id, risk.category, risk.score)
        return None

    output = await your_generator(prompt)
    await client.log_generation(attempt.id, output)
    return output
Enter fullscreen mode Exit fullscreen mode

Resources


Conclusion

The Grok crisis revealed what we should have known all along: you can't prove a negative with positive-only attestation systems.

C2PA proves what was created. SynthID marks what was generated. But neither can prove what was refused.

CAP-SRP fills this gap with a simple but powerful insight: log the decision, not just the content. By treating refusals as first-class cryptographic events—with the same rigor we apply to generations—we create accountability that actually works.

The math is straightforward:

ATTEMPTS = GENERATIONS + REFUSALS + ERRORS
Enter fullscreen mode Exit fullscreen mode

If that equation doesn't balance, someone's cheating. And with external anchoring, you can't hide it.

The regulatory deadlines are real (August 2026 for EU AI Act). The technical solutions exist. The question isn't whether AI systems will need refusal provenance—it's whether you'll implement it proactively or scramble to comply.

The code is open source. The spec is public. Let's build AI systems that can prove they're safe, not just claim it.


Questions? Issues?


This post is part of the "Verifiable AI Provenance" series. Next up: "Integrating CAP-SRP with C2PA for Complete Content Lifecycle Tracking."

VeritasChain Standards Organization (VSO) is a non-profit standards body developing open specifications for AI accountability. We don't sell products—we write standards.

Top comments (0)