DEV Community

Cover image for VeraSnap Solves the Other Half: Building Cryptographic Proof of Reality in a World That Can't Detect Fakes

VeraSnap Solves the Other Half: Building Cryptographic Proof of Reality in a World That Can't Detect Fakes

Last week we published "Four Events in 24 Hours Exposed the Same Gap." Microsoft said detection doesn't work. Samsung shipped labels for AI content. Canada couldn't verify OpenAI's safety claims. Australia prosecuted deepfakes without forensic infrastructure. All four pointed at the same blind spot: no one can prove what AI refused to generate. CAP-SRP addresses that — the "output side." This article addresses the other half: the "input side." How do you prove a photograph is real?


TL;DR

The AI provenance ecosystem has two halves. CAP-SRP proves what AI systems refused to generate. VeraSnap (implementing the Content Provenance Protocol / CPP) proves what cameras actually captured. Together, they close the loop: every piece of digital media has either a generation provenance chain (C2PA + CAP-SRP) or a capture provenance chain (CPP / VeraSnap) — or it has neither, and that absence is informative.

This article walks through:

  • Why detection-based approaches fail (Microsoft's own data)
  • How VeraSnap's capture-time cryptography sidesteps the detection arms race
  • The CPP event model, hash chains, and Completeness Invariant
  • Screen recapture detection with LiDAR, moiré analysis, and IMU
  • Working code: Swift (iOS) and Kotlin (Android) implementations
  • C2PA integration for interoperability with the Content Credentials ecosystem
  • How this changes each of the four scenarios from last week's article

Table of Contents

  1. The Detection Dead End
  2. Flip the Question: Prove Authenticity, Don't Detect Fakes
  3. CPP Architecture: What VeraSnap Actually Does
  4. The Capture Event Flow
  5. Hash Chains and Case-Based Separation
  6. The Completeness Invariant: Detecting Deleted Evidence
  7. Screen Recapture Detection: Closing the Analog Hole
  8. Hardware-Backed Signatures
  9. External Timestamping: Eliminating Self-Attestation
  10. The 200ms Provenance Gate: What Happens at Share Time
  11. C2PA Integration
  12. Revisiting the Four Events
  13. Implementation: iOS (Swift)
  14. Implementation: Android (Kotlin)
  15. What VeraSnap Proves and What It Doesn't
  16. Developer Takeaways

The Detection Dead End {#the-detection-dead-end}

Let's start where last week's article started: Microsoft's LASER report.

The team evaluated 60 combinations of provenance methods — C2PA metadata, imperceptible watermarking, soft-hash fingerprinting — across images, audio, and video. Only 20 achieved "high-confidence provenance authentication." The report introduced "reversal attacks": taking a real photo, making a trivial AI edit, and watching the provenance system label it as "AI-generated." A real photograph, discredited by the system designed to protect it.

Here's the detection landscape in one matrix:

                ┌─────────────────────────────────────────────┐
                │         CONTENT VERIFICATION MATRIX          │
                ├──────────────┬──────────┬──────────┬────────┤
                │   Method     │ Positive │ Negative │ Conf.  │
                │              │ (is AI?) │ (is real)│ Level  │
                ├──────────────┼──────────┼──────────┼────────┤
                │ C2PA         │ ✅ Yes*  │ ❌ No    │ Med    │
                │ Watermark    │ ⚠️ Maybe │ ❌ No    │ Low    │
                │ Fingerprint  │ ⚠️ Maybe │ ❌ No    │ Low    │
                │ AI Detector  │ ⚠️ Maybe │ ❌ No    │ Low    │
                │ Combined     │ ✅ Yes*  │ ❌ No    │ Med-Hi │
                ├──────────────┼──────────┼──────────┼────────┤
                │ CPP/VeraSnap │ N/A      │ ✅ Yes   │ High   │
                └──────────────┴──────────┴──────────┴────────┘

                * = if metadata preserved (often stripped)
Enter fullscreen mode Exit fullscreen mode

Every existing method answers: "Is this content AI-generated?" None answers: "Is this content captured from reality?"

The distinction matters. Detection is inherently probabilistic — a confidence score, not a proof. Capture provenance is cryptographic — a mathematical guarantee, not an opinion.


Flip the Question: Prove Authenticity, Don't Detect Fakes {#flip-the-question}

The deepfake problem creates what researchers call the Liar's Dividend: when any image could be AI-generated, even authentic images become deniable. Politicians claim leaked photos are fake. Criminals argue surveillance footage was manufactured. Whistleblowers find their evidence dismissed.

The detection arms race can't solve this. As generators improve, detectors fall behind. Microsoft's report confirms it.

VeraSnap takes a fundamentally different approach:

DETECTION APPROACH (arms race):
  Content exists → Run detector → "Maybe fake" / "Maybe real"
                                   (probabilistic, degrading)

PROVENANCE APPROACH (VeraSnap):
  Shutter pressed → Crypto proof generated → "Provably captured"
                    (at capture time)         (mathematical, permanent)
Enter fullscreen mode Exit fullscreen mode

We're not trying to detect fakes. We're proving authenticity. The burden of proof shifts from "prove this is fake" to "does this have capture provenance?"

Images with VeraSnap credentials are in a categorically different evidentiary class. Not because they're "more trusted" — because they have mathematical proof that a physical device captured physical light at a specific moment in time, witnessed by an independent timestamp authority.


CPP Architecture: What VeraSnap Actually Does {#cpp-architecture}

VeraSnap implements the Content Provenance Protocol (CPP), an open specification published as IETF Internet-Draft (draft-vso-cpp-core) and maintained at github.com/veritaschain/cpp-spec.

CPP is part of the VAP (Verifiable AI Provenance) framework — the same framework that houses CAP-SRP. They share identical cryptographic primitives:

┌─────────────────────────────────────────────────────────────┐
│                   VAP Framework v1.2                         │
│                                                              │
│  ┌───────────────┐  ┌───────────────┐  ┌─────────────────┐  │
│  │   CAP-SRP     │  │     CPP       │  │      VCP        │  │
│  │  (AI Refusal  │  │  (Capture     │  │  (Trading       │  │
│  │   Provenance) │  │   Provenance) │  │   Audit Trail)  │  │
│  └───────┬───────┘  └───────┬───────┘  └────────┬────────┘  │
│          │                  │                    │            │
│          └──────────────────┼────────────────────┘            │
│                             │                                 │
│  ┌──────────────────────────┴──────────────────────────────┐  │
│  │              Shared Cryptographic Layer                   │  │
│  │  Ed25519/ES256 │ SHA-256 │ Merkle │ RFC 3161 │ SCITT   │  │
│  └──────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

The key design principle: all cryptographic commitments happen at the moment of capture, before the system knows what the content looks like. Unlike C2PA, which can be applied to any image at any time, CPP credentials can only be created by a device that was physically present when the shutter was pressed.


The Capture Event Flow {#the-capture-event-flow}

Here's what happens in the ~200ms between tapping the shutter and seeing the preview:

┌─────────────────────────────────────────────────────────────────┐
│                 CPP CAPTURE EVENT FLOW                           │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  [1] User taps shutter                                          │
│       │                                                          │
│       ▼                                                          │
│  [2] Sensor data acquisition (GPS, accelerometer, ambient light) │
│       │                                                          │
│       ▼                                                          │
│  [3] Raw image → SHA-256 hash (media_hash)                      │
│       │                                                          │
│       ▼                                                          │
│  [4] CPP INGEST event constructed                               │
│       │  ├─ EventID (UUID v7, time-ordered)                     │
│       │  ├─ ChainID (case-specific hash chain)                  │
│       │  ├─ PrevHash (link to previous event)                   │
│       │  ├─ media_hash (SHA-256 of raw image)                   │
│       │  └─ CaptureContext (device, sensors, depth)             │
│       ▼                                                          │
│  [5] EventHash = SHA-256(JCS_canonicalize(event))               │
│       │                                                          │
│       ▼                                                          │
│  [6] Signature = SecureEnclave.sign(EventHash, device_key)      │
│       │           (ES256 / P-256, hardware-backed)              │
│       ▼                                                          │
│  [7] Local storage + anchor queue                               │
│       │                                                          │
│       ▼  (batch, every 30 min)                                  │
│  [8] Merkle tree → RFC 3161 TSA → external timestamp            │
│       │                                                          │
│       ▼                                                          │
│  [9] Verification Pack generated (offline-verifiable)           │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

Steps [1]–[6] happen on-device in under 200ms. Steps [7]–[9] happen asynchronously. The critical point: the cryptographic commitment to the image content (step 3) happens before any processing, editing, or network activity. The hash of the raw sensor data is the first thing computed.


Hash Chains and Case-Based Separation {#hash-chains}

Like CAP-SRP, CPP uses hash chains — but with an important design difference: case-based chain separation.

In CAP-SRP, a single AI system maintains one chain of generation attempts and refusals. In VeraSnap, each "case" (project, investigation, assignment) gets its own independent chain:

Case: "Insurance Claim #4521"
  Event₁: INGEST (front bumper photo)    PrevHash = GENESIS
  Event₂: INGEST (rear damage photo)     PrevHash = Hash(Event₁)
  Event₃: INGEST (odometer reading)      PrevHash = Hash(Event₂)
  Event₄: EXPORT (shared to adjuster)    PrevHash = Hash(Event₃)

Case: "Construction Site Visit 2026-02-25"
  Event₁: INGEST (foundation pour)       PrevHash = GENESIS
  Event₂: INGEST (rebar placement)       PrevHash = Hash(Event₁)
  ...
Enter fullscreen mode Exit fullscreen mode

Why case separation? Privacy and disclosure scope. When you share evidence for Insurance Claim #4521, you share only that chain. Your construction photos, personal captures, and other cases are cryptographically isolated. Each chain is independently verifiable.

The chain structure is identical to CAP-SRP:

import hashlib
import json

def compute_event_hash(event: dict) -> str:
    """
    CPP event hash computation.
    Identical algorithm to CAP-SRP — shared VAP infrastructure.
    """
    # RFC 8785 JSON Canonicalization
    canonical = json.dumps(event, sort_keys=True, separators=(',', ':'))
    return hashlib.sha256(canonical.encode('utf-8')).hexdigest()

def verify_chain(events: list[dict]) -> bool:
    """Verify hash chain integrity."""
    for i, event in enumerate(events):
        # Verify event hash
        stored_hash = event['EventHash']
        computed_hash = compute_event_hash({
            k: v for k, v in event.items()
            if k not in ['EventHash', 'Signature']
        })
        if stored_hash != computed_hash:
            print(f"Hash mismatch at event {i}")
            return False

        # Verify chain linkage
        if i == 0:
            if event['PrevHash'] != '0' * 64:
                print("Genesis event has non-zero PrevHash")
                return False
        else:
            if event['PrevHash'] != events[i-1]['EventHash']:
                print(f"Chain break between events {i-1} and {i}")
                return False

    return True
Enter fullscreen mode Exit fullscreen mode

The Completeness Invariant: Detecting Deleted Evidence {#completeness-invariant}

This is where CPP gets interesting — and where it diverges from simple hash chains.

In CAP-SRP, the Completeness Invariant says: GEN_ATTEMPT = GEN + GEN_DENY + GEN_ERROR. Every attempt has exactly one outcome.

In CPP, the problem is different: how do you detect if someone deleted a photograph from a chain? If a police officer takes 10 photos at a crime scene but only submits 8, the hash chain of the 8 photos will still verify correctly. The two deleted photos leave no trace.

CPP solves this with the XOR Completeness Invariant:

def compute_xor_checksum(event_hashes: list[str]) -> str:
    """
    CPP Completeness Invariant.
    XOR of all event hashes in a chain.
    Deleting ANY event changes the checksum.
    """
    result = bytes(32)  # 256-bit zero
    for h in event_hashes:
        h_bytes = bytes.fromhex(h)
        result = bytes(a ^ b for a, b in zip(result, h_bytes))
    return result.hex()

def verify_completeness(chain_events: list[dict],
                        committed_checksum: str,
                        committed_count: int) -> bool:
    """
    Verify chain completeness against committed values.

    The checksum and count are committed to an external TSA
    at anchor time. If any event is later deleted:
    - The count won't match
    - The XOR checksum won't match
    - Both are independently detectable
    """
    current_hashes = [e['EventHash'] for e in chain_events]
    current_checksum = compute_xor_checksum(current_hashes)
    current_count = len(chain_events)

    count_ok = current_count == committed_count
    checksum_ok = current_checksum == committed_checksum

    if not count_ok:
        print(f"Event count mismatch: {current_count} vs {committed_count}")
    if not checksum_ok:
        print(f"XOR checksum mismatch: deletion detected")

    return count_ok and checksum_ok
Enter fullscreen mode Exit fullscreen mode

The CPP spec is explicit about the strength of XOR: "XOR is for omission detection, NOT cryptographic commitment." It detects deletion. It doesn't prevent sophisticated attacks where someone replaces one event with another that produces the same XOR. That's why the Merkle tree (committed externally) provides the stronger guarantee. The XOR is a fast, O(1) sanity check; the Merkle proof is the authoritative verification.

┌─────────────────────────────────────────────────┐
│         CPP COMPLETENESS VERIFICATION            │
│                                                  │
│  Layer 1: Event Count                            │
│  ├─ Committed: 10 events                        │
│  └─ Current:   10 events  ✅                    │
│                                                  │
│  Layer 2: XOR Checksum                           │
│  ├─ Committed: a7f3...9c2d                      │
│  └─ Current:   a7f3...9c2d  ✅                  │
│                                                  │
│  Layer 3: Merkle Root (TSA-stamped)              │
│  ├─ Committed: 5e82...1fb7 (RFC 3161 receipt)   │
│  └─ Recomputed: 5e82...1fb7  ✅                 │
│                                                  │
│  Result: Chain is COMPLETE and UNMODIFIED        │
└─────────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

Screen Recapture Detection: Closing the Analog Hole {#screen-detection}

Every content provenance system has a fundamental vulnerability: the analog hole. Take a deepfake, display it on a high-quality monitor, photograph the monitor with a "trusted" camera app. The photo now has valid capture credentials for synthetic content.

This is the attack that Microsoft's "reversal attacks" describe from the generation side. CPP addresses it from the capture side with multi-modal screen detection:

┌─────────────────────────────────────────────────────────────┐
│              MULTI-MODAL SCREEN DETECTION                    │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  ┌─────────────────┐  Sensor 1: LiDAR Depth Analysis       │
│  │  Real Scene:     │  ├─ Measures actual distance to       │
│  │  Depth varies    │  │   objects in scene                  │
│  │  0.3m to ∞       │  ├─ Screen: flat depth plane          │
│  │                  │  │   (all pixels same distance)        │
│  │  Screen:         │  └─ Real: depth variance > threshold  │
│  │  Flat plane      │                                       │
│  │  ~0.4m uniform   │  Sensor 2: Moiré Pattern Detection    │
│  └─────────────────┘  ├─ Screen pixels create interference  │
│                       │   patterns with camera sensor        │
│                       └─ Frequency analysis in Fourier       │
│                          domain detects periodic artifacts   │
│                                                              │
│  Sensor 3: Rolling Shutter Flicker                          │
│  ├─ Screens refresh at 60/120/144 Hz                        │
│  ├─ CMOS rolling shutter captures different                 │
│  │   refresh phases across scan lines                        │
│  └─ Detectable as horizontal banding                        │
│                                                              │
│  Sensor 4: IMU Human Presence                               │
│  ├─ Accelerometer detects micro-tremors                     │
│  │   characteristic of human hand-holding                    │
│  ├─ Gyroscope detects natural sway                          │
│  └─ Absence of human motion → tripod/mount → flag           │
│                                                              │
│  Combined verdict: REAL_SCENE | SCREEN_DETECTED | UNKNOWN   │
│  (cryptographically bound to capture event)                  │
└─────────────────────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

The verdict is embedded in the capture event and signed before the image data is processed. You can't retroactively change the screen detection result without invalidating the hash chain.

Here's the Swift implementation for LiDAR depth analysis:

import ARKit

struct DepthAnalysisResult {
    let isScreen: Bool
    let confidence: Float       // 0.0 - 1.0
    let depthVariance: Float
    let flatPlaneRatio: Float
    let sensorType: String      // "lidar" | "truedepth" | "stereo"
}

func analyzeDepth(frame: ARFrame) -> DepthAnalysisResult {
    guard let depthMap = frame.sceneDepth?.depthMap else {
        return DepthAnalysisResult(
            isScreen: false, confidence: 0.0,
            depthVariance: 0.0, flatPlaneRatio: 0.0,
            sensorType: "none"
        )
    }

    let width = CVPixelBufferGetWidth(depthMap)
    let height = CVPixelBufferGetHeight(depthMap)

    CVPixelBufferLockBaseAddress(depthMap, .readOnly)
    defer { CVPixelBufferUnlockBaseAddress(depthMap, .readOnly) }

    let baseAddress = CVPixelBufferGetBaseAddress(depthMap)!
    let buffer = baseAddress.assumingMemoryBound(to: Float32.self)

    // Sample center region (avoid edges where depth is noisy)
    let startX = width / 4
    let startY = height / 4
    let endX = 3 * width / 4
    let endY = 3 * height / 4

    var depths: [Float] = []
    for y in stride(from: startY, to: endY, by: 4) {
        for x in stride(from: startX, to: endX, by: 4) {
            let depth = buffer[y * width + x]
            if depth > 0 && depth < 10.0 { // Valid range: 0-10m
                depths.append(depth)
            }
        }
    }

    guard depths.count > 100 else {
        return DepthAnalysisResult(
            isScreen: false, confidence: 0.0,
            depthVariance: 0.0, flatPlaneRatio: 0.0,
            sensorType: "lidar"
        )
    }

    // Compute statistics
    let mean = depths.reduce(0, +) / Float(depths.count)
    let variance = depths.map { ($0 - mean) * ($0 - mean) }
                        .reduce(0, +) / Float(depths.count)

    // Screen detection: screens have very low depth variance
    // Real scenes: variance typically > 0.05
    // Screens: variance typically < 0.01
    let flatPlaneThreshold: Float = 0.015
    let flatPixels = depths.filter { abs($0 - mean) < flatPlaneThreshold }
    let flatPlaneRatio = Float(flatPixels.count) / Float(depths.count)

    // Screen: >85% of pixels at same depth
    let isScreen = flatPlaneRatio > 0.85 && variance < 0.02

    return DepthAnalysisResult(
        isScreen: isScreen,
        confidence: isScreen ? min(flatPlaneRatio, 0.99) : 1.0 - flatPlaneRatio,
        depthVariance: variance,
        flatPlaneRatio: flatPlaneRatio,
        sensorType: "lidar"
    )
}
Enter fullscreen mode Exit fullscreen mode

Hardware-Backed Signatures {#hardware-signatures}

VeraSnap's cryptographic keys never leave the device's hardware security module. On iOS, that's Apple's Secure Enclave. On Android, it's StrongBox (or TEE as fallback).

import CryptoKit
import Security

class HardwareKeyManager {
    private let keyTag = "org.veritaschain.verasnap.signing"

    /// Generate a key pair inside Secure Enclave.
    /// The private key CANNOT be extracted — ever.
    func generateSigningKey() throws -> SecKey {
        let access = SecAccessControlCreateWithFlags(
            kCFAllocatorDefault,
            kSecAttrAccessibleWhenUnlockedThisDeviceOnly,
            [.privateKeyUsage, .biometryCurrentSet],
            nil
        )!

        let attributes: [String: Any] = [
            kSecAttrKeyType as String: kSecAttrKeyTypeECSECPrimeRandom,
            kSecAttrKeySizeInBits as String: 256,
            kSecAttrTokenID as String: kSecAttrTokenIDSecureEnclave,
            kSecPrivateKeyAttrs as String: [
                kSecAttrIsPermanent as String: true,
                kSecAttrApplicationTag as String: keyTag.data(using: .utf8)!,
                kSecAttrAccessControl as String: access
            ]
        ]

        var error: Unmanaged<CFError>?
        guard let privateKey = SecKeyCreateRandomKey(attributes as CFDictionary, &error) else {
            throw error!.takeRetainedValue() as Error
        }

        return privateKey
    }

    /// Sign data using Secure Enclave — key never leaves hardware.
    func sign(data: Data, with privateKey: SecKey) throws -> Data {
        let hash = SHA256.hash(data: data)
        let hashData = Data(hash)

        var error: Unmanaged<CFError>?
        guard let signature = SecKeyCreateSignature(
            privateKey,
            .ecdsaSignatureDigestX962SHA256,
            hashData as CFData,
            &error
        ) else {
            throw error!.takeRetainedValue() as Error
        }

        return signature as Data
    }
}
Enter fullscreen mode Exit fullscreen mode

This is a critical differentiator from C2PA. C2PA's specification states that "the signer need not be human." A server can sign C2PA credentials. An AI generator can create an image and immediately give it credentials.

VeraSnap's credentials are bound to a physical device's hardware security module with optional biometric binding. You can't generate VeraSnap credentials from a server. You can't add them to an AI-generated image. You can only create them by pressing a shutter button on a device that was physically present.


External Timestamping: Eliminating Self-Attestation {#external-timestamps}

Every device has a system clock. Every system clock can be manipulated. Self-attested timestamps prove nothing.

CPP requires RFC 3161 timestamping from an independent Time Stamp Authority (TSA):

import hashlib
import base64
import requests
from asn1crypto import tsp, core

def request_rfc3161_timestamp(data_hash: bytes,
                               tsa_url: str = "http://timestamp.digicert.com"
                               ) -> bytes:
    """
    Request an RFC 3161 timestamp from an independent TSA.

    The TSA doesn't see the content — only the hash.
    It returns a signed token proving this hash existed at this time.
    """
    # Build TimeStampReq (RFC 3161 §2.4.1)
    message_imprint = tsp.MessageImprint({
        'hash_algorithm': {'algorithm': 'sha256'},
        'hashed_message': data_hash
    })

    ts_req = tsp.TimeStampReq({
        'version': 'v1',
        'message_imprint': message_imprint,
        'cert_req': True  # Request TSA certificate in response
    })

    # Send to TSA
    response = requests.post(
        tsa_url,
        data=ts_req.dump(),
        headers={'Content-Type': 'application/timestamp-query'}
    )

    # Parse TimeStampResp
    ts_resp = tsp.TimeStampResp.load(response.content)

    if ts_resp['status']['status'].native != 'granted':
        raise ValueError(f"TSA rejected request: {ts_resp['status']}")

    # Extract the signed token
    return ts_resp['time_stamp_token'].dump()


def verify_rfc3161_timestamp(token: bytes,
                              expected_hash: bytes) -> dict:
    """
    Verify an RFC 3161 timestamp token.

    Returns the TSA-certified time if valid.
    This verification can happen offline, years later.
    """
    ts_token = tsp.TimeStampToken.load(token)
    tst_info = ts_token['content']['encap_content_info']['content'].parsed

    # Verify the hash matches
    token_hash = tst_info['message_imprint']['hashed_message'].native
    if token_hash != expected_hash:
        raise ValueError("Hash mismatch: timestamp doesn't match content")

    # Extract certified time
    gen_time = tst_info['gen_time'].native

    return {
        'verified': True,
        'timestamp': gen_time.isoformat(),
        'tsa': tst_info['tsa'].human_friendly if tst_info['tsa'] else 'Unknown',
        'serial': tst_info['serial_number'].native
    }
Enter fullscreen mode Exit fullscreen mode

The key property: the TSA never sees the image or its content. It receives only a hash and returns a signed receipt. Privacy is preserved. But the receipt proves — to anyone, at any time, without trusting VeraSnap or the photographer — that this specific hash existed at this specific time.


The 200ms Provenance Gate: What Happens at Share Time {#provenance-gate}

CPP v1.5 introduced the Pre-Publish Verification Extension. Here's the problem it solves: social media platforms strip all metadata on upload. Your C2PA credentials, EXIF data, and CPP proofs are gone the moment you share to X, Instagram, or WhatsApp.

VeraSnap intercepts the share action and, in under 200ms:

  1. Validates the full CPP chain (hash chain + signatures + timestamps)
  2. Composes a VisualMark — a visible provenance indicator in the image pixels
  3. Embeds a verification QR code that links to the Verification Pack
  4. Applies an invisible watermark as a third survival layer
  5. Forwards the composited image to the destination app
┌────────────────────────────────────────────────────────┐
│            200ms PROVENANCE GATE (v1.5)                 │
│                                                         │
│  User taps "Share to X"                                │
│       │                                                 │
│       ▼  [0ms]                                         │
│  VeraSnap intercepts Intent/Share Extension             │
│       │                                                 │
│       ▼  [50ms]                                        │
│  Full CPP verification runs                            │
│  ├─ Hash chain integrity  ✅                           │
│  ├─ Signature validation  ✅                           │
│  ├─ TSA token check       ✅                           │
│  └─ Completeness Invariant ✅                          │
│       │                                                 │
│       ▼  [100ms]                                       │
│  Three-layer composition:                              │
│  ├─ L1: VisualMark (icon in corner)                    │
│  ├─ L2: QR code (links to Verification Pack)           │
│  └─ L3: Invisible watermark (steganographic)           │
│       │                                                 │
│       ▼  [200ms]                                       │
│  Forward to X with all three layers                    │
│                                                         │
│  Platform strips metadata? Doesn't matter.             │
│  The proof is IN the pixels now.                       │
└────────────────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

When anyone scans the QR code, they access the complete Verification Pack — the full CPP chain, Merkle proofs, and TSA receipts — and can independently verify everything offline.


C2PA Integration {#c2pa-integration}

VeraSnap has applied to the C2PA Conformance Program and already supports C2PA v2.3-compatible manifest export. The integration maps CPP's proof fields to C2PA's Content Credentials format:

{
  "c2pa_manifest": {
    "claim_generator": "VeraSnap/1.5 CPP/1.5",
    "title": "Capture_20260225_144700.heic",
    "format": "image/heic",
    "assertions": [
      {
        "label": "c2pa.actions",
        "data": {
          "actions": [{
            "action": "c2pa.created",
            "digitalSourceType": "http://cv.iptc.org/newscodes/digitalsourcetype/trainedAlgorithmicMedia",
            "softwareAgent": "VeraSnap/1.5"
          }]
        }
      },
      {
        "label": "org.veritaschain.cpp.capture_provenance",
        "data": {
          "cpp_version": "1.5",
          "chain_id": "case_4521_insurance",
          "event_hash": "5e82a9f3...1fb7c42d",
          "media_hash": "a7f3b2c1...9c2de8f0",
          "tsa_token_ref": "rfc3161://digicert/token/2026022514470023",
          "depth_analysis": {
            "sensor": "lidar",
            "verdict": "REAL_SCENE",
            "confidence": 0.97
          },
          "completeness_invariant": {
            "event_count": 10,
            "xor_checksum": "c4a1...7f3e"
          },
          "human_attestation": {
            "biometric_bound": true,
            "method": "face_id"
          }
        }
      }
    ]
  }
}
Enter fullscreen mode Exit fullscreen mode

This means VeraSnap-captured images can flow through the entire C2PA ecosystem — Adobe Content Credentials, Google's provenance tools, Microsoft's media authentication — while carrying the additional CPP guarantees that C2PA alone doesn't provide: external timestamps, depth analysis, completeness verification, and hardware-backed capture binding.


Revisiting the Four Events {#revisiting-four-events}

Let's return to last week's four events and see what changes with VeraSnap in the picture:

1. Microsoft LASER: "Detection doesn't work reliably"

Without VeraSnap: You have a photo. You run three detection tools. One says "probably real," one says "inconclusive," one says "probably AI." Confidence: zero.

With VeraSnap: The photo either has a CPP Verification Pack or it doesn't. If it does: verify the hash chain, check the TSA receipt, confirm the depth analysis. The photo is provably captured. No detection arms race. No probabilistic guessing.

Microsoft's 60-combination matrix becomes less critical when the question shifts from "detect fakes" to "verify authenticity."

2. Samsung S26: "Labels only what AI created"

Without VeraSnap: Samsung labels AI-generated images with C2PA credentials. Unedited photos carry no provenance. Absence of a label tells you nothing.

With VeraSnap: Unedited photos captured with VeraSnap carry CPP credentials — proving they were captured from reality by a physical device. The "nutrition label" analogy gains a critical second half: images have either generation credentials (C2PA/Samsung) or capture credentials (CPP/VeraSnap). Images with neither are the ones that warrant scrutiny.

┌──────────────────────────────────────────────────────┐
│           COMPLETE PROVENANCE LANDSCAPE               │
│                                                       │
│  AI-Generated Content    Real-World Captures          │
│  ┌─────────────────┐    ┌──────────────────┐         │
│  │ C2PA + CAP-SRP  │    │  CPP / VeraSnap  │         │
│  │                 │    │                  │         │
│  │ "Generated by   │    │ "Captured by     │         │
│  │  Firefly at     │    │  iPhone 16 Pro   │         │
│  │  2:47 PM.       │    │  at 2:47 PM.     │         │
│  │  Request logged │    │  LiDAR confirms  │         │
│  │  in audit       │    │  3D scene.       │         │
│  │  trail."        │    │  TSA receipt:    │         │
│  │                 │    │  DigiCert."      │         │
│  └─────────────────┘    └──────────────────┘         │
│                                                       │
│  No Credentials = Unknown Origin                      │
│  ┌─────────────────────────────────────────┐         │
│  │  Could be real. Could be AI. Could be   │         │
│  │  manipulated. No mathematical evidence  │         │
│  │  either way.                            │         │
│  └─────────────────────────────────────────┘         │
└──────────────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

3. Canada-OpenAI: "Can't verify safety claims"

Without VeraSnap: OpenAI says it banned the account and refused subsequent requests. Canada has to take their word for it.

With VeraSnap: This is primarily a CAP-SRP problem (refusal provenance), not a capture provenance problem. But VeraSnap complements CAP-SRP in a crucial way: if the Tumbler Ridge investigation involved documentary evidence — surveillance photos, social media posts, digital communications — VeraSnap-credentialed captures would have verifiable timestamps and chain of custody. The investigative evidence itself becomes cryptographically reliable.

4. South Australia: "Law arrived before evidence infrastructure"

Without VeraSnap: Prosecutors rely on endpoint forensics and provider cooperation. If a suspect claims the deepfake was created by someone else, proving provenance is essentially impossible.

With VeraSnap: Two forensic tools become available. On the refusal side (CAP-SRP): which AI services refused to generate similar content for this account? On the capture side (VeraSnap): any real photographs used as source material for the deepfake — were they captured with provenance? Can we trace the original, unmanipulated images to specific devices and moments?

The deepfake attribution chain gains structure:

Source image (VeraSnap proof of capture)
    → Uploaded to AI service (C2PA generation record)
    → AI service refused (CAP-SRP refusal log)
    → Tried different service (CAP-SRP shows attempt)
    → Generation succeeded (C2PA output record)
    → Shared online (VeraSnap QR code survives upload)
Enter fullscreen mode Exit fullscreen mode

Each step is independently verifiable. The investigation becomes structural, not testimonial.


Implementation: iOS (Swift) {#implementation-ios}

Here's a minimal but complete CPP capture implementation for iOS:

import Foundation
import CryptoKit

// MARK: - CPP Event Model

struct CPPEvent: Codable {
    let eventID: String        // UUID v7
    let chainID: String        // Case-specific chain identifier
    let eventType: String      // INGEST | EXPORT | ANCHOR | DELETE
    let timestamp: String      // ISO 8601
    let prevHash: String       // SHA-256 of previous event (or 64 zeros)
    let payload: CPPPayload
    var eventHash: String = ""
    var signature: String = ""

    struct CPPPayload: Codable {
        let mediaHash: String?           // SHA-256 of raw image
        let captureContext: CaptureContext?
        let depthAnalysis: DepthResult?
    }

    struct CaptureContext: Codable {
        let deviceModel: String
        let osVersion: String
        let sensorType: String
        let latitude: Double?
        let longitude: Double?
    }

    struct DepthResult: Codable {
        let sensor: String     // lidar | truedepth | stereo
        let verdict: String    // REAL_SCENE | SCREEN_DETECTED | UNKNOWN
        let confidence: Float
        let depthVariance: Float
    }

    mutating func computeHash() {
        let hashInput = [
            eventID, chainID, eventType, timestamp,
            prevHash, String(describing: payload)
        ].joined(separator: "|")

        let digest = SHA256.hash(data: Data(hashInput.utf8))
        eventHash = digest.map { String(format: "%02x", $0) }.joined()
    }
}

// MARK: - CPP Chain Manager

class CPPChainManager {
    private var chains: [String: [CPPEvent]] = [:]

    func captureEvent(chainID: String,
                      imageData: Data,
                      depthResult: CPPEvent.DepthResult?,
                      context: CPPEvent.CaptureContext) -> CPPEvent {

        let chain = chains[chainID] ?? []
        let prevHash = chain.last?.eventHash ?? String(repeating: "0", count: 64)

        // Step 1: Hash the raw image FIRST
        let mediaHash = SHA256.hash(data: imageData)
            .map { String(format: "%02x", $0) }.joined()

        // Step 2: Build event
        var event = CPPEvent(
            eventID: UUID().uuidString, // Should be UUID v7
            chainID: chainID,
            eventType: "INGEST",
            timestamp: ISO8601DateFormatter().string(from: Date()),
            prevHash: prevHash,
            payload: CPPEvent.CPPPayload(
                mediaHash: mediaHash,
                captureContext: context,
                depthAnalysis: depthResult
            )
        )

        // Step 3: Compute event hash
        event.computeHash()

        // Step 4: Sign with Secure Enclave (simplified)
        // In production: use HardwareKeyManager from earlier
        event.signature = "SE_SIGNATURE_PLACEHOLDER"

        // Step 5: Append to chain
        chains[chainID, default: []].append(event)

        return event
    }

    /// XOR Completeness Invariant
    func computeCompleteness(chainID: String) -> (count: Int, xorChecksum: String) {
        let chain = chains[chainID] ?? []
        var xorResult = Data(count: 32) // 256-bit zero

        for event in chain {
            let hashBytes = Data(hex: event.eventHash)
            for i in 0..<32 {
                xorResult[i] ^= hashBytes[i]
            }
        }

        return (chain.count, xorResult.map { String(format: "%02x", $0) }.joined())
    }
}
Enter fullscreen mode Exit fullscreen mode

Implementation: Android (Kotlin) {#implementation-android}

import java.security.KeyPairGenerator
import java.security.KeyStore
import java.security.MessageDigest
import java.security.Signature
import android.security.keystore.KeyGenParameterSpec
import android.security.keystore.KeyProperties
import java.time.Instant
import java.util.UUID

// CPP Event data class
data class CPPEvent(
    val eventID: String = UUID.randomUUID().toString(),
    val chainID: String,
    val eventType: String,
    val timestamp: String = Instant.now().toString(),
    val prevHash: String,
    val mediaHash: String? = null,
    val depthVerdict: String? = null,
    val depthConfidence: Float? = null,
    var eventHash: String = "",
    var signature: ByteArray = byteArrayOf()
) {
    fun computeHash(): String {
        val input = "$eventID|$chainID|$eventType|$timestamp|$prevHash|$mediaHash"
        val digest = MessageDigest.getInstance("SHA-256")
        eventHash = digest.digest(input.toByteArray())
            .joinToString("") { "%02x".format(it) }
        return eventHash
    }
}

// Hardware-backed key management (StrongBox)
class AndroidKeyManager {
    private val keyAlias = "org.veritaschain.verasnap.signing"

    fun generateKey() {
        val keyStore = KeyStore.getInstance("AndroidKeyStore")
        keyStore.load(null)

        if (!keyStore.containsAlias(keyAlias)) {
            val spec = KeyGenParameterSpec.Builder(
                keyAlias,
                KeyProperties.PURPOSE_SIGN or KeyProperties.PURPOSE_VERIFY
            )
                .setAlgorithms(KeyProperties.KEY_ALGORITHM_EC)
                .setDigests(KeyProperties.DIGEST_SHA256)
                .setUserAuthenticationRequired(true)
                .setIsStrongBoxBacked(true) // Hardware security
                .build()

            val keyPairGenerator = KeyPairGenerator.getInstance(
                KeyProperties.KEY_ALGORITHM_EC, "AndroidKeyStore"
            )
            keyPairGenerator.initialize(spec)
            keyPairGenerator.generateKeyPair()
        }
    }

    fun sign(data: ByteArray): ByteArray {
        val keyStore = KeyStore.getInstance("AndroidKeyStore")
        keyStore.load(null)
        val privateKey = keyStore.getKey(keyAlias, null)

        val signature = Signature.getInstance("SHA256withECDSA")
        signature.initSign(privateKey as java.security.PrivateKey)
        signature.update(data)
        return signature.sign()
    }
}

// Chain manager with Completeness Invariant
class CPPChainManager {
    private val chains = mutableMapOf<String, MutableList<CPPEvent>>()

    fun captureEvent(
        chainID: String,
        imageBytes: ByteArray,
        depthVerdict: String? = null,
        depthConfidence: Float? = null
    ): CPPEvent {
        val chain = chains.getOrPut(chainID) { mutableListOf() }
        val prevHash = chain.lastOrNull()?.eventHash ?: "0".repeat(64)

        // Hash raw image data FIRST
        val mediaHash = MessageDigest.getInstance("SHA-256")
            .digest(imageBytes)
            .joinToString("") { "%02x".format(it) }

        val event = CPPEvent(
            chainID = chainID,
            eventType = "INGEST",
            prevHash = prevHash,
            mediaHash = mediaHash,
            depthVerdict = depthVerdict,
            depthConfidence = depthConfidence
        )
        event.computeHash()

        chain.add(event)
        return event
    }

    fun verifyChain(chainID: String): Boolean {
        val chain = chains[chainID] ?: return false

        for (i in chain.indices) {
            // Verify hash
            val recomputed = chain[i].copy(eventHash = "", signature = byteArrayOf())
            if (recomputed.computeHash() != chain[i].eventHash) {
                return false
            }
            // Verify linkage
            if (i == 0) {
                if (chain[i].prevHash != "0".repeat(64)) return false
            } else {
                if (chain[i].prevHash != chain[i - 1].eventHash) return false
            }
        }
        return true
    }

    fun xorChecksum(chainID: String): String {
        val chain = chains[chainID] ?: return "0".repeat(64)
        val result = ByteArray(32)

        for (event in chain) {
            val hashBytes = event.eventHash.chunked(2)
                .map { it.toInt(16).toByte() }.toByteArray()
            for (i in result.indices) {
                result[i] = (result[i].toInt() xor hashBytes[i].toInt()).toByte()
            }
        }

        return result.joinToString("") { "%02x".format(it) }
    }
}
Enter fullscreen mode Exit fullscreen mode

What VeraSnap Proves and What It Doesn't {#proves-and-doesnt}

Being precise about boundaries is critical:

┌──────────────────────────┬───────────────────────────┐
│     CPP / VeraSnap       │     CPP / VeraSnap        │
│     PROVES:              │     DOES NOT PROVE:       │
├──────────────────────────┼───────────────────────────┤
│ ✅ Capture timing        │ ❌ Content truthfulness   │
│    (TSA-certified)       │    (scene could be staged)│
│                          │                           │
│ ✅ Device identity       │ ❌ Photographer identity  │
│    (HSM-backed)          │    (device ≠ person)      │
│                          │                           │
│ ✅ No deleted evidence   │ ❌ Scene authenticity     │
│    (Completeness Inv.)   │    (real but misleading)  │
│                          │                           │
│ ✅ 3D scene structure    │ ❌ Context or intent      │
│    (Depth Analysis)      │    (why photo was taken)  │
│                          │                           │
│ ✅ Chain of custody      │ ❌ Legal validity         │
│    (hash chain)          │    (jurisdiction-specific)│
└──────────────────────────┴───────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

A staged crime scene photographed with VeraSnap is still a staged crime scene — but with cryptographic proof it was staged at a specific time and place. VeraSnap provides provenance, not truth. The distinction is legally important.

As the CPP specification states: "Provenance ≠ Truth. But provenance is where trust begins."


Developer Takeaways {#developer-takeaways}

  1. Stop trying to detect fakes. Start proving authenticity. Microsoft's own research shows detection has a ceiling. Provenance has a mathematical floor.

  2. The VAP framework shares infrastructure across domains. Merkle tree builders, TSA clients, JSON canonicalizers, hash chain verifiers — built once, used in VeraSnap (capture), CAP-SRP (AI refusal), and VCP (financial audit).

  3. Hardware security is non-negotiable for capture provenance. If the signing key can be extracted, the provenance can be forged. Secure Enclave / StrongBox solves this.

  4. External timestamps are the single most important upgrade you can add to any evidence system. RFC 3161 is well-supported, easy to integrate, and eliminates self-attestation vulnerabilities.

  5. The analog hole is real. Screen recapture detection isn't perfect, but multi-modal analysis (LiDAR + moiré + flicker + IMU) makes it significantly harder to defeat than any single sensor.

  6. C2PA interoperability matters. Building a closed ecosystem is a dead end. VeraSnap's C2PA manifest export means CPP-credentialed images flow through the existing Content Credentials infrastructure.


Availability

VeraSnap is available now:

All core features — cryptographic evidence capture, depth analysis, screen detection, Completeness Invariant — are free.

14 languages supported: English, Japanese, Korean, Spanish, French, German, Portuguese, Simplified Chinese, Traditional Chinese, Arabic, Hindi, Indonesian, Italian, Russian.


Verify, don't trust.

This article is published under CC BY 4.0. Technical specifications are maintained by the VeritasChain Standards Organization (VSO). Based in Tokyo.

Top comments (0)