Kalyan Tamarapalli

Posted on Mar 6 • Originally published at ktamarapalli.hashnode.dev

Merkle Manifests: Why Build Servers Lie (How to Cryptographically Prove It)

#security #cybersecurity #devsecops #supplychain

Verifying CI/CD Artifacts Against Human-Signed Source Trees

Introduction: The Build Server Is Not a Source of Truth

Most CI/CD security models assume the build server is honest.

This is a dangerous assumption.

The SolarWinds supply-chain attack demonstrated that a build system can compile malicious code, sign it with legitimate keys, and distribute it as a trusted update — all while appearing compliant with every security control in the pipeline.

From the pipeline’s perspective:

The code was signed
The artifact passed integrity checks
The deployment followed policy

And yet the artifact was malicious.

This reveals a structural flaw:

If the same system that produces artifacts also attests to their integrity, integrity becomes meaningless.

This article introduces Merkle Manifests — a cryptographic pattern that breaks this trust loop by verifying build outputs against a human-signed source of truth, not against the build system’s claims.

Why “Signed by the Server” Is Not Security

Digital signatures answer one question:

Was this artifact signed by this key?

They do not answer:

Was this artifact derived from the intended source code?

If the build server is compromised, it can sign malicious artifacts with legitimate keys.

Cryptography works.

Trust fails.

This is why “code signed by vendor” failed in the SolarWinds compromise.

The failure mode was not cryptographic.

It was epistemological.

The system trusted the signer without verifying the provenance of what was signed.

The Provenance Gap in CI/CD

Modern supply-chain frameworks focus on:

Artifact signing
Provenance metadata
Build attestations

But many of these attestations originate from the same environment that performs the build.

This creates a closed trust loop:

Build system
   ↓
Produces artifact
   ↓
Attests to integrity
   ↓
Pipeline trusts the attestation

Once the build environment is compromised, this loop collapses.

Provenance that originates inside a compromised trust domain cannot establish truth.

Human-Signed Source as the Root of Truth

Merkle Manifests shift the root of trust from the build server to the human developer.

Before code enters the pipeline:

The developer computes a Merkle root hash of the entire source tree.
This root hash is signed using a hardware-backed key.
The signature represents conscious human intent over a specific source state.

This signed root becomes the source of truth.

The build server is no longer trusted to assert correctness.

Instead, it must prove fidelity to the human-signed source state.

How Merkle Manifests Work

A Merkle tree is constructed from the repository:

            Root Hash
           /        \
       Hash A      Hash B
       /   \       /   \
    File1 File2 File3 File4

Structure:

Leaf nodes: hash of each source file
Branch nodes: hash of child hashes
Root hash: cryptographic fingerprint of the entire repository

Key properties:

Any single file modification changes the root hash
Verification is computationally efficient
The root uniquely represents the full source state

The root hash is what the human signs.

Verification Flow at Deployment Time

During deployment approval:

Fetch the build artifact
Fetch the signed Merkle root
Reconstruct the Merkle root from artifact contents
Compare:

Hash(Artifact) ?= Signed_Merkle_Root

If the hashes differ:

The build server modified the code
The deployment is blocked

This detects:

Build-time injection
Compiler replacement
Artifact tampering
Malicious pipeline steps

The build server can no longer lie about what it built.

Why Hashing Archives Is Not Enough

A naive approach is hashing packaged artifacts such as tar.gz.

This is brittle because:

Archive metadata changes
Compression alters hashes
File ordering changes
Timestamps mutate builds

Merkle trees solve this problem.

Advantages:

Each file hashed independently
Directory structure encoded explicitly
Verification stable across packaging differences

Merkle Manifests verify semantic integrity, not packaging artifacts.

Threat Model Coverage

Merkle Manifests detect:

Source replacement during build
Compiler-injected backdoors
Script-based artifact tampering
Supply-chain attacks targeting build steps

They do not detect:

Malicious source intentionally committed
Logic bombs authored by insiders

This limitation is expected.

Cryptography cannot solve insider malice.

Merkle Manifests narrow the attack surface to human-originated actions.

Performance & Practicality

Merkle hashing thousands of files is computationally cheap relative to build times.

Typical overhead:

Operation	Cost
Hash computation	milliseconds → seconds
Verification	seconds
Typical CI build time	minutes

The overhead is negligible compared to the security guarantees.

Conclusion: Break the Server Trust Loop

Build servers are infrastructure.

Infrastructure is compromisable.

If your security model trusts infrastructure to attest to its own integrity, you are effectively trusting the attacker once compromise occurs.

Merkle Manifests break this loop by anchoring truth in human-signed source states, not server assertions.

Integrity should be proven, not claimed.

DEV Community