DEV Community

Kalyan Tamarapalli
Kalyan Tamarapalli

Posted on • Originally published at ktamarapalli.hashnode.dev

Merkle Manifests: Why Build Servers Lie (How to Cryptographically Prove It)

 Verifying CI/CD Artifacts Against Human-Signed Source Trees


Introduction: The Build Server Is Not a Source of Truth

Most CI/CD security models assume the build server is honest.

This is a dangerous assumption.

The SolarWinds supply-chain attack demonstrated that a build system can compile malicious code, sign it with legitimate keys, and distribute it as a trusted update — all while appearing compliant with every security control in the pipeline.

From the pipeline’s perspective:

  • The code was signed
  • The artifact passed integrity checks
  • The deployment followed policy

And yet the artifact was malicious.

This reveals a structural flaw:

If the same system that produces artifacts also attests to their integrity, integrity becomes meaningless.

This article introduces Merkle Manifests — a cryptographic pattern that breaks this trust loop by verifying build outputs against a human-signed source of truth, not against the build system’s claims.


Why “Signed by the Server” Is Not Security

Digital signatures answer one question:

Was this artifact signed by this key?

They do not answer:

Was this artifact derived from the intended source code?

If the build server is compromised, it can sign malicious artifacts with legitimate keys.

Cryptography works.

Trust fails.

This is why “code signed by vendor” failed in the SolarWinds compromise.

The failure mode was not cryptographic.

It was epistemological.

The system trusted the signer without verifying the provenance of what was signed.


The Provenance Gap in CI/CD

Modern supply-chain frameworks focus on:

  • Artifact signing
  • Provenance metadata
  • Build attestations

But many of these attestations originate from the same environment that performs the build.

This creates a closed trust loop:

Build system
   ↓
Produces artifact
   ↓
Attests to integrity
   ↓
Pipeline trusts the attestation
Enter fullscreen mode Exit fullscreen mode

Once the build environment is compromised, this loop collapses.

Provenance that originates inside a compromised trust domain cannot establish truth.


Human-Signed Source as the Root of Truth

Merkle Manifests shift the root of trust from the build server to the human developer.

Before code enters the pipeline:

  1. The developer computes a Merkle root hash of the entire source tree.
  2. This root hash is signed using a hardware-backed key.
  3. The signature represents conscious human intent over a specific source state.

This signed root becomes the source of truth.

The build server is no longer trusted to assert correctness.

Instead, it must prove fidelity to the human-signed source state.


How Merkle Manifests Work

A Merkle tree is constructed from the repository:

            Root Hash
           /        \
       Hash A      Hash B
       /   \       /   \
    File1 File2 File3 File4
Enter fullscreen mode Exit fullscreen mode

Structure:

  • Leaf nodes: hash of each source file
  • Branch nodes: hash of child hashes
  • Root hash: cryptographic fingerprint of the entire repository

Key properties:

  • Any single file modification changes the root hash
  • Verification is computationally efficient
  • The root uniquely represents the full source state

The root hash is what the human signs.


Verification Flow at Deployment Time

During deployment approval:

  1. Fetch the build artifact
  2. Fetch the signed Merkle root
  3. Reconstruct the Merkle root from artifact contents
  4. Compare:
Hash(Artifact) ?= Signed_Merkle_Root
Enter fullscreen mode Exit fullscreen mode

If the hashes differ:

  • The build server modified the code
  • The deployment is blocked

This detects:

  • Build-time injection
  • Compiler replacement
  • Artifact tampering
  • Malicious pipeline steps

The build server can no longer lie about what it built.


Why Hashing Archives Is Not Enough

A naive approach is hashing packaged artifacts such as tar.gz.

This is brittle because:

  • Archive metadata changes
  • Compression alters hashes
  • File ordering changes
  • Timestamps mutate builds

Merkle trees solve this problem.

Advantages:

  • Each file hashed independently
  • Directory structure encoded explicitly
  • Verification stable across packaging differences

Merkle Manifests verify semantic integrity, not packaging artifacts.


Threat Model Coverage

Merkle Manifests detect:

  • Source replacement during build
  • Compiler-injected backdoors
  • Script-based artifact tampering
  • Supply-chain attacks targeting build steps

They do not detect:

  • Malicious source intentionally committed
  • Logic bombs authored by insiders

This limitation is expected.

Cryptography cannot solve insider malice.

Merkle Manifests narrow the attack surface to human-originated actions.


Performance & Practicality

Merkle hashing thousands of files is computationally cheap relative to build times.

Typical overhead:

Operation Cost
Hash computation milliseconds → seconds
Verification seconds
Typical CI build time minutes

The overhead is negligible compared to the security guarantees.


Conclusion: Break the Server Trust Loop

Build servers are infrastructure.

Infrastructure is compromisable.

If your security model trusts infrastructure to attest to its own integrity, you are effectively trusting the attacker once compromise occurs.

Merkle Manifests break this loop by anchoring truth in human-signed source states, not server assertions.

Integrity should be proven, not claimed.

Top comments (0)