You get a file hash and a Merkle proof that claims it's anchored in some blockchain transaction. How do you verify that claim without trusting the service that gave you the proof?
I built ProofLedger to create these proofs, but the verification should work independently. Anyone can validate a Merkle proof with the right algorithm. Here's how to walk up the tree from leaf to root.
What's Actually in a Merkle Proof
A Merkle proof is a list of sibling hashes that lets you reconstruct the path from your file's hash up to the Merkle root. Each step combines your current hash with a sibling hash to produce the parent hash.
The proof tells you two things for each step:
- The sibling hash (as a hex string)
- Which side to put it on ("left" or "right")
# Example proof structure
proof = {
"hash": "a665a45920422f9d417e4867efdc4fb8a04a1f3fff1fa07e998e86f7f7a27ae3",
"merkle_path": [
{
"hash": "b5d4045c3f466fa91fe2cc6abe79232a1a57cdf104f7a26e716e0a1e2789df78",
"position": "right"
},
{
"hash": "c3d4045c3f466fa91fe2cc6abe79232a1a57cdf104f7a26e716e0a1e2789df79",
"position": "left"
}
]
}
That's it. No magic, no proprietary formats. Just a starting hash and a list of directions.
The Hash Combination Algorithm
Each step up the tree combines two 64-character hex strings into one. The critical part is getting the concatenation order right.
import hashlib
def combine_hashes(left_hex, right_hex):
"""Combine two hex hashes into their parent hash."""
# Concatenate the hex strings
combined_hex = left_hex + right_hex
# Convert to bytes, then hash
combined_bytes = bytes.fromhex(combined_hex)
parent_hash = hashlib.sha256(combined_bytes).hexdigest()
return parent_hash
# Example
left = "a665a45920422f9d417e4867efdc4fb8a04a1f3fff1fa07e998e86f7f7a27ae3"
right = "b5d4045c3f466fa91fe2cc6abe79232a1a57cdf104f7a26e716e0a1e2789df78"
parent = combine_hashes(left, right)
print(f"Parent: {parent}")
The order matters. combine_hashes(A, B) produces a different result than combine_hashes(B, A). That's why the proof needs to specify position.
Walking the Full Path
Now you can iterate through the proof steps. Start with your file hash. At each step, combine it with the sibling hash in the specified position.
def verify_merkle_proof(file_hash, merkle_path):
"""Walk up a Merkle tree and return the computed root."""
current = file_hash.lower() # Normalize to lowercase
for step in merkle_path:
sibling = step["hash"].lower()
position = step["position"]
if position == "left":
# Sibling goes left, current goes right
current = combine_hashes(sibling, current)
else:
# Current goes left, sibling goes right
current = combine_hashes(current, sibling)
print(f"Step: {current}")
return current
# Test with our example proof
file_hash = "a665a45920422f9d417e4867efdc4fb8a04a1f3fff1fa07e998e86f7f7a27ae3"
merkle_path = [
{
"hash": "b5d4045c3f466fa91fe2cc6abe79232a1a57cdf104f7a26e716e0a1e2789df78",
"position": "right"
},
{
"hash": "c3d4045c3f466fa91fe2cc6abe79232a1a57cdf104f7a26e716e0a1e2789df79",
"position": "left"
}
]
computed_root = verify_merkle_proof(file_hash, merkle_path)
print(f"Computed root: {computed_root}")
The function returns the root hash. That's what should match the value stored on-chain.
Complete Verification Function
Here's a complete verifier that handles the proof structure and validates the result:
import hashlib
import json
def hash_file(filepath):
"""Generate SHA-256 hash of a file."""
sha256_hash = hashlib.sha256()
with open(filepath, "rb") as f:
for chunk in iter(lambda: f.read(4096), b""):
sha256_hash.update(chunk)
return sha256_hash.hexdigest()
def verify_proof(file_path, proof_data):
"""Verify a complete Merkle proof against a file."""
# Hash the file
file_hash = hash_file(file_path)
# Extract proof components
claimed_hash = proof_data["hash"]
merkle_path = proof_data.get("merkle_path", [])
# Verify file hash matches claim
if file_hash != claimed_hash.lower():
return {
"verified": False,
"error": f"File hash {file_hash} doesn't match claimed {claimed_hash}"
}
# If no merkle path, just verify the hash
if not merkle_path:
return {
"verified": True,
"file_hash": file_hash,
"merkle_root": file_hash # Root is just the file hash
}
# Compute the Merkle root
computed_root = verify_merkle_proof(file_hash, merkle_path)
return {
"verified": True,
"file_hash": file_hash,
"merkle_root": computed_root,
"path_length": len(merkle_path)
}
# Usage
with open('proof.json', 'r') as f:
proof = json.load(f)
result = verify_proof('document.pdf', proof)
print(json.dumps(result, indent=2))
Why This Matters for Evidence
You don't need to trust the timestamping service. The Merkle proof lets you independently verify that your file hash was included in the transaction. You can check the blockchain explorer yourself to confirm the root hash exists on-chain.
The proof can't be forged after the fact. If someone tries to create a fake proof for a file that wasn't actually anchored, the computed root won't match what's on the blockchain.
For legal or forensic use, this means opposing counsel can verify your evidence timestamps independently. They don't have to trust ProofLedger, ProofAnchor, or any other service. They just run the algorithm against the blockchain data.
That's the point of cryptographic proofs. Math, not trust.
Top comments (0)