Downloading a software package, a binary, or a large dataset from the internet means trusting that what you received is what the author published. Checksums are how you verify that trust. The file's author computes a hash before publishing; you compute the same hash after downloading; if the values match, the file is intact and unmodified.
SHA-256 is the algorithm you should use for this. MD5 and SHA-1 checksums still appear for older projects, but both have demonstrated collision vulnerabilities that make them unsuitable for integrity verification where tampering is a concern. The SHA-2 family that includes SHA-256 has no known practical collision vulnerability. If you're choosing an algorithm for new file verification workflows, SHA-256 is the default.
Step 1: Find the Published Checksum
Most serious software projects publish checksums on their download page or in a separate checksums file. Common patterns:
- A file named
SHA256SUMS,checksums.txt, orsha256sums.txtlisted alongside the download - A hash value next to the download link, labeled as
SHA-256orsha256 - A detached PGP signature file covering the checksum file, adding a layer of authenticity verification on top of the integrity check
If the project only publishes MD5 checksums, you can still verify against them. An MD5 match tells you the file wasn't accidentally corrupted in transit. It doesn't guarantee the file wasn't intentionally modified by an attacker who could compute a matching MD5 for a modified file, which is now practical with consumer hardware.
For security-critical downloads (operating system images, cryptographic tools, container runtimes), prefer projects that publish SHA-256 checksums and verify them. The OWASP guidance on secure software distribution covers why this matters in web application and DevOps contexts.
Step 2: Compute the Hash of the Downloaded File
On the command line, this is straightforward on every platform.
Linux and macOS:
sha256sum /path/to/downloaded-file.zip
This prints the SHA-256 hash followed by the filename.
macOS alternative (if sha256sum is not installed by default):
shasum -a 256 /path/to/downloaded-file.zip
Windows (PowerShell):
Get-FileHash -Path "C:\path\to\downloaded-file.zip" -Algorithm SHA256
In scripting and automation contexts, Python provides the hashlib library for computing file hashes programmatically:
import hashlib
def sha256_file(path):
h = hashlib.sha256()
with open(path, 'rb') as f:
for chunk in iter(lambda: f.read(65536), b''):
h.update(chunk)
return h.hexdigest()
Reading in chunks avoids loading the entire file into memory, which matters for large files like OS images or database dumps. The chunk size of 65536 bytes (64 KB) is a reasonable default that balances memory usage and I/O efficiency.
For text content or strings (not file hashing), the EvvyTools Hash Generator handles SHA-256 and the other major algorithms entirely in the browser. File hash verification for large files is better done on the command line since you don't upload the file anywhere.

Photo by Pixabay on Pexels
Step 3: Compare the Hashes
The hash you computed in Step 2 should exactly match the hash the project published. The comparison must be character-for-character with no differences.
Manual comparison: Copy both values into a text editor and compare them. SHA-256 hashes are 64 characters long. One wrong character anywhere means the file is different. A mismatch could mean the file was corrupted during download, served from a compromised mirror, or modified in transit.
Automated comparison on Linux/macOS: If the project provides a SHA256SUMS file, you can verify all listed files in one command:
sha256sum --check SHA256SUMS
This reads the SHA256SUMS file, computes hashes for all the listed files in the current directory, and reports which ones pass and which fail. Files that fail are listed with a clear mismatch message.
When hashes don't match:
First, try downloading again from the official source rather than a mirror. Corrupted downloads are the most common cause of mismatches, and a second download often resolves it. If you downloaded from a mirror, switch to the project's primary server. If the mismatch persists from the official source, report it to the project maintainers. A persistent mismatch from the canonical source could indicate a compromised distribution channel.
Step 4: Optionally Verify the Checksum File Itself
A SHA-256 hash tells you the file matches what was published. It doesn't tell you the published hash wasn't changed by someone who compromised the project's website or CDN between when the author published and when you downloaded.
For high-security scenarios, project maintainers often sign the checksum file with a PGP key. Verifying the PGP signature confirms the checksum file was signed by someone with the maintainer's private key:
gpg --verify SHA256SUMS.asc SHA256SUMS
This requires the maintainer's public key in your GPG keyring. Most projects that use this approach include instructions for importing their signing key on their download page or security documentation.
For most everyday software downloads, SHA-256 comparison alone is sufficient. PGP verification becomes more important for security-critical tools where you want to verify you have the genuine software from the genuine author, not just software that matches a potentially-compromised published hash.
Automating Checksum Verification in Build Pipelines
If you're downloading dependencies or artifacts in a build or CI/CD pipeline, automate the checksum verification rather than checking manually each time.
In a shell script:
EXPECTED_HASH="a7f3b...(paste the published SHA-256 here)"
ACTUAL_HASH=$(sha256sum downloaded-file.zip | awk '{print $1}')
if [ "$EXPECTED_HASH" != "$ACTUAL_HASH" ]; then
echo "ERROR: Hash mismatch. Expected $EXPECTED_HASH, got $ACTUAL_HASH"
exit 1
fi
echo "Hash verified successfully"
In Node.js using the built-in crypto module, Node.js provides crypto.createHash('sha256') for computing file hashes in a streaming fashion:
const crypto = require('crypto');
const fs = require('fs');
function sha256File(path) {
return new Promise((resolve, reject) => {
const hash = crypto.createHash('sha256');
const stream = fs.createReadStream(path);
stream.on('data', chunk => hash.update(chunk));
stream.on('end', () => resolve(hash.digest('hex')));
stream.on('error', reject);
});
}
Automating this step means a compromised download will fail your build rather than proceed silently. For build pipelines that fetch external dependencies, this is a meaningful security improvement with low implementation cost. Pin to a specific version and verify the hash in the same pipeline step.
Understanding What Checksums Protect Against
SHA-256 checksums protect against accidental corruption (bit flips, incomplete transfers, disk errors) and intentional modification by an attacker who compromised the distribution channel after the hash was published.
They do not protect against a compromised build process. If the author's build server was compromised before the binary was created, the published hash is for the compromised binary. They also don't protect against malicious software published intentionally by the author. A hash match confirms you have what the author published, not that what the author published is safe.
For fully verifiable builds, some projects use reproducible builds: a deterministic build process where anyone can build the project from source and produce a binary with an identical hash. This is the strongest form of verification available and is increasingly common in security-conscious open-source projects.
The underlying mechanics of why SHA-256 is appropriate here and why MD5 and SHA-1 are not, including the collision vulnerabilities that disqualify them from integrity verification, are explained in the full article on How Cryptographic Hash Functions Work: MD5, SHA-1, SHA-256, and SHA-512 Explained.

Top comments (0)