When handling contracts, legal files, or technical documentation, multiple versions of the same PDF are often involved. Identifying what has changed between versions manually can be tedious and prone to mistakes.
Fortunately, Spire.PDF for Python makes it easy to detect and highlight differences between two PDF files automatically , using only a small amount of code.
This tutorial shows you how to compare PDFs step by step, including setup and optional configuration.
Install the Library
To begin, install the required package from PyPI:
pip install spire.pdf
After installation, you can start comparing PDF documents right away.
Basic Example: Detect Differences Between Two PDFs
The example below compares an original document with an updated version and outputs a comparison file that visually marks the changes:
from spire.pdf.common import *
from spire.pdf import *
# Load the original PDF
original = PdfDocument("C:\\Users\\Administrator\\Desktop\\original.pdf")
# Load the updated PDF
revised = PdfDocument("C:\\Users\\Administrator\\Desktop\\revised.pdf")
# Initialize comparer
comparer = PdfComparer(original, revised)
# Generate comparison result
comparer.Compare("output/CompareResult.pdf")
# Release resources
original.Dispose()
revised.Dispose()
Open the resulting file in a PDF viewer (such as Adobe Acrobat), and you’ll see a side-by-side comparison. Removed content appears highlighted in red in the original file, while added content is marked in yellow in the revised version.
Advanced Options for Comparison
You can further control how the comparison works by adjusting settings before calling the Compare method.
Compare Text Only
If you want to ignore layout or graphical differences and focus purely on text changes:
comparer.PdfCompareOptions.OnlyCompareText = True
Limit Comparison to Specific Pages
For large documents, you may only need to analyze certain sections. You can define page ranges like this:
comparer.PdfCompareOptions.SetPageRanges(1, 3, 1, 3)
# Parameters: (oldStartIndex, oldEndIndex, newStartIndex, newEndIndex)
This allows you to compare only selected pages instead of the entire document.
Conclusion
Manually reviewing differences between PDF versions can be inefficient and error-prone. By using Spire.PDF for Python, you can quickly produce a clear visual comparison and identify changes with ease. This method is especially useful for contract reviews, document proofreading, and version tracking in professional workflows.
Top comments (0)