PDF Automation with Python: Merge, Split, Watermark & More
Working with PDFs daily? Here is how to automate it with Python.
Complete PDF Toolkit
import PyPDF2
class PDFToolkit:
@staticmethod
def merge(pdf_list, output):
merger = PyPDF2.PdfMerger()
for pdf in pdf_list:
merger.append(pdf)
merger.write(output)
merger.close()
return len(pdf_list)
@staticmethod
def split(input_pdf, output_dir):
import os; os.makedirs(output_dir, exist_ok=True)
reader = PyPDF2.PdfReader(input_pdf)
for i, page in enumerate(reader.pages):
writer = PyPDF2.PdfWriter()
writer.add_page(page)
with open(f"{output_dir}/page_{i+1}.pdf", "wb") as f:
writer.write(f)
return i + 1
@staticmethod
def extract_text(input_pdf):
reader = PyPDF2.PdfReader(input_pdf)
return "\n".join(page.extract_text() for page in reader.pages)
Features
- Merge multiple PDFs into one
- Split PDF into individual pages
- Extract text for analysis
- Add watermarks to documents
- Convert PDF to images
Follow for more automation tools!
Top comments (0)