DEV Community

Xinglin Ming
Xinglin Ming

Posted on

PDF Automation with Python: Merge, Split, Watermark & More

PDF Automation with Python: Merge, Split, Watermark & More

Working with PDFs daily? Here is how to automate it with Python.

Complete PDF Toolkit

import PyPDF2
class PDFToolkit:
    @staticmethod
    def merge(pdf_list, output):
        merger = PyPDF2.PdfMerger()
        for pdf in pdf_list:
            merger.append(pdf)
        merger.write(output)
        merger.close()
        return len(pdf_list)
    @staticmethod
    def split(input_pdf, output_dir):
        import os; os.makedirs(output_dir, exist_ok=True)
        reader = PyPDF2.PdfReader(input_pdf)
        for i, page in enumerate(reader.pages):
            writer = PyPDF2.PdfWriter()
            writer.add_page(page)
            with open(f"{output_dir}/page_{i+1}.pdf", "wb") as f:
                writer.write(f)
        return i + 1
    @staticmethod
    def extract_text(input_pdf):
        reader = PyPDF2.PdfReader(input_pdf)
        return "\n".join(page.extract_text() for page in reader.pages)
Enter fullscreen mode Exit fullscreen mode

Features

  • Merge multiple PDFs into one
  • Split PDF into individual pages
  • Extract text for analysis
  • Add watermarks to documents
  • Convert PDF to images

Follow for more automation tools!

Top comments (0)