DEV Community

Mate Technologies
Mate Technologies

Posted on

🚀 PDFTextor — Extract Text from PDFs with a Modern Python GUI

Working with PDFs can be frustrating—especially when you just need the text quickly.

That’s why I built PDFTextor, a simple yet powerful desktop tool that lets you extract text from one or multiple PDF files with ease.

👉 Get it here: https://gum.new/gum/cmk3n0dst002504ky9ulpdf2u

✨ What is PDFTextor?

PDFTextor is a Python-based desktop application that allows you to:

Extract text from single or multiple PDF files
Handle large documents smoothly
Track progress in real time
Cancel extraction anytime
Save extracted text instantly

It’s lightweight, fast, and designed with a clean modern UI.

🔥 Key Features
📂 Multi-PDF Extraction

Select multiple PDF files and extract all text in one go.

⚡ Real-Time Progress Tracking

A progress bar shows exactly how far the extraction has gone.

❌ Cancel Anytime

Stop long operations instantly with a cancel button.

💾 Save Extracted Text

Export results to a .txt file with one click.

🧹 Clean & Reset

Clear the workspace instantly for a new session.

🎨 Modern UI

Built with ttkbootstrap for a sleek, professional look.

🧠 How It Works

PDFTextor uses:

PyPDF2 to read and extract text from PDF pages
Threading to keep the UI responsive
Tkinter + ttkbootstrap for the interface

The extraction runs in a background thread, so your app never freezes—even with large PDFs.

🖥️ Code Overview

Here’s the core structure of the app:

class PDFTextExtractor:
    def __init__(self, root):
        self.root = root
        self.root.title("PDFTextor - PDF Text Extractor")
Enter fullscreen mode Exit fullscreen mode

📌 Extraction Logic

The heavy lifting happens in a worker thread:

def _extract_text_worker(self, file_paths):
    for file_path in file_paths:
        reader = PdfReader(file_path)
        for page in reader.pages:
            text += page.extract_text() or ""
Enter fullscreen mode Exit fullscreen mode

📊 Progress Updates

def _update_progress(self, value):
    self.progress["value"] = value
    self.progress_label.config(text=f"Extracting... {int(value)}%")
Enter fullscreen mode Exit fullscreen mode

❌ Cancel Support

if self.cancel_event.is_set():
    return
Enter fullscreen mode Exit fullscreen mode

This ensures users can safely stop long-running tasks.

🧩 UI Highlights
Buttons for extract, save, clear, cancel
Scrollable text output area
Progress bar with status updates
Built-in About/Guide window

Everything is designed for simplicity and usability.

📦 Tech Stack
Python
Tkinter
ttkbootstrap
PyPDF2
Multithreading
💡 Why I Built This

Many PDF tools are either:

Too heavy
Too expensive
Or lack batch processing

PDFTextor solves all three problems with a clean, efficient solution.

🚀 Getting Started
Download the app
👉 https://gum.new/gum/cmk3n0dst002504ky9ulpdf2u
Run the application
Click "Extract PDF Text"
Select your PDF files
Sit back and let PDFTextor do the work
🛠️ Future Improvements
OCR support for scanned PDFs
Export to JSON / CSV
Drag & drop support
Dark mode toggle
👨‍💻 Developer

Built by MateTools
🌐 https://matetools.gumroad.com

⭐ Final Thoughts

If you regularly work with PDFs, PDFTextor can save you a ton of time.

Simple. Fast. Reliable.

Give it a try and let me know what you think!

👉 https://gum.new/gum/cmk3n0dst002504ky9ulpdf2u

Top comments (0)