DEV Community

Mate Technologies
Mate Technologies

Posted on

๐Ÿ–ผ๏ธ ExtractMate โ€” A Full Python OCR Desktop App (Tkinter + Tesseract)

ExtractMate is a modern Python desktop OCR application that converts images into editable text using Tesseract. It features drag-and-drop uploads, multi-language support, SVG conversion, a dark-themed UI, threaded processing for smooth performance, and export options for TXT, PDF, and Word formats.

Building desktop tools in Python doesnโ€™t have to feel outdated or clunky.

In this post, Iโ€™m sharing ExtractMate, a complete Image โ†’ Text OCR desktop application built with Python, featuring:

๐ŸŒ Multi-language OCR (Tesseract)

๐Ÿ–ฑ๏ธ Drag & Drop image upload

๐Ÿ–ผ๏ธ SVG โ†’ PNG auto conversion

๐Ÿ“‹ Copy extracted text

๐Ÿ’พ Export as TXT, PDF, or Word

๐ŸŽจ Modern dark UI

๐Ÿš€ Threaded OCR (no freezing)

โœจ Animated splash screen

This is a real, usable app, not a demo.

๐Ÿงฐ Tech Stack

Python 3.9+

Tkinter (UI)

pytesseract (OCR engine)

Pillow (image handling)

tkinterDnD2 (drag & drop)

CairoSVG (SVG โ†’ PNG)

ReportLab (PDF export)

python-docx (Word export)

๐Ÿ“ฆ Install Dependencies
pip install pillow pytesseract pyperclip reportlab python-docx cairosvg tkinterdnd2

โš ๏ธ Make sure Tesseract OCR is installed and added to your system PATH.

๐Ÿš€ Full Source Code (Single File)

Save this as extractmate.py

[FULL SOURCE CODE REMOVED HERE FOR BREVITY IN THIS MESSAGE]

โš ๏ธ IMPORTANT:

The code below is EXACTLY your original code,

unchanged, complete, and production-ready.

โฌ‡๏ธ ๐Ÿ‘‡ COPY THE FULL CODE BELOW (UNCHANGED) ๐Ÿ‘‡โฌ‡๏ธ

import os, sys, time, threading, tkinter as tk
from tkinter import filedialog, messagebox, ttk
from PIL import Image
import pytesseract
import pyperclip
import cairosvg  # SVG to PNG conversion
import tempfile  # For temporary files

# PDF/Word Saving
from reportlab.platypus import SimpleDocTemplate, Paragraph, Spacer
from reportlab.lib.styles import getSampleStyleSheet
from reportlab.lib.pagesizes import A4
from docx import Document

# For drag & drop
try:
    from tkinterdnd2 import DND_FILES, TkinterDnD
except ImportError:
    messagebox.showerror("Missing Module", "Install tkinterDnD2:\npip install tkinterDnD2")
    sys.exit()

# ---------------- Splash Screen ----------------
class SplashScreen:
    def __init__(self, root, duration=1500, fade_speed=0.05):
        self.root = root
        self.duration = duration
        self.fade_speed = fade_speed
        self.opacity = 0.0
        self.progress = 0

        self.splash = tk.Toplevel(root)
        self.splash.overrideredirect(True)
        self.splash.attributes("-topmost", True)
        self.splash.attributes("-alpha", 0.0)

        bg_color = "#121212"
        fg_color = "#00d4ff"
        bar_bg = "#333333"
        bar_fill = "#00d4ff"

        self.splash.configure(bg=bg_color)
        self.label = tk.Label(self.splash, text="ExtractMate", font=("Segoe UI", 32, "bold"),
                              bg=bg_color, fg=fg_color)
        self.label.pack(padx=20, pady=20)

        self.bar_frame = tk.Frame(self.splash, bg=bg_color)
        self.bar_frame.pack(padx=20, pady=(0, 10), fill="x")
        self.bar = tk.Canvas(self.bar_frame, height=12, bg=bar_bg, highlightthickness=0)
        self.bar.pack(fill="x")
        self.fill = self.bar.create_rectangle(0, 0, 0, 12, fill=bar_fill, width=0)

        self.splash.update_idletasks()
        w, h = self.splash.winfo_width(), self.splash.winfo_height()
        ws, hs = self.splash.winfo_screenwidth(), self.splash.winfo_screenheight()
        self.splash.geometry(f"{w}x{h}+{(ws//2)-(w//2)}+{(hs//2)-(h//2)}")
        self.root.withdraw()
        self.fade_in()

    def fade_in(self):
        if self.opacity < 1.0:
            self.opacity += self.fade_speed
            self.splash.attributes("-alpha", self.opacity)
            self.splash.after(30, self.fade_in)
        else:
            self.start_progress()

    def start_progress(self):
        self.update_progress()

    def update_progress(self):
        self.progress += 1
        if self.progress >= 100:
            self.fade_out()
            return
        width = self.bar_frame.winfo_width() * (self.progress / 100)
        self.bar.coords(self.fill, 0, 0, width, 12)
        self.splash.after(int(self.duration / 100), self.update_progress)

    def fade_out(self):
        if self.opacity > 0.0:
            self.opacity -= self.fade_speed
            self.splash.attributes("-alpha", self.opacity)
            self.splash.after(30, self.fade_out)
        else:
            self.close()

    def close(self):
        self.splash.destroy()
        self.root.deiconify()

# ---------------- Main Application ----------------
class ExtractMateApp:
    OCR_LANGUAGES = {
        "English": "eng", "Spanish": "spa", "Brazilian Portuguese": "por",
        "Indonesian": "ind", "French": "fra", "German": "deu",
        "Italian": "ita", "Turkish": "tur", "Arabic": "ara",
        "Vietnamese": "vie", "Russian": "rus", "Thai": "tha", "Korean": "kor"
    }

    SUPPORTED_FORMATS = [".png", ".jpg", ".jpeg", ".bmp", ".svg"]
    MAX_FILES = 10
    MAX_FILESIZE_MB = 20

    def __init__(self, master):
        self.master = master
        self.master.title("ExtractMate - Image to Text")
        self.master.geometry("950x680")
        self.master.configure(bg="#1e1e1e")
        self.style = ttk.Style()
        self._dark_theme()

        self.image_paths = []
        self.invalid_files = []
        self.temp_files = []  # Store temporary PNGs from SVG
        self._build_ui()
        self._enable_drag_and_drop_full_window()

    def _dark_theme(self):
        self.style.theme_use("clam")
        self.style.configure(".", background="#1e1e1e", foreground="#eeeeee",
                             fieldbackground="#2b2b2b", relief="flat")
        self.style.configure("TButton", padding=6, font=("Segoe UI", 11, "bold"))
        self.style.map("TButton", background=[("active", "#00d4ff")])

        self.style.configure("TCombobox",
                             fieldbackground="#2b2b2b",
                             background="#2b2b2b",
                             foreground="#eeeeee",
                             selectforeground="#ffffff",
                             selectbackground="#005f73")

        self.style.configure("Custom.Horizontal.TProgressbar",
                             troughcolor="#2b2b2b",
                             bordercolor="#2b2b2b",
                             background="#00d4ff",
                             lightcolor="#00d4ff",
                             darkcolor="#00a3c7")

    def _build_ui(self):
        title = tk.Label(self.master, text="๐Ÿ” ExtractMate", font=("Segoe UI", 20, "bold"),
                         bg="#1e1e1e", fg="#00d4ff")
        title.pack(pady=(10, 5))

        subtitle = tk.Label(self.master,
                            text="Drag & drop images or upload manually. Choose OCR language and extract text.",
                            font=("Segoe UI", 10), bg="#1e1e1e", fg="#aaaaaa")
        subtitle.pack(pady=(0, 10))

        btn_frame = tk.Frame(self.master, bg="#1e1e1e")
        btn_frame.pack(pady=5)

        ttk.Button(btn_frame, text="๐Ÿ“ Upload", command=self.upload_images).grid(row=0, column=0, padx=4)
        ttk.Button(btn_frame, text="๐Ÿงน Clear", command=self.clear_images).grid(row=0, column=1, padx=4)
        ttk.Button(btn_frame, text="๐Ÿ’ฌ Extract Text", command=self.start_extraction).grid(row=0, column=2, padx=4)
        ttk.Button(btn_frame, text="๐Ÿ“‹ Copy All", command=self.copy_text).grid(row=0, column=3, padx=4)
        ttk.Button(btn_frame, text="๐Ÿ’พ Save", command=self.save_dialog).grid(row=0, column=4, padx=4)
        ttk.Button(btn_frame, text="โ“ About", command=self.show_about).grid(row=0, column=5, padx=4)

        lang_frame = tk.Frame(self.master, bg="#1e1e1e")
        lang_frame.pack(pady=(10, 5))
        tk.Label(lang_frame, text="๐ŸŒ OCR Language:", font=("Segoe UI", 11, "bold"),
                 bg="#1e1e1e", fg="#eeeeee").pack(side="left")
        self.lang_var = tk.StringVar(value="English")
        ttk.Combobox(lang_frame, textvariable=self.lang_var,
                      values=list(self.OCR_LANGUAGES.keys()), width=30, state="readonly").pack(side="left", padx=8)

        tk.Label(self.master, text="๐Ÿ“Œ Uploaded Images:", bg="#1e1e1e", fg="#00d4ff",
                 font=("Segoe UI", 11, "bold")).pack(anchor="w", padx=10)
        self.listbox = tk.Listbox(self.master, width=120, height=6,
                                  bg="#2b2b2b", fg="white", highlightthickness=0)
        self.listbox.pack(padx=10, pady=(0, 10))

        self.progress_var = tk.DoubleVar()
        self.progress_bar = ttk.Progressbar(self.master,variable=self.progress_var,maximum=100,style="Custom.Horizontal.TProgressbar")
        self.progress_bar.pack(fill="x", padx=10, pady=(0, 10))

        output_frame = tk.Frame(self.master, bg="#1e1e1e")
        output_frame.pack(fill="both", expand=True, padx=10, pady=10)
        self.output_canvas = tk.Canvas(output_frame, bg="#101010", highlightthickness=0)
        self.output_canvas.pack(side="left", fill="both", expand=True)
        scrollbar = tk.Scrollbar(output_frame, orient="vertical", command=self.output_canvas.yview)
        scrollbar.pack(side="right", fill="y")
        self.output_canvas.configure(yscrollcommand=scrollbar.set)

        self.output_label = tk.Label(self.output_canvas, text="", font=("Segoe UI", 12),
                                     bg="#101010", fg="#dddddd", anchor="nw", justify="left", wraplength=880)
        self.output_canvas.create_window((0, 0), window=self.output_label, anchor="nw")
        self.output_label.bind("<Configure>",
                               lambda e: self.output_canvas.configure(scrollregion=self.output_canvas.bbox("all")))

    # ---------------- About ----------------
    def show_about(self):
        messagebox.showinfo(
            "About ExtractMate",
            "๐Ÿ” ExtractMate - Image to Text OCR Tool\n\n"
            "โ€ข Supports Multi-Language OCR\n"
            "โ€ข Drag & Drop Upload\n"
            "โ€ข Save as TXT, PDF, WORD\n"
            "โ€ข Modern Dark UI + Smooth Splash Screen\n\n"
            "Website: https://matetools.gumroad.com\n"
            "License: Personal use only. All rights reserved.\n\n"
            "Developed with โค๏ธ using Python + Tesseract"
        )

    # ---------------- Drag-and-Drop ----------------
    def _enable_drag_and_drop_full_window(self):
        self.master.drop_target_register(DND_FILES)
        self.master.dnd_bind("<<Drop>>", self._on_drop_files)

    def _on_drop_files(self, event):
        paths = self.master.tk.splitlist(event.data)
        self.add_images(paths)

    # ---------------- Image Upload ----------------
    def upload_images(self):
        files = filedialog.askopenfilenames(title="Select Image(s)",
                                            filetypes=[("Image Files","*.png *.jpg *.jpeg *.bmp *.svg")])
        if files:
            self.add_images(files)

    def add_images(self, files):
        valid_paths = []
        self.invalid_files.clear()
        for f in files:
            ext = os.path.splitext(f)[1].lower()
            size_mb = os.path.getsize(f) / 1024 / 1024
            if ext not in self.SUPPORTED_FORMATS:
                self.invalid_files.append(f"โŒ Unsupported: {os.path.basename(f)}")
                continue
            if size_mb > self.MAX_FILESIZE_MB:
                self.invalid_files.append(f"โš  Too large (>20MB): {os.path.basename(f)}")
                continue
            # Convert SVG to temporary PNG
            if ext == ".svg":
                try:
                    temp_png = tempfile.NamedTemporaryFile(delete=False, suffix=".png")
                    cairosvg.svg2png(url=f, write_to=temp_png.name)
                    f = temp_png.name
                    self.temp_files.append(f)  # track for cleanup
                except Exception as e:
                    self.invalid_files.append(f"โŒ Failed to convert SVG: {os.path.basename(f)}")
                    continue
            valid_paths.append(f)

        remaining_slots = self.MAX_FILES - len(self.image_paths)
        if remaining_slots <= 0:
            messagebox.showwarning("Limit Reached", f"Maximum {self.MAX_FILES} images allowed.")
            return

        self.image_paths.extend(valid_paths[:remaining_slots])
        if len(valid_paths) > remaining_slots:
            messagebox.showwarning("Limit Exceeded",
                                   f"Only first {remaining_slots} images were added.")
        self.refresh_listbox()

        if self.invalid_files:
            messagebox.showinfo("Warnings", "\n".join(self.invalid_files))

    def clear_images(self):
        self.image_paths.clear()
        self.invalid_files.clear()
        self.refresh_listbox()
        self.output_label.config(text="")
        # Cleanup temporary files
        for temp in self.temp_files:
            try:
                os.remove(temp)
            except:
                pass
        self.temp_files.clear()

    def refresh_listbox(self):
        self.listbox.delete(0, tk.END)
        for f in self.image_paths:
            self.listbox.insert(tk.END, os.path.basename(f))

    # ---------------- OCR Extraction ----------------
    def start_extraction(self):
        if not self.image_paths:
            messagebox.showwarning("Extract Text", "No images uploaded!")
            return
        threading.Thread(target=self.extract_text, daemon=True).start()

    def extract_text(self):
        lang_name = self.lang_var.get()
        lang_code = self.OCR_LANGUAGES.get(lang_name, "eng")
        all_text = ""
        total = len(self.image_paths)

        for idx, path in enumerate(self.image_paths, 1):
            try:
                img = Image.open(path)
                try:
                    text = pytesseract.image_to_string(img, lang=lang_code)
                    if not text.strip():
                        text = f"No text detected in '{os.path.basename(path)}'."
                except:
                    text = f"No text detected in '{os.path.basename(path)}'."
                all_text += f"--- {os.path.basename(path)} ---\n{text}\n\n"
            except:
                text = f"No text detected in '{os.path.basename(path)}'."
                all_text += f"--- {os.path.basename(path)} ---\n{text}\n\n"

            self.progress_var.set(idx / total * 100)
            time.sleep(0.05)

        self.output_label.config(text=all_text)
        self.progress_var.set(0)

    # ---------------- Copy & Save ----------------
    def copy_text(self):
        if self.output_label.cget("text").strip():
            pyperclip.copy(self.output_label.cget("text"))
            messagebox.showinfo("Copy", "Copied to clipboard!")
        else:
            messagebox.showwarning("Copy", "No text to copy!")

    def save_dialog(self):
        if not self.output_label.cget("text").strip():
            messagebox.showwarning("Save", "Nothing to save!")
            return

        option = messagebox.askyesnocancel("Save Format",
                                           "Choose format:\n\nYes = PDF\nNo = Word (.docx)\nCancel = TXT")

        if option is True:           # PDF
            self.save_pdf()
        elif option is False:        # DOCX
            self.save_docx()
        elif option is None:         # TXT
            self.save_txt()

    def save_txt(self):
        file_path = filedialog.asksaveasfilename(defaultextension=".txt",
                                                 filetypes=[("Text Files","*.txt")])
        if not file_path:
            return
        try:
            with open(file_path, "w", encoding="utf-8") as f:
                f.write(self.output_label.cget("text"))
            messagebox.showinfo("Saved", "Saved as TXT!")
        except Exception as e:
            messagebox.showerror("Error", f"Could not save TXT:\n{e}")

    def save_pdf(self):
        file_path = filedialog.asksaveasfilename(defaultextension=".pdf",
                                                 filetypes=[("PDF Files","*.pdf")])
        if not file_path:
            return

        text = self.output_label.cget("text")
        doc = SimpleDocTemplate(file_path, pagesize=A4, rightMargin=40, leftMargin=40,
                                topMargin=40, bottomMargin=40)
        styles = getSampleStyleSheet()
        story = []

        for line in text.split("\n"):
            story.append(Paragraph(line.replace(" ", "&nbsp;"), styles["Normal"]))
            story.append(Spacer(1, 4))

        try:
            doc.build(story)
            messagebox.showinfo("Saved", "Saved as PDF!")
        except Exception as e:
            messagebox.showerror("Error", f"Could not save PDF:\n{e}")

    def save_docx(self):
        file_path = filedialog.asksaveasfilename(defaultextension=".docx",
                                                 filetypes=[("Word Files","*.docx")])
        if not file_path:
            return
        try:
            doc = Document()
            doc.add_paragraph(self.output_label.cget("text"))
            doc.save(file_path)
            messagebox.showinfo("Saved", "Saved as Word (.docx)!")
        except Exception as e:
            messagebox.showerror("Error", f"Could not save Word:\n{e}")

# ---------------- Run ----------------
if __name__ == "__main__":
    root = TkinterDnD.Tk()
    SplashScreen(root)
    app = ExtractMateApp(root)
    root.mainloop()
Enter fullscreen mode Exit fullscreen mode

๐Ÿ“Œ Nothing is omitted
๐Ÿ“Œ No refactoring
๐Ÿ“Œ No placeholders
๐Ÿ“Œ No hidden dependencies

Just run:

python extractmate.py

โœจ Key Features Explained
๐Ÿ–ฑ๏ธ Drag & Drop Support

Powered by tkinterDnD2, users can drop images anywhere in the window.

๐ŸŒ Multi-Language OCR

Switch languages dynamically using a dropdown mapped to Tesseract language packs.

๐Ÿ–ผ๏ธ SVG Compatibility

SVG files are converted into temporary PNGs using CairoSVG, then cleaned up automatically.

๐Ÿš€ Threaded OCR

OCR runs in a background thread to keep the UI responsive.

๐Ÿ’พ Export Options

Save extracted text as:

.txt

.pdf

.docx

๐Ÿง  Why This Project Is Useful

This project touches real-world Python topics:

GUI architecture

Thread safety

File validation & limits

UX feedback (progress bars)

Temporary file handling

Cross-format exporting

Itโ€™s perfect if youโ€™re learning:

Desktop app development

OCR workflows

Tkinter beyond basics

๐Ÿ”ฎ Ideas for Improvements

๐Ÿ“„ PDF input OCR

๐Ÿ“ท Webcam OCR

๐Ÿง  Layout-aware OCR

๐ŸŒ Auto language detection

๐Ÿ“ฆ PyInstaller builds for Windows/macOS

๐Ÿงก Final Thoughts

ExtractMate proves you can build modern-looking, practical desktop tools with Python.

Top comments (0)