DEV Community

Mate Technologies
Mate Technologies

Posted on

πŸš€ Build a Powerful File Search System with Python (Step-by-Step Guide)

In this beginner-friendly tutorial, we’ll build a File Search System desktop app using Python.
It can search inside:

βœ… .txt

βœ… .csv

βœ… .xlsx

βœ… .pdf

βœ… .xml

It also supports:

πŸ”Ž Regex search

πŸ“‚ File type filtering

🚫 Folder exclusions

πŸ“Š Progress tracking (ETA + speed)

πŸ“€ Export results to CSV

By the end, you'll have a fully working desktop search tool.

πŸ‘‰ GitHub Repository:
https://github.com/rogers-cyber/File-Search-System

πŸ“¦ Step 1 β€” Install Required Libraries

First, install dependencies:

pip install pandas PyPDF2 ttkbootstrap openpyxl
Enter fullscreen mode Exit fullscreen mode

We use:

tkinter β†’ GUI (built-in)

ttkbootstrap β†’ Modern UI theme

pandas β†’ Read Excel files

PyPDF2 β†’ Read PDFs

xml.etree.ElementTree β†’ Read XML

πŸ— Step 2 β€” Basic App Setup

Let’s create our main window.

import tkinter as tk
import ttkbootstrap as tb
from ttkbootstrap.constants import *

APP_NAME = "File Search System"
APP_VERSION = "1.0.1"

app = tk.Tk()
style = tb.Style(theme="superhero")

app.title(f"{APP_NAME} {APP_VERSION}")
app.geometry("1230x680")

app.mainloop()
Enter fullscreen mode Exit fullscreen mode

πŸ’‘ What’s happening?

ttkbootstrap adds a modern theme

"superhero" gives us a dark professional UI

geometry() sets window size

Run this β€” you’ll see your app window!

πŸ“‚ Step 3 β€” Add Search Settings Section

We need:

Root folder selector

Search term input

Regex toggle

from tkinter import filedialog

frame1 = tb.Labelframe(app, text="Search Settings", padding=10)
frame1.pack(fill="x", padx=10, pady=6)

root_path = tk.StringVar()

tb.Label(frame1, text="Root Folder:", width=12).pack(side="left")
tb.Entry(frame1, textvariable=root_path, width=50).pack(side="left", padx=6)

tb.Button(
    frame1,
    text="Browse",
    command=lambda: root_path.set(filedialog.askdirectory())
).pack(side="left", padx=6)
Enter fullscreen mode Exit fullscreen mode

πŸ’‘ What’s happening?

StringVar() stores dynamic input values

filedialog.askdirectory() opens folder picker

Labelframe groups UI elements neatly

Add Search Term + Regex Toggle

search_var = tk.StringVar()
regex_var = tk.BooleanVar()

tb.Label(frame1, text="Search Term:", width=12).pack(side="left", padx=(12,0))
tb.Entry(frame1, textvariable=search_var, width=40).pack(side="left", padx=6)

tb.Checkbutton(
    frame1,
    text="Regex",
    variable=regex_var,
    bootstyle="info-round-toggle"
).pack(side="left", padx=12)
Enter fullscreen mode Exit fullscreen mode

πŸ—‚ Step 4 β€” Add File Filters

We’ll let users:

Filter file types

Exclude folders

frame2 = tb.Labelframe(app, text="Filters", padding=10)
frame2.pack(fill="x", padx=10, pady=6)

file_filter = tk.StringVar(value="*")
exclude_folders = tk.StringVar(value="node_modules .git __pycache__")

tb.Label(frame2, text="File Types:", width=12).pack(side="left")
tb.Entry(frame2, textvariable=file_filter, width=30).pack(side="left", padx=6)

tb.Label(frame2, text="Exclude Folders:", width=16).pack(side="left", padx=(12,2))
tb.Entry(frame2, textvariable=exclude_folders, width=40).pack(side="left", padx=6)
Enter fullscreen mode Exit fullscreen mode

πŸ’‘ Example filters:
*.txt *.csv *.xlsx *.pdf *.xml

πŸ“Š Step 5 β€” Add Progress Bar

We want:

% progress

ETA

Speed

Match counter

frame3 = tb.Labelframe(app, text="Progress", padding=8)
frame3.pack(fill="x", padx=10)

progress_var = tk.IntVar()

tb.Progressbar(
    frame3,
    variable=progress_var,
    maximum=100,
    length=500
).pack(side="left", padx=10)

eta_lbl = tb.Label(frame3, text="ETA: --")
eta_lbl.pack(side="left", padx=10)

spd_lbl = tb.Label(frame3, text="Speed: -- files/s")
spd_lbl.pack(side="left", padx=10)

counter_lbl = tb.Label(frame3, text="Matches: 0")
counter_lbl.pack(side="right", padx=10)
Enter fullscreen mode Exit fullscreen mode

πŸ“„ Step 6 β€” Display Results (Treeview)

We’ll show:

File path

Match preview

frame4 = tb.Labelframe(
    app,
    text="Search Results (Click=Copy | Double-Click=Open)",
    padding=10
)
frame4.pack(fill="both", expand=True, padx=10, pady=6)

cols = ("path", "preview")

tree = tb.Treeview(frame4, columns=cols, show="headings")

tree.heading("path", text="File Path")
tree.column("path", width=550, anchor="w")

tree.heading("preview", text="Match Preview")
tree.column("preview", width=500, anchor="w")

tree.pack(side="left", fill="both", expand=True)
Enter fullscreen mode Exit fullscreen mode

πŸ” Step 7 β€” Search Logic (Core Engine)

Now the fun part.

We walk through folders using os.walk().

import os
import fnmatch

def allowed_file(name):
    patterns = file_filter.get().lower().split()
    return "*" in patterns or any(
        fnmatch.fnmatch(name, p) for p in patterns
    )
Enter fullscreen mode Exit fullscreen mode

πŸ’‘ This filters files based on user input patterns.
Excluding folders

def excluded_dir(path):
    parts = set(path.lower().split(os.sep))
    return any(
        ex in parts for ex in exclude_folders.get().lower().split()
    )
Enter fullscreen mode Exit fullscreen mode

πŸ“˜ Step 8 β€” Reading Different File Types

Here’s an example for .txt files:

import re

def read_text_file(path, query, use_regex=False):
    with open(path, "r", errors="ignore") as f:
        for i, line in enumerate(f, 1):
            if use_regex:
                if re.search(query, line, re.IGNORECASE):
                    return True, f"Line {i}: {line.strip()}"
            else:
                if query.lower() in line.lower():
                    return True, f"Line {i}: {line.strip()}"
    return False, ""
Enter fullscreen mode Exit fullscreen mode

You would create similar handlers for:

Excel (pandas.read_excel)

PDF (PyPDF2.PdfReader)

XML (ElementTree)

πŸš€ Step 9 β€” Run Search with Progress Tracking

import time

def run_search():
    root = root_path.get()
    query = search_var.get()
    start = time.time()

    files = []

    for r, d, f in os.walk(root):
        if excluded_dir(r):
            d[:] = []
            continue

        for file in f:
            if allowed_file(file.lower()):
                files.append(os.path.join(r, file))
Enter fullscreen mode Exit fullscreen mode

Now process files:

total = len(files)
done = 0

for path in files:
    found, preview = read_text_file(path, query)

    done += 1
    progress_var.set(int(done / total * 100))

    elapsed = time.time() - start
    speed = done / elapsed if elapsed else 0

    spd_lbl.config(text=f"Speed: {speed:.1f} files/s")

    if found:
        tree.insert("", "end", values=(path, preview))
Enter fullscreen mode Exit fullscreen mode

πŸ“€ Step 10 β€” Export Results to CSV

import csv
from tkinter import messagebox

def export_results():
    file = filedialog.asksaveasfilename(
        defaultextension=".csv",
        filetypes=[("CSV files","*.csv")]
    )

    if not file:
        return

    with open(file,"w",newline="",encoding="utf-8") as f:
        writer = csv.writer(f)
        writer.writerow(["File Path","Preview"])
        for row in tree.get_children():
            writer.writerow(tree.item(row)["values"])

    messagebox.showinfo("Success","Results exported!")
Enter fullscreen mode Exit fullscreen mode

🎯 Final Result

You now have:

A modern GUI

Multi-file-type content search

Regex support

Folder exclusions

Real-time performance metrics

CSV export

πŸ’‘ Bonus Ideas

You can improve it by adding:

πŸ”₯ Multithreading for faster scanning

🌈 Highlight matched text

πŸ“¦ Convert to .exe using PyInstaller

☁️ Cloud storage support

πŸ“Œ Full Source Code

πŸ‘‰ GitHub Repository:
https://github.com/rogers-cyber/File-Search-System

Top comments (0)