DEV Community

Mate Technologies
Mate Technologies

Posted on

Build a File Lock & Duplicate Detector in Python

In this tutorial, we’ll create a Python tool that scans folders, detects locked files, and highlights duplicates. The app features a modern GUI with ttkbootstrap and multi-threaded scanning for speed.

By the end, you’ll have a fully functional desktop application to help manage your files.

  1. Setup and Install Dependencies

We’ll use the following Python packages:

tkinter – built-in GUI toolkit

ttkbootstrap – modern themes for tkinter

hashlib – for SHA1 hashing

concurrent.futures – for multi-threading

Install ttkbootstrap via pip:

pip install ttkbootstrap
Enter fullscreen mode Exit fullscreen mode
  1. Import Required Modules

At the top of your script, import all modules:

import os
import hashlib
import threading
import tkinter as tk
from tkinter import filedialog, messagebox
import tkinter.ttk as ttk
import ttkbootstrap as tb
import time
import concurrent.futures
import subprocess
import sys
Enter fullscreen mode Exit fullscreen mode

Explanation:

os and subprocess for interacting with files and system commands.

hashlib to calculate file hashes for duplicate detection.

tkinter + ttkbootstrap for the GUI.

concurrent.futures + threading for multi-threaded scanning.

  1. Define App Info

Set up some basic information for your app:

APP_NAME = "File Lock & Duplicate Detector"
APP_VERSION = "2.0"
APP_AUTHOR = "Mate Technologies"
APP_WEBSITE = "https://matetools.gumroad.com"
Enter fullscreen mode Exit fullscreen mode

This will be displayed in the GUI and the “About” section.

  1. Create the Main App Class

We’ll define the main app structure with tkinter:

class FileScannerApp:
    def __init__(self, master):
        self.master = master
        self.master.title(f"{APP_NAME} {APP_VERSION}")
        self.master.geometry("1000x650")
        self.style = tb.Style(theme="superhero")  # Dark modern theme
Enter fullscreen mode Exit fullscreen mode

Explanation:

master is the main window.

ttkbootstrap provides the “superhero” dark theme.

  1. Add Variables for State Management
        self.folder_var = tk.StringVar()
        self.status_var = tk.StringVar(value="Idle")
        self.progress_val = tk.DoubleVar(value=0)
        self.hash_map = {}  # SHA1 -> first occurrence
        self.stop_event = threading.Event()
        self.pause_event = threading.Event()
Enter fullscreen mode Exit fullscreen mode

Explanation:

StringVar and DoubleVar are tkinter variables linked to the GUI.

hash_map stores file hashes to detect duplicates.

stop_event and pause_event control the scan flow.

  1. Build the GUI

We can start adding the UI elements:

        tb.Label(self.master, text=f"{APP_NAME}", font=("Segoe UI", 18, "bold")).pack(pady=(10, 2))
        tb.Label(self.master, text="Scan folders, detect locked files, and highlight duplicates",
                 font=("Segoe UI", 10, "italic"), foreground="#9ca3af").pack(pady=(0, 10))
Enter fullscreen mode Exit fullscreen mode

Explanation:

Two labels: one for the app title, another for a subtitle.

6.1 Top Buttons

        top_frame = tb.Frame(self.master, padding=10)
        top_frame.pack(fill="x")

        tb.Button(top_frame, text="Select Folder", bootstyle="primary", command=self.select_folder).pack(side="left", padx=5)
        tb.Button(top_frame, text="Highlight Duplicates", bootstyle="warning", command=self.highlight_duplicates).pack(side="left", padx=5)
        tb.Button(top_frame, text="Pause Scan", bootstyle="secondary", command=self.pause_scan).pack(side="left", padx=5)
        tb.Button(top_frame, text="Resume Scan", bootstyle="success", command=self.resume_scan).pack(side="left", padx=5)
        tb.Button(top_frame, text="Stop Scan", bootstyle="danger", command=self.stop_scan).pack(side="left", padx=5)
        tb.Button(top_frame, text="About", bootstyle="info", command=self.show_about).pack(side="right", padx=5)
Enter fullscreen mode Exit fullscreen mode

Explanation:

Buttons allow users to start scanning, highlight duplicates, pause/resume, stop, and open “About” info.

  1. Create the Treeview Table
        table_frame = tb.Frame(self.master)
        table_frame.pack(fill="both", expand=True, padx=10, pady=10)

        self.tree = ttk.Treeview(table_frame, columns=("status", "size", "hash"), show="headings")
        self.tree.heading("status", text="Status")
        self.tree.heading("size", text="Size (MB)")
        self.tree.heading("hash", text="SHA1 Hash")
        self.tree.pack(side="left", fill="both", expand=True)
Enter fullscreen mode Exit fullscreen mode

Explanation:

Treeview displays file information.

Columns: Status (Safe, Locked, Unknown), File Size, SHA1 Hash.

7.1 Add Scrollbar

        scrollbar = ttk.Scrollbar(table_frame, orient="vertical", command=self.tree.yview)
        self.tree.configure(yscrollcommand=scrollbar.set)
        scrollbar.pack(side="right", fill="y")
Enter fullscreen mode Exit fullscreen mode
  1. File Scanning Logic 8.1 Detect File Status
    def get_file_status(self, path):
        try:
            with open(path, "r+"):
                pass
            return "Safe/Deletable", "green"
        except (PermissionError, OSError):
            return "Locked/Undeletable", "red"
Enter fullscreen mode Exit fullscreen mode

Explanation:

Tries to open the file in read/write mode.

If it fails, the file is considered “Locked/Undeletable.”

8.2 Calculate SHA1 Hash

    def file_hash(self, path, block_size=65536):
        sha1 = hashlib.sha1()
        try:
            with open(path, "rb") as f:
                for block in iter(lambda: f.read(block_size), b""):
                    sha1.update(block)
        except Exception:
            return "ERROR"
        return sha1.hexdigest()
Enter fullscreen mode Exit fullscreen mode

Explanation:

Reads the file in blocks to avoid memory overload.

Returns a SHA1 hash to detect duplicates.

8.3 Scan Folder with Multi-Threading

    def scan_folder_thread(self, folder):
        file_list = [os.path.join(root, f) for root, _, files in os.walk(folder) for f in files]
        with concurrent.futures.ThreadPoolExecutor(max_workers=8) as executor:
            futures = {executor.submit(self.process_file, path): path for path in file_list}
            for future in concurrent.futures.as_completed(futures):
                full_path, status, color, size_mb, sha1, dup_tag = future.result()
Enter fullscreen mode Exit fullscreen mode

Explanation:

Uses ThreadPoolExecutor to speed up scanning.

Each file is processed in parallel.

  1. Highlight Duplicates
    def highlight_duplicates(self):
        items = self.tree.get_children()
        hash_map = {}
        for item in items:
            sha1 = self.tree.item(item, "values")[2]
            if sha1 in hash_map:
                self.tree.item(item, tags=("duplicate",))
            else:
                hash_map[sha1] = item
Enter fullscreen mode Exit fullscreen mode

Explanation:

Checks each file’s SHA1 hash.

Marks duplicates with the duplicate tag (e.g., gold highlight).

  1. Run the App

Finally, create and run the main window:

if __name__ == "__main__":
    root = tk.Tk()
    app = FileScannerApp(root)
    root.mainloop()
Enter fullscreen mode Exit fullscreen mode

✅ Conclusion

You now have a fully functional File Lock & Duplicate Detector!

Next steps / ideas:

Add file deletion confirmation for duplicates

Export scan results to CSV

Add more themes for the GUI

GitHub: FileLockDuplicateDetector

Top comments (0)