DupCleaner PRO v1.0.0 is a professional Python desktop app that helps you find, preview, and safely delete duplicate files. In this tutorial, we’ll walk through how it works, step by step, with code snippets and explanations.
Step 1: Project Setup
First, clone the repository from GitHub:
git clone https://github.com/rogers-cyber/DupCleanerPRO.git
cd DupCleanerPRO
Then, create a Python virtual environment (optional but recommended):
python -m venv venv
source venv/bin/activate # Linux/macOS
venv\Scripts\activate # Windows
Install the required Python packages:
pip install ttkbootstrap pillow send2trash
ttkbootstrap: Modern Tkinter themes for your GUI
Pillow: Image handling for thumbnails
send2trash: Safe deletion (moves files to Recycle Bin)
Step 2: Basic GUI Setup
We’ll use ttkbootstrap and Tkinter to create the main window:
import ttkbootstrap as tb
# Create main application window
app = tb.Window(themename="darkly")
app.title("DupCleaner PRO")
app.geometry("1200x680")
# Add title label
tb.Label(app, text="DupCleaner PRO", font=("Segoe UI", 22, "bold")).pack(pady=(10, 2))
app.mainloop()
Explanation:
tb.Window creates the main window with a dark theme
tb.Label adds a title
app.mainloop() starts the GUI event loop
Step 3: Add File and Folder Selection
Users need to choose files or folders to scan. Let’s add a listbox with buttons:
from tkinter import filedialog
target_paths = []
def add_files():
files = filedialog.askopenfilenames()
if files:
for f in files:
if f not in target_paths:
target_paths.append(f)
target_listbox.insert("end", f)
def add_folder():
folder = filedialog.askdirectory()
if folder and folder not in target_paths:
target_paths.append(folder)
target_listbox.insert("end", folder)
askopenfilenames() lets users pick multiple files
askdirectory() lets users select a folder
target_paths stores all selected paths
Add a Listbox to show selected paths:
import tkinter as tk
target_listbox = tk.Listbox(app, height=5, width=60)
target_listbox.pack(pady=10)
Step 4: Scanning for Duplicates
We’ll scan files using size first, then hash for accuracy:
import os, hashlib
from collections import defaultdict
def file_hash(path, chunk_size=65536):
md5 = hashlib.md5()
with open(path, "rb") as f:
while True:
data = f.read(chunk_size)
if not data:
break
md5.update(data)
return md5.hexdigest()
def scan_duplicates(files):
size_map = defaultdict(list)
for f in files:
try:
size_map[os.path.getsize(f)].append(f)
except Exception:
continue
duplicates = []
for group in size_map.values():
if len(group) > 1:
hash_map = defaultdict(list)
for f in group:
h = file_hash(f)
hash_map[h].append(f)
for dup_group in hash_map.values():
if len(dup_group) > 1:
duplicates.append(dup_group)
return duplicates
Explanation:
size_map groups files by size (fast pre-check)
hash_map groups files by MD5 hash (accurate duplicate detection)
Only groups with more than one file are considered duplicates
Step 5: Display Results in a Treeview
We can show duplicate groups using a Treeview widget:
from ttkbootstrap.constants import *
tree = tb.Treeview(app, columns=("group", "count"), show="headings")
tree.heading("group", text="Group")
tree.heading("count", text="Count")
tree.pack(fill="both", expand=True, padx=10, pady=10)
def show_duplicates(duplicates):
tree.delete(*tree.get_children())
for i, group in enumerate(duplicates, 1):
tree.insert("", "end", values=(f"Group {i}", len(group)))
Explanation:
Treeview shows duplicate groups in a table format
Clicking a group can later show thumbnails or file list
Step 6: Preview Files with Thumbnails
For image files, we can display thumbnails:
from PIL import Image, ImageTk
thumbnail_cache = []
def show_thumbnails(files):
for widget in preview_frame.winfo_children():
widget.destroy()
thumbnail_cache.clear()
for f in files:
if f.lower().endswith((".png", ".jpg", ".jpeg", ".gif", ".bmp")):
img = Image.open(f)
img.thumbnail((120, 120))
tk_img = ImageTk.PhotoImage(img)
thumbnail_cache.append(tk_img)
lbl = tk.Label(preview_frame, image=tk_img, text=os.path.basename(f), compound="top")
lbl.pack(side="left", padx=5, pady=5)
thumbnail_cache keeps references to images to prevent garbage collection
Only image files are displayed as thumbnails
Step 7: Safe Deletion
We can delete selected duplicates using send2trash:
from send2trash import send2trash
def delete_files(file_list):
for f in file_list:
try:
send2trash(f)
except Exception as e:
print(f"Failed to delete {f}: {e}")
Explanation:
send2trash moves files to the Recycle Bin instead of permanent deletion
It’s safer for accidental deletion
Step 8: Export Results
Users can save duplicate reports to JSON or TXT:
import json
def export_json(duplicates, path="duplicates.json"):
data = {f"Group {i+1}": lst for i, lst in enumerate(duplicates)}
with open(path, "w") as f:
json.dump(data, f, indent=2)
def export_txt(duplicates, path="duplicates.txt"):
with open(path, "w") as f:
for i, group in enumerate(duplicates, 1):
f.write(f"Group {i} ({len(group)} files)\n")
for file in group:
f.write(f"{file}\n")
f.write("\n")
Step 9: Putting It All Together
Combine all the steps into the main app:
GUI layout
File selection
Scan duplicates button
Preview panel
Delete duplicates button
Export buttons
At the end, call:
app.mainloop()
And your DupCleaner PRO is ready to use!
Step 10: Next Steps / Enhancements
Add keep newest/first file logic
Add progress bar and ETA during scanning
Add custom settings save/load
Add About / Help section
✅ Summary:
This step-by-step guide walked through building DupCleaner PRO, covering file selection, duplicate detection, thumbnails, safe deletion, and export. With these basics, you can extend the app with features like threaded scanning, more file previews, or enhanced UI.
GitHub Repository: https://github.com/rogers-cyber/DupCleanerPRO

Top comments (0)