ChatGPT vs. My System Cleanup Script: Who's Winning?

#tutorial #programming #chatgpt #python

Some time ago, while studying bash scripts, I tried to delve deeper into the topic and looked for more practice by solving any, even the smallest, tasks. One of these was a script that deletes temporary files, old dumps, folders node_modulesfrom long-forgotten projects. I found it the other day completely by accident. I tested it on a virtual machine, the script works, but is terribly hacky and visually unpleasant.

What idea did I have? To check if ChatGPT can do the same (and how well) as I did, but more competently. The result was quite instructive: the AI did a great job with the architecture, but really tried to ruin the system with a couple of lines. Below I will tell you how it was.

The task is simple, you need to automatically find and delete unnecessary files according to certain rules. My old script was a monolith: with a bunch of repeating findand rm -rfawkward attempts to handle errors. Please do not judge me too much in advance, I was just learning Bash and its capabilities.

The main problems of my creation

Commands rm -rf with variable concatenation are a game of Russian roulette (concatenation is the combination of two or more strings into one).

Any gap in the path and the script will silently "fly" past the target or delete the wrong thing.

To change the rules, you need to go directly into the code there are no proper settings at the beginning.

The script did not log what exactly it deleted (or did not delete?). It worked in silence, which is always alarming.

I sent ChatGPT the TOR: "Write a secure and customizable script to search/delete temporary files, caches, and old logs. Add a whitelist of folders that cannot be accessed. Add logging."

Step-by-step code analysis before and after

I'll start by demonstrating that very "cheat" script, for which I am extremely ashamed. It was really hard to share this.
My version (comments were added by me before writing the article for better understanding)

#!/bin/bash
# If $DIR contains a space, the command will split into two
DIRS="/tmp ~/cache ~/projects/*/node_modules"
# Remove everything at once
for dir in $DIRS; do
    echo "Removing $dir"
    rm -rf "$dir"  # Quotes are here, but the for loop breaks them anyway, right?
done
# Find and delete all .log and .tmp files older than 30 days
find ~/ -name "*.log" -mtime +30 -exec rm {} \; 2>/dev/null
find ~/ -name "*.tmp" -mtime +30 -exec rm {} \; 2>/dev/null
echo "Cleanup done!"

This code is a product of colossal laziness (after all, he could have "read the Internet" and done better…). It deletes node_modulesrecursively, without looking, silently ignores any errors ( 2>/dev/null), which can hide existing rights issues.

Python version from ChatGPT

I want to point out in advance that the choice of Python is not accidental, but a conscious choice. I wanted to emphasize the difference in approaches between a "script on the knee" and a well-thought-out, reliable tool.

ChatGPT, tasked with making quality and secure software, intuitively understands that Bash is not the right tool for the job. The main aspect of any script is security, and Bash has no built-in mechanisms for secure file deletion. Security literally depends on proper escaping and the hope that no variable is accidentally empty.

I gave the prompt: "Write a safe and configurable system cleanup script. It should remove old files, have a blocklist for dangerous directories (like /, /home, etc.), and log its actions to a file. Avoid using shell commands directly."

Here's what the AI suggested (my comments and subsequent edits were added)

# cleanup_agent.py
import os
import logging
import time
from pathlib import Path
# === CONFIGURATION ===
TARGETS = [
    {
        "path": os.path.expanduser("~/Downloads"),
        "patterns": ["*.tmp", "*.temp", "*.log"],
        "max_age_days": 7
    },
    {
        "path": os.path.expanduser("~/.cache"),
        "patterns": ["*"],
        "max_age_days": 30
    }
]
# Directories that must never be touched (!)
BLOCKLIST = {
    os.path.expanduser("/"),
    os.path.expanduser("~"),
    os.path.expanduser("~/.ssh"),
}
# Logging setup
logging.basicConfig(
    filename=os.path.expanduser('~/cleanup_agent.log'),
    level=logging.INFO,
    format='%(asctime)s - %(levelname)s - %(message)s'
)
def safe_remove(target: Path):
    """Safely remove a file or empty folder with logging"""
    try:
        if target.is_file():
            target.unlink()  # Remove file
            logging.info(f"Removed file: {target}")
        elif target.is_dir():
            target.rmdir()  # Remove empty folder only
            logging.info(f"Removed empty folder: {target}")
    except OSError as e:
        # Log error but don't crash
        logging.error(f"Failed to remove {target}: {e}")
def cleanup_directory(config: dict):
    """Clean a directory based on config"""
    base_dir = Path(config["path"])
    if not base_dir.exists():
        logging.warning(f"Directory not found: {base_dir}")
        return
# Check if directory is in blocklist
    if any(blocked in base_dir.parents for blocked in BLOCKLIST):
        logging.error(f"Attempted cleanup of protected directory: {base_dir}. Aborting.")
        return
current_time = time.time()
    for pattern in config["patterns"]:
        for target in base_dir.rglob(pattern):  # Recursive search
            if not target.exists():
                continue
# Check file/folder age
            age_days = (current_time - target.stat().st_mtime) / (60 * 60 * 24)
            if age_days > config["max_age_days"]:
                safe_remove(target)
if __name__ == "__main__":
    logging.info("=== Cleanup started ===")
    for job in TARGETS:
        cleanup_directory(job)
    logging.info("=== Cleanup finished ===\n")

Comparative analysis

Where ChatGPT was right
Made the script as secure as possible. Eliminated such rm -rfdirect calls using Python's built-in methods ( unlink, rmdir). A BLOCKLIST, which completely prohibits any attempts to get into /or , appeared $HOME.

Added customizability. Instead of hardcode - a good config in the form of a list of dictionaries TARGETS. Need to clean another folder or change the "age"? Just edit the list without changing the code. In my opinion, the right and competent solution.

The script now maintains a full log file. Now you can see not only what was deleted, but also why something is going wrong.
Use pathlib.Pathinstead of string concatenation, which is more correct work with paths. It automatically handles different OS and escapes special characters.

Where ChatGPT was not quite right (in my opinion, please correct me if I'm wrong)

A bit dangerous recursive search. Initially AI used base_dir.rglob('')for pattern ""in ~/.cache. This literally means: "go recursively through EVERYTHING in the cache and check the age of EVERY file". For a cache directory, where there are a huge number of small files, this could easily lead to incredibly long and useless work. I would add a condition for a minimum age for such an aggressive cleaning.

Imitation of security. The function safe_removetries to delete the folder only if it is empty. This is safe, but completely useless for node_modules. For "non-empty" directories, the script will simply ignore them. It would be worth explicitly specifying this in the logging.

Not the most practical templates. The template ""is ~/.cachetoo wide. It would be more correct: ['.bin', 'cache/', 'thumbnails/']etc.

What conclusion can be drawn: ChatGPT made a low-quality and slightly dangerous bash script into a nearly production utility with config and logs. But blind confidence in recursive traversal of "everything and everyone" could easily hang the system. AI structures and secures the code perfectly, but it seems to lack a specific understanding of "what exactly should I clean?" As an auxiliary tool for generation, it is an indispensable thing, but you need to know the material well and very carefully monitor the generated code in order to avoid dangerous consequences.

Example of use

As usual, the instructions for the script are in the article (maybe someone will need it?)

Save the code to a file cleanup_agent.py.
We edit the config TARGETSfor the required tasks. It is necessary to clean Downloadsonce a week - please. It is necessary to clean Projectsfrom pycache- we add a rule.
Let's launch it and look at the logs.

# Make the script executable
chmod +x cleanup_agent.py
# Run the script
python3 cleanup_agent.py

Check the log output

tail -f ~/cleanup_agent.log
The output in the log will be something like this:
2025-08-19 11:05:32,123 - INFO - === Cleanup started ===
2025-08-19 11:05:32,456 - INFO - Removed file: /home/user/Downloads/old_report.tmp
2025-08-19 11:05:33,001 - ERROR - Failed to remove /home/user/.cache/some_file: [Errno 13] Permission denied
2025-08-19 11:05:33,002 - INFO - === Cleanup finished ===