DEV Community

smn729
smn729

Posted on

Automate Your Boring File Tasks with Python: 5 Scripts You Can Use Today

Let's be honest: most of the repetitive file operations we do every day don't require our attention. Renaming hundreds of files, cleaning messy CSVs, resizing images one by one — these tasks are necessary but mind-numbing.

I've been collecting and writing Python automation scripts for the past few years, and I want to share five that have saved me the most time. You can copy these directly, adapt them to your workflow, and start automating today.


Before We Start

You'll need Python 3.8+ installed. Each script below is a single file — no complex setup required. Just save it, install any dependencies, and run.

pip install pillow pandas  # for image and CSV scripts
Enter fullscreen mode Exit fullscreen mode

1. Smart File Organizer

This one is my most-used script. Your Downloads folder is probably a mess — mine definitely was. This script sorts files into folders by type or date.

#!/usr/bin/env python3
import os
import shutil
import argparse
from collections import defaultdict

FILE_CATEGORIES = {
    'Images': ['.jpg', '.jpeg', '.png', '.gif', '.webp', '.svg'],
    'Documents': ['.pdf', '.docx', '.txt', '.md', '.xlsx', '.pptx'],
    'Audio': ['.mp3', '.wav', '.flac', '.aac', '.m4a'],
    'Video': ['.mp4', '.mov', '.avi', '.mkv', '.webm'],
    'Archives': ['.zip', '.tar', '.gz', '.rar', '.7z'],
    'Code': ['.py', '.js', '.ts', '.html', '.css', '.json', '.yaml'],
}

def organize_by_type(source_dir, target_dir=None, dry_run=False):
    if not target_dir:
        target_dir = source_dir

    for filename in os.listdir(source_dir):
        filepath = os.path.join(source_dir, filename)
        if os.path.isfile(filepath):
            ext = os.path.splitext(filename)[1].lower()
            category = 'Other'
            for cat, exts in FILE_CATEGORIES.items():
                if ext in exts:
                    category = cat
                    break

            dest = os.path.join(target_dir, category)
            if dry_run:
                print(f'[DRY RUN] Would move: {filename} -> {category}/')
            else:
                os.makedirs(dest, exist_ok=True)
                shutil.move(filepath, os.path.join(dest, filename))
                print(f'Moved: {filename} -> {category}/')


if __name__ == '__main__':
    parser = argparse.ArgumentParser(description='Organize files by type')
    parser.add_argument('directory', help='Directory to organize')
    parser.add_argument('--dry-run', action='store_true', help='Preview only')
    args = parser.parse_args()
    organize_by_type(args.directory, dry_run=args.dry_run)
Enter fullscreen mode Exit fullscreen mode

Usage:

# Preview what would happen
python file-organizer.py ~/Downloads --dry-run

# Actually organize
python file-organizer.py ~/Downloads
Enter fullscreen mode Exit fullscreen mode

2. CSV Cleaner

CSV files from different sources are never consistent. Different separators, extra whitespace, missing values, duplicate rows — this script handles the common pain points.

#!/usr/bin/env python3
import csv
import sys
import argparse


def clean_csv(input_file, output_file, **options):
    with open(input_file, 'r', encoding='utf-8') as f:
        reader = csv.DictReader(f)
        rows = list(reader)
        fieldnames = reader.fieldnames

    cleaned = []
    seen = set()

    for row in rows:
        # Strip whitespace from all values
        if options.get('strip'):
            row = {k: v.strip() if isinstance(v, str) else v
                   for k, v in row.items()}

        # Remove empty rows
        if options.get('drop_empty'):
            if all(v == '' or v is None for v in row.values()):
                continue

        # Remove duplicates
        if options.get('drop_duplicates'):
            row_key = tuple(row.values())
            if row_key in seen:
                continue
            seen.add(row_key)

        # Fill missing values
        if options.get('fill'):
            row = {k: (v if v != '' else options['fill'])
                   for k, v in row.items()}

        cleaned.append(row)

    with open(output_file, 'w', encoding='utf-8', newline='') as f:
        writer = csv.DictWriter(f, fieldnames=fieldnames)
        writer.writeheader()
        writer.writerows(cleaned)

    print(f'Cleaned: {len(rows)} -> {len(cleaned)} rows')


if __name__ == '__main__':
    parser = argparse.ArgumentParser(description='Clean CSV files')
    parser.add_argument('input', help='Input CSV file')
    parser.add_argument('--output', '-o', default='cleaned.csv')
    parser.add_argument('--strip', action='store_true')
    parser.add_argument('--drop-duplicates', action='store_true')
    parser.add_argument('--drop-empty-rows', action='store_true')
    parser.add_argument('--fill', help='Fill missing values with this')
    args = parser.parse_args()
    clean_csv(args.input, args.output,
              strip=args.strip,
              drop_empty=args.drop_empty_rows,
              drop_duplicates=args.drop_duplicates,
              fill=args.fill)
Enter fullscreen mode Exit fullscreen mode

Usage:

python csv-cleaner.py messy_data.csv --strip --drop-duplicates --output clean.csv
Enter fullscreen mode Exit fullscreen mode

3. Batch Image Resizer

Need to resize 50 images for a website or social media post? Doing it manually in Photoshop is not the move.

#!/usr/bin/env python3
import os
from PIL import Image
import argparse


def resize_images(directory, max_width, output_format='webp', quality=85):
    for filename in os.listdir(directory):
        if not filename.lower().endswith(('.png', '.jpg', '.jpeg')):
            continue

        filepath = os.path.join(directory, filename)
        img = Image.open(filepath)

        # Calculate new height maintaining aspect ratio
        ratio = max_width / img.width
        new_size = (max_width, int(img.height * ratio))

        img_resized = img.resize(new_size, Image.LANCZOS)

        # Save in the chosen format
        new_name = os.path.splitext(filename)[0] + f'.{output_format}'
        output_path = os.path.join(directory, new_name)

        if output_format.lower() == 'webp':
            img_resized.save(output_path, 'WEBP', quality=quality)
        elif output_format.lower() == 'jpg':
            img_resized.save(output_path, 'JPEG', quality=quality)
        else:
            img_resized.save(output_path)

        print(f'Resized: {filename} -> {new_name} ({new_size[0]}x{new_size[1]})')


if __name__ == '__main__':
    parser = argparse.ArgumentParser(description='Batch resize images')
    parser.add_argument('directory', help='Directory with images')
    parser.add_argument('--max-width', type=int, default=1200)
    parser.add_argument('--format', default='webp', choices=['webp', 'jpg', 'png'])
    parser.add_argument('--quality', type=int, default=85)
    args = parser.parse_args()
    resize_images(args.directory, args.max_width, args.format, args.quality)
Enter fullscreen mode Exit fullscreen mode

Usage:

python image-resizer.py ./photos --max-width 800 --format webp --quality 90
Enter fullscreen mode Exit fullscreen mode

4. PDF Text Extractor

Need to pull text from multiple PDFs for analysis? This script handles it.

#!/usr/bin/env python3
import os
import json
import argparse

try:
    import PyPDF2
except ImportError:
    import subprocess
    subprocess.check_call(['pip', 'install', 'PyPDF2'])
    import PyPDF2


def extract_pdfs(directory, output_format='txt'):
    results = {}

    for filename in os.listdir(directory):
        if not filename.lower().endswith('.pdf'):
            continue

        filepath = os.path.join(directory, filename)
        text = []

        with open(filepath, 'rb') as f:
            reader = PyPDF2.PdfReader(f)
            for page_num in range(len(reader.pages)):
                page = reader.pages[page_num]
                extracted = page.extract_text()
                if extracted.strip():
                    text.append(f'--- Page {page_num + 1} ---\n{extracted}')

        results[filename] = '\n'.join(text)

    if output_format == 'json':
        output = json.dumps(results, indent=2, ensure_ascii=False)
        print(output)
    else:
        for pdf_name, content in results.items():
            print(f'\n{"="*60}')
            print(f'FILE: {pdf_name}')
            print(f'{"="*60}')
            print(content)


if __name__ == '__main__':
    parser = argparse.ArgumentParser(description='Extract text from PDFs')
    parser.add_argument('directory', help='Directory with PDF files')
    parser.add_argument('--format', choices=['txt', 'json'], default='txt')
    args = parser.parse_args()
    extract_pdfs(args.directory, args.format)
Enter fullscreen mode Exit fullscreen mode

5. Batch File Renamer

Renaming files one by one is maybe the single most tedious task in computing. This script gives you pattern-based renaming with a dry-run mode so you never accidentally destroy your file names.

#!/usr/bin/env python3
import os
import re
import argparse


def rename_files(directory, pattern, replacement, dry_run=False):
    for filename in os.listdir(directory):
        filepath = os.path.join(directory, filename)
        if not os.path.isfile(filepath):
            continue

        new_name = re.sub(pattern, replacement, filename)
        if new_name == filename:
            continue

        new_path = os.path.join(directory, new_name)
        if dry_run:
            print(f'[DRY RUN] {filename} -> {new_name}')
        else:
            os.rename(filepath, new_path)
            print(f'Renamed: {filename} -> {new_name}')


if __name__ == '__main__':
    parser = argparse.ArgumentParser(description='Batch rename files')
    parser.add_argument('directory', help='Target directory')
    parser.add_argument('--pattern', '-p', required=True, help='Regex pattern')
    parser.add_argument('--replace', '-r', required=True, help='Replacement')
    parser.add_argument('--dry-run', action='store_true', help='Preview')
    args = parser.parse_args()
    rename_files(args.directory, args.pattern, args.replace, args.dry_run)
Enter fullscreen mode Exit fullscreen mode

Usage:

# Replace spaces with underscores
python batch-renamer.py ./docs --pattern '\s+' --replace '_' --dry-run

# Remove "(1)" suffixes from duplicates  
python batch-renamer.py ./downloads --pattern '\((\d+)\)' --replace ''
Enter fullscreen mode Exit fullscreen mode

Putting It All Together

These scripts are intentionally simple — each one does one thing and does it well. Here are a few real workflows where I use them together:

Weekly photo processing:

# 1. Organize incoming photos
python file-organizer.py ~/CameraUploads --method date

# 2. Resize for web
python image-resizer.py ~/CameraUploads/2026-05 --max-width 1200 --format webp

# 3. Rename with consistent pattern
python batch-renamer.py ~/CameraUploads/2026-05 --pattern 'IMG_\d+' --replace 'vacation_'
Enter fullscreen mode Exit fullscreen mode

Data pipeline cleanup:

# 1. Extract data from PDF reports
python pdf-extractor.py ~/Reports --format json > raw_data.json

# 2. Clean and normalize
python csv-cleaner.py raw_data.csv --strip --drop-duplicates --fill "N/A"
Enter fullscreen mode Exit fullscreen mode

What's Next

If you found these useful, I've put together a bundle of 10 automation scripts with full documentation, proper error handling, and CLI interfaces for every script. It includes a web scraper, email sender, YouTube transcript downloader, text summarizer, and more — all ready to run.

Check it out here: AI Automation Scripts Bundle

The bundle also comes with a requirements.txt for one-command install and consistent logging across all scripts. Every script has a dry-run mode, so you can preview changes before making them.


Got a favorite automation script? Drop it in the comments. I'm always looking for new ideas to add to the collection.

Top comments (0)