I'll go first.
Three years ago I wrote a 20-line Python script that watches a folder for new CSV files, deduplicates them, and moves the clean version to Dropbox.
I wrote it in 15 minutes. It has saved me roughly 200 hours since then.
import time
from pathlib import Path
import pandas as pd
import shutil
watch = Path('~/Downloads').expanduser()
done = Path('~/Dropbox/clean-data').expanduser()
done.mkdir(exist_ok=True)
seen = set()
while True:
for f in watch.glob('*.csv'):
if f.name not in seen:
df = pd.read_csv(f).drop_duplicates()
df.to_csv(done / f.name, index=False)
seen.add(f.name)
print(f'Cleaned: {f.name} ({len(df)} rows)')
time.sleep(10)
It's ugly. It's not production-grade. But it runs every single day and I've never had to touch it.
What's yours?
I'm collecting the best ones for a roundup post. The most interesting scripts I'll feature (with credit) in a follow-up article.
Rules:
- Any language
- Must be something you actually use (not a hypothetical)
- Bonus points if it's under 50 lines
Drop your script below. Even a description without code is fine — I'm curious what problems people solve with quick automation.
More from me: 10 Dev Tools I Use Daily | 77 Scrapers on a Schedule | 150+ Free APIs
Top comments (0)