How to automate your boring tasks with scripts: a beginner tutorial for engineers
Automation as Code: A Practical Guide to Scripting, Batching, and CI/CD
Automating repetitive development tasks saves time, reduces human error, and makes your workflow scalable. This guide stitches together scripting for file manipulation, batch operations, database tasks, deployment helpers, and CI/CD automation, with patterns you can reuse in real projects. We’ll cover tool selection, representative patterns, and concrete examples you can adapt.
Choosing the right tool for the job
- Python: Best for cross-platform scripting, data processing, and tasks that require readability and rich ecosystems (requests, pandas, sqlalchemy, click).
- Bash: Fast for simple, shell-native tasks on Unix-like systems; excels at orchestrating other commands and file operations.
- Makefiles: Great for declarative, dependency-driven build and task orchestration; keeps pipelines explicit and portable.
- Node.js / Deno: Useful when your stack revolves around JavaScript/TypeScript; good for tooling with rich npm ecosystems.
- Configuration-as-code (CI/CD): Separate concerns by tool (GitHub Actions, GitLab CI, CircleCI, Jenkins) and leverage built-in caching, matrix builds, and artifact handling.
Patterns for composable automation
- Small, focused tasks: Create single-responsibility scripts that can be chained or composed.
- Idempotence: Design tasks so running them multiple times yields the same outcome.
- Clear inputs/outputs: Use explicit parameters and write outputs to known locations or stdout in structured formats.
- Dependency graphs: Use Makefiles or orchestrators to express task order and parallelism.
- Dry runs: Support a dry-run mode to preview work without side effects.
- Logging and observability: Emit structured logs, use log levels, and write to rotating files when appropriate.
- Config-driven behavior: Read from a config file (YAML/JSON/INI) to avoid hard-coding paths or secrets.
File manipulation and batch operations
- Move, copy, sanitize, and transform files with predictable patterns.
- Example approach: Python scripts that take a source glob, destination, and a transform function (e.g., renaming, content replacement).
Example: batch rename and content replace in Python
- Problem: Rename all markdown files and update internal references to a project slug.
- Approach:
- Use glob to collect files.
- For each file, compute new name and update contents using regex.
- Write back, with optional dry-run.
- Skeleton:
- config: source_dir, pattern, old_slug, new_slug, dry_run
- steps: collect files -> for each file: read -> replace references -> write -> log
Database tasks
- Common tasks: migrations, seed data, backups, health checks.
- Pattern: idempotent migrations, versioned SQL, or Python-based schema evolution with a lightweight ORM.
Example: simple Python DB task outline
- Use SQLAlchemy or psycopg2 for PostgreSQL.
- Tasks:
- run_migrations: apply changes if a version table indicates newer schema.
- seed_data: insert initial data if not present.
- dump_schema: output schema as SQL or JSON for inspection.
- Include robust retry logic and transaction handling.
Deployment helpers
- Treat deployment as a reproducible workflow: package, validate, deploy, verify.
- Patterns:
- Build artifact creation, then automated validation and smoke tests.
- Rollback guardrails: capture previous version identifiers and provide a quick revert path.
- Environment-specific overrides via config.
Example: simple deploy script outline (Python or Bash)
- Steps:
- Build artifact (zip/ttar/pem).
- Upload to artifact store or registry.
- Deploy to target environment with a single command (e.g., kubectl apply, docker compose up, or a cloud CLI).
- Smoke test: ping service, run a health check endpoint.
- Notify on success/failure (Slack/email).
CI/CD automation
- Goals: fast feedback, reproducible environments, and safe promotion of changes.
- Patterns:
- Matrix builds: test across Python versions, Node versions, or OSes.
- Caching: leverage language-specific caches (pip, npm, cargo) to speed up pipelines.
- Artifacts and provenance: attach build metadata (commit hash, branch, environment) to artifacts.
- Gatekeeping: require test/lint checks before merging.
- Common tools: GitHub Actions, GitLab CI, CircleCI, Jenkins.
Concrete, real-world examples
1) Python file batch processing with Makefile orchestration
- Files: scripts/rename_and_fix.py, config.yaml
- Makefile:
- targets: dry-run, run, test, lint
- dependencies: Python env, requirements.txt
- Python script (rename_and_fix.py):
- Accept source_dir, old_slug, new_slug, dry_run
- Rename matching files and update internal references with regex
- Usage:
- make dry-run SOURCE_DIR=projects OLD_SLUG=old NEW_SLUG=new
- make run SOURCE_DIR=projects OLD_SLUG=old NEW_SLUG=new
2) Bash batch operation: aggregate logs from multiple services
- Script: bin/collect_logs.sh
- Behavior:
- Accept a list of services, fetch logs via curl or docker logs, consolidate into a single file, rotate daily.
- Pattern: idempotent aggregation, non-destructive outputs, timestamped files.
3) Go-to-task database migration wrapper
- Tooling: sql-migrate or plain SQL scripts
- Approach:
- A small CLI that reads a migrations directory, applies unapplied migrations, and updates a schema_version table.
- Benefit: simple, fast, and portable across environments.
4) CI/CD: GitHub Actions workflow for Python app
- Steps:
- Setup Python, install dependencies, run lint, run tests with multiple versions, build a wheel, publish if on main.
- Caching: cache pip packages and unit test artifacts.
- Example workflow snippets:
- uses: actions/setup-python@v5
- with: python-version: | 3.9 3.10
- run: pip install -r requirements.txt
- run: pytest -q
5) Makefile-powered UI build pipeline
- Targets: install, lint, test, build, deploy
- Dependencies encode order and parallelism, enabling you to run builds locally or in CI with consistent behavior.
Tips to get started quickly
- Start small: pick a frequent manual task and automate the simplest observable success path.
- Make it easy to test: add a dry-run mode and unit-testable components.
- Separate concerns: keep scripts generic and configurable rather than hard-coded for one project.
- Centralize configuration: store paths, environment names, and flags in a config file or environment variables.
- Document usage: add a README with examples for each script and Makefile target.
Illustration: a small automation loop
- Step 1: You write a Python script to extract new user records from a CSV and insert them into a database.
- Step 2: You wrapper this in a Makefile target that ensures dependencies are in place and can run on a schedule.
- Step 3: You hook this into CI to run on every push to a staging branch, with a safety check that logs failed records.
- Step 4: You deploy the updated user seed to staging, smoke-test, then promote to production after approval.
Would you like a ready-to-use starter kit tailored to your stack (Python + PostgreSQL, Bash-based ops, or Node.js with a Makefile), including a minimal Makefile, a couple of scripts, and a sample CI workflow? If so, tell me your environment (OS, language, and CI system), and any repetitive tasks you want automated first.
-
Rizwan Saleem | https://rizwansaleem.co
Top comments (0)