Peyton Green

Posted on Apr 22 • Edited on May 24

Python Developer AI Toolkit, Part 2: Five CLI scripts that automate the prompts

#ai #automation #python #productivity

Part 1 of this series covered the prompt library — 272 prompts for Python and backend development organized by task type. This is Part 2: the five CLI scripts that turn those prompts into automation tools you can run from the command line or wire into your development workflow.

The goal isn't to replace your IDE or your existing toolchain. It's to make AI-assisted code review, test generation, documentation, commit messages, and multi-step analysis available as shell commands — so the prompts run when you need them, not when you remember to open a chat window.

All five scripts use the same dependencies (just requests) and support both Claude and GPT via environment variable. They're in the scripts/ directory of the AI Dev Toolkit.

Script 1: ai_code_reviewer.py

What it does: Sends a source file to AI and returns a structured code review.

python ai_code_reviewer.py src/app.py
python ai_code_reviewer.py src/app.py --format markdown > review.md
python ai_code_reviewer.py src/app.py --format json | jq '.issues[]'

The review covers four categories: security issues, performance concerns, style and readability, and actionable suggestions. The structured output is the key part — "review this code" produces whatever the model feels like. A structured prompt with defined categories produces consistent output you can act on.

Three output formats:

text (default): readable console output
markdown: Markdown-formatted output suitable for saving as a file or pasting into a PR comment
json: machine-parseable output for scripting

# Review every changed file before committing
git diff --name-only HEAD | grep '\.py$' | xargs -I {} python ai_code_reviewer.py {}

# Generate markdown reviews for all files in a PR
git diff origin/main --name-only | grep '\.py$' | while read f; do
  python ai_code_reviewer.py "$f" --format markdown > "reviews/$(basename $f .py)-review.md"
done

When I actually use this: Before any PR that touches business logic. Running it on the file before writing the commit message catches issues I'd otherwise miss — especially the edge cases that don't show up in the happy-path test.

Script 2: test_generator.py

What it does: Analyzes a Python source file and generates a pytest test suite for it.

python test_generator.py src/utils.py
python test_generator.py src/utils.py --output tests/test_utils.py
python test_generator.py src/api/routes.py --output tests/test_routes.py --model gpt

The output is runnable pytest code. The generator analyzes function signatures, type hints, and docstrings to infer what to test — including edge cases and error conditions that aren't always obvious when you're writing the function.

# Generate and immediately run the tests
python test_generator.py src/utils.py --output tests/test_utils.py && python -m pytest tests/test_utils.py -v

The generated tests aren't always perfect out of the box. Complex dependency injection, database interactions, and external API calls need adjustments. But for utility functions, data transformations, and validation logic, the first pass is often 80% of what you'd write manually — in about 15 seconds instead of 20 minutes.

The workflow that's changed how I write code: Run test_generator.py on a new module immediately after writing it. The generated tests are a rapid sanity check on whether the function contract matches what I intended. If the generated tests don't make sense for how I expect the function to be used, the function is probably poorly named or badly documented.

Script 3: doc_generator.py

What it does: Generates structured Markdown documentation from Python source files or entire directories.

# Document a single file
python doc_generator.py src/utils.py

# Document an entire package
python doc_generator.py src/ --output docs/

# Generate and redirect to a file
python doc_generator.py src/api/routes.py > docs/routes.md

The output is structured Markdown: module overview, function/class signatures, parameter descriptions, return types, and usage examples. It reads docstrings if they exist and supplements them — or generates from scratch if there are no docstrings.

For packages with clear type hints and function names, the output requires minimal editing. For older codebases with minimal documentation, it produces a first pass that's faster to edit than to write from scratch.

# Document all Python files in a project and write to docs/
find src/ -name "*.py" | while read f; do
  python doc_generator.py "$f" --output "docs/$(dirname ${f#src/})/$(basename $f .py).md"
done

Script 4: commit_message_ai.py

What it does: Generates a conventional commit message from your staged git diff.

# See the suggested message
git add -p
python commit_message_ai.py

# Copy to clipboard
python commit_message_ai.py --copy

# Commit directly
python commit_message_ai.py --apply

# Install as a git hook (runs automatically on every `git commit`)
python commit_message_ai.py --install

The generated messages follow the Conventional Commits format: type(scope): description. The model reads the actual diff — not just the filenames — and produces a message that accurately describes what changed and why.

The --install flag is the useful one. It installs the script as a prepare-commit-msg git hook. After that, every time you run git commit, your editor opens with an AI-generated message pre-filled. Keep it, edit it, or replace it — but you're starting from something specific rather than a blank cursor.

# Install and test
python commit_message_ai.py --install
git add .
git commit  # Your editor opens with: "feat(auth): add JWT refresh token rotation"

Caveat: The hook reads staged changes from git diff --cached. If you stage everything with git add ., the message covers the full diff. If you stage selectively with git add -p, the message covers only what you've staged. Either works — the message reflects what's actually in the commit.

Script 5: prompt_chain.py

What it does: Runs multi-step AI workflows defined in YAML configuration files.

python prompt_chain.py chains/code_review.yaml --input src/app.py
python prompt_chain.py chains/bug_analysis.yaml --input "Login fails on Safari with SSO enabled"
python prompt_chain.py chains/refactor_plan.yaml --input src/legacy_module.py

Single-prompt interactions work well for isolated tasks. Multi-step analysis — where the output of one prompt feeds into the next — is more powerful for complex problems. prompt_chain.py lets you define these pipelines in YAML without writing Python to wire them together.

A simple chain (included in the toolkit at chains/code_review.yaml):

name: "Staged Code Review"
description: "Deep code review: issues first, then fix suggestions, then a summary"
steps:
  - name: "identify_issues"
    prompt: |
      Analyze this code and list every issue you find — bugs, security concerns,
      performance problems, style violations. Be specific and exhaustive.
      Code:
      {{input}}
    temperature: 0.2

  - name: "generate_fixes"
    prompt: |
      For each issue identified in the previous analysis, provide a concrete fix.
      Show the specific change needed, not general advice.
      Issues identified:
      {{identify_issues}}
    temperature: 0.3

  - name: "write_summary"
    prompt: |
      Write a two-paragraph summary of the code review: what the code does well,
      and what needs attention before merging. Be direct.
      Full analysis:
      {{generate_fixes}}
    temperature: 0.4
    max_tokens: 512

Run it:

python prompt_chain.py chains/code_review.yaml --input src/payment_processor.py

The chains directory in the toolkit includes five pre-built chains:

code_review.yaml — the staged review above
bug_analysis.yaml — issue description → root cause hypotheses → debugging steps
architecture_review.yaml — design analysis → trade-offs → recommendations
test_strategy.yaml — module analysis → test cases → mock requirements
refactor_plan.yaml — code smells → refactor candidates → prioritized action plan

You can modify these or build your own. The YAML format supports variable interpolation, step-level temperature and token limits, and referencing any previous step's output in a later prompt via {{step_name}}.

Putting them together

A realistic workflow for a backend Python module:

# After writing a new module
python test_generator.py src/new_feature.py --output tests/test_new_feature.py
python -m pytest tests/test_new_feature.py -v  # check what it caught

# Before committing
python ai_code_reviewer.py src/new_feature.py --format markdown > .review.md
cat .review.md  # address anything critical

# When committing
git add src/new_feature.py tests/test_new_feature.py
python commit_message_ai.py --apply  # generates and commits with AI message

# If the module is going to be maintained by others
python doc_generator.py src/new_feature.py --output docs/new_feature.md

None of these steps are required. Each one is independently useful. The combination is the point — it removes the friction from the parts of Python development that aren't writing the actual logic.

Setup

All five scripts require only requests:

pip install requests

For Claude (default):

export ANTHROPIC_API_KEY=your_key_here

For GPT:

export OPENAI_API_KEY=your_key_here

Switch models at the command line with --model gpt or --model claude.

The scripts are in scripts/ inside the AI Dev Toolkit. The prompt chains are in scripts/chains/. The 272 prompt library from Part 1 is in prompts/, organized by category.

What's in the full toolkit

The AI Dev Toolkit includes:

272 prompts organized across 10 categories (code generation, debugging, architecture, testing, code review, DevOps, documentation, data/SQL, frontend, AI integration)
5 CLI scripts covered in this article
5 pre-built prompt chains for multi-step workflows
Works with Claude and GPT — bring your own API key

Available at kazdispatch.gumroad.com for $29. Less than an hour of a developer consultant's time.

If you found Part 1 useful, Part 2 is the operational half. The prompts give you the right questions; the scripts give you the automation to run them without friction.

Part 1 of this series: Python Developer AI Toolkit, Part 1: How I stopped rewriting the same prompts and packaged 272 that actually work — update URL after Part 1 publishes

Next in the series: pytest fixtures that actually scale — patterns from 2 years of Python CI pipelines