I got tired of reviewing my own pull requests at 2 AM. So I built a GitHub Action that does it for me. Then I built a cron system to keep that action alive. Then I added 55 more AI agent jobs to that cron system because, honestly, I couldn't stop.
Here's what's actually running, what it costs, and what I'm building toward.
The Code Reviewer That Started It All
The core product: a GitHub Action called sulthonzh/code-reviewer that lives at github.com/sulthonzh/code-reviewer. Every time someone opens a PR on any of my repos, five jobs fire off in sequence:
- Secret scan — checks the diff for leaked API keys, passwords, private keys
- AI review — sends the diff to Z.AI's GLM model, gets back security/quality/style feedback
- Quality gate — runs linting, type checks, test thresholds
- Auto-merge — if the AI approved AND quality passed, merges automatically
- Auto-release — on push to main, cuts a GitHub release with changelog
Here's the real workflow. This runs on 240+ repos right now:
name: AI Code Review
on:
pull_request:
types: [opened, synchronize, ready_for_review]
push:
branches: [main]
concurrency:
group: review-${{ github.ref }}
cancel-in-progress: true
permissions:
pull-requests: write
contents: write
checks: write
statuses: write
env:
ZAI_BASE_URL: "https://api.z.ai/api/coding/paas/v4/"
jobs:
secret-scan:
name: "🔒 Secret Scan"
runs-on: ubuntu-latest
outputs:
secrets_found: ${{ steps.scan.outputs.found }}
steps:
- uses: actions/checkout@v6
with:
fetch-depth: 0
- name: Scan diff for secrets
id: scan
uses: sulthonzh/code-reviewer@main
with:
command: secret-scan
github-token: ${{ secrets.GITHUB_TOKEN }}
- name: Block if secrets found
if: steps.scan.outputs.found == 'true'
run: |
echo "::error::Found potential secret(s) in the diff. Remove before merging."
exit 1
ai-review:
name: "🤖 AI Review"
runs-on: ubuntu-latest
needs: secret-scan
outputs:
approved: ${{ steps.review.outputs.approved }}
steps:
- uses: actions/checkout@v6
with:
fetch-depth: 0
- name: Detect project context
id: context
uses: sulthonzh/code-reviewer@main
with:
command: detect-context
github-token: ${{ secrets.GITHUB_TOKEN }}
- name: Route model by diff size
id: model
run: |
DIFF_LINES=$(git diff origin/main...HEAD 2>/dev/null | wc -l || echo 0)
if [ "$DIFF_LINES" -gt 500 ]; then
echo "model=glm-5.1" >> "$GITHUB_OUTPUT"
else
echo "model=glm-4.5" >> "$GITHUB_OUTPUT"
fi
- name: Run AI review
id: review
uses: sulthonzh/code-reviewer@main
with:
command: ai-review
model: ${{ steps.model.outputs.model }}
project-type: ${{ steps.context.outputs.project_type }}
zai-api-key: ${{ secrets.ZAI_API_KEY }}
zai-base-url: ${{ env.ZAI_BASE_URL }}
github-token: ${{ secrets.GITHUB_TOKEN }}
quality-gate:
name: "✅ Quality Gate"
runs-on: ubuntu-latest
needs: [secret-scan, ai-review]
outputs:
passed: ${{ steps.gate.outputs.passed }}
steps:
- uses: actions/checkout@v6
with:
fetch-depth: 0
- name: Run quality checks
id: gate
uses: sulthonzh/code-reviewer@main
with:
command: quality-gate
github-token: ${{ secrets.GITHUB_TOKEN }}
auto-merge:
name: "🔀 Auto-Merge"
runs-on: ubuntu-latest
needs: [ai-review, quality-gate]
if: >-
needs.ai-review.outputs.approved == 'true' &&
needs.quality-gate.outputs.passed == 'true' &&
github.event_name == 'pull_request'
steps:
- uses: actions/checkout@v6
- name: Approve and merge
uses: sulthonzh/code-reviewer@main
with:
command: auto-merge
github-token: ${{ secrets.GITHUB_TOKEN }}
auto-release:
name: "📦 Auto-Release"
runs-on: ubuntu-latest
if: >-
github.event_name == 'push' &&
github.ref == 'refs/heads/main'
steps:
- uses: actions/checkout@v6
with:
fetch-depth: 0
- name: Detect and release
uses: sulthonzh/code-reviewer@main
with:
command: auto-release
github-token: ${{ secrets.GITHUB_TOKEN }}
The model routing bit
Small PRs (under 500 lines diff) hit glm-4.5. Bigger ones get glm-5.1. This isn't arbitrary. The larger model costs more per token but handles cross-file reasoning better. Most PRs are under 500 lines, so the cheap model handles 90% of traffic.
The API endpoint is Z.AI (from 智谱AI, a Chinese AI company). Their GLM models are OpenAI-compatible, so the integration was just pointing the OpenAI SDK at a different base URL. No wrappers, no adapters.
What it actually costs
Per review:
Z.AI API call: ~$0.002
GitHub Actions: ~$0.003 (free tier mostly covers this)
Total: ~$0.006 per review
I'm spending roughly $3-5/month on API calls across all repos. That's less than a coffee.
The Secret Scanning Story
Here's where it got interesting. Before I built the secret-scan job, I ran a manual sweep across 240 public repos. Found 9 repos with real leaked credentials in git history:
- AWS access keys
- MySQL root passwords
- RSA private keys
- Hardcoded JWT secrets
Cleaning them wasn't just git rm. The secrets were in history. I used git filter-repo to rewrite the affected repos, rotated every compromised credential, and added the secret-scan job to the workflow to prevent recurrence.
That job alone has caught three attempted credential pushes in the last month. Worth the entire build.
The Babysitter: OpenClaw Cron Fleet
The code reviewer runs fine on its own. But I kept adding things. A marketing supervisor that publishes blog posts to Dev.to (10 articles so far). A deployment supervisor that ships to Vercel free tier. An IDX stock screener that runs 20+ intraday scans on the Indonesian exchange. A wealth builder that scaffolds SaaS products.
All of these are AI agent jobs running on cron schedules through a system I call OpenClaw.
Current state: 56 jobs, monitored by a guardian process that scans every few hours.
Guardian cycle 2026-06-11 04:48 WIB:
- 56 jobs scanned
- 0 with consecutiveErrors >= 2
- 1 single-error transient (wealth-builder timeout)
- No actions taken
The guardian doesn't just watch. It has rules:
- 1 consecutive error: ignore, probably transient
- 2 consecutive errors: monitor, create incident ticket
- 5+ consecutive errors: auto-heal (restart job, switch model, increase timeout)
This actually worked last week. The marketing supervisor started failing because the GLM model hit rate limits. The guardian detected 2+ consecutive errors, switched the model to glm-4.5-air (lighter, faster), bumped the timeout from 2700s to 3600s. Resolved without me touching anything.
The circuit breaker pattern
Each agent job wraps its API calls in a circuit breaker. Here's the pattern from my IDX screener:
class HealthRecord:
"""Track health of a single component."""
def record_failure(self):
self.consecutive_failures += 1
self.consecutive_successes = 0
if self.consecutive_failures >= 5:
self.circuit_open = True
self.circuit_opened_at = time.time()
def record_success(self, duration_ms: float = 0):
self.consecutive_successes += 1
self.consecutive_failures = 0
if self.consecutive_successes >= 3:
self.circuit_open = False # auto-close after 3 wins
@property
def is_healthy(self):
if not self.circuit_open:
return True
# Half-open: try again after 5 min cooldown
if time.time() - self.circuit_opened_at > 300:
return True
return False
5 failures in a row opens the circuit. 3 successes in a row closes it. 5-minute half-open cooldown lets it retry. This runs in production and has prevented cascading failures during API outages.
The Architecture (What Exists vs. What's Next)
Here's the honest map:
┌─────────────────────────────────────────────────────┐
│ WHAT'S LIVE │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌───────────┐ │
│ │ AI Code │ │ OpenClaw │ │ Guardian │ │
│ │ Reviewer │ │ Cron Fleet │ │ Monitor │ │
│ │ (240 repos) │ │ (56 jobs) │ │ (auto- │ │
│ │ │ │ │ │ heal) │ │
│ └──────────────┘ └──────────────┘ └───────────┘ │
│ │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ Marketing │ │ Secret Scan │ │
│ │ Supervisor │ │ (9 repos │ │
│ │ (10 posts) │ │ cleaned) │ │
│ └──────────────┘ └──────────────┘ │
└─────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────┐
│ WHAT I'M BUILDING TOWARD │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌───────────┐ │
│ │ Wallet │ │ Cloning │ │ Revenue │ │
│ │ Module │ │ Engine │ │ Engine │ │
│ │ (Stripe) │ │ (multi-cloud)│ │ (SaaS) │ │
│ └──────────────┘ └──────────────┘ └───────────┘ │
└─────────────────────────────────────────────────────┘
The bottom row doesn't exist yet. I'm sharing the architecture because it's where this is heading, but I want to be clear about the boundary.
What's next (honest roadmap)
Near term (building now):
- Wallet module with Stripe integration for the code reviewer SaaS
- Better incident response (currently the guardian can restart jobs and switch models; adding credential rotation automation)
Medium term (designing):
- Multi-cloud cloning (snapshot state, deploy to new provider)
- Revenue engine (paid tiers for the code reviewer, API marketplace listing)
Far term (thinking about):
- Swarm coordination between cloned instances
- Knowledge base that actually learns from review patterns over time (currently static prompts)
Why Z.AI and Not OpenAI
Three reasons:
Cost. GLM-4.5 costs roughly 10x less per token than GPT-4o for code review quality that's comparable for the patterns I care about (security, style, common bugs).
Latency. The API responds in under 2 seconds for most diffs. OpenAI was averaging 4-5 seconds.
OpenAI-compatible. Zero code changes to the OpenAI SDK. Just swap
baseURLandapiKey. I could switch back to OpenAI (or add Claude, or Gemini) in about 10 minutes if Z.AI went down.
That last point matters. Vendor lock-in is the enemy of resilience.
Try It
Drop this into .github/workflows/ai-review.yml on any repo:
name: AI Code Review
on:
pull_request:
types: [opened, synchronize]
jobs:
review:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
with:
fetch-depth: 0
- uses: sulthonzh/code-reviewer@main
with:
command: ai-review
zai-api-key: ${{ secrets.ZAI_API_KEY }}
github-token: ${{ secrets.GITHUB_TOKEN }}
You'll need a Z.AI API key from open.z.ai. The free tier covers a few hundred reviews per month.
The code is open source at github.com/sulthonzh/code-reviewer.
Top comments (0)