Suifeng023

Posted on May 12

How to Build a 24/7 AI Coding Agent on a $50 VPS

#programming #ai #devops #automation

Most AI coding advice still assumes a human is sitting inside an IDE waiting for autocomplete. That is useful, but it is not the most powerful workflow.

The real unlock is a small coding agent that can run on a VPS, inspect a repository, make a plan, edit files, run tests, summarize failures, and keep working while you are away.

This article shows a practical setup you can build without enterprise infrastructure.

The goal

The goal is not to create a magical senior engineer. The goal is to automate the boring middle of development:

create boilerplate
refactor repetitive files
write first-pass tests
update documentation
scan logs
open small pull requests
summarize what changed

If the task requires product judgment, architecture tradeoffs, or security approval, the agent should stop and ask. If the task is mechanical, it should execute.

Minimal architecture

A useful 24/7 coding agent needs five parts:

A task queue: where work items live
A planner: turns a request into steps
A tool layer: file edits, shell commands, git, test runners
A memory layer: project notes, conventions, previous failures
A review gate: prevents unsafe deploys

You can run all of this on a small VPS. The expensive part is usually not compute; it is token usage. That means the design should minimize unnecessary context.

Recommended stack

For a lean version, I would use:

Python for orchestration
GitHub issues or a local SQLite table as the task queue
Docker for isolated execution
ripgrep for code search
pytest, npm test, or your normal test command
one strong model for planning
one cheaper model for summaries and repetitive transformations

A simple folder structure works:

agent/
  tasks.db
  main.py
  tools/
    shell.py
    files.py
    git.py
  memory/
    project_rules.md
    failure_log.md
  workspaces/

The agent loop is straightforward:

while True:
    task = get_next_task()
    if not task:
        sleep(60)
        continue

    repo = checkout_repo(task.repo)
    context = collect_relevant_context(repo, task)
    plan = ask_model_for_plan(task, context)

    for step in plan.steps:
        result = execute_step(step)
        if result.failed:
            fix = ask_model_to_debug(step, result.logs)
            execute_step(fix)

    run_tests()
    create_summary()
    open_pull_request_or_request_review()

This is not glamorous, but it works.

The most important rule: give the agent less context

A common mistake is dumping the whole repository into the prompt. That is slow, expensive, and often worse.

Instead, build context in layers:

project rules
relevant file tree
files found by search
failing test output
exact task request

For example, if the task is “add CSV export to invoices,” the agent probably needs the invoice model, invoice routes, export utilities, tests, and coding rules. It does not need your entire frontend.

Safety gates I recommend

A coding agent with shell access needs boundaries. I use rules like:

no production credentials in the workspace
no destructive shell commands unless explicitly approved
no direct deploys without review
always run tests before creating a PR
summarize every file changed
if tests fail twice, stop and ask for help

The best agent is not the one that never fails. It is the one that fails safely.

High-ROI tasks to automate first

Start with tasks where correctness is easy to verify:

1. Test generation

Give the agent one module and ask it to create missing tests. This is great because the test runner provides a clear signal.

2. Documentation updates

Ask it to update README sections, API examples, changelogs, or migration notes after a change.

3. Mechanical refactors

Renaming functions, updating imports, replacing deprecated APIs, and applying formatting rules are perfect agent tasks.

4. Bug reproduction

Give the agent an error message and ask it to write a failing test before attempting a fix.

5. Dependency maintenance

The agent can inspect outdated packages, read changelogs, update versions, run tests, and report risks.

A simple prompt template

Use a consistent system prompt:

You are a cautious coding agent working inside a disposable repo checkout.
Your job is to complete the task with the smallest safe change.
Before editing, inspect relevant files.
After editing, run tests.
If you are uncertain or need secrets, stop and ask.
Return a summary with files changed, commands run, and remaining risks.

Then pass task-specific context:

Task: Add CSV export for invoice list.
Repo rules: Use existing service pattern. No new dependency without approval.
Relevant files: ...
Test command: pytest tests/invoices

Cost control

To keep the monthly bill low:

use a cheaper model for log summaries
cache project rules
cap each task at a maximum number of model calls
truncate logs aggressively
run small tasks instead of giant tasks
store previous solutions in memory

In practice, many useful tasks cost cents, not dollars, if the agent retrieves only relevant context.

Final thought

A 24/7 coding agent is not a replacement for engineering judgment. It is a tireless junior teammate for repetitive work. The winning workflow is human direction plus agent execution plus automated tests.

Start small: one repository, one task type, one review gate. Once that loop is reliable, add more tools.

Check out my AI Prompt Packs: https://payhip.com/b/ADsQI | https://payhip.com/b/6lqVh | https://payhip.com/b/XLNPm | https://payhip.com/b/CAN9Z

DEV Community