How I Built an AI Agent That Watches My Logs and Opens Pull Requests While I Sleep 😴🤖

#ai #opensource #python #devops

As a developer, there are few things more anxiety-inducing than the Slack notification sound at 3:00 AM: "Production is down."

You groggily open your laptop, pull up the server logs, trace the exception through 5 different files, fix a missing try/catch block, push the hotfix, and try to go back to sleep.

I got tired of this. As an engineer obsessed with automation, I decided to build something that solves the problem for me. Enter AutoFixer-Agent.

What is AutoFixer?

AutoFixer is an autonomous AI agent (built with Python) that watches your production server logs in real-time. When it detects a crash or an exception, it doesn't just alert you — it investigates the stack trace, finds the exact bug in your codebase, generates a contextual fix using LLMs, and automatically opens a Pull Request on GitHub.

You wake up to a PR waiting for review, not a broken production environment. ✅

How it Works Under the Hood 🛠️

The architecture is surprisingly simple but immensely powerful:

The Log Watcher: A background Python daemon constantly tails your error.log.
The Brain (LLM Orchestration): When an exception is thrown, the agent captures the stack trace and uses the Google Gemini API to analyze the root cause. It maps the error back to the specific line of code in the repository.
The Fixer: The agent generates a drop-in replacement block of code.
The GitHub Bot: Using GitHub Actions and the GitHub CLI, the agent branches off main, applies the fix locally, runs sanity checks, and pushes a new Pull Request with a detailed explanation of the bug.

The "Aha!" Moment 💡

The hardest part wasn't generating the code — LLMs are great at that now. The hardest part was building the context window.

If a generic KeyError happens, the LLM needs to know what dictionary it came from. A naked stack trace is not enough.


python
# Bad prompt (hallucination-prone):
"Fix this error: KeyError: 'user_id'"

# Good prompt (context-aware):
"Fix this error: KeyError: 'user_id'
Surrounding code (lines 45-95 of auth/handler.py):
...
def process_request(payload):
    user = payload['user_id']  # <-- line 52
..."

To solve this, AutoFixer dynamically pulls in the surrounding **50 lines of code** from the file mentioned in the stack trace before sending the prompt to the AI. This gives the model enough context to write a *safe*, production-ready fix rather than a hallucinated one.

## Why This Matters

We are moving from **"AI as a pair programmer"** (GitHub Copilot) to **"AI as a DevOps team member."**

Tools like AutoFixer prove that we can delegate tedious, high-stress tasks — like 3 AM hotfixes — to autonomous systems that handle the boring parts while we sleep.

## Try it Out!

I've open-sourced the entire project! You can clone it, simulate a crash in your local logs, and watch it generate a GitHub PR in real time.

🔗 **GitHub:** [turfin-logic/autofixer-agent](https://github.com/turfin-logic/autofixer-agent)

If you're into automation, DevSecOps, or AI agents — drop a ⭐ on the repo or contribute. Let's automate the boring (and stressful) stuff together. 💪

Top comments (4)

Harjot Singh • May 31

An agent that watches logs and opens PRs while you sleep is the dream use-case, and it's also exactly where the "let it act autonomously" question gets sharp - reading logs and diagnosing is low-risk, but opening a PR with a proposed fix is the moment it crosses from observer to actor. The thing that makes this safe and trustworthy is that the PR is a proposal, not an auto-merge: the agent does the tedious triage (parse the error, find the likely cause, draft the patch) and you keep the gate. Propose-then-approve, not act-then-hope. That boundary is what makes "while I sleep" actually sleepable.

This is the exact spirit of what I build - Moonshift is a multi-agent pipeline that takes a prompt to a deployed SaaS, and the whole night-shift idea (agents do the work while you're away, you review the result) is literally the brand: the agents run the night shift, you wake up to something to approve. A verify layer gates each step so a bad diagnosis doesn't become a bad merge, and multi-model routing keeps it cheap (~$3 a build, first run free no card). Really like this project. How are you keeping a misdiagnosed log from producing a confidently-wrong PR - confidence threshold before it opens one, or does every alert get a draft and you triage in the morning? The false-positive PR is the thing I'd guard hardest.

Rajesh Bhanushali • May 31

Hey Harjot, thanks man! You totally nailed it with the "propose-then-approve" point. That's exactly the sweet spot. If it auto-merges, I'm definitely not sleeping 😅. Moonshift sounds super cool btw, love the night-shift branding and that $3 build routing is actually insane.

To answer your question on false positives (which is honestly the biggest headache): I rely on a mix of local test loops and a confidence threshold.

Basically, before opening a PR, the agent tries to run the proposed fix locally against the linters and the specific failing test. If it breaks syntax or the test still fails, it just bails out.

On top of that, it assigns itself a confidence score. If it's highly confident AND the local tests pass, it opens the PR. But if it's unsure, it doesn't touch the PR tab at all—it just creates a draft issue with the patch attached so I can triage it over morning coffee.

Tying it to actual compiler/test feedback is the only way I've found to stop it from confidently shipping garbage.

Definitely gonna check out Moonshift. Keep building awesome stuff/.

Harjot Singh • May 31

Haha exactly, auto-merge at 3am is how you wake up to a fire. Glad the night-shift framing landed; that's literally the origin story (agents work the shift while you sleep, you wake to something to approve, never something already merged). If you ever point Moonshift at a real repo, the free first run is the easiest way to see the propose-then-gate flow end to end. Keep building the log-watcher though, that's a genuinely useful agent, and the PR-as-proposal boundary you've got is the right call.

Rajesh Bhanushali • May 31

haha alright true, waking up to a broken prod because of a 3am auto-merge is the stuff of nightmares 😅 I'll definitely check out Moonshift on a real repo soon to see that propose-then-gate flow in action.

btw, really dig what you guys are building. would love to connect and talk more about agent stuff. you on twitter or discord? drop your handle.