pulkitgovrani

Posted on May 24

Shadow CTO — A GitHub Repo Memory Container Powered by Hermes Agent

#hermesagentchallenge #devchallenge #agents

Hermes Agent Challenge Submission: Build With Hermes Agent

This is a submission for the Hermes Agent Challenge: Build With Hermes Agent

What I Built

Shadow CTO is a persistent AI memory system that watches any GitHub repository and answers natural language questions about why engineering decisions were made — not just what changed.

Ask it "Why was Redis removed?" and it responds with the actual commit reasoning, PR context, and what problem it solved. Not a hallucination. Not a summary of the README. The actual institutional memory of your codebase, accumulated over time.

The core insight: engineers leave what in commits but rarely write down why. Over time that institutional knowledge walks out the door. Shadow CTO fills that gap by giving every repo a persistent AI brain that never quits, never forgets, and never joined another team.

Demo

Live Q&A example (streamed from Hermes memory):

You: "Why was the authentication system changed?"

Shadow CTO: In PR #342 merged on March 14th, the team replaced
JWT-based auth with session cookies after discovering that
the token refresh logic had a race condition under concurrent
requests (issue #289). The decision was driven by a production
incident that caused 2% of users to be logged out unexpectedly
during peak hours. The tradeoff accepted was slightly higher
server memory in exchange for eliminated client-side token state.

The 4 tabs:

Dashboard — Add repos, trigger sync, ask free-form questions with live streaming responses
Decisions — Browse every extracted engineering decision with rationale, type, confidence score, and tags
Patterns — Autonomous failure pattern detection (Hermes flags what keeps breaking)
Autonomous Jobs — View the cron jobs Shadow CTO registered with Hermes's scheduler

Code

github.com/pulkitg/shadow-cto

My Tech Stack

Layer	Technology
AI Memory	Hermes Agent (persistent sessions via `X-Hermes-Session-Id`)
Backend	Python 3.12, FastAPI, APScheduler
Database	SQLite via SQLAlchemy async
Frontend	React 18, Vite, Tailwind CSS
GitHub	PyGitHub (commits, PRs, issues)

How I Used Hermes Agent

Hermes is load-bearing in four distinct ways — not just a call to an LLM, but the entire memory and scheduling backbone of the system.

1. Persistent Memory per Repository

Every GitHub repo gets its own Hermes session ID stored in the database. Every commit, PR, and issue ever synced is fed into that session. The X-Hermes-Session-Id header turns Hermes into a stateful brain that accumulates context indefinitely — one per repo:

# hermes_client.py — every chat uses the repo's dedicated session
response = await self._openai.chat.completions.create(
    model=self.model,
    messages=messages,
    extra_headers={"X-Hermes-Session-Id": session_id},  # per-repo memory
)

This is the killer feature. Each repo's Hermes session is its own contained institutional memory. facebook/react and your-team/backend each have completely separate, isolated histories.

2. Decision Extraction at Ingest Time

When a new commit or PR arrives, it's immediately sent to Hermes with a structured system prompt asking it to decide: is this a meaningful engineering decision? If yes, extract the rationale, classify it, and score confidence.

# services/ingestion.py — Hermes reads each event and extracts decisions
INGEST_SYSTEM_PROMPT = """You are the Shadow CTO for {repo_name}. When you receive
a commit, PR, or issue, determine if a meaningful engineering decision was made,
extract the rationale, and classify the decision type. Respond in JSON."""

Because this runs through the same session, Hermes builds cumulative understanding. By the 50th commit, it has context about what came before and can identify reversals, patterns, and contradictions.

3. Streaming Natural Language Q&A

The query endpoint asks Hermes questions against its own accumulated memory and streams the answer back to the frontend using Server-Sent Events. Hermes answers from what it actually knows — not from a RAG database or keyword search:

# routers/query.py — SSE streaming from Hermes memory
async for chunk in hermes.stream_chat(
    messages=messages,
    session_id=repo.hermes_session_id,  # query the right repo's brain
):
    yield f"data: {chunk}\n\n"

4. Autonomous Cron Jobs via Hermes Scheduler

Shadow CTO registers two jobs directly with Hermes's /api/jobs endpoint on startup. These show up in the Hermes jobs dashboard and represent the fully autonomous behavior:

# jobs/cron_setup.py
await hermes.create_job(
    name="shadow-cto-daily-patterns",
    schedule="0 2 * * *",
    prompt=(
        "You are the Shadow CTO. Review the engineering decisions stored "
        "in your memory from the past 30 days. Identify recurring failure "
        "patterns — components that keep breaking, decisions that get reversed, "
        "or technical debt accumulating."
    ),
)

Every night at 2am, Hermes autonomously scans the accumulated decision history and surfaces failure patterns — without any human prompt. This is the agentic loop: ingest → remember → analyze → alert, running continuously.

Why Hermes Was the Right Fit

The alternative would have been a vector database + RAG pipeline. That approach answers questions about what's in the documents. Hermes's persistent session memory answers questions about what the AI has learned from watching your repo over time. The difference is subtle but critical — Hermes builds understanding, not just retrieval.

The cron job integration was the other decisive factor. Registering jobs directly with Hermes means the autonomous analysis is part of the same system that holds the memory, not a separate process calling a stateless API.

Shadow CTO proves that persistent agent memory isn't just a feature — it's the foundation that makes an AI system genuinely useful over time rather than just impressive in a demo.

DEV Community