DEV Community

EmaadS
EmaadS

Posted on

Bounty Scout: I gave Hermes the job of finding work that pays — and it wrote its own skill to do it

Hermes Agent Challenge Submission: Build With Hermes Agent

This is a submission for the Hermes Agent Challenge: Build With Hermes Agent.

What I Built

Bounty Scout — a small agent that finds funded open-source bounties worth
actually working on, and gets better at judging them every time it runs.

I didn't want to build another "wrap an LLM in a loop" demo. Hermes Agent's
defining feature is a closed learning loop: after doing a task it can write a
reusable skill, and then improve that skill the next time. So I built the
smallest project that makes that loop the whole point.

The job I gave it is one I genuinely care about: which open-source bounties can an
AI-assisted developer realistically win and get paid for?
In 2026 that's a real
filtering problem — lots of funded issues now explicitly ban AI contributions or
demand human-only proof, and a naive scraper happily wastes your time on them.

The self-improving loop (the actual demo)

Run What Hermes did
Run 1 Scouted GitHub for funded bounties, triaged 20 of them against a 7-axis rubric, wrote a ranked shortlist — and authored a bounty-triage skill from scratch.
Run 2 Loaded the skill it wrote, scored fresh bounties, appended new finds — then edited its own skill, tightening the dollar-amount parsing it found brittle.

That second row is the magic. Here's the end of Run 2's transcript, in its own words:

4. I improved the `bounty-triage` skill by updating its SKILL.md...
   - "Funded?" score 2 → "Clear cash payout explicitly stated
     (now robustly parsed from title, including decimals)."
   - "Dollars-vs-effort?" → "scoring now includes type check for
     numerical estimated dollar amount."
Enter fullscreen mode Exit fullscreen mode

It noticed its own weakness and patched its own playbook. Run 3 starts smarter than
Run 1 did — with zero changes from me.

A slice of what it actually surfaced (it correctly VETO'd a security/PIN bounty
as out of an AI's safe zone, and flagged AI-friendly ones as pursue):

Title Verdict Est. Why
Attachment Summarizer Service pursue $960 High payout, AI-friendly, good stack fit
Low Hanging Fruit Automation pursue $700 Explicitly AI-friendly, small tasks
Note Locking — Biometrics/PIN avoid $660 Security topic; needs careful human review

How I Used Hermes Agent

  • Skill creation + self-improvement — the core. Hermes wrote bounty-triage and then revised it across runs. The skill file in the repo is Hermes's, not mine.
  • Terminal tool — it runs gh search issues to pull live bounty data itself.
  • Autonomous multi-step execution (--yolo) — fetch → triage → write the shortlist → author/refine the skill, all unattended in one shot.
  • OpenRouter backend — model-agnostic; this demo runs on google/gemini-2.5-flash.

The whole two-run demo cost about $0.25 in inference.

Demo

Bounty Scout demo: Hermes lists the bounty-triage skill it wrote, ranks real GitHub bounties pursue/maybe/avoid, then shows the line it improved in its own skill on run 2

demo-run-2.txt in the repo is the raw run-2 transcript (skill reuse + the
self-edit). SKILL.bounty-triage.md is the skill Hermes authored and then improved.

Code

👉 Repo: https://github.com/emaadshamsi/bounty-scout

# prereqs: uv, gh (authenticated), OPENROUTER_API_KEY
./scout.sh   # installs Hermes, configures OpenRouter, runs both passes
Enter fullscreen mode Exit fullscreen mode

My Tech Stack

  • Hermes Agent (Nous Research, MIT)
  • OpenRoutergoogle/gemini-2.5-flash
  • GitHub CLI (gh) as the live data source
  • uv for an isolated Python 3.11 env
  • Bash glue (scout.sh)

Honest notes

On a cheap fast model the triage prose is solid-but-templated — a stronger model
sharpens the verdicts, but the architecture is the point. Scouting is
GitHub-label-based, so it's broad, not exhaustive. This is a focused demo of the
self-improving loop, not a finished bounty-hunter.

But that loop is the part I'll keep using: an agent that writes down what it learns
and gets sharper on its own is exactly what you want pointed at a messy,
ever-changing problem like "where's the work that pays?"

Top comments (0)