I Built a Hermes Agent to Tell Me Which Hackathons to Enter. It Told Me to Enter This One.

#hermesagentchallenge #devchallenge #agents #showdev

Hermes Agent Challenge Submission: Build With Hermes Agent

This is a submission for the Hermes Agent Challenge: Build With Hermes Agent

What I Built

I started AI-assisted building in July 2025. Self-taught, learning as I go. Hackathons and challenges turned out to be a big part of how I got better, structured lessons to build something inside a deadline, and the DEV challenge feed became one of my regular places to look.

But the more I leaned on it, the more I kept hitting the same question: how do I decide which challenge to enter? It is not trivial. Entering the wrong one costs days I do not get back. And you cannot always tell from a challenge page alone whether it fits the stack, tools or style I build with, whether there is enough runway left, or whether the learning is worth the hours. The feed does not sort itself by worth it. So you either check it obsessively, miss things, or enter on a hunch and hope for the best.

Vigil Crest is the filter I wanted. It is a challenge triage agent I talk to on Telegram. I message it, it browses the live challenge feed, and it sends back a verdict on each active challenge. For every one it gives me four short fit lines, Time, Learning, Stack fit, and Timing, a paragraph of reasoning, and a call: enter, skip, or maybe.

One thing I prioritized was the over familiarization. Vigil Crest does not pretend to know me better than it does. Early on it labels its verdicts as first impressions and tells me what it is unsure about. It is built to get better at reading my judgment the more I check in with it. A verdict that knows how sure it is beats a confident guess. That is the major portion of the whole design, not a disclaimer bolted on the end.

It is named Vigil Crest because it keeps watch from a height and reports what it sees.

Demo

The whole thing happens in Telegram. Here is a full run.

It starts with the nudge. Twice a week Vigil Crest sends a short reminder to check the board. I reply when I am ready, and it goes to work: it browses the live challenge feed and reports back what it found.

Then it triages each one. Here is its verdict on the Hermes Agent Challenge itself. Notice the two gauges, STACK FIT and WORTH IT, and that it reasons about deadline and competitive angle rather than just reading the prize off the page.

It is just as willing to tell me to skip. Two of the active challenges closed the same night. Vigil Crest passed on both, and said why: a rushed entry will not compete with work done over several days.

And when something is neither a clear yes nor a clear no, it says so. The GitHub Finish-Up-A-Thon has a relaxed deadline but a hard requirement that does not map cleanly to my work, so it lands on maybe, with the reasoning attached.

Four challenges, four verdicts, each one hedged and explained. That is a single check-in.

Code

github.com/earlgreyhot1701D/vigil-crest

Vigil Crest is not a clone-and-run app. It is a configured Hermes instance. The repo gives you the recipe and the portable parts, the skills, the persona and stack templates, and the build guide. Every user supplies their own persona and stack, because the tool grades against one specific person's judgment.

My Tech Stack

Hermes Agent runs the agent, the persona, the skills, and the schedule.
AWS Bedrock (Claude Sonnet 4.6) for the model, reached through an EC2 instance role so there are no credentials sitting on the box.
EC2 (Ubuntu, t3.micro) as the always-on host for the Hermes gateway.
Telegram as the interface. I talk to Vigil Crest like any other contact.
Playwright for live browsing of the challenge feed.
GitHub public repos as the source for the auto-refreshing part of the stack file.

How I Used Hermes Agent

Hermes Agent is the whole spine of this. A few capabilities carried most of this.

The persona did the judging. Hermes' SOUL.md is where Vigil Crest learned how I pick challenges. Not a generic "rank these by prize money" prompt, but the criteria I use: that excitement is a prerequisite, that time is a genuine cost, that a writing track can lower the barrier when a build track is out of reach, that skipping is a legitimate choice. The agentic part is that the persona reasons from those principles instead of running a scoring formula.

Skills made the triage repeatable. I wrote a triage-challenge skill that browses the live feed, reads the active challenges, and produces the four fit lines and a verdict for each. I also wrote a refresh-stack skill that reads my public GitHub repos and keeps the Languages section of the stack file current. The stack file is split by provenance: the part GitHub can verify is auto-refreshed, the part it cannot (frameworks, cloud, tooling) is hand-curated and marked as such. The agent grades stack fit against that file.

The browser tool, used deliberately. Vigil Crest always browses the challenge feed directly. Search results cache stale states and will tell you there are no active challenges when four are live. The skill reads the rendered page, the active section only.

The model is Claude Sonnet 4.6, served through Bedrock. I picked Sonnet over Opus deliberately: Vigil Crest is meant to run often, and Sonnet 4.6 is strong enough for the triage reasoning while keeping an always-available agent affordable.

Not all of it worked

I built Vigil Crest to also run autonomously: a scheduled job, a change-detection pre-check, the agent waking on its own when something new appeared. It ran into a wall I could not get around. The pre-check script reads the feed with a headless browser, and a headless browser is environment sensitive. It launches fine when I run the script myself and fails to launch inside the scheduler's background process. The script's safety guardrail then correctly reports nothing new, and the scheduler discards the script's error output on a clean exit, so the failure was invisible until I dug for it.

The clean fix is an API-based pre-check, since a plain JSON request behaves the same in a shell and a scheduler. The DEV API is rich, but it does not yet expose challenges as their own endpoint, so a pre-check built on it would need some experimentation. That is a v2 item, and it is written up plainly in the repo.

What I shipped is Vigil Crest as a check-in correspondent. A browser-free nudge runs on a schedule and sends me a short reminder twice a week. The triage happens when I reply. I think this suits the project better than the autonomous version would have. An agent whose value is learning my judgment is probably better off in conversation with me than broadcasting into the void on a timer, since a check-in gives it something to learn from and a timer does not.

I did not go looking for that reframe. I went looking for a workaround and found a better design instead.

As for this challenge:
Vigil Crest looked at the feed, weighed it, and said enter. I am taking its advice.

AI assisted. Human approved. Powered by NLP.