<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Arqam Waheed</title>
    <description>The latest articles on DEV Community by Arqam Waheed (@arqamwd).</description>
    <link>https://dev.to/arqamwd</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3760002%2Feb94d8d9-e8ef-4932-ab99-d07a12fe197b.jpeg</url>
      <title>DEV Community: Arqam Waheed</title>
      <link>https://dev.to/arqamwd</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/arqamwd"/>
    <language>en</language>
    <item>
      <title>I Finally Finished Schedio: Turning a 5-Day Hackathon MVP Into a Live Product</title>
      <dc:creator>Arqam Waheed</dc:creator>
      <pubDate>Sat, 06 Jun 2026 13:20:10 +0000</pubDate>
      <link>https://dev.to/arqamwd/i-finally-finished-schedio-turning-a-5-day-hackathon-mvp-into-a-live-product-3n8k</link>
      <guid>https://dev.to/arqamwd/i-finally-finished-schedio-turning-a-5-day-hackathon-mvp-into-a-live-product-3n8k</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/github-2026-05-21"&gt;GitHub Finish-Up-A-Thon Challenge&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;A few weeks ago, I built Schedio as a 5-day hackathon project, which was also, unironically, another GitHub challenge.&lt;/p&gt;

&lt;p&gt;The idea was simple:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Highlight any text that mentions an event, and turn it into a Google Calendar event in under 5 seconds.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;It worked.&lt;/p&gt;

&lt;p&gt;Kind of.&lt;/p&gt;

&lt;p&gt;The first version could parse highlighted text, open a clean event modal, and write to Google Calendar. I even wrote about that original MVP here:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://dev.to/arqamwd/schedio-highlight-to-calendar-in-5-seconds-18pi"&gt;Schedio: Highlight to Calendar in 5 Seconds&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;But after the hackathon rush ended, Schedio was still very obviously an MVP.&lt;/p&gt;

&lt;p&gt;The demo was cool, but the product was not finished.&lt;/p&gt;

&lt;p&gt;The Gemini key was too close to the client. OAuth verification was not done. Billing did not exist. Pro was just an idea. Onboarding was basically “install this and figure it out.”&lt;/p&gt;

&lt;p&gt;There was not even a real landing page yet. My original plan was to build one after the Chrome Web Store approval, properly market the extension, get some users, add more features, and slowly turn it into an actual product instead of just a hackathon project sitting in a repo.&lt;/p&gt;

&lt;p&gt;The funny part is that the Chrome Web Store launch failed once because I accidentally uploaded the wrong build. I fixed it, submitted it again, and then it got rejected a second time lol.&lt;/p&gt;

&lt;p&gt;After that point, uni exams had started, other hackathons came up, and Schedio slowly drifted into that “I’ll finish it later” state. I never really pushed it beyond the original hackathon MVP.&lt;/p&gt;

&lt;p&gt;It worked for me, but it was never really out there for everyone else to use.&lt;/p&gt;

&lt;p&gt;So for the Finish-Up-A-Thon, I came back to Schedio and tried to do the part of building that usually gets ignored after the fun demo is over — which is TO ACTUALLY FINISH it and put it out there for others.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc70fpwhgkx6dyn49cm9k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc70fpwhgkx6dyn49cm9k.png" alt=" " width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;Schedio is an AI-powered Chrome extension that turns natural language into real Google Calendar events.&lt;/p&gt;

&lt;p&gt;You can highlight something like:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Team standup Friday 3pm, Room 204&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Then right-click and choose &lt;strong&gt;Create Event with Schedio&lt;/strong&gt;, or use the keyboard shortcut. Schedio parses the title, date, time, and location, shows you a quick review modal, and creates the event in Google Calendar when you confirm.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;No copy-pasting.&lt;/li&gt;
&lt;li&gt;No switching tabs.&lt;/li&gt;
&lt;li&gt;No manually typing date fields while trying to remember what the original message said.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F88i4x3f1r16owuw7ul4e.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F88i4x3f1r16owuw7ul4e.png" alt=" " width="800" height="439"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The finished version is now a real product, not just a hackathon repo. It has a live Chrome Web Store listing, a landing page at &lt;code&gt;schedio.org&lt;/code&gt;, Google OAuth verification approved, Lemon Squeezy approved for payments, a Cloudflare Worker backend, server-side Gemini calls, Supabase-backed users and subscriptions, Free/Pro metering, an in-extension upgrade flow, voice-to-calendar as a Pro feature, first-run onboarding, privacy policy, Terms of Service, rate limiting, input validation, and no baked Gemini key in the client bundle.&lt;/p&gt;

&lt;p&gt;The MVP proved the magic.&lt;/p&gt;

&lt;p&gt;The finished version makes the magic safe, usable, and shippable.&lt;/p&gt;




&lt;h2&gt;
  
  
  Demo
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Live site:&lt;/strong&gt; &lt;a href="https://schedio.org" rel="noopener noreferrer"&gt;https://schedio.org&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Chrome Web Store:&lt;/strong&gt; &lt;a href="https://chromewebstore.google.com/detail/schedio/nlnkjghkddopgocdbhhkefmjbchlpjnc" rel="noopener noreferrer"&gt;https://chromewebstore.google.com/detail/schedio/nlnkjghkddopgocdbhhkefmjbchlpjnc&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Original hackathon version repo:&lt;/strong&gt; &lt;a href="https://github.com/ArqamWaheed/schedio" rel="noopener noreferrer"&gt;https://github.com/ArqamWaheed/schedio&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The original hackathon MVP repo is still public, but the current production version is now private because it contains live infrastructure, billing flows, and production authentication logic.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fok9n46dl9c5deyrzcd22.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fok9n46dl9c5deyrzcd22.png" alt=" " width="800" height="413"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The core flow is still the same as the original MVP: highlight text, trigger Schedio, review the parsed event, and send it to Google Calendar.&lt;/p&gt;

&lt;p&gt;But everything around that flow evolved. What started as a hackathon extension slowly turned into a real product and brand, with a proper landing page, onboarding, Pro features, subscriptions, backend infrastructure, and a much more polished overall experience.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Comeback Story
&lt;/h2&gt;

&lt;p&gt;The original Schedio was built under pressure. I cared about one question: could I make calendar creation feel instant?&lt;/p&gt;

&lt;p&gt;That question led to the first version. But when I came back to the project, the question changed. It was no longer just “can this work?” It became:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“What would I need to change before I could confidently give this to strangers and turn it into a real brand?”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That shift basically defined the entire comeback.&lt;/p&gt;

&lt;p&gt;The first real finishing moment was getting Schedio live on the Chrome Web Store. The earlier review had failed because I uploaded the wrong non-working build, which is such a small but painful launch mistake. The code can work locally, the demo can be impressive, the idea can be good, and then one bad upload means nobody can actually install it. So I rebuilt, rechecked, uploaded the correct version, and got it listed.&lt;/p&gt;

&lt;p&gt;That made the project feel different immediately.&lt;/p&gt;

&lt;p&gt;Before, Schedio was something I could show.&lt;/p&gt;

&lt;p&gt;Now it was something people could install.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7oive6wchyniodz9crqr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7oive6wchyniodz9crqr.png" alt=" " width="800" height="461"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The next big problem was architecture. The MVP had the classic hackathon shortcut: the Gemini API key was too close to the client. With a browser extension, that is not something you can just hand-wave forever. If a key is in the shipped bundle, it is not really secret.&lt;/p&gt;

&lt;p&gt;So I moved the AI parsing behind a Cloudflare Worker backend using Hono. The extension now sends highlighted text and the user’s Google token to &lt;code&gt;api.schedio.org/parse&lt;/code&gt;. The backend verifies identity, calls Gemini server-side, and returns the parsed event. The Gemini key lives as a Worker secret, not inside the extension.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fndxm9ctfzc46q3or9r10.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fndxm9ctfzc46q3or9r10.png" alt=" " width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;That was the first moment where Schedio stopped feeling like a clever browser hack and started feeling like infrastructure. The product still felt simple from the outside, but the trust boundary had completely changed.&lt;/p&gt;

&lt;p&gt;Once the backend existed, I could finally turn Schedio into a Free/Pro product. Free users get a monthly event cap. Pro users get unlimited events and access to the voice feature. The important part is that the limit is enforced server-side before Gemini is called, so over-limit users do not cost an API request.&lt;/p&gt;

&lt;p&gt;I also made the quota harder to game. The monthly bucket comes from the server’s UTC clock, not the client’s local date, and usage increments through an atomic Postgres function so concurrent highlights do not lose updates.&lt;/p&gt;

&lt;p&gt;That is not the flashiest part of the project, but it is exactly the kind of thing that separates a demo from a product. A demo only needs to work once. A product has to keep working when users do weird things.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxrh7o1e4dkt9w5nvkl71.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxrh7o1e4dkt9w5nvkl71.png" alt=" " width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Billing was another place where finishing meant doing the boring thing correctly. I used Lemon Squeezy as the Merchant of Record, so Schedio does not touch card numbers, VAT, tax, or PCI directly. The backend has endpoints for checking the current plan, creating a personalized checkout link, and handling subscription webhooks.&lt;/p&gt;

&lt;p&gt;The checkout flow embeds the verified Schedio user ID into Lemon Squeezy custom data. That way, when the webhook comes back, the subscription can be attached to the exact right Google account.&lt;/p&gt;

&lt;p&gt;I almost made the obvious mistake of putting a raw “Buy Pro” link on the website. But the website does not know who the user is. Identity lives inside the extension. A raw checkout link from the marketing page could create an orphaned payment that the backend cannot map to anyone.&lt;/p&gt;

&lt;p&gt;So the website sells the product, but the actual upgrade flow starts inside the extension, where the user is already authenticated.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5zh6841esgqijpt7bitk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5zh6841esgqijpt7bitk.png" alt=" " width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The biggest new feature I added was voice-to-calendar. Instead of highlighting text, Pro users can speak an event into the popup. The backend sends the raw audio to Gemini 2.5 Flash multimodal, and Gemini transcribes and extracts the event in one call. No separate speech-to-text step. No transcript first, parse second.&lt;/p&gt;

&lt;p&gt;Just speech into calendar structure.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fackhjtuwoofwn8tf5br5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fackhjtuwoofwn8tf5br5.png" alt=" " width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I made voice a real Pro anchor, not a fake paywall. I could have used the free Web Speech API, but the multimodal approach was more accurate and has real marginal cost. So the server enforces the Pro gate before audio reaches Gemini. Free users get a &lt;code&gt;403 PRO_REQUIRED&lt;/code&gt; before the expensive work happens.&lt;/p&gt;

&lt;p&gt;That felt like a real product decision: the feature is better, it costs something, and the paywall protects the cost center before the bill is created.&lt;/p&gt;

&lt;p&gt;The next problem was onboarding. The MVP dropped users into the product and expected them to discover the context menu or shortcut. That is fine when the builder is the user, but for others, it is terrible for a new install.&lt;/p&gt;

&lt;p&gt;So I built a first-run onboarding tab that opens on install. It shows the core habit: highlight text, right-click, review the event, connect Google Calendar. I wanted the tour to appear once without adding a new &lt;code&gt;storage&lt;/code&gt; permission, so I used Chrome’s &lt;code&gt;runtime.onInstalled&lt;/code&gt; event with &lt;code&gt;reason === "install"&lt;/code&gt;. That fires once per install, so there is no extra permission and no extra state to manage.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frhxxwamo1q8myboe2fk2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frhxxwamo1q8myboe2fk2.png" alt=" " width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I also moved sign-in earlier, but carefully. The onboarding ends with a clear &lt;strong&gt;Connect Google Calendar&lt;/strong&gt; button, not an automatic OAuth popup. There is also a skip option. The connect ask only appears after the user has seen the value.&lt;/p&gt;

&lt;p&gt;Good onboarding is not a wall of text. It is a rehearsal of the product’s best moment.&lt;/p&gt;

&lt;p&gt;The landing page also became part of the finishing arc. I shifted it away from technical explanations and focused more on outcomes instead, because that is what actually gets people interested in a product.&lt;/p&gt;

&lt;p&gt;I also had to make the marketing honest. Some features are planned but not built yet, like extra calendar providers and bulk event creation. I did not want to delete the ambition, but I also did not want to lie. So unfinished features got “soon” labels, and shipped features were removed from the future roadmap.&lt;/p&gt;

&lt;p&gt;That sounds tiny, but it matters. A product page should not make claims the product cannot survive.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiquxpj1p6hzmye35kl1k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiquxpj1p6hzmye35kl1k.png" alt=" " width="800" height="1000"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The final wall was trust.&lt;/p&gt;

&lt;p&gt;Google OAuth verification got approved. Lemon Squeezy approved the store. Those two approvals were the moment Schedio stopped being “works on my machine” and became something distributable and monetizable.&lt;/p&gt;

&lt;p&gt;A calendar-writing extension has to earn trust. Google needed the owned domain, hosted privacy policy, write-only scope explanation, demo video, and correct compliance language. Payments needed a real Merchant of Record review. None of that is as fun as building a new AI feature, but that was the actual finish line.&lt;/p&gt;

&lt;p&gt;The original version died near this wall. The newer version finally crossed it.&lt;/p&gt;




&lt;h2&gt;
  
  
  How GitHub Copilot Helped Me
&lt;/h2&gt;

&lt;p&gt;I used GitHub Copilot CLI to generate a large part of the implementation, but I never treated it like autopilot. &lt;/p&gt;

&lt;p&gt;The architecture, product decisions, system boundaries, and overall direction were still mine. Copilot was the accelerator, not the driver. I spent most of the project defining flows, structuring prompts carefully, reviewing generated code, and deciding what should or should not exist in the final product.&lt;/p&gt;

&lt;p&gt;That mattered more as Schedio evolved from a hackathon MVP into a real product.&lt;/p&gt;

&lt;p&gt;The backend migration is a good example. I knew the Gemini key could not stay in the client anymore, but Copilot helped turn that idea into the actual Worker architecture: extension → Cloudflare Worker → Gemini/Supabase/Lemon Squeezy. It helped scaffold routes, tighten request shapes, and keep the extension and backend synced while the architecture evolved.&lt;/p&gt;

&lt;p&gt;It also helped with the parts that are easy to ignore when you are moving quickly: webhook verification, quota tracking, atomic usage increments, CORS restrictions, rate limits, generic error handling, and validation layers. None of those make a flashy demo. All of them make the product safer.&lt;/p&gt;

&lt;p&gt;Copilot was also surprisingly useful for debugging weird launch issues. The best example was the OAuth &lt;code&gt;bad client id&lt;/code&gt; bug. After moving authentication into the real product flow, Google sign-in suddenly broke in development builds. The issue turned out to be Chrome extension IDs: unpacked builds can generate different IDs unless the public extension key is pinned correctly.&lt;/p&gt;

&lt;p&gt;Copilot helped trace the extension ID behavior, compare it against the published Web Store ID, and wire the correct key into the manifest so development and production resolved identically. What started as a vague OAuth failure became a clean one-line fix.&lt;/p&gt;

&lt;p&gt;It also helped with product consistency outside pure code. While redesigning the Chrome Web Store graphics, Copilot helped identify outdated messaging that still implied users needed their own AI key. But the final product had already removed BYOK entirely. Leaving those images up would have been misleading, so they got rebuilt before launch.&lt;/p&gt;

&lt;p&gt;That was the part I did not expect initially. Copilot was not just generating code. It was helping keep the product coherent while the scope kept expanding.&lt;/p&gt;

&lt;p&gt;The Terms of Service and SEO work were similar. Copilot helped structure the TOS around the existing privacy policy style, wire the new pages into the build system, and connect everything through the footer and metadata layer. It also helped add Open Graph tags, Twitter cards, sitemaps, robots files, structured data, and asset handling.&lt;/p&gt;

&lt;p&gt;The biggest lesson was that Copilot never replaced the decisions. It simply made implementation dramatically faster once the direction was clear.&lt;/p&gt;

&lt;p&gt;And most of the important decisions were actually restraint:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Do not put a raw checkout link on the website because identity lives in the extension.&lt;/li&gt;
&lt;li&gt;Do not claim planned features are already shipped.&lt;/li&gt;
&lt;li&gt;Do not add unnecessary permissions just because they are convenient.&lt;/li&gt;
&lt;li&gt;Do not process webhooks loosely when strict validation is safer.&lt;/li&gt;
&lt;li&gt;Do not leak backend internals in API errors.&lt;/li&gt;
&lt;li&gt;Do not keep a Gemini key in the client just because it is easier.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The MVP was built with speed.&lt;/p&gt;

&lt;p&gt;The finished product was built with speed plus restraint.&lt;/p&gt;

&lt;p&gt;Once the architecture and product decisions were clear, Copilot accelerated the implementation massively. I still handled the direction, debugging, and review process, but it removed an enormous amount of friction from actually extending and shipping Schedio.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftl4dd5r842y42amyc3ez.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftl4dd5r842y42amyc3ez.png" alt=" " width="799" height="537"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A lot of the final product simply would have taken far longer to build without it.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I Learned
&lt;/h2&gt;

&lt;p&gt;A hackathon project is about proving the magic.&lt;/p&gt;

&lt;p&gt;A finished product is about protecting it.&lt;/p&gt;

&lt;p&gt;Schedio already had the magic: highlight text and turn it into a calendar event in seconds. But the comeback was everything around that.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Can users install it?&lt;/li&gt;
&lt;li&gt;Can Google trust it?&lt;/li&gt;
&lt;li&gt;Can payments map to the right account?&lt;/li&gt;
&lt;li&gt;Can a free user hit a limit without the UX feeling broken?&lt;/li&gt;
&lt;li&gt;Can secrets stay secret?&lt;/li&gt;
&lt;li&gt;Can onboarding teach the habit?&lt;/li&gt;
&lt;li&gt;Can the website sell without lying?&lt;/li&gt;
&lt;li&gt;Can the system survive the boring edge cases?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is what I finished.&lt;/p&gt;

&lt;p&gt;And weirdly, that made the project more exciting than the original hackathon version, because now Schedio is not just a demo I can show. It is a product I can actually launch.&lt;/p&gt;

&lt;p&gt;The next steps are smarter recurrence, multiple calendars, Outlook/iCloud/CalDAV support, bulk multi-event parsing, Firefox and Edge support, and eventually a Mac/Safari companion app. But the important part is that Schedio is no longer blocked by the boring stuff.&lt;/p&gt;

&lt;p&gt;The boring stuff is done.&lt;/p&gt;

&lt;p&gt;And that was the real Finish-Up-A-Thon.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's Next Beyond Development
&lt;/h2&gt;

&lt;p&gt;Now comes the next challenge: &lt;strong&gt;distribution&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Schedio is finally live, installable, verified, and usable by real people. The engineering side finally feels complete enough to push properly, so my next focus is figuring out how to market it, get feedback from real users, and turn it from a finished side project into something people genuinely rely on.&lt;/p&gt;

&lt;p&gt;If anyone has ideas for growth, launch, or distribution strategies for productivity extensions, I would genuinely love to hear them.&lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>githubchallenge</category>
      <category>githubcopilot</category>
      <category>ai</category>
    </item>
    <item>
      <title>I Made My AI Models Argue, Then Let Hermes Be the Judge</title>
      <dc:creator>Arqam Waheed</dc:creator>
      <pubDate>Sat, 30 May 2026 16:00:54 +0000</pubDate>
      <link>https://dev.to/arqamwd/i-made-my-ai-models-argue-then-let-hermes-be-the-judge-5e6c</link>
      <guid>https://dev.to/arqamwd/i-made-my-ai-models-argue-then-let-hermes-be-the-judge-5e6c</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/hermes-agent-2026-05-15"&gt;Hermes Agent Challenge&lt;/a&gt;: Build With Hermes Agent&lt;/em&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;TL;DR&lt;/strong&gt; — Ask any judgment call and three different AI models argue it out, then Hermes hands down one verdict, a confidence score, and exactly why they split. Every verdict, dissent, and mind-changed-in-debate is written into Hermes' own memory, so the next question re-weights the jurors before they ever vote. The judging is a pure function over that memory: no memory, no weights, no verdict. Three models, one verdict, $0.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;An LLM once talked me into the wrong database with total confidence. One smooth, authoritative answer. I shipped it. It cost me a weekend and a migration I'm still not over.&lt;/p&gt;

&lt;p&gt;The villain here is &lt;strong&gt;single-model overconfidence&lt;/strong&gt;: you get one polished reply, and the disagreement that should have warned you is invisible. You never see the other opinions, because you only asked one model.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;So I stopped trusting one model. I convened a jury.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Council takes any judgment call ("Postgres or Mongo?", "is this PR safe to merge?", "is this clause risky?") and asks &lt;strong&gt;three different models&lt;/strong&gt;, lets them disagree, then has Hermes deliver one verdict, a confidence score, and exactly &lt;em&gt;why&lt;/em&gt; they split. Three models, one verdict, $0.&lt;/p&gt;

&lt;p&gt;You ask a question. Council fans it out to three jurors (two free OpenRouter models from different families and one local model via Ollama), each takes a position with reasons. Then, if they disagree, a &lt;strong&gt;second deliberation round&lt;/strong&gt; runs: each juror sees the others' answers and either holds or changes its mind, so the council &lt;em&gt;debates&lt;/em&gt; instead of just voting once. Hermes then judges the deliberated opinions: a single verdict, a &lt;strong&gt;confidence score&lt;/strong&gt; (high when they agree, low when they split 2-1), and a "why they disagreed" panel. Every verdict is remembered, a &lt;code&gt;council&lt;/code&gt; skill learns which juror to trust for which kind of question, and the agent can even &lt;strong&gt;propose its own&lt;/strong&gt; trust adjustments for you to approve.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj3c86rvf61fkb58mr2d7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj3c86rvf61fkb58mr2d7.png" alt="The Council home screen: one input box, a model-agnostic jury behind it" width="800" height="679"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The whole product is one question box. Everything interesting happens behind it, and the rest of this post is mostly pictures of that "behind."&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Demo
&lt;/h2&gt;

&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/tREMaJuJGH4"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Repo:&lt;/strong&gt; &lt;a href="https://github.com/ArqamWaheed/council" rel="noopener noreferrer"&gt;https://github.com/ArqamWaheed/council&lt;/a&gt; &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Live demo:&lt;/strong&gt; &lt;a href="https://council-jet-kappa.vercel.app/" rel="noopener noreferrer"&gt;https://council-jet-kappa.vercel.app/&lt;/a&gt;&lt;br&gt;
Hermes orchestration is local-only (no Hermes binary on serverless); the hosted demo runs the same UI via OpenRouter/mock. Run locally for the real hermes -z path.&lt;/p&gt;

&lt;p&gt;Try "Should a 3-person startup use microservices?" and open the dissent panel.&lt;/p&gt;

&lt;p&gt;Local, one command (runs at $0 in offline mock mode, no key needed):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/ArqamWaheed/council &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;cd &lt;/span&gt;council &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; ./setup_hermes.sh &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; python server.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Architecture, in pictures
&lt;/h2&gt;

&lt;p&gt;I think the design is easiest to &lt;em&gt;see&lt;/em&gt;, so here's the system as a sequence of images. Each caption is the explanation.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9uzdgrtr7rlnb7iy8gbd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9uzdgrtr7rlnb7iy8gbd.png" alt="Convene flow: the browser/CLI sends one question to run_council.py, which calls hermes_run.py three times in parallel, two arrows to OpenRouter (hosted models) and one to Ollama (local model), then a fourth Hermes call to the foreman that returns a single verdict" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The core loop. One question, three independent Hermes subagents (2 hosted + 1 local) fanned out in parallel, then a fourth Hermes run (the foreman) synthesizes one verdict. Every arrow is the same &lt;code&gt;hermes -z&lt;/code&gt; interface; nothing talks to a model directly.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjv22kevlk6ojqgm3ms07.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjv22kevlk6ojqgm3ms07.png" alt="Model-agnostic jury: a single hermes -z interface in the middle, with three model cards plugged into it, openai/gpt-oss-120b:free and z-ai/glm-4.5-air:free via the openrouter provider, and qwen2.5 via the ollama-local provider running on-device" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The bet. A hosted model and an on-device model sit on the same jury, swapped with a single &lt;code&gt;--provider/--model&lt;/code&gt; flag, no code change. This model-agnosticism is the one Hermes property the whole project is built on.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbvs0pa1hoz1ol8g41q3f.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbvs0pa1hoz1ol8g41q3f.png" alt="Verdict card: a confidence dial reading 67%, three colour-coded juror chips (two green agreeing, one amber dissenting), a one-line verdict, and a collapsed " width="800" height="1595"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The UX surface. Confidence is high when jurors agree and drops on a 2-1 split. The dissent panel is collapsed by default, and you expand it exactly when the confidence number makes you nervous.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcjyozmnkf3vfuu2fv2fp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcjyozmnkf3vfuu2fv2fp.png" alt="Dissent panel expanded: " width="800" height="283"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The actual product. A confident single answer hides this; Council makes the disagreement the headline. Getting the clustering right here was subtle (see "What I learned" below).&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvc0blkp3zovuds4epww2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvc0blkp3zovuds4epww2.png" alt="Deliberation round: each juror card shows a " width="800" height="1077"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The headline feature: a council that **deliberates, not just votes&lt;/em&gt;&lt;em&gt;. After round 1, disagreeing jurors get a second Hermes pass where they read each other's arguments and may hold or change their vote. A "⇄ changed" badge marks the ones that moved, and the confidence dial actually climbs when a 2-1 split is talked into agreement.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm81gxa5fsm8gvr6pz0iy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm81gxa5fsm8gvr6pz0iy.png" alt="Reflect/approve flow: a " width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The agentic learning loop, human-in-the-loop. Hermes proposes; you approve or dismiss. Approved rules persist client-side and ride along with the next convene call.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjxozezam917mjgn1wcc1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjxozezam917mjgn1wcc1.png" alt="Memory recall: a terminal running  raw `hermes -z " width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Persistence the judge can verify. Verdicts are mirrored into Hermes' own memory, so recall is Hermes doing the work; proof lives in &lt;code&gt;docs/hermes-proof/04-memory-recall.txt&lt;/code&gt;.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Code
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Repo:&lt;/strong&gt; &lt;a href="https://github.com/ArqamWaheed/council" rel="noopener noreferrer"&gt;https://github.com/ArqamWaheed/council&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Interesting files:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;hermes_run.py&lt;/code&gt; (the Hermes CLI driver every juror/judge call goes through)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;run_council.py&lt;/code&gt; (orchestration + the deterministic judge + Hermes foreman + the &lt;code&gt;--reflect&lt;/code&gt; loop)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;skills/council/SKILL.md&lt;/code&gt; (the juror-weighting brain Hermes edits)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;server.py&lt;/code&gt; (the &lt;code&gt;/api/reflect&lt;/code&gt; + &lt;code&gt;/api/learn&lt;/code&gt; endpoints) &lt;/li&gt;
&lt;li&gt;
&lt;code&gt;index.html&lt;/code&gt; (the designed verdict UI with the foreman TTS readout and localStorage persistence). &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Proof that Hermes is genuinely in the loop (subagent transcripts, skill diff, memory recall) is in &lt;a href="https://github.com/ArqamWaheed/council/tree/main/docs/hermes-proof" rel="noopener noreferrer"&gt;&lt;code&gt;docs/hermes-proof/&lt;/code&gt;&lt;/a&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# hermes_run.py: every juror/judge call is a real Hermes run
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;ask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;skills&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;120&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;cmd&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;binary&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--provider&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;skills&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;cmd&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--skills&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;skills&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;cmd&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;-z&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;                       &lt;span class="c1"&gt;# -z = one-shot, final answer on stdout
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;subprocess&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;capture_output&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;stdout&lt;/span&gt;

&lt;span class="c1"&gt;# jurors.py: fan out one Hermes subagent per juror, in parallel
&lt;/span&gt;&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nc"&gt;ThreadPoolExecutor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_workers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;roster&lt;/span&gt;&lt;span class="p"&gt;()))&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;pool&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;opinions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pool&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;ask_juror&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nf"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;roster&lt;/span&gt;&lt;span class="p"&gt;())))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  How I Used Hermes Agent
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Why Hermes at all: the model-agnostic core.&lt;/strong&gt; Hermes lets you point at any provider and swap with a flag, no code change. Council is built &lt;em&gt;on top of that one property&lt;/em&gt;: the jurors are different models, and Hermes is the only piece that makes "different models" cheap. The clearest proof is the third juror: it runs &lt;strong&gt;locally&lt;/strong&gt; via Ollama while the other two are &lt;strong&gt;hosted&lt;/strong&gt; on OpenRouter, and all three answer through the exact same &lt;code&gt;hermes -z&lt;/code&gt; interface (the model-agnostic diagram above). A hosted model and an on-device model, sitting on the same jury, no code change: that's model-agnosticism you can see. I genuinely didn't see another entry in this challenge exploit it; everyone picked one model and moved on. That's the whole bet.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Subagents: one real Hermes run per juror.&lt;/strong&gt; Each juror is a genuine, isolated Hermes invocation on a &lt;em&gt;different&lt;/em&gt; provider+model (&lt;code&gt;hermes -z --provider openrouter --model …&lt;/code&gt; for the two hosted jurors, &lt;code&gt;--provider ollama-local …&lt;/code&gt; for the on-device one), fanned out &lt;strong&gt;in parallel&lt;/strong&gt; so no model's reasoning anchors another's (the convene-flow diagram above). Hermes does the inference; my Python (&lt;code&gt;jurors.py&lt;/code&gt; to &lt;code&gt;hermes_run.py&lt;/code&gt;) is just the fan-out plumbing, and every juror in the output JSON is tagged &lt;code&gt;"via": "hermes"&lt;/code&gt;. The gotcha worth flagging: Hermes enforces a &lt;strong&gt;64K-context floor&lt;/strong&gt;, which for the local model meant setting both &lt;code&gt;ollama_num_ctx&lt;/code&gt; &lt;em&gt;and&lt;/em&gt; a named &lt;code&gt;custom_providers&lt;/code&gt; entry; without the named provider, &lt;code&gt;--provider ollama&lt;/code&gt; silently routed to the wrong base URL. &lt;code&gt;setup_hermes.sh&lt;/code&gt; encodes the working config so a judge can reproduce it in one command.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A true debate, not just a vote (round 2 is real Hermes work).&lt;/strong&gt; This is the feature I'm proudest of. After round 1, if the jurors disagree, each one gets a &lt;em&gt;second&lt;/em&gt; Hermes run that shows it the others' positions and lead reasons and asks it to hold or change its mind. Real jurors reconsider through the same &lt;code&gt;hermes -z&lt;/code&gt; path as round 1, so the debate is genuine extra agentic work, not a UI flourish; mock jurors reconsider deterministically so the offline demo stays reproducible. The judge then synthesizes the verdict from the &lt;strong&gt;deliberated&lt;/strong&gt; opinions, so a juror that's talked round actually moves the outcome (the deliberation diagram above). It's gated on disagreement (a unanimous round 1 skips it) and toggled with &lt;code&gt;COUNCIL_DEBATE=0&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why a skill, not a prompt, for judging.&lt;/strong&gt; The foreman's verdict is itself a Hermes run (&lt;code&gt;hermes -z --skills council&lt;/code&gt;) grounded in &lt;code&gt;skills/council/SKILL.md&lt;/code&gt;, which is &lt;strong&gt;installed into Hermes&lt;/strong&gt; (&lt;code&gt;hermes skills list&lt;/code&gt; shows it). The weighting logic lives in a machine-readable &lt;code&gt;weights&lt;/code&gt; block.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdakiorz2ez87ajz7eqqj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdakiorz2ez87ajz7eqqj.png" alt="The SKILL.md weights block: a small machine-readable table mapping (juror, topic) to multiplier, with a one-line comment that the foreman reads before synthesizing" width="799" height="391"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The judging brain is data, not a buried prompt. &lt;code&gt;--learn&lt;/code&gt; and &lt;code&gt;--reflect&lt;/code&gt; both edit this block, and the installed Hermes copy is kept in sync.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;After a string of security questions, &lt;code&gt;--learn&lt;/code&gt; appended a rule to upweight the local model on that topic (&lt;em&gt;and synced the installed Hermes copy&lt;/em&gt;) because it had caught issues the hosted models missed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python run_council.py &lt;span class="nt"&gt;--learn&lt;/span&gt; &lt;span class="s2"&gt;"Local Juror | security | 1.5"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On the next security question that juror's vote counts 1.5×, read straight back by the judge. Counterfactual: a static synthesis prompt can't get better; this does. (The before/after skill diff is in &lt;a href="https://github.com/ArqamWaheed/council/blob/main/docs/hermes-proof/03-skill-learning.txt" rel="noopener noreferrer"&gt;&lt;code&gt;docs/hermes-proof/03-skill-learning.txt&lt;/code&gt;&lt;/a&gt;.)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Letting the agent propose its own learning, now on the web and grounded in evidence.&lt;/strong&gt; &lt;code&gt;python run_council.py --reflect&lt;/code&gt; (and the &lt;strong&gt;"Should the council reweight itself?"&lt;/strong&gt; button in the UI) hands Hermes its &lt;em&gt;own&lt;/em&gt; memory of past verdicts and asks it to propose one weight change, e.g. "the local juror has dissented on three database calls; upweight it." The key fix this round: the proposal is &lt;strong&gt;evidence-grounded&lt;/strong&gt;, since Hermes is fed the actual dissent tally and any rule backed by fewer than two real dissents is rejected, so it can't just parrot the example baked into the skill. You then &lt;strong&gt;Approve or Dismiss&lt;/strong&gt; it (the reflect-flow diagram above). That's the agentic loop done honestly: a single verdict has no ground truth, so the agent surfaces a &lt;em&gt;pattern&lt;/em&gt; and a human confirms it's signal, not overfitting (the exact tension this post closes on). (Offline, it falls back to a deterministic heuristic so it never breaks.)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Making learning survive a stateless deploy.&lt;/strong&gt; On a hosted demo the filesystem is read-only, so an approved rule can't be written back to &lt;code&gt;SKILL.md&lt;/code&gt;. Council handles this honestly: approved rules are stored in the browser's &lt;strong&gt;localStorage&lt;/strong&gt; and re-sent with every &lt;code&gt;/api/convene&lt;/code&gt; call, where they're merged into the judge's weights for that request. Locally you get a persistent &lt;code&gt;SKILL.md&lt;/code&gt;; on the web you get per-browser persistence, and either way the learning sticks.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why memory.&lt;/strong&gt; Each verdict is appended to a log &lt;em&gt;and mirrored into Hermes' own &lt;code&gt;MEMORY.md&lt;/code&gt;&lt;/em&gt;, so I can ask &lt;code&gt;hermes -z "what did the council decide about auth?"&lt;/code&gt; and Hermes recalls it from its memory, not from my code (the memory-recall image above). Proof: &lt;a href="https://github.com/ArqamWaheed/council/blob/main/docs/hermes-proof/04-memory-recall.txt" rel="noopener noreferrer"&gt;&lt;code&gt;docs/hermes-proof/04-memory-recall.txt&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The foreman reads the verdict aloud.&lt;/strong&gt; The verdict card has a "the foreman reads the verdict" button (browser SpeechSynthesis, $0); Hermes also ships native TTS via &lt;code&gt;hermes setup tts&lt;/code&gt;. On-theme and memorable: a jury foreman &lt;em&gt;announcing&lt;/em&gt; the decision.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The build itself was agent-run.&lt;/strong&gt; I kept a &lt;code&gt;memory.md&lt;/code&gt; the coding agent read before each task and updated after (so context stayed cheap), committed every increment with Conventional Commits, and built the verdict UI with the &lt;strong&gt;frontend-design&lt;/strong&gt; skill, which is why the confidence dial and colour-coded juror chips read as &lt;em&gt;designed&lt;/em&gt;, not default-template AI slop. The repo's &lt;code&gt;AGENTS.md&lt;/code&gt; + commit history show the process, not just the result.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why these models, and the concession.&lt;/strong&gt; Two free OpenRouter models from different families (≥64K context, since Hermes rejects smaller at startup) plus a local Ollama juror. Two honest concessions: (1) free models are slower and three calls add latency (~10-20s/verdict); (2) the free tier is &lt;em&gt;aggressively&lt;/em&gt; rate-limited, so I hit 429s constantly while building, and Council retries and, if a juror still won't answer, falls back (Hermes to direct API to deterministic stand-in) rather than crashing the verdict, which also means the demo runs &lt;strong&gt;fully offline at $0&lt;/strong&gt;. For a once-a-decision tool, I'll take it. Cost: $0.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;License.&lt;/strong&gt; MIT. Fork it, add your own jurors.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I learned (and what's next)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The disagreement is the product.&lt;/strong&gt; A 2-1 split is &lt;em&gt;more&lt;/em&gt; useful than a confident single answer, so the clustering that decides "who actually disagreed" has to be right. A small local model once wrote a vague position ("to facilitate efficient integration…") whose &lt;em&gt;reasons&lt;/em&gt; clearly endorsed Postgres; the first version mis-filed it as a dissenter. The fix: when a juror's stated position is ambiguous, fall back to reading its reasons, and ignore options only mentioned in a comparison ("better &lt;em&gt;than&lt;/em&gt; Mongo" isn't a vote for Mongo). Now agreeing jurors cluster together, and the split count is honest.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Grounded beats glib.&lt;/strong&gt; Letting the agent propose its own weighting only works if the proposal is tied to real evidence; an ungrounded "reflect" just echoes whatever example is in the skill.&lt;/li&gt;
&lt;li&gt;Hermes' 64K-context floor caught a model that would've quietly underperformed.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A council should deliberate, not just vote.&lt;/strong&gt; The round-2 debate above was the turning point: letting jurors read each other and reconsider means a juror that's genuinely persuaded moves the verdict, and you watch the confidence dial climb as a 2-1 split becomes unanimous. A one-shot vote can't do that.&lt;/li&gt;
&lt;/ul&gt;




</description>
      <category>hermesagentchallenge</category>
      <category>devchallenge</category>
      <category>agents</category>
      <category>ai</category>
    </item>
    <item>
      <title>Terra Triage: I Built a 3-Agent Wildlife Dispatcher That Learns From Every Referral</title>
      <dc:creator>Arqam Waheed</dc:creator>
      <pubDate>Mon, 20 Apr 2026 06:34:33 +0000</pubDate>
      <link>https://dev.to/arqamwd/terra-triage-i-built-a-3-agent-wildlife-dispatcher-that-learns-from-every-referral-efk</link>
      <guid>https://dev.to/arqamwd/terra-triage-i-built-a-3-agent-wildlife-dispatcher-that-learns-from-every-referral-efk</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for &lt;a href="https://dev.to/challenges/weekend-2026-04-16"&gt;Weekend Challenge: Earth Day Edition&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;TL;DR&lt;/strong&gt; — Snap a photo of an injured animal, the right licensed rehabber gets paged in under 60 seconds. Backboard remembers every accept, decline, and "at capacity" outcome, so the next case re-ranks before it's dispatched. Memory is the product; the ranking is a pure function that cannot compute without it.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;Last spring I found a stunned songbird on the sidewalk and spent forty minutes cold-calling vets that don't take wildlife. By the time I reached an actual rehabber, the bird was gone. That's the problem I wanted to solve in a weekend.&lt;/p&gt;

&lt;p&gt;Most dispatch apps pick the closest rehabber. &lt;strong&gt;Terra Triage picks the one who will actually say yes, because Backboard remembers who said no last time.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Terra Triage&lt;/strong&gt; is a three-agent web app for people who just found an injured animal and have no idea who to call. You snap a photo, approve a single consent prompt, and under 60 seconds later a licensed wildlife rehabilitator within range has an email in their inbox with the photo, the GPS, and a one-click "accept / decline / at capacity" magic link. No account, no app, no phone tree.&lt;/p&gt;

&lt;p&gt;The interesting part is not the first dispatch. It's the second one. Every outcome a rehabber returns (accepted, declined, at capacity, unreachable) is written back as a signal into &lt;strong&gt;Backboard&lt;/strong&gt;, and the very next case reranks because of it. If Rehabber A just declined a raptor at 9:42, the 9:51 raptor won't go to them first. The memory is the product.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8pejiepcji2joktivyup.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8pejiepcji2joktivyup.png" alt="Finder triage card rendering species, severity, do/don't list" width="508" height="852"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Three agents, one narrow job each:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Agent&lt;/th&gt;
&lt;th&gt;Job&lt;/th&gt;
&lt;th&gt;Model / Service&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Finder&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Vision triage: species, severity 1-5, safety advice&lt;/td&gt;
&lt;td&gt;Groq Llama-4 Scout (vision), JSON mode&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Dispatcher&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Rank rehabbers, send the email, mint magic-link&lt;/td&gt;
&lt;td&gt;Auth0 scoped agent token + Resend&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Memory&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Read and write rehabber signals that drive the ranking&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Backboard&lt;/strong&gt; (primary), Supabase mirror as fallback&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Demo
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Live URL:&lt;/strong&gt; &lt;a href="https://terra-triage.vercel.app/" rel="noopener noreferrer"&gt;https://terra-triage.vercel.app/&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;60 second walkthrough:&lt;/strong&gt;   &lt;iframe src="https://www.youtube.com/embed/1oT1n1p0tdc"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The flow:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Open the website on a phone, snap a photo of an injured animal, and approve the location prompt.&lt;/li&gt;
&lt;li&gt;The Finder agent returns a triage card with species, severity, and first-aid advice.&lt;/li&gt;
&lt;li&gt;A ranked list of nearby rehabbers appears. Each card shows the Backboard-aware score, distance, capacity, and a one-tap &lt;strong&gt;Call&lt;/strong&gt; button for the listed 555-01xx number.&lt;/li&gt;
&lt;li&gt;Tap &lt;strong&gt;Send referral&lt;/strong&gt; on the top pick. Auth0 asks for the &lt;code&gt;referral:send&lt;/code&gt; scope, you consent once, and the dispatcher fires.&lt;/li&gt;
&lt;li&gt;The success pane shows "Referral sent" next to a scoped-token badge and a &lt;strong&gt;View captured email&lt;/strong&gt; link.&lt;/li&gt;
&lt;li&gt;Open the captured email in &lt;code&gt;/demo/inbox/&amp;lt;id&amp;gt;&lt;/code&gt;. Everything a real rehabber would see is there: photo, GPS, triage summary, accept and decline buttons.&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;Decline, at capacity&lt;/strong&gt; from inside that email. The magic-link records the outcome and redirects to a thank-you page.&lt;/li&gt;
&lt;li&gt;Switch to &lt;code&gt;/admin&lt;/code&gt;. The memory timeline shows the new signal landing in Backboard, and the same case re-ranks with that rehabber demoted.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;No email leaves the server during this flow. Delivery is gated behind a demo switch for this submission; why, and what the real launch path looks like, are in the sections below.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc0k6c34a0j1ifbp9hux5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc0k6c34a0j1ifbp9hux5.png" alt="Dispatch success screen with Auth0 scoped-token badge" width="502" height="321"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr5gkpw09ilxmrwqwo2xu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr5gkpw09ilxmrwqwo2xu.png" alt="Admin memory signals timeline showing Backboard writes in real time" width="800" height="464"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Code
&lt;/h2&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/ArqamWaheed" rel="noopener noreferrer"&gt;
        ArqamWaheed
      &lt;/a&gt; / &lt;a href="https://github.com/ArqamWaheed/terra-triage" rel="noopener noreferrer"&gt;
        terra-triage
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;Terra Triage&lt;/h1&gt;
&lt;/div&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Snap a photo of an injured wild animal and a multi-agent system identifies the species, triages the injury, and dispatches the referral to the rehabber most likely to say yes, in under 60 seconds.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;What it does&lt;/h2&gt;
&lt;/div&gt;
&lt;p&gt;Terra Triage collapses the chaotic gap between &lt;em&gt;"I just found a hurt animal"&lt;/em&gt; and &lt;em&gt;"a trained rehabber is on the way"&lt;/em&gt; into a single guided 60-second flow. It pairs a Groq-powered vision &lt;strong&gt;Finder agent&lt;/strong&gt;, an Auth0-scoped &lt;strong&gt;Dispatcher agent&lt;/strong&gt;, and a Backboard-backed &lt;strong&gt;Memory agent&lt;/strong&gt; so that every referral outcome improves the next ranking. Most dispatch apps pick the closest rehabber. Terra Triage picks the one who will actually accept, because Backboard remembers who said no last time.&lt;/p&gt;
&lt;p&gt;Nationwide coverage is seeded (250 licensed rehabbers, 5 per US state, fictional &lt;code&gt;.example.org&lt;/code&gt; contacts using the NANPA 555-01xx block reserved for fiction) so the ranker has something to rank from day one. Every…&lt;/p&gt;&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/ArqamWaheed/terra-triage" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;p&gt;Project structure (trimmed):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;src/
├── app/
│   ├── report/                    # Anonymous intake (photo + geo)
│   ├── case/[id]/                 # Reporter-visible case page
│   ├── rehabber/outcome/[token]/  # Magic-link outcome form
│   ├── admin/cases/               # Ops console + memory timeline
│   └── api/
│       ├── admin/seed-demo-case/  # Idempotent demo seeder
│       └── auth/[auth0]/          # Auth0 login / callback / profile
├── lib/
│   ├── agents/
│   │   ├── finder.ts              # Groq vision call, JSON mode
│   │   ├── dispatcher.ts          # Rank + Resend + magic-link
│   │   └── rank-with-memory.ts    # Fuses memory signals into the rank
│   ├── memory/
│   │   ├── backboard.ts           # Real Backboard API client
│   │   └── index.ts               # Backboard-primary, local fallback
│   └── auth/
│       ├── agent-token.ts         # Scoped agent token (PAR or M2M)
│       └── magic-link.ts          # HMAC-signed, single-use tokens
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  How I Built It
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Backboard as the protagonist
&lt;/h3&gt;

&lt;p&gt;Most "memory" integrations I see treat the memory service as a prompt-context bucket: fetch recent history, stuff it into the system message, let the LLM figure it out. Terra Triage does the opposite. &lt;strong&gt;The ranker is a pure scoring function that cannot compute without memory first&lt;/strong&gt; — no LLM in the hot path, no prose interpretation, just signals driving weights.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// src/lib/agents/rank-with-memory.ts&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;rankRehabbersWithMemory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="nx"&gt;input&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;CaseInput&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;rehabbers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;PublicRehabber&lt;/span&gt;&lt;span class="p"&gt;[],&lt;/span&gt;
&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;RankedRehabber&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;signals&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;getMemory&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;rehabbers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;rankRehabbers&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;input&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;rehabbers&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;signals&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The scorer weights species match (0.35), distance (0.25), capacity (0.20), accept rate (0.15), and response time (0.05). Every weight except distance is sourced from Backboard. When a rehabber submits an outcome, &lt;code&gt;applyOutcomeToSignals&lt;/code&gt; mutates the relevant keys (&lt;code&gt;capacity&lt;/code&gt;, &lt;code&gt;accept_rate&lt;/code&gt;, &lt;code&gt;species_scope&lt;/code&gt;, &lt;code&gt;response_ms&lt;/code&gt;) as a pure function and writes them back. The next ranking reflects it immediately.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkt9mslzoljxuvi5vsi7v.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkt9mslzoljxuvi5vsi7v.png" alt="Before / after ranking on the same case, after a single decline" width="800" height="646"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The engineering lesson I did not expect.&lt;/strong&gt; My first Backboard integration used semantic &lt;code&gt;/memories/search&lt;/code&gt; once per rehabber, per case. That is correct-looking code and costs about $0.80 per triage at hackathon volumes.&lt;/p&gt;

&lt;p&gt;Because all of our memory writes are structured and attributable to a rehabber id, the correct access pattern is a single paginated &lt;code&gt;GET /memories&lt;/code&gt; and filter in application code. I rewrote it that way and the cost dropped roughly 800x (to fractions of a cent) with no change in ranking quality. Signals are encoded as &lt;code&gt;TERRA_SIGNAL rehabber=&amp;lt;id&amp;gt; key=&amp;lt;k&amp;gt; value=&amp;lt;json&amp;gt;&lt;/code&gt; so the filter is trivial.&lt;/p&gt;

&lt;p&gt;The final detail: &lt;code&gt;FallbackMemory&lt;/code&gt; is a tiny proxy that prefers Backboard and mirrors every upsert to a local &lt;code&gt;memory_entries&lt;/code&gt; table tagged &lt;code&gt;source='backboard' | 'local_fallback'&lt;/code&gt;. If Backboard is down mid-demo, the app keeps working and the admin timeline shows a red chip so you can see the failover instead of it hiding behind a stack trace.&lt;/p&gt;

&lt;h3&gt;
  
  
  Auth0 for Agents: scoped consent for a destructive action
&lt;/h3&gt;

&lt;p&gt;"Send referral" is the one button in this app that can annoy a real human being (emails a licensed rehabber). I treated it as an agent action that must be authorized, not a server-side formality.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// src/lib/auth/agent-token.ts (excerpt)&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;getAgentToken&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;AgentToken&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;session&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;getSession&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;session&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;tokenSet&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;scope&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;referral:send&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;token&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;tokenSet&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;accessToken&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;mode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;user-consented&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;scope&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;referral:send&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;mintM2MToken&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;audience&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;AUTH0_AGENT_AUDIENCE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;scope&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;referral:send&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;PAR is on when the tenant allows it (&lt;code&gt;AUTH0_PAR=1&lt;/code&gt;), so the browser never sees the full authorization params, only a &lt;code&gt;request_uri&lt;/code&gt; handle. The custom &lt;code&gt;consent_context&lt;/code&gt; query parameter carries human-readable context ("email Marcus at Hudson Valley Raptors on your behalf") into the consent screen. If consent is unavailable, we fall back to a scoped machine-to-machine token rather than silently downgrading the action to a service call.&lt;/p&gt;

&lt;p&gt;The UI surfaces which mode was used with an on-screen badge. The narrator can literally point at it on camera and say "scoped." That visibility is the Auth0 story for me: agents should explain themselves, not hide.&lt;/p&gt;

&lt;p&gt;Rehabbers do not have accounts. Their outcome submission goes through an HMAC-signed, single-use, 72-hour magic link (&lt;code&gt;src/lib/auth/magic-link.ts&lt;/code&gt;). Single-use is enforced with a conditional &lt;code&gt;UPDATE ... WHERE outcome IS NULL&lt;/code&gt;, so concurrent submissions for the same token are atomic at the database layer.&lt;/p&gt;

&lt;h3&gt;
  
  
  The rest of the stack
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Finder&lt;/strong&gt;: Groq's &lt;code&gt;meta-llama/llama-4-scout-17b-16e-instruct&lt;/code&gt; over the OpenAI-compatible &lt;code&gt;chat/completions&lt;/code&gt; endpoint, with &lt;code&gt;response_format: { type: "json_object" }&lt;/code&gt;. Sub-second vision triage. Prompt shape is inlined in the system message because Groq does not support strict JSON schemas.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Supabase&lt;/strong&gt;: Postgres, RLS, private &lt;code&gt;photos&lt;/code&gt; bucket with short-lived signed URLs. The Finder hashes the resized JPEG bytes and caches triage results, so demo retries are free.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Resend&lt;/strong&gt;: transactional email, gated behind a &lt;code&gt;DEMO_MODE&lt;/code&gt; flag for this submission (more on that below).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Next.js 16&lt;/strong&gt; (app router) + server actions, &lt;strong&gt;Tailwind&lt;/strong&gt; + &lt;strong&gt;shadcn/ui&lt;/strong&gt;, &lt;strong&gt;Leaflet&lt;/strong&gt; for the rehabber map.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3auzpsuwdzecmpv4yhcu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3auzpsuwdzecmpv4yhcu.png" alt="Architecture: three agents, one memory backbone" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Seeding 250 rehabbers without spamming any of them
&lt;/h3&gt;

&lt;p&gt;The list you see in the demo is &lt;strong&gt;250 fictional licensed rehabilitators, five per US state&lt;/strong&gt;, generated from a deterministic script (&lt;code&gt;scripts/generate-rehabber-seed.ts&lt;/code&gt;). Every record uses real capital and largest-city coordinates so the distance math is honest, but every email ends in &lt;code&gt;.example.org&lt;/code&gt; (reserved under RFC 2606, can never resolve) and every phone uses the NANPA &lt;code&gt;555-0100..555-0199&lt;/code&gt; block reserved for fiction. Not one of those addresses can receive mail. That is deliberate.&lt;/p&gt;

&lt;p&gt;Two switches control delivery in production:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;DEMO_MODE=1&lt;/code&gt; shorts the dispatcher before Resend is ever called. The rendered email is written to a &lt;code&gt;sent_emails_log&lt;/code&gt; table and surfaced at &lt;code&gt;/demo/inbox/&amp;lt;referral_id&amp;gt;&lt;/code&gt;, a server-rendered viewer behind admin basic-auth. The success pane grows a &lt;strong&gt;View captured email&lt;/strong&gt; link so judges can click straight from the app into the message that &lt;em&gt;would have&lt;/em&gt; been sent. Zero outbound traffic, real referral row, real memory signal, real magic-link outcome loop.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;DEMO_REDIRECT_TO=you@example.com&lt;/code&gt; keeps Resend in the loop but rewrites every recipient to a single verified inbox and prefixes the subject &lt;code&gt;[DEMO -&amp;gt; original@address]&lt;/code&gt;. Useful for recording a live walkthrough where you want a real email to arrive on your phone.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Both paths delete the referral row if the send actually fails, so the case page never shows a phantom "awaiting response" card for a message that never left the server.&lt;/p&gt;

&lt;h3&gt;
  
  
  What I cut, and the real path to launch
&lt;/h3&gt;

&lt;p&gt;The biggest thing I cut: &lt;strong&gt;real rehabber contacts.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;There is no global registry of licensed wildlife rehabilitators. US coverage is fragmented state-by-state, sometimes county-by-county, and most other countries (mine included) have no centralized list at all.&lt;/p&gt;

&lt;p&gt;The tempting fix is to scrape state-agency PDFs and let an LLM parse them into rows. I refused to ship that for three reasons: (1) scraping public directories into a third-party product violates most of those agencies' terms of use, (2) the data is stale the moment you capture it (licenses lapse, phones change), and (3) language models invent plausible-looking email addresses. Sending a real referral to a hallucinated inbox is worse than returning no results.&lt;/p&gt;

&lt;p&gt;So the 250 rows in this build are honest placeholders that exercise the ranking math without lying to anyone. Production needs a different sourcing path, and I think there are only three real options:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Partner with the &lt;a href="https://ahnow.org" rel="noopener noreferrer"&gt;Animal Help Now&lt;/a&gt; 501(c)(3).&lt;/strong&gt; AHN already runs a consented, maintained database of thousands of rehabbers across the US. A partnership integration (their pipeline, our ranking and memory layer) is the only path that ships real coverage without recreating two decades of stewardship work. This is what I would pursue first, post-hackathon.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A self-serve rehabber portal.&lt;/strong&gt; Licensed rehabbers sign up, verify their license number against the relevant state registry, accept a Terra Triage ToS, and opt in to receive referrals. Growth is slow but consent is unambiguous and the data stays fresh because each rehabber owns their own row. This is the right fallback if #1 does not pan out.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Per-state agency MoUs.&lt;/strong&gt; Some state wildlife agencies distribute their rehabber lists under explicit terms. Where those terms permit a downstream dispatcher, you sign a memorandum and import. Slow, jurisdiction-by-jurisdiction, but legally clean where it applies.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;What will not change is the consent requirement. Regardless of sourcing path, every rehabber in the live system needs a signed agreement covering referral delivery, PII handling, license verification, and a clear opt-out before they can be ranked. That is table stakes, not a feature.&lt;/p&gt;

&lt;p&gt;The data model for all three paths already exists in this repo (&lt;code&gt;rehabbers&lt;/code&gt; table with &lt;code&gt;active&lt;/code&gt; flag, &lt;code&gt;species_scope&lt;/code&gt;, license metadata). The discovery pipeline and ToS flow are the next weekend.&lt;/p&gt;




&lt;h2&gt;
  
  
  Prize Categories
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Primary: Best Use of Backboard.&lt;/strong&gt; Memory drives a computed decision, not an LLM prompt. Every rank reads signals first; every outcome writes them back; the admin timeline makes the loop visible on screen. A &lt;code&gt;FallbackMemory&lt;/code&gt; proxy keeps the app alive if Backboard is unreachable and tags the origin so failover is auditable. The cost model went from $0.80 per triage to fractions of a cent after rewriting from per-rehabber semantic search to a single filtered list read.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Secondary: Best Use of Auth0 for Agents.&lt;/strong&gt; The Dispatcher is a first-class OAuth client scoped to &lt;code&gt;referral:send&lt;/code&gt;, with PAR when available and an M2M fallback, and the UI labels which mode was used. Rehabbers authenticate through HMAC-signed, single-use magic links with DB-level replay protection.&lt;/p&gt;

&lt;p&gt;Built solo in a weekend with GitHub Copilot CLI as co-author with zero paid services.&lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>weekendchallenge</category>
      <category>ai</category>
      <category>backboard</category>
    </item>
    <item>
      <title>MergeGuardian 9000: I Built an AI Code Reviewer With a 0% Approval Rate</title>
      <dc:creator>Arqam Waheed</dc:creator>
      <pubDate>Tue, 07 Apr 2026 14:19:14 +0000</pubDate>
      <link>https://dev.to/arqamwd/mergeguardian-9000-i-built-an-ai-code-reviewer-with-a-0-approval-rate-5ecm</link>
      <guid>https://dev.to/arqamwd/mergeguardian-9000-i-built-an-ai-code-reviewer-with-a-0-approval-rate-5ecm</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/aprilfools-2026"&gt;DEV April Fools Challenge&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;I've opened hundreds of pull requests in my career. Fixed typos. Refactored auth flows. Centered divs. And every single time, some reviewer finds a reason to block the merge. Not because the code is bad. Because the &lt;em&gt;vibes are off&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;By PR #200, I realized the problem wasn't my code. It was that no tool existed to formalize the experience of being told your perfectly working code is somehow insufficient. So I built the tool myself.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;MergeGuardian 9000&lt;/strong&gt; is an AI-powered pull request review platform with a guaranteed &lt;strong&gt;0.00% approval rate&lt;/strong&gt;. You paste your code, pick a reviewer persona, and within seconds Google Gemini delivers a devastatingly thorough review that finds profoundly absurd reasons to block your merge.&lt;/p&gt;

&lt;p&gt;It looks exactly like a real GitHub PR review. Verdict cards. Status checks. Inline comments. A merge button at the bottom. Except the merge button is permanently disabled. And the status checks are things like "Existential Debt Audit" and "Naming Karma Validation." And the verdict is always one of three options: &lt;code&gt;changes_requested&lt;/code&gt;, &lt;code&gt;blocked&lt;/code&gt;, or &lt;code&gt;spiritually_rejected&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Here's the thing that makes it actually work: &lt;strong&gt;Gemini reads your real code&lt;/strong&gt;. This isn't a random joke generator. Google Gemini analyzes your actual functions, your variable names, your architecture choices, and then finds deeply specific reasons why none of it is merge-worthy. Paste a &lt;code&gt;function add(a, b) { return a + b }&lt;/code&gt; and the Guardian will explain how your function "shows a troubling belief that problems can be solved by combining things."&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fih2dnosk64k6hc4u2dgn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fih2dnosk64k6hc4u2dgn.png" alt=" " width="800" height="892"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  The Five Horsemen of Code Review
&lt;/h3&gt;

&lt;p&gt;Every enterprise platform needs opinionated reviewers. MergeGuardian ships with five, each backed by its own Gemini system prompt that gives the AI a distinct personality:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Persona&lt;/th&gt;
&lt;th&gt;Title&lt;/th&gt;
&lt;th&gt;Blocking Style&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;🛡️ Guardian Core&lt;/td&gt;
&lt;td&gt;Senior Review Orchestrator&lt;/td&gt;
&lt;td&gt;References fake policies like "Guardian Policy 7.4.2"&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;📋 Compliance Beast&lt;/td&gt;
&lt;td&gt;Chief Policy Enforcement Officer&lt;/td&gt;
&lt;td&gt;Sees SOC2 violations in your variable names&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;💀 Staff Engineer of Doom&lt;/td&gt;
&lt;td&gt;Principal Taste Architect&lt;/td&gt;
&lt;td&gt;Has seen better implementations in languages you haven't learned yet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🤖 AI Optimizer&lt;/td&gt;
&lt;td&gt;Metrics &amp;amp; Confidence Analyst&lt;/td&gt;
&lt;td&gt;Your semantic drift score is 0.89. Acceptable range: 0.00 to 0.02.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;😊 Passive-Aggressive Teammate&lt;/td&gt;
&lt;td&gt;Friendly Neighborhood Blocker&lt;/td&gt;
&lt;td&gt;"Just a thought, but have you considered not merging this? Totally up to you! 😊"&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Each persona has its own Gemini system prompt, its own blocking patterns, and its own way of making you question your career choices. Same model. Same API. Five completely different voices. That's the fun part of Gemini's system prompt flexibility.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Loading Theater
&lt;/h3&gt;

&lt;p&gt;No enterprise tool is complete without unnecessary ceremony. When you submit a review, the Guardian runs through a 12-stage "Enterprise Review Pipeline":&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foo1atth45yn5y5m3y7kw.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foo1atth45yn5y5m3y7kw.gif" alt=" " width="760" height="323"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The stages include gems like "Validating emotional idempotency" and "Cross-referencing naming karma." A progress bar ticks up from 0% to 100%. The final stage, "Finalizing disappointment," always fails with a red X. Because of course it does.&lt;/p&gt;

&lt;p&gt;Here's the funny part: Gemini 2.0 Flash responds in 1-3 seconds. The loading theater takes longer than the actual AI generation. Enterprise ceremony demands it.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Appeal System
&lt;/h3&gt;

&lt;p&gt;Here's where it gets good. After your merge gets blocked, you can file an appeal. The "Senior Merge Arbitration Officer" reviews your case via a fresh Gemini call and... denies it. With even more elaborate reasoning.&lt;/p&gt;

&lt;p&gt;Not satisfied? Escalate to the "Principal Philosophy of Code Director." Still denied. Final appeal goes to the "Supreme Architect of the Eternal Codebase." Three rounds of escalating absurdity, each powered by a separate Gemini API call with its own system prompt that shifts the AI's entire personality.&lt;/p&gt;

&lt;p&gt;Round 3 denials hit different: &lt;em&gt;"We ran your code through a quantum computer. In every possible timeline, this merge was blocked."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhzxcai4kookessmlnfic.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhzxcai4kookessmlnfic.png" alt=" " width="800" height="447"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  The Code Quality Roast
&lt;/h3&gt;

&lt;p&gt;Click "Run Code Quality Analysis" and Gemini generates a full enterprise metrics dashboard for your code. The AI returns structured JSON with scores, grades, and per-metric roast explanations. Every metric is suspiciously terrible:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Semantic Cohesion: 12%&lt;/strong&gt; ... "Your functions communicate like divorced parents at a school play"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bus Factor Resilience: 3%&lt;/strong&gt; ... "If you get hit by a bus, this code dies alone"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vibe Alignment Score: 8%&lt;/strong&gt; ... "This code has the structural integrity of a house of cards in a wind tunnel"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Overall grade: F. AI confidence: 99.7% certain this should not ship.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxwkch59qnlugw84z7ol1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxwkch59qnlugw84z7ol1.png" alt=" " width="800" height="578"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Bring Your Own Gemini Key 🔑
&lt;/h3&gt;

&lt;p&gt;You can paste your own Google Gemini API key directly in the UI. It stays in your browser's &lt;code&gt;localStorage&lt;/code&gt; and never goes anywhere except the app's own API routes. No &lt;code&gt;.env&lt;/code&gt; file. No cloning repos. Just grab a &lt;a href="https://aistudio.google.com/apikey" rel="noopener noreferrer"&gt;free key from Google AI Studio&lt;/a&gt;, paste it in, and unlock AI-powered reviews instantly.&lt;/p&gt;

&lt;p&gt;The Gemini free tier gives you 60 requests per minute and 1,000 per day. That's enough to get roasted hundreds of times without spending a cent. The entire app runs at zero cost.&lt;/p&gt;

&lt;p&gt;Without a key the app still works perfectly. Our handcrafted fallback engine has 80+ jokes and serves the same JSON shape. But with Gemini the reviews get personal.&lt;/p&gt;

&lt;h3&gt;
  
  
  10 Sample PRs to Get Roasted
&lt;/h3&gt;

&lt;p&gt;Don't have code handy? Pick from 10 pre-loaded PRs including "Fix typo in button label" (still gets blocked), "feat: implement entire todo app" (built during a meeting, naturally rejected), "feat: add vibe-based code generation" (the Guardian has thoughts about vibes), and "feat: decentralized merge approval via blockchain" (the MergeChain has a 0% approval rate by design).&lt;/p&gt;

&lt;h3&gt;
  
  
  Easter Eggs 🫖
&lt;/h3&gt;

&lt;p&gt;Visit &lt;code&gt;/418&lt;/code&gt; and you'll find an ASCII art teapot with animated steam, a tribute to RFC 2324, and a teapot status dashboard showing: Temperature ∞°C, Brew Status: Philosophically Brewing, Capacity: Unlimited Disappointment.&lt;/p&gt;

&lt;p&gt;The 404 page is on brand too. Even our errors reject you.&lt;/p&gt;

&lt;h2&gt;
  
  
  Demo
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Live demo:&lt;/strong&gt; &lt;a href="https://april-fools-hackathon.vercel.app/" rel="noopener noreferrer"&gt;april-fools-hackathon.vercel.app&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Paste code. Pick a persona. Get blocked. Appeal. Get blocked harder. Share your rejection on Twitter.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkg1ju7kkdueei6chdkhy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkg1ju7kkdueei6chdkhy.png" alt=" " width="800" height="595"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Code
&lt;/h2&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/ArqamWaheed" rel="noopener noreferrer"&gt;
        ArqamWaheed
      &lt;/a&gt; / &lt;a href="https://github.com/ArqamWaheed/april-fools-hackathon" rel="noopener noreferrer"&gt;
        april-fools-hackathon
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;🛡️ MergeGuardian 9000&lt;/h1&gt;
&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;The AI-powered code review platform that blocks every merge — for your own good.&lt;/strong&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"Your code compiles, tests pass, but the universe has not consented."&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;MergeGuardian 9000 is an enterprise-grade AI pull request review platform with a &lt;strong&gt;0.00% approval rate&lt;/strong&gt;. Paste your code, select a reviewer persona, and watch as the Guardian finds profoundly absurd reasons to block your merge.&lt;/p&gt;
&lt;p&gt;Built for the &lt;a href="https://dev.to/devteam/join-our-april-fools-challenge-for-a-chance-at-tea-rrific-prizes-1ofa" rel="nofollow"&gt;DEV April Fools Challenge 2026&lt;/a&gt;.&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;✨ Features&lt;/h2&gt;
&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;5 Reviewer Personas&lt;/strong&gt; — Each with a unique personality and blocking style:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;🛡️ &lt;strong&gt;Guardian Core&lt;/strong&gt; — Senior Review Orchestrator&lt;/li&gt;
&lt;li&gt;📋 &lt;strong&gt;Compliance Beast&lt;/strong&gt; — Chief Policy Enforcement Officer&lt;/li&gt;
&lt;li&gt;💀 &lt;strong&gt;Staff Engineer of Doom&lt;/strong&gt; — Principal Taste Architect&lt;/li&gt;
&lt;li&gt;🤖 &lt;strong&gt;AI Optimizer&lt;/strong&gt; — Metrics &amp;amp; Confidence Analyst&lt;/li&gt;
&lt;li&gt;😊 &lt;strong&gt;Passive-Aggressive Teammate&lt;/strong&gt; — Friendly Neighborhood Blocker&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Google Gemini AI Integration&lt;/strong&gt; — Uses &lt;code&gt;gemini-2.0-flash&lt;/code&gt; across 3 endpoints with 8+ system prompts for contextually absurd reviews, appeal denials, and code roasts&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Bring&lt;/strong&gt;…&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/ArqamWaheed/april-fools-hackathon" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;h3&gt;
  
  
  Project Structure
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;src/
├── app/
│   ├── api/
│   │   ├── review/route.ts       # Main review endpoint (Gemini AI)
│   │   ├── appeal/route.ts       # Appeal escalation endpoint (Gemini AI)
│   │   └── roast/route.ts        # Code metrics roast endpoint (Gemini AI)
│   ├── 418/page.tsx              # 🫖 Easter egg
│   ├── not-found.tsx             # On-brand 404
│   ├── layout.tsx                # Root layout
│   └── page.tsx                  # Main orchestrator
├── components/
│   ├── PRHeader.tsx              # PR breadcrumb &amp;amp; labels
│   ├── CodeInput.tsx             # Code editor with line numbers
│   ├── SamplePRSelector.tsx      # 10 sample PR picker
│   ├── ReviewerSwitcher.tsx      # 5 persona selector
│   ├── ApiKeyInput.tsx           # Gemini API key input (localStorage)
│   ├── LoadingTheater.tsx        # 12-stage pipeline animation
│   ├── VerdictCard.tsx           # Review verdict display
│   ├── CheckRunList.tsx          # Fake status checks
│   ├── ReviewComments.tsx        # Inline review comments
│   ├── MergeBox.tsx              # Permanently blocked merge button
│   ├── AppealFlow.tsx            # 3-round appeal escalation
│   └── RoastDashboard.tsx        # Enterprise metrics roast
└── lib/
    ├── types.ts                  # TypeScript interfaces
    ├── sample-prs.ts             # 10 sample PRs, 5 personas
    ├── fallback.ts               # Review fallback (80+ jokes)
    ├── appeal.ts                 # Appeal prompts + fallback
    ├── roast.ts                  # Roast prompts + fallback
    ├── prompts.ts                # Gemini prompt builders
    └── ai.ts                     # Gemini API integration
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  How I Built It
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The Multi-Agent Gemini Architecture
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm6esqz1oy0r0hm8sib3z.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm6esqz1oy0r0hm8sib3z.png" alt=" " width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This isn't a single API call to Gemini with "be funny." MergeGuardian uses &lt;strong&gt;3 distinct Gemini-powered endpoints&lt;/strong&gt;, each with a different AI "role" and system prompt:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Endpoint&lt;/th&gt;
&lt;th&gt;AI Role&lt;/th&gt;
&lt;th&gt;What It Does&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;POST /api/review&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Code Reviewer&lt;/td&gt;
&lt;td&gt;Reads your actual code, generates verdict + checks + comments + block reason&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;POST /api/appeal&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Merge Arbitration Officer&lt;/td&gt;
&lt;td&gt;Reviews your appeal against the original block, always denies with escalating absurdity&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;POST /api/roast&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Code Quality Analyst&lt;/td&gt;
&lt;td&gt;Generates fake enterprise metrics with devastating per-metric explanations&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Every endpoint follows the same pattern:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Build a persona-specific system prompt&lt;/li&gt;
&lt;li&gt;Send code + context to &lt;code&gt;gemini-2.0-flash&lt;/code&gt; via the Google Generative AI SDK&lt;/li&gt;
&lt;li&gt;Get structured JSON back via &lt;code&gt;responseMimeType: "application/json"&lt;/code&gt;, Gemini's native structured output mode&lt;/li&gt;
&lt;li&gt;If Gemini fails (rate limit, timeout, no key), fall back to handcrafted template engine&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The fallback engines aren't afterthoughts. Each one has its own curated joke bank: 80+ review comments across 5 categories (bureaucratic, anthropomorphic, metrics, passive-aggressive, philosophical), 14 fake checks, 16 block reasons, 18 impossible next steps, 24+ appeal denial rulings, and a full library of fake enterprise metrics. The app is hilarious with or without an API key.&lt;/p&gt;

&lt;p&gt;Here's how the review prompt works under the hood:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Each persona gets a tailored system prompt&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;PERSONA_PROMPTS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;guardian_core&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;You are balanced but firm. Every PR has potential, but none 
    are ready. Reference fake standards like 'Guardian Policy 7.4.2'...&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;compliance_beast&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;You see policy violations everywhere. Reference audit 
    trails, SOC2, change management protocols...&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;passive_aggressive_teammate&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Phrase everything as friendly suggestions 
    that are absolutely requirements. Use 'just a thought' and 
    'totally up to you' liberally. You are smiling while blocking.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The appeal system uses escalating round-based prompts. Round 1 is bureaucratic ("Your appeal has been forwarded to the Department of Merge Ethics. Average response time: 6-8 business millennia."). Round 2 gets philosophical. Round 3 goes full existential. Each round is a separate Gemini call with a different system prompt, so the AI's personality genuinely shifts as you escalate.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6doyfkoh56wi3f69i094.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6doyfkoh56wi3f69i094.png" alt=" " width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  The Google AI Toolchain
&lt;/h3&gt;

&lt;p&gt;Building 8+ system prompts for different AI characters is a lot of prompt engineering. &lt;strong&gt;Google AI Studio&lt;/strong&gt; was the backbone of that process. I used the chat playground to prototype every persona voice, swapping system instructions to A/B test whether the Compliance Beast sounded different enough from the Staff Engineer of Doom. I validated that Gemini's structured output mode could handle complex nested JSON. Arrays of checks. Inline comments. Metric objects. All reliably typed. When a prompt needed iteration, I could edit the system instruction and re-run the same user input instantly.&lt;/p&gt;

&lt;p&gt;I also used &lt;strong&gt;Gemini CLI&lt;/strong&gt; (&lt;code&gt;npx @google/gemini-cli&lt;/code&gt;) for rapid prompt testing straight from the terminal. When I wanted to quickly test how a persona responded to a specific code snippet without context-switching to the browser, I'd pipe code directly into Gemini from the command line. Useful for fast iteration on edge cases, like making sure the AI Optimizer persona generates fake metrics with decimal precision even for a one-line function.&lt;/p&gt;

&lt;p&gt;I explored a few other Google AI features during development that didn't make the cut. &lt;strong&gt;Nano Banana&lt;/strong&gt;, Google's image generation model, was tempting. I considered having it generate fake "architecture violation diagrams" as part of the review. Imagine a UML diagram of why your code is spiritually misaligned. But in testing, the text roasts were funnier than any image could be. We also looked at &lt;strong&gt;function calling&lt;/strong&gt; for simulating tool-use patterns in reviews, &lt;strong&gt;code execution&lt;/strong&gt; for actually running the submitted code and roasting the output, and &lt;strong&gt;Google Search grounding&lt;/strong&gt; for finding real coding standards to parody. In each case, the simpler approach won. The comedy comes from Gemini playing a character and committing to the bit, not from adding complexity.&lt;/p&gt;

&lt;p&gt;For deployment, the app is &lt;strong&gt;Google Cloud Run-ready&lt;/strong&gt;. The repo includes a multi-stage &lt;code&gt;Dockerfile&lt;/code&gt; optimized for Next.js standalone output and a &lt;code&gt;cloudbuild.yaml&lt;/code&gt; for automated builds via Google Cloud Build. One &lt;code&gt;gcloud builds submit&lt;/code&gt; and the app is live on Cloud Run with auto-scaling, managed TLS, and the free tier covering 2 million requests per month. The live demo runs on Vercel for convenience, but the Cloud Run configs are there and tested. Full Google stack, top to bottom.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Stack
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Technology&lt;/th&gt;
&lt;th&gt;Role&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Google Gemini API (&lt;code&gt;gemini-2.0-flash&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;AI generation (3 endpoints, 8+ system prompts, structured JSON output)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Google AI Studio&lt;/td&gt;
&lt;td&gt;Prompt prototyping, system instruction editing, structured output validation, persona A/B testing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gemini CLI (&lt;code&gt;npx @google/gemini-cli&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;Rapid terminal-based prompt testing during development&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Next.js 14 (App Router)&lt;/td&gt;
&lt;td&gt;Framework&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TypeScript (strict mode)&lt;/td&gt;
&lt;td&gt;Language&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tailwind CSS v3&lt;/td&gt;
&lt;td&gt;Styling (custom &lt;code&gt;guardian&lt;/code&gt; color palette)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Lucide React&lt;/td&gt;
&lt;td&gt;Icons&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Vercel&lt;/td&gt;
&lt;td&gt;Deployment&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Why It's Not Just "Call Gemini and Be Funny"
&lt;/h3&gt;

&lt;p&gt;The entire comedy engine runs on Gemini playing characters. Not templates. Not mad-libs. The AI reads your code, inhabits a persona, and improvises within a structured JSON schema. That's what makes every review different.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Per-persona prompt engineering.&lt;/strong&gt; Five distinct system prompts, each producing genuinely different blocking patterns. The Compliance Beast cites fake audit trails. The AI Optimizer invents metrics to false precision. The Passive-Aggressive Teammate smiles while destroying your confidence.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Structured JSON output.&lt;/strong&gt; Gemini doesn't return a blob of text. It returns typed JSON with verdict, checks, comments, block reasons, and next steps via &lt;code&gt;responseMimeType: "application/json"&lt;/code&gt;. Every field maps to its own UI component. No parsing. No regex. No "please format your response as JSON." Just Gemini's native structured output mode. This is a key Google AI feature that made the whole architecture possible, letting AI-generated comedy flow directly into typed React components.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Graceful degradation.&lt;/strong&gt; Every Gemini endpoint has a matching fallback generator that produces the exact same JSON shape. If the API is down, the demo still works perfectly. You'll never see an error state.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Three distinct AI roles.&lt;/strong&gt; The reviewer, the arbitration officer, and the metrics analyst each have different system prompts, different response schemas, and different comedy patterns. This isn't one trick repeated three times.&lt;/p&gt;

&lt;p&gt;Honestly, the whole reason this project exists is because Gemini turned out to be surprisingly good at playing different characters. I started with one API call and ended up with three endpoints because each "reviewer persona" needed its own voice, its own system prompt, its own response format. I prototyped all of them in Google AI Studio first, tweaking system instructions and testing structured output until the JSON was reliable and the jokes were landing. The structured JSON output made it possible to pipe AI-generated comedy directly into typed UI components without parsing nightmares. That rabbit hole is what made the project fun to build.&lt;/p&gt;

&lt;p&gt;And I think it's fun to use because every developer has lived this. The reviewer who blocks your typo fix over "architectural implications." The one who says "just a thought" and then marks it as a blocker. MergeGuardian takes that universal pain and turns it into something you can screenshot, tweet, and argue about in Slack.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prize Category
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Best Google AI Usage
&lt;/h3&gt;

&lt;p&gt;I'm submitting for Best Google AI Usage because Google Gemini isn't a feature of MergeGuardian 9000. It &lt;em&gt;is&lt;/em&gt; MergeGuardian 9000. The entire comedy engine is Gemini playing characters and committing to the bit. Not templates. Not mad-libs. Every review is improvised.&lt;/p&gt;

&lt;p&gt;Here's the full scope of Google AI integration:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3 Gemini-powered API endpoints&lt;/strong&gt;, each acting as a different AI agent. The code review endpoint has 5 persona-specific system prompts. The appeal endpoint has 3 round-based system prompts that shift from bureaucratic to philosophical to existential. The roast endpoint generates structured metric data with AI explanations. That's &lt;strong&gt;8+ unique Gemini system prompts&lt;/strong&gt; across the app.&lt;/p&gt;

&lt;p&gt;Every endpoint uses Gemini's native &lt;strong&gt;structured JSON output&lt;/strong&gt; (&lt;code&gt;responseMimeType: "application/json"&lt;/code&gt;). The AI returns typed objects with verdicts, arrays of checks, inline comments, metric scores, and denial rulings. No string parsing. No regex extraction. Just structured data flowing directly into React components.&lt;/p&gt;

&lt;p&gt;All prompt engineering was done in &lt;strong&gt;Google AI Studio&lt;/strong&gt;. Every persona voice was prototyped in AI Studio's chat playground. I used system instruction swapping to A/B test persona voices, validated complex nested JSON schemas in structured output mode, and iterated appeal escalation prompts until the comedic arc from Round 1 to Round 3 landed right. AI Studio was the prompt workshop. The codebase was just the final deployment.&lt;/p&gt;

&lt;p&gt;I used &lt;strong&gt;Gemini CLI&lt;/strong&gt; (&lt;code&gt;npx @google/gemini-cli&lt;/code&gt;) for fast terminal-based prompt testing. When I needed to check how a specific persona handled a code snippet without opening AI Studio, I'd test it right from the command line. Great for edge cases and quick iterations.&lt;/p&gt;

&lt;p&gt;The app has a &lt;strong&gt;Bring Your Own Key&lt;/strong&gt; feature that links directly to &lt;a href="https://aistudio.google.com/apikey" rel="noopener noreferrer"&gt;Google AI Studio's API key page&lt;/a&gt;. Users grab a free key, paste it in, and unlock AI reviews. The Gemini &lt;strong&gt;free tier&lt;/strong&gt; (60 requests/minute, 1,000/day) runs the entire app at zero cost. No billing required. No API key required for the demo either, since the fallback engine serves the same JSON shape.&lt;/p&gt;

&lt;p&gt;I chose &lt;strong&gt;Gemini 2.0 Flash&lt;/strong&gt; specifically for speed. It responds in 1-3 seconds, which means the fake 12-stage "Enterprise Review Pipeline" loading theater genuinely takes longer than the actual AI generation. The model handles persona-switching through system prompts remarkably well. Five genuinely different reviewer voices from one model.&lt;/p&gt;

&lt;p&gt;We explored other Google AI capabilities too. &lt;strong&gt;Function calling&lt;/strong&gt; for simulating tool-use patterns in reviews. &lt;strong&gt;Code execution&lt;/strong&gt; for actually running submitted code and roasting the output. &lt;strong&gt;Google Search grounding&lt;/strong&gt; for finding real coding standards to parody. &lt;strong&gt;Nano Banana&lt;/strong&gt; for generating fake architecture violation diagrams. In each case, the simpler approach was funnier. The comedy works because Gemini inhabits a character and stays in character. Adding more features would have diluted that.&lt;/p&gt;

&lt;p&gt;The final count: 3 Gemini-powered endpoints, 8+ system prompts, structured JSON on every call, AI Studio for prototyping, Gemini CLI for testing, Cloud Run deployment configs in the repo, BYOK with an AI Studio link, and the entire thing running on the free tier. Every review, every appeal denial, every devastating metric explanation. That's all Google.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;No code was actually approved in the making of this application. Approval rate: 0.00%.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>418challenge</category>
      <category>showdev</category>
    </item>
    <item>
      <title>Schedio – Highlight to Calendar in 5 Seconds</title>
      <dc:creator>Arqam Waheed</dc:creator>
      <pubDate>Sat, 14 Feb 2026 13:36:00 +0000</pubDate>
      <link>https://dev.to/arqamwd/schedio-highlight-to-calendar-in-5-seconds-18pi</link>
      <guid>https://dev.to/arqamwd/schedio-highlight-to-calendar-in-5-seconds-18pi</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/github-2026-01-21"&gt;GitHub Copilot CLI Challenge&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;I created a &lt;strong&gt;Google Chrome extension&lt;/strong&gt; that instantly turns any highlighted text on a webpage into a Google Calendar event - no tab switching, no copy-pasting, no friction.&lt;/p&gt;

&lt;p&gt;I thought of this idea because adding events on google calendar takes me way more time than it needs to, and there was no solution for this problem in a convenient way like I needed. I was sure that other people with no technical depth must be facing this issue to, so I decided to change that.&lt;/p&gt;

&lt;p&gt;Schedio was built &lt;strong&gt;almost entirely by GitHub Copilot CLI&lt;/strong&gt;, while I only handled setup tasks like OAuth, agent documentation, the product requirements and a little bit of manual debugging. By prompting Copilot effectively, I focused solely on the design aspects — almost no code had to be written manually, except for minor adjustments like time conversion fixes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;With schedio you just have to:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Highlight a meeting time&lt;/li&gt;
&lt;li&gt;Right-click → &lt;strong&gt;"Create Event with Schedio"&lt;/strong&gt; (or use the keyboard shortcut)&lt;/li&gt;
&lt;li&gt;Review the pre-filled details in a sleek modal&lt;/li&gt;
&lt;li&gt;Click "Create Event" → the event lands in your Google Calendar instantly&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The project is currently &lt;strong&gt;under review for Chrome Web Store publication&lt;/strong&gt;, but the source is public on my &lt;a href="https://github.com/ArqamWaheed/schedio" rel="noopener noreferrer"&gt;Github repo&lt;/a&gt;. Follow the README for setup instructions!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8w3lr5gapvb6y1y0mucv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8w3lr5gapvb6y1y0mucv.png" alt=" "&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I will be updating this post as soon as the review is done and link the chrome extension for ease of access.&lt;/p&gt;




&lt;h2&gt;
  
  
  📹 Demo
&lt;/h2&gt;

&lt;p&gt;Here’s a &lt;strong&gt;live demo&lt;/strong&gt; of Schedio in action:&lt;/p&gt;

&lt;p&gt;

  &lt;iframe src="https://www.youtube.com/embed/pHS1S02qzhE"&gt;
  &lt;/iframe&gt;


&lt;/p&gt;

&lt;p&gt;At the time of writing, Schedio isn’t yet published on the Chrome Web Store, so setup requires following the instructions in my &lt;a href="https://github.com/ArqamWaheed/schedio" rel="noopener noreferrer"&gt;GitHub repo&lt;/a&gt;. Once you’ve completed the setup, using Schedio is simple:&lt;/p&gt;

&lt;p&gt;1) &lt;strong&gt;Go to Schedio Options&lt;/strong&gt; and enter your &lt;strong&gt;Gemini API key&lt;/strong&gt; to enable AI parsing. There is a public shared API key, but it may be rate-limited, so it’s recommended to add your own — it’s free!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi3hsagztr6vhfijnpmuo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi3hsagztr6vhfijnpmuo.png" alt="Options page"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;2) &lt;strong&gt;Highlight&lt;/strong&gt; text on any webpage containing event information.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkt8jsws0k05gj0jl1rbl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkt8jsws0k05gj0jl1rbl.png" alt="Highlight example"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;3) &lt;strong&gt;Right-click&lt;/strong&gt; and select &lt;strong&gt;"Create Event with Schedio"&lt;/strong&gt; (or use the keyboard shortcut &lt;u&gt;Alt+Shift+S&lt;/u&gt;). The shortcut can also be customized through the options page.&lt;/p&gt;

&lt;p&gt;4) A sleek &lt;strong&gt;modal pops up&lt;/strong&gt;, pre-filled with AI-parsed details like title, date, time, and location.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F37yklebtssclwtxv30ee.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F37yklebtssclwtxv30ee.png" alt="Modal example"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;5) &lt;strong&gt;Review&lt;/strong&gt; the details and click &lt;strong&gt;"Create Event"&lt;/strong&gt;. The event is added &lt;strong&gt;instantly to your Google Calendar&lt;/strong&gt;. You’ll only need to link your Google account via OAuth the first time — after that, creating events is seamless.&lt;/p&gt;




&lt;h2&gt;
  
  
  My Experience with GitHub Copilot CLI
&lt;/h2&gt;

&lt;p&gt;Building Schedio was my first time shipping a full Chrome extension, and it involved a lot more moving pieces than I expected. OAuth flows, Chrome extension permissions, background scripts, content script messaging, AI parsing, and Google Calendar integration all had to work together seamlessly.&lt;/p&gt;

&lt;p&gt;I used GitHub Copilot CLI to generate most of the implementation, but I did not treat it like autopilot. I defined the architecture, structured the prompts carefully, and reviewed everything it produced. When something broke, I debugged it myself.&lt;/p&gt;

&lt;p&gt;One issue that stood out was a silent failure when creating calendar events. The modal worked, the parsed data looked correct, but the event simply was not appearing in Google Calendar. There were no clear errors. After tracing logs across the background script and OAuth token flow, I realized the access token was expiring earlier than expected and the refresh logic was not being triggered properly. Copilot had scaffolded the initial OAuth integration, but I had to step in, inspect the token lifecycle, and restructure the flow so the token was validated before every API call. Once fixed, event creation became consistent and instant.&lt;/p&gt;

&lt;p&gt;Another time, AI-parsed times were being converted incorrectly for users in different time zones. Instead of patching it blindly, I isolated the formatting logic, tested edge cases, and adjusted the conversion logic to normalize everything before sending it to Google Calendar.&lt;/p&gt;

&lt;p&gt;Using Copilot CLI did not remove responsibility completely, but it was able to help me ship schedio WAY FASTER than I could ever have before. I felt a lot more productive using copilot.&lt;/p&gt;

&lt;h2&gt;
  
  
  Beyond Development
&lt;/h2&gt;

&lt;p&gt;The help didn’t stop at coding. Copilot made it possible to ship a complete product fast. I used it to generate branding ideas, logo prompts, privacy policy drafts, and even content for demo posts. Normally, figuring all that out would take hours of brainstorming and trial-and-error. Instead, I could feed suggestions into tools like Nanobanana, tweak them, and get polished results. In just a few days, I went from concept to a fully working, branded extension with marketing-ready copy.&lt;/p&gt;

&lt;p&gt;This approach didn’t just make development faster but it also let me release a polished, full-featured product on my first try while keeping the user experience smooth and seamless.&lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>githubchallenge</category>
      <category>cli</category>
      <category>githubcopilot</category>
    </item>
  </channel>
</rss>
