DEV Community

Juan Torchia
Juan Torchia

Posted on

GreenGate: the agent proposes, the phone decides — building an Auth0 CIBA + Gemini audit for embodied carbon

DEV Weekend Challenge: Earth Day

This is a submission for Weekend Challenge: Earth Day Edition

Construction is responsible for roughly 37% of global energy-related CO₂ emissions (UNEP Global Status Report for Buildings). Most of that footprint is decided at the Bill of Materials stage — before anything gets poured, welded, or framed. Audit a BOM against peer-reviewed emission factors and you can cut 10-30% off a project before ground is broken. The math is a spreadsheet. The consequence of changing a material — the structural class, the fire rating, the supply chain, the local building code — is absolutely not a spreadsheet.

That shape — agent does the work, human takes the decision — has a name. It's called CIBA, Client-Initiated Backchannel Authentication, and it's the primitive Auth0 is pushing for their AI Agents product.

This is the weekend I spent building GreenGate: an embodied-carbon BOM auditor where an agent proposes lower-carbon substitutions with real cost deltas, and a site manager approves each one from their phone before the change is applied. Every approval carries a cryptographically-bound description of what's being approved. No "Allow login?" push that could mean anything.

I want to tell this story the way it actually happened — hand-tagged and honest — because the interesting part isn't the architecture diagram. It's the two walls I hit, the escape hatch that made the submission viable, and a polling bug that took me longer to find than it had any right to.


What I Built

GreenGate is a Next.js 16 App Router application that does three things in order:

  1. Ingests a BOM (Bill of Materials) for a construction project — materials, quantities, stocking units. One seeded project, "Casa 80 m² — Buenos Aires", 33 line items covering binders, aggregates, masonry, steel, wood, insulation, finishes, electrical, plumbing.
  2. Audits the BOM against an emission factor database (30 stocking-unit factors derived from ICE Database v3.0 — the public peer-reviewed UK dataset, used under educational license). The audit picks the top-5 hotspots that each contribute ≥5% of the project's total footprint, queries lower-carbon alternatives, then hands the hotspot list to Google Gemini 2.5 Flash with a typed schema and asks for specific substitutions.
  3. Requires human approval per substitution via Auth0 CIBA. The agent never changes a BOM row autonomously. For each suggestion ("replace Portland cement CEM I with GGBS-blended CEM III/A, save 180 kg CO₂e"), a push arrives on the site manager's phone through Auth0 Guardian. The push shows the binding message — the literal text describing the change — bound to the approval request. The user sees what they're approving. If they deny, the BOM doesn't change. If they approve, the suggestion flips to APPROVED and the Approval row gets persisted with the Auth0 auth_req_id as permanent audit trail.

The prize-narrative claim: every GreenGate action that costs CO₂ or costs money needs two keys — the agent's and a human's. The Auth0 auth_req_id is what makes the audit trail reconstructable from the DB. Someone can ask "who approved replacing the cement on project X, when, and from which device," and the answer is in a single row.


Demo

Live app:
https://web-production-5dcc8.up.railway.app

login needs Auth0 Guardian on your phone for the MFA-on-login step — behaviour is identical to an agent approval but for the session itself

Demo (320s):

https://www.youtube.com/watch?v=C9zrwHqhMQc

Typical audit stream, copy-pasted from the live UI:

› Session OK · juan@…
› Loaded project "Casa 80 m² — Buenos Aires" · 33 BOM items
› Fetched 30 ICE emission factors
› Project total: 14,500 kg CO₂e · 5 hotspots → CONC-C25 (4,300 kg, 30%) …
› Asking gemini-2.5-flash for substitutions on 5 hotspots…
› Gemini returned 5 proposals
› Persisted 5 suggestions (audit run a3c12f)
✓ Audit complete.
Enter fullscreen mode Exit fullscreen mode

Each of those five suggestions sits in the UI with a "Request approval" button. Clicking it triggers a push to my phone. The push says, for example: Sub CONC-C25 with CONC-GGBS30: save 1,290kg CO2e. I approve on-device. ~6 seconds later the UI flips to ✓ Approved and the Approvals panel on the page grows by one row, showing the auth_req_id, the decision, the timestamp, and the substitution details.

That's the whole product loop in about 20 seconds of video.


Code

Repo:

https://github.com/JuanTorchia/greengate

Stack in one table:

Layer Choice
Framework Next.js 16 (App Router, TypeScript strict)
UI Tailwind v4 + shadcn/ui + Framer Motion
Operational DB PostgreSQL via Prisma
Emission factors ICE Database v3.0 (30 factors, JSON-loaded)
LLM Google Gemini 2.5 Flash (structured output)
Auth Auth0 Universal Login + CIBA
MFA / Approvals Auth0 Guardian (push notifications)
Deploy Railway (managed Postgres + Node runtime)

Folder layout is feature-first, not MVC. The audit, the agent, and the approvals belong to the same domain — they belong together:

src/
  features/
    audit/
      server/        # Server Actions + the audit orchestrator
      components/    # RunAuditButton, ApprovalButton, AnimatedKg
      schemas.ts     # Zod schemas for Gemini output + events
  lib/
    auth0/           # ciba.ts, Management Client helpers
    emissions/       # Mock ICE source + typed interface
    gemini/          # Client + versioned prompt
    prisma.ts
  app/
    app/             # The authenticated dashboard
    api/audit/       # SSE stream route
Enter fullscreen mode Exit fullscreen mode

TypeScript is set to noUncheckedIndexedAccess. If I'm going to trust structured output from an LLM, I want the compiler to be paranoid about array indexing.


How I Built It

The rest of this post is how the weekend actually went. Not a cleaned-up retrospective — the real ordering, including the walls I hit.

Friday night: the pitch

The category description for "Best Use of Auth0 for Agents" described the exact shape of a problem I'd been wanting to work on. Most agent pitches you see in 2026 are some variant of "chat with your X." Most of them either (a) let the agent take actions with no human gate, or (b) put a dumb "Are you sure?" modal in front of every tool call. Both are bad answers.

CIBA is a different answer. It decouples the agent's initiation of an action from the human's approval of that specific action on a device the human already trusts. The protocol bakes a binding_message into the authorization request, and the authenticator (Guardian, in Auth0's case) is supposed to show that message on the approval surface. If it works end-to-end, you get agents that can scale beyond a single keyboard without becoming runaway autonomous bots.

Scope I aggressively cut before writing a single line:

  • No multi-tenancy. One tenant, one user, seeded project.
  • No file upload. One BOM lives in the seed.
  • No manual BOM edits. The agent proposes, the human approves. That's the loop.
  • No "try the flow without Guardian" fallback. The phone-in-your-hand moment is the whole point; half-measures here would cost more narrative than they'd save in effort.

That aggressive cut is what made it possible to ship Phase 1 (Auth0 CIBA end-to-end) on its own, as a credible submission, before Gemini and the real data even showed up.

Saturday 00:10 — Priority flip: Auth0 first

My original draft plan attacked Snowflake → Gemini → Auth0. I flipped it. The reasoning was simple:

  • If Auth0 CIBA doesn't work, the submission doesn't qualify for its primary prize category.
  • If Gemini fails, I hand-craft one realistic substitution server-side and the roundtrip still demos.
  • If Snowflake fails, a mock ICE source with the same interface works. BOM math is deterministic.

So: the high-risk, prize-carrying piece goes first, while I'm alert. Snowflake eventually got cut to mock entirely when I realised it was going to cost a day of its own and add nothing the prize cared about. Honest in the README about it.

00:40 — Auth0 MCP instead of clicking

Auth0 ships an official MCP server (@auth0/auth0-mcp-server, beta). Which means the "thirty minutes of clicking in the dashboard" part — create tenant, create API, create M2M app, authorize the grant, add the scope — can largely be done by the agent itself. Installed the MCP, authorized it against the tenant with device code, approved a scope set covering clients, resource_servers, client_grants, actions, logs, and forms.

Tiny gotcha: the MCP's init command writes to the Claude Desktop config path, not the Claude Code one. Moved the server block manually, pinned it to the Node 24 binaries I had installed.

Meta-observation I'm deliberately not losing: the project is about agent-with-human-in-the-loop, and I used an agent to set up the authentication infrastructure for the agent I was building. Turtles all the way down, but in a useful way. The tenant-setup work is deterministic config-plumbing that doesn't need a human decision. The runtime decisions (approve this material substitution, or don't) absolutely do. Two different shapes of agent work in one project, and the design makes the difference explicit.

Still on me, physically: install Guardian on my phone, enroll the user. That enrollment is out of scope for any MCP, and that's the right design.

01:00 — The wall

First sign of trouble. I asked the MCP to flip the M2M client's grant_types to include urn:openid:params:grant-type:ciba. Auth0 answered with a wall:

Upgrade your subscription to enable Client Initiated Backchannel Authentication.

Pricing pages confirmed it: CIBA on classic Auth0 is Enterprise, custom-priced, which third-party trackers peg at roughly $30k/year. The 22-day Enterprise trial was the obvious move — except my Auth0 account was years old and the trial was long burned. No promo code, no hackathon code, no path through.

I sat with this for a minute. My plan named MFA OOB Push as the documented fallback: same Auth0 tenant, same Guardian app, different endpoint (/mfa/challenge + grant mfa-oob). Still human-in-the-loop. Still agent-initiated. Still an Auth0 prize story. Just not the canonical CIBA one.

The prize category literally says "Auth0 for Agents" and the canonical Auth0-for-Agents primitive is CIBA, not MFA OOB. Pivoting was the safe call. The narrative cost was real.

01:15 — The escape hatch

Before committing to the pivot I did one more pass through the docs. This paragraph, from Auth0's March 2026 GA announcement for Auth0 for AI Agents:

"Our free tier includes two connected apps in Token Vault, async authorization, and all the core features you need to start building."

Async authorization is CIBA. So the free tier of the AI Agents product includes the exact feature the free tier of classic Auth0 explicitly excludes. Same company, same protocol, different product packaging. Not stated anywhere on the main pricing page.

The catch: my existing tenant had been provisioned through the regular onboarding flow years ago. It didn't have the AI Agents flag. The fix was to spin up a new tenant from inside the existing account — which is free.

If you're reading this and hit the same wall: create a new tenant via the dashboard, not via the MCP or the signup page. The new tenant gets the AI Agents-compatible plan state by default. Thirty seconds of clicking saved me the pivot.

01:30 — Tenant migration

Created greengate.us.auth0.com from the dashboard. Re-ran the MCP init flow — first attempt picked up the old tenant anyway, because the device-code flow follows the active browser session. Logout + fresh browser pointed at manage.auth0.com/dashboard/us/greengate, re-init, and the MCP locked onto the new tenant.

Recreated everything in the new tenant via MCP, in parallel:

  • API https://api.greengate.local with a single scope audit:propose
  • Regular Web App GreenGate Web for Universal Login
  • M2M GreenGate CIBA Client for the agent
  • Application grant linking the M2M to the API

01:40 — A different error

Re-tried the grant_types flip. Different error — which was the good news:

Application must have at least one channel configured in async_approval_notification_channels when CIBA is enabled.

Config error, not a plan error. CIBA was available; I just needed to declare a channel. The MCP server is beta and doesn't expose that field, so I dropped to the dashboard: Application → Advanced Settings → Grant Types → enable CIBA → Save → a new "Client Initiated Backchannel Authentication" section appears below, where you select guardian-push as the channel.

Worth flagging for anyone reproducing this: in the new tenant, the MFA factor tiles are all labelled "ENTERPRISE MFA" or "PRO MFA" in the dashboard — including Guardian Push. The labels appear to be inherited from the classic Auth0 packaging and aren't accurate for the AI Agents product. The factor enables and works on Free anyway. Labels lie; behaviour doesn't.

01:55 — M2: the round trip (the risk-killer)

The single unknown in the entire CIBA flow: would Auth0 Guardian actually render the binding_message on the push? RFC 9126 says yes, but Guardian hadn't historically advertised it as a first-class feature. I'd budgeted two attempts before flipping to MFA OOB. This had to work.

Bare POST /bc-authorize from curl:

curl -X POST "https://greengate.us.auth0.com/bc-authorize" \
  -H "Content-Type: application/x-www-form-urlencoded" \
  --data-urlencode "client_id=${AUTH0_CIBA_CLIENT_ID}" \
  --data-urlencode "client_secret=${AUTH0_CIBA_CLIENT_SECRET}" \
  --data-urlencode "audience=https://api.greengate.local" \
  --data-urlencode "scope=openid audit:propose" \
  --data-urlencode "binding_message=PIZZA HAWAIANA 42 BANANA TRACTOR" \
  --data-urlencode 'login_hint={"format":"iss_sub","iss":"https://greengate.us.auth0.com/","sub":"auth0|..."}'
Enter fullscreen mode Exit fullscreen mode

(That binding message is absurd on purpose. If it rendered literally on the phone, there was no ambiguity about whether I was reading my own text back.)

First attempt rejected: binding_message can only contain alphanumerics, whitespace, and +-_.,:#. My test string had parentheses. Fixed, re-sent. Response:

{
  "auth_req_id": "atx_3tnNo46Qa22basx5mnPPcQE3bWoT",
  "expires_in": 300,
  "interval": 5
}
Enter fullscreen mode Exit fullscreen mode

Phone buzzed. Guardian notification, unlocked screen, approved on the device.

Polled /oauth/token with grant_type=urn:openid:params:grant-type:ciba and the auth_req_id. Got back a real RS256 access token. sub = the user. aud = the API. scope = openid audit:propose. End-to-end roundtrip in under a minute.

The second push — the one sent to actually answer the Guardian question — used the full PIZZA HAWAIANA 42 BANANA TRACTOR binding. The literal text rendered on the Guardian approval screen.

That's the prize narrative landing intact. The text describing the action that the user is approving is shown on the approval surface itself, cryptographically bound to the request the agent initiated. The user knows they are approving a cement substitution, not "a login." That difference is the product.

M2 done. The biggest risk in the project was closed at hour five.

02:25 — M3: @auth0/nextjs-auth0 v4

The scaffold pinned @auth0/nextjs-auth0@^4.0.0, so v4 it was. v4 changed the env vars from v3: out goes AUTH0_ISSUER_BASE_URL, in comes AUTH0_DOMAIN (just the hostname). Out goes AUTH0_BASE_URL, in comes APP_BASE_URL. Confirmed this by grepping the installed dist/ for the actual constant names — the docs floating around online still mix the two and you can't trust them.

The wireup in v4 is genuinely tiny. Three files:

// src/lib/auth0.ts
import { Auth0Client } from "@auth0/nextjs-auth0/server";
export const auth0 = new Auth0Client();
Enter fullscreen mode Exit fullscreen mode
// src/middleware.ts
import type { NextRequest } from "next/server";
import { auth0 } from "@/lib/auth0";

export async function middleware(request: NextRequest) {
  return await auth0.middleware(request);
}

export const config = {
  matcher: ["/((?!_next/static|_next/image|favicon.ico).*)"],
};
Enter fullscreen mode Exit fullscreen mode
// src/app/page.tsx (excerpt)
import { auth0 } from "@/lib/auth0";
export default async function Home() {
  const session = await auth0.getSession();
  return session ? <Dashboard user={session.user} /> : <LandingPage />;
}
Enter fullscreen mode Exit fullscreen mode

One trip-up worth flagging: I dropped middleware.ts at the project root on instinct. With src/ layout on Next.js 16, middleware has to live at src/middleware.ts or it's silently ignored. Cost me one round of "why is /auth/login 404ing" before I moved it.

Policy config in Auth0 was set to MFA Always. So the login flow itself becomes: Universal Login → password → MFA push → approve on phone → session. Same Guardian app, same device, two distinct semantic events on the same trust path: "logging in" and later "approving the agent's proposal." That's the shape I wanted for the demo.

02:35 — Railway Postgres, the long way

I wanted a managed Postgres from day one so the same DB would serve dev and the eventual deploy. Two CLI walls:

  1. railway add --database postgres returned Unauthorized. Please run railway login again. Even after a fresh login. Same error on railway deploy --template postgres. Some specific permission for managed-template deployment is broken on this account.
  2. Fell back to railway add --service postgres --image postgres:16, which worked and deployed a raw Postgres container. But the raw image only exposes postgres.railway.internal — no public TCP proxy. Can't connect from local Prisma. Adding a public TCP proxy is a dashboard-only operation, not a CLI one.

Ended up asking for one click in the dashboard ("+ New → Database → Postgres") which provisions the managed template with DATABASE_PUBLIC_URL included. Pulled it via railway variable list --service Postgres --json, dropped it in .env. prisma migrate dev --name init + npm run db:seed succeeded on the first try. 30 materials, 33 BOM items, one Project.

Lesson worth flagging: the Railway CLI is not enough for first-time managed-database provisioning on every account state. Document the click-path honestly in your README.

02:50 — M4 + M5: propose, poll, persist

src/lib/auth0/ciba.ts got real TypeScript implementations of initiateCiba and pollCibaToken. Same HTTP endpoints I'd proved with curl in M2; nothing new, just typed. Shapes:

export interface CibaInitiateInput {
  loginHint: string;        // {"format":"iss_sub","iss":"<tenant>","sub":"<user>"}
  bindingMessage: string;   // Sanitized: alphanumeric + ` +-_.,:#`
  scope: string;            // e.g. "openid audit:propose"
}

export type CibaPollStatus =
  | { status: "pending"; slowDown?: boolean; retryAfterSeconds?: number }
  | { status: "approved"; accessToken: string; idToken?: string; rawResponse: unknown }
  | { status: "denied"; error: string }
  | { status: "expired" };
Enter fullscreen mode Exit fullscreen mode

src/features/audit/server/dummy-proposal.ts exposed two server actions:

  • proposeDummyChange() — upserts the user from the session, picks the seeded project, picks the first BOM item, writes a stub AuditRun + Suggestion, calls initiateCiba with a fixed binding ("Replace cement CEM I with GGBS, save 180 kg CO2e"), returns { authReqId, suggestionId }.
  • checkApproval(authReqId, suggestionId) — a single poll; on a terminal status, persists the Approval row and flips the suggestion.

UI was a client component ProposalButton driving a tiny state machine: idle → initiating → polling → (approved | denied | expired | error). Home page shows the button when signed in plus a list of recent approvals from the DB — so a judge can see the audit trail without opening Prisma Studio.

And it worked. For like three minutes.

02:58 — The slow_down trap

Click, push, approve on phone, UI sat at "Waiting for approval…" forever.

tail'd the dev server log. checkApproval was running every 3 seconds. Every poll was returning HTTP 200. None of them advanced past pending. Hit the token endpoint manually with the same auth_req_id, and Auth0 gave me a response my own code had never surfaced:

{
  "error": "slow_down",
  "error_description": "You are polling faster than allowed. Try again in 140 seconds."
}
Enter fullscreen mode Exit fullscreen mode

My pollCibaToken was mapping both authorization_pending and slow_down into { status: "pending" }. The client loop was banging the token endpoint at 3-second intervals. Every poll inside the cooldown re-extends the rate limit, so we never escape. The approval Auth0 already has on file never surfaces. User-visible symptom: "I clicked Allow on my phone, why is the page still spinning?"

RFC 9126 actually mandates this back-off behaviour. I'd cut it on the first pass for speed and paid for it.

Fix: surface slow_down as a distinct state, parse the cooldown out of error_description (Auth0 puts it in plain English), and back off to whatever Auth0 asks for with a 6-second floor (default interval from /bc-authorize is 5s, so 6 leaves headroom):

if (json.error === "slow_down") {
  const match = /\d+/.exec(String(json.error_description ?? ""));
  const retry = match ? parseInt(match[0], 10) : 10;
  return { status: "pending", slowDown: true, retryAfterSeconds: retry };
}
Enter fullscreen mode Exit fullscreen mode

Client reads retryAfterSeconds and schedules the next poll accordingly. Re-tested: click → approve on phone → ~6 seconds later the UI flips to ✓ Approved. The persisted Approval row shows up in the list with auth_req_id, decision, timestamp, and the suggestion details.

Phase 1 done. Auth0 CIBA end-to-end. That's the prize story ready to ship on its own, before anything else.

Sunday 10:30 — Deploy, before the last hour

My plan had a rule: "Railway deploy issues surface late → mitigation: deploy during Phase 1 once login works." I didn't wait. App went up before Phase 2.

  • Added an empty service web alongside the existing Postgres service.
  • Set env vars via CLI, using Railway variable references so the wiring survives reprovisioning: DATABASE_URL=${{Postgres.DATABASE_URL}} (resolves to postgres.railway.internal:5432, no proxy hop) and APP_BASE_URL=https://${{RAILWAY_PUBLIC_DOMAIN}}.

Caveat worth knowing: RAILWAY_PUBLIC_DOMAIN resolves to empty until a domain exists on the service. APP_BASE_URL came out as "https://" until I generated the domain and re-set the var explicitly. Tiny thing, easy to miss.

  • Generated public domain: web-production-5dcc8.up.railway.app.
  • Updated railway.toml from NIXPACKS to RAILPACK (current Railway guidance; Nixpacks is legacy). buildCommand = "npm ci && npm run db:generate && npm run build". startCommand = "npm run db:deploy && npm start". db:deploy runs prisma migrate deploy idempotently on every boot.
  • Added the prod callback URL to the Auth0 Regular Web App via MCP — literally a one-command "add this to this app's allowlist" — so the moment the build finished, production auth worked without a dashboard trip.
  • railway up --service web --detach → built clean on first try. GET / returned 200, /auth/login 307'd to Auth0 with the right prod redirect_uri. Full day of slack left for Phase 2.

11:00 — Phase 2: real data + Gemini

Emissions source first. The scaffold's mock-data.json was empty. Project rule: no fabricated factors. ICE Database or nothing. I populated it with cradle-to-gate (A1-A3) values from ICE Database v3.0 (Circular Ecology, free for educational use). 30 stocking-unit factors covering binders, aggregates, concrete, masonry, steel, wood, roofing, insulation, finishes, electrical, plumbing, openings — plus a deliberately-included set of lower-carbon alternatives: GGBS-blended concrete, AAC blocks, EAF-recycled rebar, cellulose insulation, FSC-certified wood windows, reclaimed-wood doors, recycled porcelain tile. Per-kg ICE values converted to per-stocking-unit using typical product densities, with a __note in the JSON that explicitly documents the conversion so the data is honest about its own construction.

createMockEmissionsSource() parses the JSON through Zod at load, caches, serves queryByMaterialCodes() and queryLowerCarbonAlternatives() against an interface that's ready to be swapped for Snowflake the day someone wants to pay for the ICE CSV upload.

Gemini client second. @google/generative-ai v0.24, responseMimeType: "application/json", schema described in the prompt, validated with Zod on receipt, single retry that feeds the validation errors back into the next prompt. That two-call pattern keeps the retry useful instead of just spinning.

Then the audit orchestrator. streamAudit() (an async generator — we'll get to why in a second) loads the BOM, applies factors, computes per-item totals, picks top-5 hotspots that each contribute ≥5% of project total, queries up to 4 lower-carbon alternatives per hotspot category, calls Gemini, validates, persists AuditRun + Suggestion[] in PENDING. Each suggestion is then individually approvable via CIBA — same initiateCiba / pollCibaToken as M2 — but now the binding message is built from the specific suggestion at approval time:

const binding =
  `Sub ${oldCode} with ${newCode}: save ${savings}kg CO2e`
    .replace(/[^a-zA-Z0-9 +\-_.,:#]/g, " ")   // Auth0 charset
    .slice(0, 64);                              // Auth0 max
Enter fullscreen mode Exit fullscreen mode

The message that shows up on the phone is the actual change, sanitized to the Auth0 charset and capped at 64 characters. Not a token ID, not "Approve #42," not "the agent wants to do something." The literal diff.

11:30 — Two Gemini model gotchas

First test: gemini-2.0-flash. Got back HTTP 429, limit 0. Not "you've exceeded the free tier" — literally the project has zero free quota assigned for this model. Switched to gemini-1.5-flash. Got HTTP 404: "models/gemini-1.5-flash is not found for API version v1beta." The 1.5 series is gone from v1beta endpoints.

Current 2026 free tier is gemini-2.5-flash (Pro is paid since April 2026; 2.0 has zero quota by default on new keys; 1.5 is removed from v1beta). Updated .env, .env.example, and the Railway prod variable. First call worked.

Flag for anyone hitting this: the "current" Gemini docs don't always agree with what your API key can actually call. Always start with the smallest model (gemini-2.5-flash or -flash-lite) on a fresh key, and treat cross-model migration as a real, possibly-breaking step — not a one-line config change.

11:45 — Streaming, because silence is broken UX

First version of runAudit() was a single server action that took 5–15 seconds and returned at the end. The user stared at a "Running audit…" spinner with zero signal about whether anything was actually happening. That's the worst thing AI UX is allowed to do in 2026.

Refactored into an async generator streamAudit() that yields a tagged event for each pipeline step:

export type AuditEvent =
  | { event: "session-checked"; userEmail: string }
  | { event: "project-loaded"; projectName: string; bomItemCount: number }
  | { event: "factors-loaded"; factorCount: number; missingCodeCount: number }
  | {
      event: "hotspots-computed";
      totalKgCo2e: number;
      hotspotCount: number;
      hotspotPreview: { materialCode: string; totalKgCo2e: number; share: number }[];
    }
  | { event: "alternatives-loaded"; alternativeCount: number }
  | { event: "gemini-asking"; model: string; hotspotCount: number }
  | { event: "gemini-responded"; rawSuggestionCount: number }
  | { event: "persisted"; suggestionCount: number; auditRunId: string }
  | { event: "complete"; auditRunId: string; totalKgCo2e: number; hotspotCount: number; suggestionCount: number }
  | { event: "error"; message: string };
Enter fullscreen mode Exit fullscreen mode

A POST /api/audit route handler wraps the generator in a ReadableStream and emits server-sent events. The client component fetches the route, reads with a TextDecoder, splits on \n\n, parses each data: line, and appends a coloured transcript line to the page.

Same business logic as the single-shot version. Same database writes. Same Gemini call. Dramatically better perceived latency, and — this is the part I care about — the user knows what the agent is doing. "Asking gemini-2.5-flash" isn't a black-box spinner. It's "here is the specific step that's taking the time, and here is the model name so you can check if I'm asking the right one."

That transparency matters doubly in a human-in-the-loop product. If the user's going to approve decisions the agent proposes, they need to see the agent working at every step. Not just at the approval moment.


Prize Categories

Best Use of Auth0 for Agents (primary).

CIBA is the centrepiece of the product, not a login bolt-on.

  • Every Suggestion the agent generates is gated by a distinct CIBA request. /bc-authorize is called per-suggestion with a binding_message that describes the specific substitution, sanitized to the Auth0 charset and capped at 64 chars.
  • The binding_message shows up literally on the phone's Guardian approval surface — I verified this end-to-end with "PIZZA HAWAIANA 42 BANANA TRACTOR" and again with real substitution text. The user knows what they're approving.
  • Approval.authReqId is a @unique non-null column. Every approved or denied suggestion has a corresponding auth_req_id that lets anyone reconstruct who approved what, when, from which authenticator.
  • The Auth0 Management API is accessed through the official MCP server for the infrastructure setup — itself an AI-for-agents scenario — demonstrating the second shape of "agent work": deterministic config plumbing that doesn't need human approval, alongside the runtime decisions that do.
  • Same Auth0 tenant and same Guardian device serve both "login MFA" and "agent action approval." Two distinct semantic events on one trust path; the session gate and the action gate share an authenticator without collapsing into a single prompt.

Best Use of Google Gemini (secondary).

  • Gemini 2.5 Flash produces the substitution proposals.
  • Structured JSON output (responseMimeType: "application/json") with the schema embedded in the prompt, validated server-side with Zod.
  • Single-retry pattern: if the first response fails validation, the validation errors are fed back to the model in the retry prompt. This is specifically to handle edge cases where the model returns syntactically-valid-but-semantically-wrong output (e.g., a reasoning field longer than 200 chars, or a savings number that's suspiciously large).
  • The prompt itself is versioned in src/lib/gemini/prompts/audit-suggestion.ts — first-class code, not a magic string floating somewhere in a server action.

Snowflake (noted, not submitted).

Originally in scope, downgraded to a mock EmissionsSource with the same interface to keep Phase 1 shippable on its own. The mock is loaded from ICE Database v3.0 values via a JSON file, parsed through Zod at load time, and served through the same interface a Snowflake client would implement. EMISSIONS_SOURCE=snowflake would be the flag to swap it in when the CSV gets uploaded to a warehouse. Honest in the README about this — no pretending there's a Snowflake integration behind the curtain.


What's Deliberately Not in the Demo

Because it reads better to be explicit about this than to let a judge discover it:

  • No multi-tenant. One Auth0 tenant, one user, one seeded project. The architecture supports multi-tenant (the userId on Project and AuditRun is already there), but wiring the UI around user-scoped resources was cut from this weekend.
  • No BOM upload UI. The seeded BOM is fine for a demo. The BOM format is a straightforward Prisma model (BomItem { material, quantity, unit }) and CSV import is a two-hour follow-up, not a blocker.
  • No real Snowflake. See above. The swap is a single file replacement.
  • No retro-auditing of approved changes. The Approval row is persisted; there's no UI yet to browse historical decisions. /admin/approvals with a filtered table is the obvious next step.
  • No email alerts on denial. If a site manager denies a substitution, the Suggestion goes DENIED in the DB but nobody gets an email. Trivial to add via Postmark/Resend.

The build prioritised ship-ability of the prize-carrying flow over breadth. That's the trade I made and I'd make it again.


Takeaways

  • CIBA is the right shape for agents with real-world consequences. Not "generic human-in-the-loop middleware," not "are you sure?" modals — cryptographically-bound human approvals of specifically-described actions. The binding message is the product.
  • The classic Auth0 pricing page hides CIBA behind Enterprise. The Auth0 for AI Agents product puts it in the free tier. If you're starting a new project that needs CIBA, create a new tenant; don't fight the old one. Thirty seconds of clicks, full canonical prize-narrative unlocked.
  • Publish the agent's progress as a stream, not a spinner. For any AI UX that takes more than two seconds, a running transcript beats a black box every single time. ReadableStream + generators + tagged events is the simplest wiring that delivers it.
  • Documenting your scope cuts is better than faking breadth. ICE mock instead of Snowflake made Phase 1 shippable. I said so in the README. No one has been penalised for being honest in a write-up.

Every action in GreenGate that changes real-world material — and therefore real-world CO₂ — needs two keys. The agent's, and yours. On your phone. With the actual change written on the approval screen, signed into the request.

That's what shipping agents look like when you take the consequences seriously.


Built over one weekend for the DEV.to Earth Day Weekend Challenge, by a full-stack dev from Argentina who has opinions about buildings and cement.

Top comments (0)