<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: L. Cordero</title>
    <description>The latest articles on DEV Community by L. Cordero (@earlgreyhot1701d).</description>
    <link>https://dev.to/earlgreyhot1701d</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3683045%2F745698c0-b6f4-42ea-96e9-44a671fa69e0.png</url>
      <title>DEV Community: L. Cordero</title>
      <link>https://dev.to/earlgreyhot1701d</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/earlgreyhot1701d"/>
    <language>en</language>
    <item>
      <title>Can retrieval agents like ChatGPT and Perplexity read your website? Agentis Lux sees what they see.</title>
      <dc:creator>L. Cordero</dc:creator>
      <pubDate>Sun, 28 Jun 2026 21:13:08 +0000</pubDate>
      <link>https://dev.to/earlgreyhot1701d/can-retrieval-agents-like-chatgpt-and-perplexity-read-your-website-agentis-lux-sees-what-they-see-5cac</link>
      <guid>https://dev.to/earlgreyhot1701d/can-retrieval-agents-like-chatgpt-and-perplexity-read-your-website-agentis-lux-sees-what-they-see-5cac</guid>
      <description>&lt;p&gt;&lt;em&gt;I created &lt;a href="https://agentislux.io" rel="noopener noreferrer"&gt;Agentis Lux&lt;/a&gt; for the purposes of entering &lt;a href="https://h01.devpost.com/" rel="noopener noreferrer"&gt;H0 Hackathon&lt;/a&gt; (Vercel + AWS Databases). #H0Hackathon&lt;/em&gt; &lt;a href="https://devpost.com/software/agentis-lux-for-your-second-audience" rel="noopener noreferrer"&gt;See Agentis Lux's Devpost.com entry&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/bv56_XB1E_c"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

&lt;p&gt;It started with a comment at a hackathon.&lt;/p&gt;

&lt;p&gt;A you.com employee said the thing out loud: the web has a second audience now. When you ask ChatGPT or Perplexity a question, a retrieval agent fetches a page and reads its HTML to answer you. Not the laid-out site with the buttons and the hero image. The markup underneath. These agents arrive by the million, and many of them rely on the raw or minimally rendered HTML rather than running your JavaScript, so they often see far less of your page than a person does. &lt;/p&gt;

&lt;p&gt;That comment sent me to build. My first answer to it was &lt;a href="https://github.com/earlgreyhot1701D/hermes-clew" rel="noopener noreferrer"&gt;Hermes Clew&lt;/a&gt;, for the GitLab Duo Agent Platform Challenge. Hermes lived inside GitLab Duo Chat, no frontend, no database: a Python engine that scanned the HTML, JSX, and TSX files in a repo, scored them across six categories, and let an LLM reason over the findings. It proved the core idea. It also told developers how to fix things, lived inside one vendor's chat, and only worked on files in a repo.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://agentislux.io" rel="noopener noreferrer"&gt;Agentis Lux&lt;/a&gt; is what happened when I took that idea to the open web and rebuilt it with a different stance. Any live URL, not just repo files. Its own product on a real cloud architecture, not a chat window. And no fix suggestions, on purpose, where Hermes used to hand them out. Same six-category bones, a new body, a sharper philosophy. It scans your site and shows you what that second audience experiences when it tries to read it.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it does
&lt;/h2&gt;

&lt;p&gt;You paste a URL to &lt;a href="https://agentislux.io" rel="noopener noreferrer"&gt;Agentis Lux&lt;/a&gt;. You get a report. The report is written from the agent's point of view.&lt;/p&gt;

&lt;p&gt;Not "this is broken." More like: "an agent landing on this page can't tell which element starts checkout, because it's a styled div and not a button."&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fp25fo9lbju0iq1ps6q17.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fp25fo9lbju0iq1ps6q17.png" alt=" " width="800" height="855"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It reports findings. It does not suggest fixes, and that is on purpose. I know what the agent sees, not what you should change. That is the whole value: visibility, and you decide what to do with it. Awareness, not judgment.&lt;/p&gt;

&lt;p&gt;Six deterministic checks score the frontend out of 100: semantic HTML, form accessibility, ARIA, structured data, content in the HTML, and link and navigation. A parallel set of six API checks runs on the backend.&lt;/p&gt;

&lt;h2&gt;
  
  
  The one idea the architecture is built on
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/earlgreyhot1701D/perseus-clew/blob/main/docs/ARCHITECTURE.md" rel="noopener noreferrer"&gt;Structure is deterministic. Flavor is AI.&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The checks and the scoring are pattern matching. No model touches the number. Same input, same score, every time. I only spend AI in two places where a regex can't help: a Bedrock call writes the one-line plain-language verdict, and a second Bedrock layer runs an agent simulation, reasoning about what a retrieval agent would experience on the page and reporting what it could and could not accomplish. Not an autonomous agent clicking around. A simulation of the experience.&lt;/p&gt;

&lt;p&gt;Vercel runs the entire frontend and the edge layer. The Next.js App Router app deploys to Vercel with the /api/scan route as a serverless proxy in front of the AWS backend, so the browser never talks to Lambda directly. Preview deployments on every push meant I could see each change live before it merged, which is most of how a solo builder keeps quality up without a QA team. The custom domain, HTTPS, and CDN were Vercel defaults I didn't have to think about, which kept my attention on the scan engine.&lt;/p&gt;

&lt;p&gt;The AI is constrained, not creative. Low temperature, capped tokens, and a system prompt that encodes the product's own rules: no fixes, no judgment words, no em dashes. The simulation returns structured JSON, and any finding it references is filtered against the deterministic findings, so the model can't invent something the math didn't catch. If it fails validation, it falls back to a template. Math for trust, and the AI is fenced into exactly the two jobs where judgment helps.&lt;/p&gt;

&lt;p&gt;Math stays math, so you can trust the number. Language and judgment are where AI earns its place.&lt;/p&gt;

&lt;p&gt;This sounds like a philosophy choice. It ended up being an economics choice that fell out of the architecture. The deterministic core runs at any scale for almost nothing, so the free tier can stay free. I only pay for model tokens on the sentence and the simulation, the two places a human reads. I didn't design that in a spreadsheet. It just dropped out of keeping the math and the AI in separate boxes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why DynamoDB, and how I used it
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fh7q1ev5rlj7nvfx4kisa.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fh7q1ev5rlj7nvfx4kisa.png" alt=" " width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The hackathon stack is Vercel on the frontend and AWS on the back, with DynamoDB as the data layer. I wanted to use DynamoDB as a deliberate data model, not a key-value afterthought, because every access pattern in this product is a single key lookup. That is exactly what it is built for.&lt;/p&gt;

&lt;p&gt;Five tables, each with one job:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;ScanCache&lt;/strong&gt;, 15-minute TTL, keyed by a hash of the URL, dedupes repeat fetches and keeps Bedrock cost down.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ScanResults&lt;/strong&gt;, 24-hour TTL, keyed by an opaque id, anonymous, results that expire on their own.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;BenchmarkScans&lt;/strong&gt;, the 50-site dataset, with a GSI on vertical, rewritten monthly by an EventBridge refresh.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ScanCounters&lt;/strong&gt;, server-side counts, no PII. Reserved for the team tier.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Users&lt;/strong&gt;, reserved for signed-in history. A stub.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Two of those are live on every scan, one holds the benchmark, and two are reserved stubs for later. Two TTLs, two lifetimes, two reasons. Per-vertical rollups use the GSI, not a second database. No joins, no migrations, no idle server.&lt;/p&gt;

&lt;p&gt;The write on a live scan is fail-soft and async. The scan returns to you whether or not the write lands, and a failed write goes to CloudWatch instead of your screen. The scan result is the product. Persistence is a side effect.&lt;/p&gt;

&lt;p&gt;(The product is Agentis Lux. The engine is Perseus Clew, part of my Clew suite, which is why the AWS tables carry the PerseusClew prefix.)&lt;/p&gt;

&lt;h2&gt;
  
  
  The bet I made in public
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fmkzlyz10yesynxrjdimx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fmkzlyz10yesynxrjdimx.png" alt=" " width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Before the engine scanned anything, I wrote down what I expected it to find across 50 sites and committed it to the repo with a timestamp. &lt;a href="https://dev.to/earlgreyhot1701d/predictions-first-data-later-seven-hot-takes-on-ai-agent-readiness-before-i-scan-50-sites-599d"&gt;Predictions first&lt;/a&gt;, data later, so I couldn't move the goalposts.&lt;/p&gt;

&lt;p&gt;Then I scanned ten sites each across e-commerce, SaaS, content and media, US government, and indie builder projects.&lt;/p&gt;

&lt;p&gt;Indie builders won. Mean score 77 out of 100, ahead of government, SaaS, and e-commerce. The single highest score in the whole run was a personal developer portfolio at 91. Scores ran from 34 to 91. Four sites blocked the scan at the door, including OpenAI.&lt;/p&gt;

&lt;p&gt;I missed three of my six predictions. That is the point of pre-registering them. If I had gone six for six you should distrust me, because it would mean I only predicted what I already knew. The misses are where I learned something: that craft beats compliance, that the API is the real blind spot, and that a hand-built personal site reads cleaner to an agent than most of the web's biggest companies.&lt;/p&gt;

&lt;p&gt;The full dataset, including the sites that blocked me, is in &lt;a href="https://github.com/earlgreyhot1701D/perseus-clew/blob/main/docs/BENCHMARK-HYPOTHESES.md" rel="noopener noreferrer"&gt;the repo&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The gaps
&lt;/h2&gt;

&lt;p&gt;Fetching arbitrary user-supplied URLs on a public endpoint is a security problem before it is a feature. The backend does full DNS resolution, blocks private and reserved IPs, validates every redirect hop, forces HTTPS, and caps size and time. That hardening took as long as some of the checks did.&lt;/p&gt;

&lt;p&gt;Bedrock had to be allowed to fail. If the model is slow or errors, the report still renders, because the AI verdict has a deterministic template under it as a floor. The hero line never breaks, because the score under it was never AI in the first place.&lt;/p&gt;

&lt;p&gt;And also: this is a solo build on a deadline. The backend is JavaScript, not TypeScript. The benchmark page serves a published snapshot instead of querying DynamoDB live. The results view still has heading-hierarchy work. All of it is written down in &lt;a href="https://github.com/earlgreyhot1701D/perseus-clew/blob/main/docs/KNOWN-LIMITATIONS.md" rel="noopener noreferrer"&gt;KNOWN-LIMITATIONS.md&lt;/a&gt;, as choices, with reasons. On a product whose whole thesis is readability, hiding the gaps would be the one move I could not make.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where this sits next to the other tools
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://www.scrunchai.com" rel="noopener noreferrer"&gt;Scrunch&lt;/a&gt;, recently acquired by Sitecore, works on AI search visibility: whether your brand gets cited when someone asks an AI a question. That is about being found. Agentis Lux is about whether an agent can read and use what it finds. Visibility, not operability.&lt;/p&gt;

&lt;p&gt;Google's experimental &lt;a href="https://developer.chrome.com/docs/lighthouse/overview" rel="noopener noreferrer"&gt;Agentic Browsing audit in Lighthouse&lt;/a&gt; (May 2026) checks the agent-as-actor surface: WebMCP and whether a browser-driving agent can operate your page. Agentis Lux goes deeper on the agent-as-reader surface, the raw HTML a retrieval agent forms an impression from before it ever acts. Different door.&lt;/p&gt;

&lt;p&gt;The agentic web is new enough that Google only added experimental, unscored checks two months ago. That is not a reason this is unoriginal. It is evidence the lane is open.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the tool says about itself
&lt;/h2&gt;

&lt;p&gt;Agents are not one reader. They are a spectrum, from the retrieval crawler that never runs your JavaScript to the browser-driving agent that does. The interesting output is the gap between them, and that is where this goes next: live benchmark querying, score history, and a render mode that shows the delta between what a non-JS agent sees and what a JS-capable one sees.&lt;/p&gt;

&lt;p&gt;The tool scans its own site and publishes the result. It went from 70 to 96 after I fixed what it found, with one finding still open and shown anyway. Because if I scrubbed my own site to a perfect 100, you would have every reason not to trust the number on yours.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fdj8vt7j9zt5sbk56uo23.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fdj8vt7j9zt5sbk56uo23.png" alt=" " width="800" height="416"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Try it on your own site: &lt;a href="https://agentislux.io" rel="noopener noreferrer"&gt;agentislux.io&lt;/a&gt;. The code, the methodology, and the raw benchmark data are in the &lt;a href="https://github.com/earlgreyhot1701D/perseus-clew" rel="noopener noreferrer"&gt;repo&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;For your second audience.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Links&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Live: &lt;a href="https://agentislux.io" rel="noopener noreferrer"&gt;agentislux.io&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Demo video (2:57): &lt;a href="https://www.youtube.com/watch?v=bv56_XB1E_c" rel="noopener noreferrer"&gt;youtube.com/watch?v=bv56_XB1E_c&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Code (Perseus Clew engine): &lt;a href="https://github.com/earlgreyhot1701D/perseus-clew" rel="noopener noreferrer"&gt;github.com/earlgreyhot1701D/perseus-clew&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;The earlier proof of concept, Hermes Clew: &lt;a href="https://github.com/earlgreyhot1701D/hermes-clew" rel="noopener noreferrer"&gt;github.com/earlgreyhot1701D/hermes-clew&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;H0 Hackathon: &lt;a href="https://h01.devpost.com/" rel="noopener noreferrer"&gt;h01.devpost.com&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  - More from Clew Labs: &lt;a href="https://earlgreyhot1701d.github.io/Clew-Labs/" rel="noopener noreferrer"&gt;earlgreyhot1701d.github.io/Clew-Labs&lt;/a&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;AI assisted. Human approved. Powered by NLP.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>aws</category>
      <category>showdev</category>
    </item>
    <item>
      <title>My app didn't go "viral". My AWS bill did.</title>
      <dc:creator>L. Cordero</dc:creator>
      <pubDate>Thu, 25 Jun 2026 03:54:05 +0000</pubDate>
      <link>https://dev.to/earlgreyhot1701d/my-app-didnt-go-viral-my-aws-bill-did-434h</link>
      <guid>https://dev.to/earlgreyhot1701d/my-app-didnt-go-viral-my-aws-bill-did-434h</guid>
      <description>&lt;p&gt;&lt;strong&gt;And by viral I mean from $0 to $31.&lt;/strong&gt; &lt;/p&gt;

&lt;p&gt;&lt;a href="https://umami.is/" rel="noopener noreferrer"&gt;Umami&lt;/a&gt; told me &lt;a href="//clewdirective.com"&gt;Clew Directive&lt;/a&gt; got 14 visits last month. AWS told me I owed $31 for it. That works out to $2.21 a visitor, which would make it the most expensive free learning-path tool in California.&lt;/p&gt;

&lt;p&gt;Spoiler alert: 14 visitors, $31, and not a single one of them was the reason.&lt;/p&gt;

&lt;p&gt;Something was off. Here is how Amazon Q, Claude, and a few hours of reading my own code untangled it. The app turned out to be innocent.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Clew Directive is, quickly
&lt;/h2&gt;

&lt;p&gt;A free, stateless tool that builds you a personalized AI learning-path PDF. You take a 60-second Vibe Check, four questions about your goals and how you learn, and it maps you to free, verified resources and hands you a briefing. No accounts, no database, no paywall, nothing stored about you. It runs on Amazon Nova, which is why it costs close to nothing to operate, which is also why a $31 bill made no sense.&lt;/p&gt;

&lt;p&gt;The name is the Theseus kind of clew. A ball of thread to find your way out of the maze. Less hype, more direction. Live at &lt;a href="https://clewdirective.com" rel="noopener noreferrer"&gt;clewdirective.com&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The number that didn't add up
&lt;/h2&gt;

&lt;p&gt;Twelve visitors, 14 visits, 93% bounce, average session about a minute. Referrers from Bing, Google, Yahoo, GitHub. Visitors from the US, India, Netherlands, Egypt, Ethiopia, Singapore. Mostly crawlers stopping by to say hello.&lt;/p&gt;

&lt;p&gt;A few curious humans and a parade of bots is not a $31 month. So either every visit was doing something enormous, or the bill was never about visits at all.&lt;/p&gt;

&lt;h2&gt;
  
  
  The dashboard lied, politely. An Amazon Q Story
&lt;/h2&gt;

&lt;p&gt;My cost tracker said Clew Directive was running on Claude Sonnet. Sonnet is the expensive one. Case closed, right?&lt;/p&gt;

&lt;p&gt;I opened the repo. Clew Directive does not run Sonnet. The Navigator agent runs Amazon Nova 2 Lite. Scout and Curator run Nova Micro. The IAM policy is scoped to Nova ARNs only, so a Sonnet call from these functions would come back AccessDenied. The app physically cannot bill Sonnet.&lt;/p&gt;

&lt;p&gt;The math agreed. A full learning-path generation on Nova costs about two-tenths of a cent. Fourteen visits, even with the agents fanning out to a few calls each, rounds to lunch money. Nothing here gets you to $31.&lt;/p&gt;

&lt;p&gt;One detail I want to flag, because it set the tone for the whole hunt. The same assistant, Q, that mislabeled the model also quoted me Haiku pricing at a quarter of the real rate. So here is the rule I kept coming back to: trust what a tool retrieves, verify what it remembers. Those are two different things.&lt;/p&gt;

&lt;h2&gt;
  
  
  Asking a better question
&lt;/h2&gt;

&lt;p&gt;The question stopped being "why is my app so expensive" and became "what is actually spending, and why is it wearing my app's name."&lt;/p&gt;

&lt;p&gt;Q pulled the breakdown. The month was 28 million tokens across only 8 active days, and two of those days did 70% of the work. May 24 and 25. Memorial Day weekend.&lt;/p&gt;

&lt;p&gt;The shape of the cost was the real tell (Sonnet only):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Cache writes: 4.1M tokens, $15.33 (55%)&lt;/li&gt;
&lt;li&gt;Cache reads: 23.8M tokens, $7.14 (26%)&lt;/li&gt;
&lt;li&gt;Output: 346K tokens, $5.20&lt;/li&gt;
&lt;li&gt;Input: 120K tokens, $0.36&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The fingerprint
&lt;/h2&gt;

&lt;p&gt;A web app serving 14 visitors does not look like that. Heavy cache write up front, heavy cache read after, almost no real input or output, is the signature of an agent reasoning over a big fixed context. It loads that context once, caches it, then re-reads it on every turn.&lt;/p&gt;

&lt;p&gt;Clew Directive does no prompt caching at all. So whatever ran up the bill, it was an agent chewing on a large cached context, not an app answering users. Which pointed me at a very different project.&lt;/p&gt;

&lt;h2&gt;
  
  
  It was me, over a long weekend sprint
&lt;/h2&gt;

&lt;p&gt;Clew Directive had zero commits on May 24 or 25. Last time I touched it was May 9.&lt;/p&gt;

&lt;p&gt;A different repo lit up. vigil-crest. Created May 23, four commits on May 24. And I know exactly what it is, because I &lt;a href="https://dev.to/earlgreyhot1701d/i-built-a-hermes-agent-to-tell-me-which-hackathons-to-enter-it-told-me-to-enter-this-one-jh2"&gt;wrote a whole article about it&lt;/a&gt; and published it on, of course, May 24.&lt;/p&gt;

&lt;p&gt;Vigil Crest is a challenge-triage agent I talk to on Telegram. It browses the live DEV challenge feed and tells me which hackathons are worth my time. Its stack, in my own published words: AWS Bedrock running Claude Sonnet 4.6, reached through an EC2 instance role so no credentials sit on the box, hosted on an always-on t3.micro. Read that back against what CloudTrail handed me: an assumed role, &lt;code&gt;vigil-crest-bedrock-role&lt;/code&gt;, on an EC2 instance, calling &lt;code&gt;claude-sonnet&lt;/code&gt; through the streaming API. (I am not pasting the full ARN. Account IDs stay home.)&lt;/p&gt;

&lt;p&gt;Same project. Same box. Same model. The weekend I shipped it.&lt;/p&gt;

&lt;p&gt;So the $28 was vigil-crest, on Sonnet, while I spent two days hammering on it before submission. Each triage run caches a fat context, the agent persona, the stack file, the rendered challenge feed, then re-reads it across turns. That is the cache-heavy shape, exactly. Whether it was the agent's own test runs or the tooling I built it with, both ran on that one EC2 box under that one role. Real work, priced correctly, just filed under the wrong project.&lt;/p&gt;

&lt;h2&gt;
  
  
  Q was Sherlock. Claude was the Watson who argued back.
&lt;/h2&gt;

&lt;p&gt;I want to shine a light on how this got solved, because neither tool did it alone and neither did I.&lt;/p&gt;

&lt;p&gt;Amazon Q in the console has one thing Claude does not: keys to the building. It reads my live account. CloudTrail, Cost Explorer, the actual deployed config, the IAM principal behind a single call. That is the force multiplier. I do not have CloudTrail memorized and I am not going to hand-read every IAM policy at 9pm. Q walked the crime scene and came back with the role name, the instance, the timestamp, the model, in minutes. That is the legwork no amount of reasoning replaces.&lt;/p&gt;

&lt;p&gt;But access is not the same as the right conclusion. Q pulled clean evidence and attached the wrong story to it three times. It was a runaway process. It was your app. Check your application logs. Every pass, perfect data, wrong suspect. The evidence was never the problem. The narrative on top of it was.&lt;/p&gt;

&lt;p&gt;Claude could not see my account at all. What it could do was refuse the easy story and push the evidence back through the actual code and the math. The repo says Nova. The IAM says AccessDenied for Sonnet. The token shape says agent, not app. It also said, out loud, that it could only see my commits and not my deployments, so part of this was inference and I should confirm it. A tool telling me where its own knowledge stops is worth more than one that sounds certain.&lt;/p&gt;

&lt;p&gt;So Q was the detective with the magnifying glass on the real scene, and Claude was the Watson who kept asking does that actually follow. The twist is that in this case Watson is the one who pushed back on the detective. Sherlock had the keys. Watson had the doubt. I had to point them at each other and refuse to take the first confident sentence either one offered.&lt;/p&gt;

&lt;p&gt;There is a tidy irony in here. Vigil Crest, the agent that ran up the bill, is built on exactly this idea: a verdict that knows how sure it is beats a confident guess. It hedges its calls on purpose. Solving its own bill came down to making my tools do the same thing, separate what they pulled from what they assumed. The agent's whole design philosophy is what cracked the case it caused.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I learned, hopefully
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The bill was never about traffic.&lt;/strong&gt; Umami is client-side JavaScript. It counts browsers that run my script. It cannot see bots, it cannot see API calls, and it has nothing to do with Bedrock spend. I had tied two unrelated numbers together and scared myself.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The project label was a guess, not a fact.&lt;/strong&gt; Bedrock charges are account-level. They do not inherit the tags of whatever called them. Unless you set up Application Inference Profiles and call those, every model dollar lands in a bucket marked "no project," and something has to claim it. Mine claimed Clew Directive by assumption.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The cost that bites you is the quiet one.&lt;/strong&gt; The $31 sprint is over. June reads $0. The token burst was loud and self-limiting, it ended when I closed the laptop. The thing that does not end is the EC2 box under Vigil Crest, billing by the hour because it is meant to stay on. A small always-on t3.micro is cheap, but cheap and forgotten is how standing costs sneak up. Know what you keep running, and why.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The AI tools were useful and wrong in the same breath.&lt;/strong&gt; Q read CloudTrail and Cost Explorer cleanly and narrated the wrong story three times, blaming my app on every pass. Claude caught the bad pricing and read the repos, and still had to admit it could see commits, not deployments. The actual work was pinning each claim to a source instead of to a confident sentence. Trust retrieval. Verify recall.&lt;/p&gt;

&lt;p&gt;So no, Clew Directive did not go viral. It served 14 people and a crowd of crawlers and cost me almost nothing, which is exactly what it was built to do.&lt;/p&gt;

&lt;p&gt;The bill was me, in a trench coat made of EC2, building the next thing.&lt;/p&gt;

&lt;p&gt;Tell AWS. I want them to know it was &lt;a href="https://github.com/earlgreyhot1701D/vigil-crest" rel="noopener noreferrer"&gt;me&lt;/a&gt;.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Clew Directive is free and open source. Find your way out of the AI-course maze at &lt;a href="https://clewdirective.com" rel="noopener noreferrer"&gt;clewdirective.com&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;AI Assisted. Human Approved. Powered by NLP. &lt;/p&gt;

</description>
      <category>aws</category>
      <category>ai</category>
      <category>learning</category>
      <category>infrastructure</category>
    </item>
    <item>
      <title>Breaking Build: Kiro and Claude delivered exactly what I asked, and it wasn't what I wanted</title>
      <dc:creator>L. Cordero</dc:creator>
      <pubDate>Fri, 19 Jun 2026 16:48:42 +0000</pubDate>
      <link>https://dev.to/earlgreyhot1701d/breaking-build-kiro-and-claude-delivered-exactly-what-i-asked-and-it-wasnt-what-i-wanted-27l5</link>
      <guid>https://dev.to/earlgreyhot1701d/breaking-build-kiro-and-claude-delivered-exactly-what-i-asked-and-it-wasnt-what-i-wanted-27l5</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Building in public means showing the part where the robots did great work on the wrong thing.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;p&gt;The deploy on Agentis Lux succeeded. Green check, no errors, site live. I scanned my own site to grab a "before" shot for a before-and-after, and the scanner handed back a score of 62.&lt;/p&gt;

&lt;p&gt;It handed back 62 for the next site too. And the next one. Same score, same findings, every time, including a finding about a "checkout button" on a site that has no checkout button.&lt;/p&gt;

&lt;p&gt;The build worked. It was running a version of the scanner I'd written weeks ago and abandoned. Everything I'd built since then was sitting in the repo, merged, tested, and not deployed. The deploy pipeline had run exactly once, in May, and never again. AND I NEVER NOTICED!&lt;/p&gt;

&lt;p&gt;So the live site was a confident, well-tested, fully-green stub.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Technically&lt;/em&gt;, nothing went wrong. That's the part I keep mulling over...and over...and over.&lt;/p&gt;

&lt;h2&gt;
  
  
  Mind the gap!
&lt;/h2&gt;

&lt;p&gt;I build with AI agents. I direct, they generate. One agent writes the infrastructure, another audits it, I make the calls and merge. It's fast and it's good, and the failure mode is not what I expected.&lt;/p&gt;

&lt;p&gt;I expected the agents to make mistakes. They mostly don't. What they do instead is build exactly what I asked for, correctly, when what I asked for wasn't what I wanted. The bug isn't in the code. The bug is in the gap between my instruction and my intention, and the agent fills that gap with whatever's most literally true. This exact thing, context engineering, came up at Anthropic's talk at the &lt;a href="https://dev.to/earlgreyhot1701d/aws-summit-los-angeles-2026-why-am-i-always-learning-the-hard-way-46lb"&gt;AWS Summit&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;A human orchestrator, in this case...me, pushes back. "You said deploy, but the pipeline hasn't run since May, did you mean redeploy the current code?" An agent says "deploy succeeded" because the deploy did, in fact, succeed. It answered the question I asked. I asked the wrong question that sat clearly in my blind spot.&lt;/p&gt;

&lt;p&gt;I hit this four times on one project in about a week. Same shape every time.&lt;/p&gt;

&lt;h2&gt;
  
  
  Four times it was right and wrong at once
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The stub that shipped.&lt;/strong&gt; The 62 that came back for every single site, the Groundhog Day score. The infrastructure was real, the tests were green, the deploy worked. It just deployed code I'd left behind. "Is it deployed" was true. "Is the thing I built deployed" was the question I forgot to ask. [Lesson: Don't assume.]&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The three doors, one of them real.&lt;/strong&gt; My scanner takes three kinds of input: a URL, a code repo, an API spec. The interface showed three tabs for them. Clean, obvious, exactly what the design implied. Only the URL one was wired up. The other two were built to the spec I gave, which described three tabs, and I'd later decided to ship only URL scanning first and never updated the interface to match. So a visitor clicks "API spec," types something in, and hits a polite wall. The tabs were correct. My scope had moved and the tabs hadn't heard about it. [Lesson: Kiro and Claude can't read my mind!]&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The findings only an engineer could read.&lt;/strong&gt; My whole audience is people who build with AI and may not know what a &lt;code&gt;&amp;lt;ul&amp;gt;&lt;/code&gt; is. The scanner's findings said things like "repeated sibling elements not wrapped in ul or ol." That is a correct finding. It is also useless to the person I built the tool for. I'd asked for accurate, technical, no-fluff findings. I got them. I forgot to ask "can my actual user read this." [Lesson: Don't forget you're building for the end user, a real person, not a theoretical one.]&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The card that rendered nothing.&lt;/strong&gt; A social card route, built, deployed, working. I saved the image and got a zero-byte file. The route fetched three fonts from the web, and when one came back empty instead of failing outright, the image renderer got fed garbage and produced nothing. The catch block that was supposed to handle font failures never fired, because the fetch didn't fail. It "succeeded" with an empty hand. The error handling was correct for the error it was watching for. The actual failure walked in through the one door nobody was watching. [Lesson: Don't skip testing the live workflow.]&lt;/p&gt;

&lt;h2&gt;
  
  
  The pattern
&lt;/h2&gt;

&lt;p&gt;Every one of these passed its own test. The deploy deployed. The tabs matched the spec. The findings were accurate. The card route ran. If I'd trusted "it works," all four would have shipped.&lt;/p&gt;

&lt;p&gt;The thing that caught them was not better prompting and not a smarter agent. It was me looking at the actual output and asking a more simplified question than the agent was capable of asking. Not "did it run." "Is this the thing I wanted." A 62 on every site is suspicious if you bother to scan a second site. Three tabs are a trap if you click the ones you didn't finish. A finding is useless if you read it as your own user instead of as the engineer who wrote it.&lt;/p&gt;

&lt;p&gt;Agents optimize for what you said. The whole job of the human in the loop is to keep checking what you said against what you meant, because the agent can't see the difference and you're the only one who can.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why I keep doing it anyway
&lt;/h2&gt;

&lt;p&gt;This reads like I haven't learned my own lessons that I've been writing about. So, yes and no? The agents did weeks of real work in days. The audit agent caught real bugs the tests missed. The infrastructure is solid. I would not give that back.&lt;/p&gt;

&lt;p&gt;But there's a reason the model is "I direct, they generate" and not "they build, I watch." Direction is not a one-time instruction you hand off. It's the continuous act of holding the work up against intent and saying "close, but that's not it." The agents are extraordinary at "exactly what you asked." Knowing what to ask, and noticing when the answer is technically perfect and quietly wrong, is the part that's still mine.&lt;/p&gt;

&lt;p&gt;The deploy succeeded. Not the deployment I thought it was. And now I know to look twice.&lt;/p&gt;

&lt;p&gt;All four of these are from building Agentis Lux, an agent-readiness scanner. Yes, a tool that tells other people what agents can't read shipped a stub, hid a broken tab, and rendered an empty card. It's in the open if you want to watch me keep catching myself: [&lt;a href="https://github.com/earlgreyhot1701D/perseus-clew" rel="noopener noreferrer"&gt;https://github.com/earlgreyhot1701D/perseus-clew&lt;/a&gt;].&lt;/p&gt;




&lt;p&gt;&lt;em&gt;AI assisted. Human approved. Powered by NLP.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Built with Kiro, Claude, and a lot of looking at the actual output.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>buildinpublic</category>
      <category>ai</category>
      <category>aws</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Predictions first, data later: seven hot takes on AI agent readiness before I scan 50 sites.</title>
      <dc:creator>L. Cordero</dc:creator>
      <pubDate>Wed, 17 Jun 2026 04:08:50 +0000</pubDate>
      <link>https://dev.to/earlgreyhot1701d/predictions-first-data-later-seven-hot-takes-on-ai-agent-readiness-before-i-scan-50-sites-599d</link>
      <guid>https://dev.to/earlgreyhot1701d/predictions-first-data-later-seven-hot-takes-on-ai-agent-readiness-before-i-scan-50-sites-599d</guid>
      <description>&lt;p&gt;Can a robot read this? Asking for a friend.&lt;/p&gt;

&lt;p&gt;A few months ago, for a hackathon, I built a small tool that checked whether an AI agent could read a website. Deterministic checks underneath, an AI reasoning layer on top. It worked. I called it &lt;a href="https://github.com/earlgreyhot1701D/hermes-clew" rel="noopener noreferrer"&gt;Hermes Clew&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;I could have left it there. The hackathon ended and it was only a proof of concept. But the idea didn't leave. The more agents showed up at the products people ship, the more it looked like the question of the next few years. Not "can a person use this." Can an agent.&lt;/p&gt;

&lt;p&gt;So Hermes became two things: Perseus Clew, the engine, and Agentis Lux, the product. (Latin, roughly "light of the agent." It shows you what the agent sees.) I laid out the thesis and the build in my &lt;a href="https://dev.to/earlgreyhot1701d/my-website-has-two-audiences-now-i-only-built-for-one-of-them-136m"&gt;last post&lt;/a&gt;. This is the part I promised to come back for.&lt;/p&gt;

&lt;p&gt;Consider it my flag in the moon sand. I think agents are the question. And I'm willing to write down what I expect to find before I have any data to hide behind.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3bykoshww5lxdk8jwu1m.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3bykoshww5lxdk8jwu1m.gif" alt=" " width="636" height="322"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here's the bet.&lt;/p&gt;

&lt;h2&gt;
  
  
  I'm pre-registering my predictions (i.e. I'm playing myself!)
&lt;/h2&gt;

&lt;p&gt;Before the engine scans a single site, I wrote down what I think it will find. Then I committed it to the repo with a timestamp.&lt;/p&gt;

&lt;p&gt;Scientists call this pre-registration. I call it no take-backs.&lt;/p&gt;

&lt;p&gt;The reason is simple. When the data comes in, it is easy to look at the numbers, find the pattern that flatters the project, and write the post as if that is what I expected all along. Writing the predictions down first makes that impossible. If I'm right, I predicted it. If I'm wrong, the miss stays on the page next to the result. Either way you can check my work.&lt;/p&gt;

&lt;p&gt;That matters more than usual here, because scoring tools have a reputation for marking their own homework. The fix is to make everything inspectable, including the part where I could fool myself.&lt;/p&gt;

&lt;h2&gt;
  
  
  The thesis it all hangs on
&lt;/h2&gt;

&lt;p&gt;The web was built by humans, for humans. Search crawlers and screen readers got partial accommodation later. Agents are showing up to a playground nobody built for them.&lt;/p&gt;

&lt;p&gt;So I expect most sites to have gaps in what an agent can read and do. And I expect those gaps to land in predictable places: wherever no human incentive paid for machine readability. A site tends to be readable to an agent by accident, wherever something else (search ranking, accessibility law) already forced clean structure. The rest is a blind spot.&lt;/p&gt;

&lt;p&gt;That is the engine under every prediction below.&lt;/p&gt;

&lt;h2&gt;
  
  
  The predictions
&lt;/h2&gt;

&lt;p&gt;Once the engine is live I'm scanning 50 sites: ten each across e-commerce, SaaS, content and media, government, and indie/builder projects. Chosen by maximum-variation sampling, documented site by site in the repo, to span size, platform, and build type. Not at random, and not by the score I expected.&lt;/p&gt;

&lt;p&gt;Each prediction has a confidence tag and a line saying what would prove it wrong, because a prediction you can't lose isn't a prediction.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. The rendering cliff is the deciding line. (Pretty sure. The one I'd bet money on.)&lt;/strong&gt;&lt;br&gt;
Sites built as heavy client-side JavaScript apps will be hard for agents to read, no matter what kind of site they are. Most AI crawlers don't run JavaScript. Vercel's network data, across hundreds of millions of crawler fetches, found no JavaScript execution at all. If your content only appears after the JS runs, the agent sees an empty shell.&lt;br&gt;
&lt;em&gt;Wrong if:&lt;/em&gt; client-heavy sites score about the same as server-rendered ones on whether the content is in the HTML.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Government beats startups. (Pretty sure.)&lt;/strong&gt;&lt;br&gt;
US government sites will be easier for agents to read than small indie and startup sites. Not because anyone in government set out to court agents, but because federal law (Section 508) forces clean, labeled, semantic markup, and that same structure is what an agent parses. Regulation made them accidentally agent-ready. I'm keeping this run US-only on purpose, so the 508 mechanism is the thing being tested and not a mix of different countries' rules.&lt;br&gt;
&lt;em&gt;Wrong if:&lt;/em&gt; US government sites land at or below indie sites on semantic structure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Structured data is a commerce-and-media thing. (Maybe, leaning pretty sure.)&lt;/strong&gt;&lt;br&gt;
The machine-readable labels that tell an agent what a page is will show up mostly on shopping and news sites, and be close to absent on government and indie sites. Search ranking is the only incentive that paid for them.&lt;br&gt;
&lt;em&gt;Wrong if:&lt;/em&gt; it's spread evenly, or shows up where there's no search incentive.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. E-commerce is the widest spread. (Maybe.)&lt;/strong&gt;&lt;br&gt;
Online stores will have the biggest internal range of any group. The platform mix runs from templated stores to fully custom JavaScript storefronts, so some will be clean and some an agent can't read.&lt;br&gt;
&lt;em&gt;Wrong if:&lt;/em&gt; store scores cluster tightly, or another group spreads wider.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Some sites lock the gate by accident. (Coin toss.)&lt;/strong&gt;&lt;br&gt;
More than a couple of sites will block agents at robots.txt, so the agent doesn't reach the page at all. I think most of this is unintentional, a default or a blanket rule, not a decision to keep agents out.&lt;br&gt;
&lt;em&gt;Wrong if:&lt;/em&gt; almost nobody restricts agents, or the ones who do clearly meant to.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;6. Scores will be all over the map. (Pretty sure.)&lt;/strong&gt;&lt;br&gt;
Overall, scores will spread wide rather than cluster, because there is no settled standard for agent readiness yet. The rules are months old, and how many of us outside developer-tool companies have adopted them? When the rules are this new, sites can't have converged on them.&lt;br&gt;
&lt;em&gt;Wrong if:&lt;/em&gt; scores cluster tightly in one band, which would mean sites are already converging without trying.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;7. Good spec, wrong doorstep. (Already seen it, so not a blind call.)&lt;/strong&gt;&lt;br&gt;
Unlike the other six, this one isn't blind. I'd already seen it while picking the sites. Nine of the ten specs I confirmed live in a GitHub repo, not on the company's own site. One serves it from its own domain. So I'm logging it as an expectation I already have reason to hold, not a bet I placed before seeing anything. The pattern: companies that ship an API tend to publish a spec, but not where an agent would look for it. The agent arrives at the front door, where it would actually check, and finds nothing. The spec exists, just somewhere the agent would never think to look. Great spec, wrong doorstep.&lt;br&gt;
&lt;em&gt;Wrong if:&lt;/em&gt; more than one or two of the ten turn out to expose their spec at a discoverable path on their own domain after all.&lt;/p&gt;

&lt;h2&gt;
  
  
  How I'll know if this is interesting
&lt;/h2&gt;

&lt;p&gt;I decided this before seeing any numbers, on purpose, so I can't talk myself into a story later. Both outcomes are findings. They just land at different volumes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Loud:&lt;/strong&gt; scores spread out, the groups look clearly different from each other, and at least one result surprises me (government beating startups would do it). Spread plus a surprise is a story that carries the post on its own.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Quiet:&lt;/strong&gt; everything clusters in one band and no group stands apart. That's not a dud. It's a finding: sites haven't converged on agent-readiness yet, and here's the baseline that says so. Quieter post, real result, and the next scan has something to measure against.&lt;/p&gt;

&lt;p&gt;The one thing I won't do is manufacture variance that isn't there. If the data comes back flat, I report it flat. This is a first reading of something six weeks old. A baseline can't fail by coming out flat. It can only fail by lying about its shape.&lt;/p&gt;

&lt;h2&gt;
  
  
  What this is not
&lt;/h2&gt;

&lt;p&gt;Fifty sites, ten per group, is illustrative. It is not a representative sample of the whole web, and the writeup will say so. It shows patterns and examples, not statistical proof.&lt;/p&gt;

&lt;p&gt;I'm scanning the public surface only. For a SaaS product that means the marketing site, the docs, and the API spec, not the app behind the login. Agents meet your product at the public doors long before any login. I'm scanning the doors.&lt;/p&gt;

&lt;p&gt;And I score what exists. A site with no forms doesn't get marked down for forms. So I'll compare group to group one category at a time, like to like, not on one combined number.&lt;/p&gt;

&lt;h2&gt;
  
  
  I'm scanning my own front door too
&lt;/h2&gt;

&lt;p&gt;Two of the fifty are Dev.to and Devpost. The place you're reading this, and the place I'm submitting it. They're in the content group, scanned like everything else. I'm interested to see how my favorite platforms read to an agent, and that goes in the writeup with everyone else's.&lt;/p&gt;

&lt;p&gt;And the tool has to pass its own scan. AgentisLux gets pointed at its own site with the same checks I'm aiming at everyone else. I'm running it twice: once now, before the benchmark, and again after, so you can watch my own front door change. No sparing the house of its own inspection.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's next
&lt;/h2&gt;

&lt;p&gt;The engine is built, tested, and merged: the six frontend checks, the six API checks, the scan handler, the agent-simulation layer, and the batch engine that runs all 50 sites. The public scan reads the frontend. The API checks run inside the benchmark. That split is the point, it's the doors idea in the architecture. The last step before the scan runs is getting it deployed to production. You can &lt;a href="https://github.com/earlgreyhot1701D/perseus-clew" rel="noopener noreferrer"&gt;follow the build&lt;/a&gt; if you want to watch it happen. When the scan lands, I'll post the results right next to these predictions, the hits and the misses both.&lt;/p&gt;

&lt;p&gt;Until then, the bets are in writing. I built the first version of this idea months ago, and I've been wrong plenty. They're timestamped in the repo, and when the data lands they'll be sitting right where I left them, right or wrong. If you've got a hunch about which one breaks first, well, ready to place your bet?&lt;/p&gt;




&lt;p&gt;&lt;em&gt;AI assisted. Human approved (all bets are mine). Powered by NLP.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;I created &lt;a href="https://agentislux.io" rel="noopener noreferrer"&gt;Agentis Lux&lt;/a&gt; for the purposes of entering &lt;a href="https://h01.devpost.com/" rel="noopener noreferrer"&gt;H0 Hackathon&lt;/a&gt; (Vercel + AWS Databases). #H0Hackathon&lt;/em&gt; &lt;a href="https://devpost.com/software/agentis-lux-for-your-second-audience" rel="noopener noreferrer"&gt;See Agentis Lux's Devpost.com entry&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>buildinpublic</category>
      <category>ai</category>
      <category>aws</category>
      <category>webdev</category>
    </item>
    <item>
      <title>AWS Summit Los Angeles 2026: Why Am I Always Learning the Hard Way?</title>
      <dc:creator>L. Cordero</dc:creator>
      <pubDate>Mon, 15 Jun 2026 01:36:47 +0000</pubDate>
      <link>https://dev.to/earlgreyhot1701d/aws-summit-los-angeles-2026-why-am-i-always-learning-the-hard-way-46lb</link>
      <guid>https://dev.to/earlgreyhot1701d/aws-summit-los-angeles-2026-why-am-i-always-learning-the-hard-way-46lb</guid>
      <description>&lt;p&gt;I walked into the Kiro lab ready to learn.&lt;/p&gt;

&lt;p&gt;I'd been building a web app (coming soon!) with Kiro for weeks. Next.js on Vercel, API routes talking to DynamoDB, Bedrock handling the AI layer. An &lt;a href="https://h01.devpost.com/" rel="noopener noreferrer"&gt;H0 hackathon&lt;/a&gt; submission with 15 days left on the clock. By then Kiro and I had a rhythm, so the lab wasn't a rescue but a resource. I signed up because I was curious. One saying I keep close: you don't know what you don't know. I went to sharpen how I build, not to confirm I already had it down.&lt;/p&gt;

&lt;p&gt;Clayton Markos was running it, an AWS Senior Technical Instructor. The session had one goal: spec-driven development. And this was new ground. I'd never had Kiro start a project. I scaffold it myself, then bring it in. Letting Kiro generate the structure first was something I hadn't done.&lt;/p&gt;

&lt;p&gt;The task was a weather app. Ninety minutes, build it, deploy it. I was confident I could get it done. Nervous but confident. Building under pressure like this is new to me. 90 minutes? Let's go.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F85gfmwtyxhvtw33om65m.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F85gfmwtyxhvtw33om65m.jpg" alt=" " width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Then I watched how the build was supposed to go.&lt;/p&gt;

&lt;p&gt;Spec first. Not a vague prompt and a prayer, an actual spec. The what, the constraints, the boundaries, written down before Kiro touched a line. Then Kiro works inside that.&lt;/p&gt;

&lt;p&gt;I'd been doing the upfront work. Just not in the shape Kiro wants it. I build with Claude as my architecture and build assistant, so I had docs. Plenty of them. But they were Claude docs. Reasoning and notes written for me to read, not specs written for Kiro to build from. Kiro's strength is spec-driven. I'd been handing it Claude-shaped prose and asking it to infer the spec. In some places that worked. In others it drifted, because I'd given it room and no edges.&lt;/p&gt;

&lt;p&gt;I built the weather app. Deployed it by the end of the lab. It shipped, same as my builds always ship. Thankfully. But I didn't come to ship a weather app. I came to learn. And the lab handed me the question underneath all of it. Am I using these tools to their full capability? Do I understand how they work? Am I building so my projects can succeed?&lt;/p&gt;

&lt;h2&gt;
  
  
  The spec is a tension, not a setting
&lt;/h2&gt;

&lt;p&gt;The lab reframed how I get the most out of Kiro. It isn't tighter control or looser reins. It's the spec, and how I develop it. That's one of the levers that decides how well the project holds up.&lt;/p&gt;

&lt;p&gt;Too rigid, and Kiro has no room to make a good call. You've pre-decided everything, including the parts you shouldn't have, and now it's a very expensive autocomplete. Too loose, and it drifts. It fills the gaps with its own guesses and you spend your time pulling it back.&lt;/p&gt;

&lt;p&gt;The sweet spot is narrow. The spec defines the what and the constraints. Kiro decides the how. I already had a version of this in my steering doc, a rule that says propose before you build, ask before you assume. I just hadn't connected it to the spec the way the lab did.&lt;/p&gt;

&lt;h2&gt;
  
  
  The learning curve tax
&lt;/h2&gt;

&lt;p&gt;I'm self-taught. My first prototype was a jury eligibility chatbot, and I started it before I knew what an API was. The whole time, one question. Can I make this work? Turns out I could. I demoed it to my boss at the time.&lt;/p&gt;

&lt;p&gt;Not much has changed. I still pick up a tool by using it, usually with AI in the loop, usually inside something I've already shipped or am racing to ship. The understanding of how to guide the build shows up late, a beat after I needed it. Filed under my lessons learned doc for the next projects, the &lt;a href="https://dev.to/challenges/june-game-jam-2026-06-03"&gt;June Game Jam&lt;/a&gt; and &lt;a href="https://hackthekitty.com/" rel="noopener noreferrer"&gt;Hack the Kitty&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;That's the tax. You don't know what you don't know, so you can't plan around it. You build, and you let the gaps introduce themselves, one expensive and time-consuming surprise at a time.&lt;/p&gt;

&lt;p&gt;And so far, I keep paying it. I haven't found a way to skip the tax or leap the learning curve, at least not one that's mine. Hence, the lab. &lt;/p&gt;

&lt;h2&gt;
  
  
  The rest of the day kept pointing at the same thing
&lt;/h2&gt;

&lt;p&gt;The Anthropic talk, "Effective Context Engineering for AI Agents," was standing room only. Every seat gone, people on the floor along the walls because there was nowhere else to put them. Turns out context about context is in high demand! Worth it.&lt;/p&gt;

&lt;p&gt;Jacqueline Garrahan, Technical Staff at Anthropic, framed the shift for me. Prompt engineering used to be two pieces: &lt;code&gt;system_prompt&lt;/code&gt; and &lt;code&gt;user_message&lt;/code&gt;. Write good instructions, get good output. Context engineering is the wider job: &lt;code&gt;system_prompt&lt;/code&gt;, &lt;code&gt;tool_definitions&lt;/code&gt;, &lt;code&gt;retrieved_documents&lt;/code&gt;, &lt;code&gt;tool_results&lt;/code&gt;. Everything the model can see before it answers. Your prompt is one input now, not the whole show.&lt;/p&gt;

&lt;p&gt;Half of what I'd picked up in the Kiro lab was the same idea. What you feed the tool decides what it hands back.&lt;/p&gt;

&lt;p&gt;After the "Prompt to Production: AWS Database Integration in Vercel v0" presentation, I did something that is not like me.&lt;/p&gt;

&lt;p&gt;Hedieh Zandi, a Vercel Product Lead and an H0 sponsor, had just presented. The stack she walked through was the one I'd built my submission on. Next.js on Vercel, API routes straight to DynamoDB, Bedrock for the AI layer. So I walked up and introduced myself. Told her I'd entered the hackathon she'd just been presenting on. That I was watching her present the stack I built on.&lt;/p&gt;

&lt;p&gt;This was the first time I've explained one of my projects out loud to people who do this for a living. I was nervous. The kind of nervous where you hear your own voice and it sounds like someone else's. I did it anyway, and I'm glad I made myself. One hurdle down.&lt;/p&gt;

&lt;h2&gt;
  
  
  So why the hard way
&lt;/h2&gt;

&lt;p&gt;And that's the question I keep landing on. I don't have a buttoned-up answer, and that bums me out a little.&lt;/p&gt;

&lt;p&gt;Here's what I do have. AI-assisted coding, vibe coding, whatever you want to call it, has moved fast. I started by typing "build a jury chatbot prototype (no mistakes, lol)" into a chat box. Now I'm doing end-to-end spec-driven development. That's not a learning curve. It's closer to a free fall. You learn on the way down, and it leaves scar tissue. It's just how I've done all of it.&lt;/p&gt;

&lt;p&gt;Labs like "Structured Approach to AI Coding with Spec-Driven Development on Kiro" are the net. They find the blind spots I can't see from inside my own workflow. They humble me. Then they hand me enough to build the next thing with a little more confidence than the last.&lt;/p&gt;

&lt;p&gt;I've still got 15 days until submission. Back to the spec.&lt;/p&gt;

&lt;p&gt;AI Assisted. Human Approved. Powered by NLP.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>aws</category>
      <category>learning</category>
    </item>
    <item>
      <title>An Instagram ad promised me a free AI course. Was it a scam?</title>
      <dc:creator>L. Cordero</dc:creator>
      <pubDate>Sun, 07 Jun 2026 16:50:44 +0000</pubDate>
      <link>https://dev.to/earlgreyhot1701d/an-instagram-ad-promised-me-a-free-ai-course-was-it-a-scam-597</link>
      <guid>https://dev.to/earlgreyhot1701d/an-instagram-ad-promised-me-a-free-ai-course-was-it-a-scam-597</guid>
      <description>&lt;p&gt;&lt;strong&gt;A free AI course promoted by Instagram? I almost scrolled right past it.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The algorithm dropped it in my feed, and my instinct was the one I've built up over years of dodging nonsense online. Is this real, or is someone about to take my information?&lt;/p&gt;

&lt;p&gt;Then I read the pitch. A free, one-week AI literacy course for any "American worker," taught entirely over text. "No laptop or internet needed. Just your phone." Well. I'm an American worker, I want to learn about AI, and free is very much my price point. Real or fake, I clicked anyway.&lt;/p&gt;

&lt;h2&gt;
  
  
  So what is it?
&lt;/h2&gt;

&lt;p&gt;It's an initiative called &lt;a href="https://beta.dol.gov/ai-ready" rel="noopener noreferrer"&gt;AI Ready&lt;/a&gt; from the U.S. Department of Labor, and the lessons call themselves your AI 101 course. One week, ten minutes a day, delivered entirely by text message. You text READY to 20202 and it runs like a text thread with a patient teacher. A short lesson arrives, it asks you a question, you reply, and the next piece comes back. That's the whole setup. No app to download, no account to build, no laptop required.&lt;/p&gt;

&lt;p&gt;My skeptic still wasn't satisfied. Why is the DOL handing this out for free? Why hadn't I seen a single piece of media about it? I went digging before I trusted it. It checked out, so I signed up. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fihv5rpbbwgj7a7ewqd05.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fihv5rpbbwgj7a7ewqd05.jpg" alt=" " width="800" height="1778"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  A quick note for my dev.to fam outside the US
&lt;/h2&gt;

&lt;p&gt;Some of you are reading from outside the US, and I want to flag this early. Enrollment runs through a US text number, so this program may only work inside the US. I haven't tested it on an international phone, so it's worth checking out. Even if it isn't available where you are, you might know someone it would reach, and a free, text-only on-ramp is a model worth seeing wherever you build.&lt;/p&gt;

&lt;h2&gt;
  
  
  What won me over
&lt;/h2&gt;

&lt;p&gt;The first lesson landed while I was at work. A gif popped up on my screen, which is a fun way to get someone's attention, and I'll admit it worked. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fozgedcbgpqcxfna8zalw.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fozgedcbgpqcxfna8zalw.gif" alt=" " width="450" height="241"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The format takes the pressure off. Instead of a blank box waiting for you to know what to say, you answer with A, B, or C. Go quiet for a few hours and it sends a gentle nudge to check in. And when my first scheduled time didn't work, I used the chat to reschedule. Small thing, but it told me someone designed this with the user in mind.&lt;/p&gt;

&lt;p&gt;The lessons themselves are short and easy to follow. Bite-size, not overwhelming, which is the part I think a beginner would appreciate most. And the topics are useful right off the bat. Lesson 4 was "the recipe for a great prompt." Lesson 5 was "put AI to work for you." Useful, not abstract. Somewhere in the week I stopped being suspicious and started being impressed. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxwk97mo5xnws1pfizu61.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxwk97mo5xnws1pfizu61.jpg" alt=" " width="800" height="1778"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The last message doesn't just say goodbye. It hands you links to keep going, with starter courses from AWS, OpenAI, and Microsoft, plus a career explorer. That part may matter to new learners to keep the momentum going.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it isn't
&lt;/h2&gt;

&lt;p&gt;There's a ceiling here. One week at ten minutes a day will not make anyone an AI practitioner, and it won't replace building something with your own hands. It's a door, not a destination. The goal is to get someone through the door without scaring them off, and for that it works. &lt;/p&gt;

&lt;p&gt;It also runs through a third-party platform called Arist, and the course tells you your number is used only for the course and not sold. I always check the privacy language before I sign up, and this one names it plainly. I'd tell a family member the same thing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pass it on
&lt;/h2&gt;

&lt;p&gt;So here's why I'm sharing a beginner course with a room full of builders.&lt;/p&gt;

&lt;p&gt;Through your wonderful articles, I've learned that in our community most of us here are past the beginner stage. We're building, shipping, breaking things, writing about it. We may not be the audience for a ten-minute intro course. But every one of us knows someone who is. A parent, a cousin, a coworker who keeps saying they're "behind on AI" and feels it every time they open their phone.&lt;/p&gt;

&lt;p&gt;There's so much noise pointed at people right now. Paid bootcamps, breathless ads, the steady message that they've already missed the train. A free course that lives in their text messages and asks for ten minutes a day is about the least intimidating on-ramp I've seen so far.&lt;/p&gt;

&lt;p&gt;I think part of this work, that we're all trying to do, is reaching back and bringing someone with us. Sharing the free thing. Lowering the barrier for the person who hasn't started. This is an easy one to pass along.&lt;/p&gt;

&lt;p&gt;If that person is in the US, tell them to text READY to 20202, or send them to &lt;a href="https://arist.link/aiready" rel="noopener noreferrer"&gt;arist.link/aiready&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;I was impressed. I think they will be too.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;AI assisted. Human approved. Powered by NLP.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>beginners</category>
      <category>devrel</category>
      <category>community</category>
    </item>
    <item>
      <title>My website has two audiences now. I only built for one of them.</title>
      <dc:creator>L. Cordero</dc:creator>
      <pubDate>Sun, 31 May 2026 00:32:22 +0000</pubDate>
      <link>https://dev.to/earlgreyhot1701d/my-website-has-two-audiences-now-i-only-built-for-one-of-them-136m</link>
      <guid>https://dev.to/earlgreyhot1701d/my-website-has-two-audiences-now-i-only-built-for-one-of-them-136m</guid>
      <description>&lt;p&gt;&lt;em&gt;I created &lt;a href="https://agentislux.io" rel="noopener noreferrer"&gt;Agentis Lux&lt;/a&gt; for the purposes of entering &lt;a href="https://h01.devpost.com/" rel="noopener noreferrer"&gt;H0 Hackathon&lt;/a&gt; (Vercel + AWS Databases). #H0Hackathon&lt;/em&gt; &lt;a href="https://devpost.com/software/agentis-lux-for-your-second-audience" rel="noopener noreferrer"&gt;See Agentis Lux's Devpost.com entry&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The conversation about who reads your website has been shifting. Agents are part of it now. ChatGPT fetches URLs. Perplexity reads content. Shopping agents try to complete purchases. Coding agents hit your API. Most of those products were built for humans, tested against humans. The agents showed up later and quietly. When they can't figure something out, they don't complain. They just bounce.&lt;/p&gt;

&lt;p&gt;I heard the phrase "second audience" at a hackathon where you.com was one of the hosts. It stuck. That's what agents are: a second audience the web wasn't designed for and isn't being measured against.&lt;/p&gt;

&lt;p&gt;And now, I want to build something about it. A scanner that tells you what an AI agent experiences when it tries to use your website or your API. The internal name is Perseus Clew and the public product is Agentis Lux. The split is intentional: Perseus Clew is the engine name, &lt;a href="https://earlgreyhot1701d.github.io/Clew-Labs/" rel="noopener noreferrer"&gt;part of a suite of AI builder tools&lt;/a&gt;, and Agentis Lux is the product-facing name (Latin for "light of the agent") that describes what agent users see.&lt;/p&gt;

&lt;p&gt;This isn't a launch post. I just finished a docs phase, and I'm about to write code. Before I do, I want to put this in front of dev.to builders and find out what I'm missing.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it will do
&lt;/h2&gt;

&lt;p&gt;Three layers:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Deterministic scanning.&lt;/strong&gt; Twelve check categories — six for frontends, six for APIs — looking at HTML, ARIA, structured data, OpenAPI specs, error responses, idempotency patterns. Same input, same score, every time. The methodology will be published, the weights will be public, and anyone can audit it. AI-readiness scoring tools have a reputation for inflating numbers and hiding their methodology, so the trust floor is making everything inspectable. That's the foundation the rest sits on.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;An AI-written verdict.&lt;/strong&gt; After the score, a Bedrock call reads the top findings and writes one sentence about what an agent experiences. Something like: "An agent visiting this page can read your product descriptions, but can't tell which button starts checkout, so it can't finish a purchase on its own." That sentence is what a human reads first. The number is the proof underneath it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;An agent simulation.&lt;/strong&gt; Bedrock runs structured tasks against the scanned content and reports back: here's what an agent could do, here's what it couldn't. Turns findings into a story instead of a spreadsheet.&lt;/p&gt;

&lt;p&gt;The score is deterministic. The explanation is AI-generated and labeled as such, and the simulation is the narrative layer on top. Each layer earns its role: math where consistency is important, AI where judgment helps, simulation where you need to know whether an agent can complete a real task.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it will look like
&lt;/h2&gt;

&lt;p&gt;Here's the result view — mock data, real design. The locked palette is Lance Wyman-inspired: cream, deep teal, sienna. Typography pairs Archivo Black with Instrument Serif italic for the AI line. Score on the left as one unit.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm2vk8rf7ihhuqw5bye3m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm2vk8rf7ihhuqw5bye3m.png" alt=" " width="800" height="584"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The "AI written" tag is intentional. I want a reader to be able to see which part came from a model and which came from math.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why B2B
&lt;/h2&gt;

&lt;p&gt;Free anonymous scan is the front door. The value is the signed-in tier: track your score over time, see your delta after you ship changes, eventually run it in CI so every PR goes through Agentis Lux before merge.&lt;/p&gt;

&lt;p&gt;I'm entered in the H0 hackathon B2B track (deadline June 29) — submission at &lt;a href="https://h01.devpost.com/?ref_content=default&amp;amp;ref_feature=challenge&amp;amp;ref_medium=portfolio&amp;amp;_gl=1*e0a3or*_gcl_au*NTUwNTc0MTcwLjE3ODAxODMyMTM.*_ga*MzIwNzUxNTg4LjE3NzIzOTgyMjE.*_ga_0YHJK3Y10M*czE3ODAxODMyMTEkbzIxJGcxJHQxNzgwMTgzMjEzJGo1OCRsMCRoMA" rel="noopener noreferrer"&gt;h01.devpost.com&lt;/a&gt; if you want to follow along.&lt;/p&gt;

&lt;h2&gt;
  
  
  Stack
&lt;/h2&gt;

&lt;p&gt;Next.js on Vercel. API routes handle scan initiation and direct DynamoDB reads. AWS Lambda for the scan engine. Bedrock for the AI pieces. DynamoDB for benchmark data, plus ephemeral 24-hour TTL results so shareable links work. EventBridge for monthly benchmark refresh. CDK in TypeScript for the AWS side. Docker so it all runs locally.&lt;/p&gt;

&lt;p&gt;Two DynamoDB tables by design: a 15-minute URL-hash cache so I'm not hammering target sites, and a 24-hour opaque-id result store for shareable links. Different keys, different lifetimes, different purposes. I went back and forth on merging them. The lifetimes don't match, so separate won.&lt;/p&gt;

&lt;p&gt;Built with Kiro.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where I'm uncertain — and what I'd love help with
&lt;/h2&gt;

&lt;p&gt;Three things I don't know:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Is the auth stub the right scope, or am I shipping too much, or too little?&lt;/strong&gt; I'm building auth (email magic link), a user table populated on sign-up, and a scan history view that renders with an empty state, but no trend charts and no score-over-time. That sits in the middle of two clean alternatives. Smaller version: skip auth entirely, generate a per-scan email link for retrieval. Bigger version: if I'm shipping auth, ship the trend chart too because that's the actual recurring value. I picked the middle because I think it answers "who cares about one scan" without pulling trend tracking into the MVP. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Would you pay to track your score over time?&lt;/strong&gt; Anonymous one-shot scans are easy. Recurring value is the business question. Be honest: do you, personally, building what you're building, care enough about agent-readiness to want trend tracking?&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Findings only, no fix suggestions.&lt;/strong&gt; Right now the product surfaces what agents see — "an agent can't tell your button is a button" — but doesn't tell you what to do about it, because that would mean knowing your codebase. I think this is a feature for builders who want visibility without being told how to fix it. It could also be the thing that makes a frustrated user close the tab. Which is it?&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  What's next
&lt;/h2&gt;

&lt;p&gt;Block 0 starts this week. First end-to-end deploy: Next.js on Vercel, scan Lambda on AWS, the result hero rendering from mock data, the DynamoDB write seam stubbed in. I'll post again when the first real scan runs against a real URL.&lt;/p&gt;

&lt;p&gt;If you're building something adjacent, or you have opinions about agent-readiness as a category, I want to hear from you.&lt;/p&gt;




&lt;p&gt;AI assisted. Human approved. Powered by NLP.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Created for the H0 Hackathon. #H0Hackathon&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>agents</category>
      <category>showdev</category>
    </item>
    <item>
      <title>I Built a Hermes Agent to Tell Me Which Hackathons to Enter. It Told Me to Enter This One.</title>
      <dc:creator>L. Cordero</dc:creator>
      <pubDate>Sun, 24 May 2026 20:24:30 +0000</pubDate>
      <link>https://dev.to/earlgreyhot1701d/i-built-a-hermes-agent-to-tell-me-which-hackathons-to-enter-it-told-me-to-enter-this-one-jh2</link>
      <guid>https://dev.to/earlgreyhot1701d/i-built-a-hermes-agent-to-tell-me-which-hackathons-to-enter-it-told-me-to-enter-this-one-jh2</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/hermes-agent-2026-05-15"&gt;Hermes Agent Challenge&lt;/a&gt;: Build With Hermes Agent&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;I started AI-assisted building in July 2025. Self-taught, learning as I go. Hackathons and challenges turned out to be a big part of how I got better, structured lessons to build something inside a deadline, and the &lt;a href="https://dev.to/challenges"&gt;DEV challenge feed&lt;/a&gt; became one of my regular places to look.&lt;/p&gt;

&lt;p&gt;But the more I leaned on it, the more I kept hitting the same question: how do I decide which challenge to enter? It is not trivial. Entering the wrong one costs days I do not get back. And you cannot always tell from a challenge page alone whether it fits the stack, tools or style I build with, whether there is enough runway left, or whether the learning is worth the hours. The feed does not sort itself by &lt;em&gt;worth it&lt;/em&gt;. So you either check it obsessively, miss things, or enter on a hunch and hope for the best.&lt;/p&gt;

&lt;p&gt;Vigil Crest is the filter I wanted. It is a challenge triage agent I talk to on Telegram. I message it, it browses the live challenge feed, and it sends back a verdict on each active challenge. For every one it gives me four short fit lines, Time, Learning, Stack fit, and Timing, a paragraph of reasoning, and a call: enter, skip, or maybe.&lt;/p&gt;

&lt;p&gt;One thing I prioritized was the over familiarization. Vigil Crest does not pretend to know me better than it does. Early on it labels its verdicts as first impressions and tells me what it is unsure about. It is built to get better at reading my judgment the more I check in with it. A verdict that knows how sure it is beats a confident guess. That is the major portion of the whole design, not a disclaimer bolted on the end.&lt;/p&gt;

&lt;p&gt;It is named Vigil Crest because it keeps watch from a height and reports what it sees.&lt;/p&gt;

&lt;h2&gt;
  
  
  Demo
&lt;/h2&gt;

&lt;p&gt;The whole thing happens in Telegram. Here is a full run.&lt;/p&gt;

&lt;p&gt;It starts with the nudge. Twice a week Vigil Crest sends a short reminder to check the board. I reply when I am ready, and it goes to work: it browses the live challenge feed and reports back what it found.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdwvbenwzgu15gu4gzs8r.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdwvbenwzgu15gu4gzs8r.jpg" alt=" " width="800" height="1778"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Then it triages each one. Here is its verdict on the Hermes Agent Challenge itself. Notice the two gauges, STACK FIT and WORTH IT, and that it reasons about deadline and competitive angle rather than just reading the prize off the page.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F01rq476faaozmt2wswqg.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F01rq476faaozmt2wswqg.jpg" alt=" " width="800" height="1778"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It is just as willing to tell me to skip. Two of the active challenges closed the same night. Vigil Crest passed on both, and said why: a rushed entry will not compete with work done over several days.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1xi31oprvfxz2xzp101j.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1xi31oprvfxz2xzp101j.jpg" alt=" " width="800" height="1778"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv1cvwjmykzyvg7bb07a6.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv1cvwjmykzyvg7bb07a6.jpg" alt=" " width="800" height="1778"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;And when something is neither a clear yes nor a clear no, it says so. The GitHub Finish-Up-A-Thon has a relaxed deadline but a hard requirement that does not map cleanly to my work, so it lands on maybe, with the reasoning attached.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foct8pxgoop8hop2c3n86.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foct8pxgoop8hop2c3n86.jpg" alt=" " width="800" height="1778"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Four challenges, four verdicts, each one hedged and explained. That is a single check-in.&lt;/p&gt;

&lt;h2&gt;
  
  
  Code
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/earlgreyhot1701D/vigil-crest" rel="noopener noreferrer"&gt;github.com/earlgreyhot1701D/vigil-crest&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Vigil Crest is not a clone-and-run app. It is a configured Hermes instance. The repo gives you the recipe and the portable parts, the skills, the persona and stack templates, and the build guide. Every user supplies their own persona and stack, because the tool grades against one specific person's judgment.&lt;/p&gt;

&lt;h3&gt;
  
  
  My Tech Stack
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Hermes Agent&lt;/strong&gt; runs the agent, the persona, the skills, and the schedule.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AWS Bedrock&lt;/strong&gt; (Claude Sonnet 4.6) for the model, reached through an EC2 instance role so there are no credentials sitting on the box.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;EC2&lt;/strong&gt; (Ubuntu, t3.micro) as the always-on host for the Hermes gateway.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Telegram&lt;/strong&gt; as the interface. I talk to Vigil Crest like any other contact.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Playwright&lt;/strong&gt; for live browsing of the challenge feed.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GitHub&lt;/strong&gt; public repos as the source for the auto-refreshing part of the stack file.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  How I Used Hermes Agent
&lt;/h2&gt;

&lt;p&gt;Hermes Agent is the whole spine of this. A few capabilities carried most of this.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The persona did the judging.&lt;/strong&gt; Hermes' SOUL.md is where Vigil Crest learned how I pick challenges. Not a generic "rank these by prize money" prompt, but the criteria I use: that excitement is a prerequisite, that time is a genuine cost, that a writing track can lower the barrier when a build track is out of reach, that skipping is a legitimate choice. The agentic part is that the persona reasons from those principles instead of running a scoring formula.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Skills made the triage repeatable.&lt;/strong&gt; I wrote a triage-challenge skill that browses the live feed, reads the active challenges, and produces the four fit lines and a verdict for each. I also wrote a refresh-stack skill that reads my public GitHub repos and keeps the Languages section of the stack file current. The stack file is split by provenance: the part GitHub can verify is auto-refreshed, the part it cannot (frameworks, cloud, tooling) is hand-curated and marked as such. The agent grades stack fit against that file.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The browser tool, used deliberately.&lt;/strong&gt; Vigil Crest always browses the challenge feed directly. Search results cache stale states and will tell you there are no active challenges when four are live. The skill reads the rendered page, the active section only.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The model is Claude Sonnet 4.6, served through Bedrock.&lt;/strong&gt; I picked Sonnet over Opus deliberately: Vigil Crest is meant to run often, and Sonnet 4.6 is strong enough for the triage reasoning while keeping an always-available agent affordable.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Not all of it worked&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I built Vigil Crest to also run autonomously: a scheduled job, a change-detection pre-check, the agent waking on its own when something new appeared. It ran into a wall I could not get around. The pre-check script reads the feed with a headless browser, and a headless browser is environment sensitive. It launches fine when I run the script myself and fails to launch inside the scheduler's background process. The script's safety guardrail then correctly reports nothing new, and the scheduler discards the script's error output on a clean exit, so the failure was invisible until I dug for it.&lt;/p&gt;

&lt;p&gt;The clean fix is an API-based pre-check, since a plain JSON request behaves the same in a shell and a scheduler. The DEV API is rich, but it does not yet expose challenges as their own endpoint, so a pre-check built on it would need some experimentation. That is a v2 item, and it is written up plainly in the repo.&lt;/p&gt;

&lt;p&gt;What I shipped is Vigil Crest as a check-in correspondent. A browser-free nudge runs on a schedule and sends me a short reminder twice a week. The triage happens when I reply. I think this suits the project better than the autonomous version would have. An agent whose value is learning my judgment is probably better off in conversation with me than broadcasting into the void on a timer, since a check-in gives it something to learn from and a timer does not.&lt;/p&gt;

&lt;p&gt;I did not go looking for that reframe. I went looking for a workaround and found a better design instead. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;As for this challenge:&lt;/strong&gt;&lt;br&gt;
Vigil Crest looked at the feed, weighed it, and said enter. I am taking its advice.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;AI assisted. Human approved. Powered by NLP.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>hermesagentchallenge</category>
      <category>devchallenge</category>
      <category>agents</category>
      <category>showdev</category>
    </item>
    <item>
      <title>Build Club Week Four: the part of Themis Lex I never explained</title>
      <dc:creator>L. Cordero</dc:creator>
      <pubDate>Sat, 23 May 2026 19:00:21 +0000</pubDate>
      <link>https://dev.to/earlgreyhot1701d/build-club-week-four-the-part-of-themis-lex-i-never-explained-20i6</link>
      <guid>https://dev.to/earlgreyhot1701d/build-club-week-four-the-part-of-themis-lex-i-never-explained-20i6</guid>
      <description>&lt;p&gt;I shipped an AI readiness self-check for California court staff, &lt;a href="https://themislex.org" rel="noopener noreferrer"&gt;ThemisLex.org&lt;/a&gt;. Here is the why behind it. The problem, the design choices, and what four weeks of building in public actually gave me.&lt;/p&gt;




&lt;p&gt;I've written a few posts about building Themis Lex. The deploy war story got the most attention. What I never did was stop and explain why the thing exists in the first place.&lt;/p&gt;

&lt;p&gt;So before the Women in AI Accelerator wraps, here it is. Why Themis Lex. Why I built it the way I did. And what four weeks of building in public actually gave me.&lt;/p&gt;

&lt;h2&gt;
  
  
  The problem
&lt;/h2&gt;

&lt;p&gt;I work in California courts. I'm a Judicial Services Manager at SBSC. The data my colleagues handle every day is some of the most regulated in the state. Case parties. Witnesses. Victims. Jurors. Sealed records. Personnel files.&lt;/p&gt;

&lt;p&gt;Every think piece this year tells court staff we should be using AI. Not one of them mentions what we actually touch.&lt;/p&gt;

&lt;p&gt;Generic AI guidance is written for a marketing team or a software engineer. It says "redact PII" and moves on. It does not know what a Judicial Assistant does in a courtroom. It does not know that chain of custody is a legal concept with real consequences, not a best practice you can bend. It does not fit the procedural and ethical context that defines court work.&lt;/p&gt;

&lt;p&gt;So a court employee reads the guidance, looks at their actual desk, and still has no answer to the only question that matters. Which part of my job can this help with, and which part must it never touch.&lt;/p&gt;

&lt;p&gt;Themis Lex is the thing I wished existed when I sat down at my desk.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it does
&lt;/h2&gt;

&lt;p&gt;Three inputs. One output. Nothing stored.&lt;/p&gt;

&lt;p&gt;You pick your role, one of three California Superior Court classifications. You describe your workflow in your own words. You pick a data sensitivity level. Themis Lex gives you back two columns. Where AI can safely help you. Where AI must not touch your work. Every item comes with the reason behind it and a plain-language guardrail. Not "redact PII." Instead, "don't paste case numbers or party names into the AI."&lt;/p&gt;

&lt;p&gt;The result downloads as a PDF you can hand to your supervisor.&lt;/p&gt;

&lt;p&gt;No account. No login. No data stored. You arrive, you get your answer, you leave.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why I built it this way
&lt;/h2&gt;

&lt;p&gt;A few choices were deliberate. They are the ones I would defend in a room.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Stateless, on purpose.&lt;/strong&gt; No database. No accounts. No session storage. For a tool that asks court staff to describe their workflow, "we keep nothing" is not a missing feature. It is the feature. Trust is built into the architecture instead of promised in a policy.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The output is the product.&lt;/strong&gt; Not the website. The PDF. Court work runs on paper artifacts and approvals, so Themis Lex produces something a court employee can hand to their supervisor without apologizing for how it looks. It was built to belong in a courthouse.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Real role context, not generic advice.&lt;/strong&gt; Under the hood, every assessment combines three things: California judicial branch AI governance principles, the actual public job description for the role you picked, and your own description of your workflow. The job description is what keeps the output from drifting generic. It tells the model what your role actually involves, so the guidance lands on your real job and not an invented one.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Honest scope.&lt;/strong&gt; Three roles in version one, not thirty. The rest are visible in the tool with a clear "pending" label. I would rather ship three roles that are right than thirty that are guesses.&lt;/p&gt;

&lt;h2&gt;
  
  
  What four weeks of building in public gave me
&lt;/h2&gt;

&lt;p&gt;Here is the part I did not expect.&lt;/p&gt;

&lt;p&gt;I am not a traditional engineer. I am self-taught. Two years ago I worked in jury services and could not have told you what an IAM role was. Building this alone, the technical decisions were the scariest part. Is the architecture sound. Is the output structured the right way. Am I missing something obvious that a trained engineer would catch on sight.&lt;/p&gt;

&lt;p&gt;The Build Club community is what carried me through that. Posting every week meant I was not deciding in a vacuum. People asked the technical questions I did not know to ask myself. They helped me think through the architecture and the shape of the output. And it mattered more than I expected to hear that the project resonated with people, that the problem was real to them too.&lt;/p&gt;

&lt;p&gt;Building in public sounds like a marketing tactic. For me it was not. The weekly check-ins kept me honest and kept me moving. The feedback made the product better. The community made the technical fear smaller.&lt;/p&gt;

&lt;p&gt;That is what I am taking from these four weeks. Not "I shipped a tool," although the tool is real and I am proud of it. The thing I will keep is the proof that I do not have to build alone to build something worth using.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;

&lt;p&gt;Themis Lex is live at &lt;a href="https://themislex.org" rel="noopener noreferrer"&gt;ThemisLex.org&lt;/a&gt;. If you work in or near a court, pick your role and run it. If you don't, run it anyway and tell me where the logic breaks.&lt;/p&gt;

&lt;p&gt;One honest note to set expectations. This is a proof-of-concept MVP, not a finished product. Three roles, one court's job descriptions, a single model call. It does the core thing well and it stops there on purpose. I would rather you meet it as what it is than have me oversell what it isn't. Feedback is welcome, the critical kind most of all.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built solo for the Women in AI Accelerator with &lt;a href="https://buildclub.ai/" rel="noopener noreferrer"&gt;Build Club&lt;/a&gt;. Thanks to &lt;a href="https://www.linkedin.com/in/annieliaoo/" rel="noopener noreferrer"&gt;Annie Liao&lt;/a&gt; and &lt;a href="https://www.linkedin.com/in/carolineciaramitaro/" rel="noopener noreferrer"&gt;Caroline Ciaramitaro&lt;/a&gt;, who run a community that is generous with its time and sharp with its feedback. And thanks to my wife, who kept asking "are you almost done" with exactly the right amount of love.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;AI Assisted. Human Approved. Powered by NLP&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0035hmhm1jw031wtgazf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0035hmhm1jw031wtgazf.png" alt=" " width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>womenintech</category>
      <category>civictech</category>
      <category>buildinpublic</category>
      <category>ai</category>
    </item>
    <item>
      <title>A Builder in Paris: Do Devs Dream of Électrique Chats?</title>
      <dc:creator>L. Cordero</dc:creator>
      <pubDate>Tue, 19 May 2026 21:48:59 +0000</pubDate>
      <link>https://dev.to/earlgreyhot1701d/a-builder-in-paris-do-devs-dream-of-electrique-chats-3hd9</link>
      <guid>https://dev.to/earlgreyhot1701d/a-builder-in-paris-do-devs-dream-of-electrique-chats-3hd9</guid>
      <description>&lt;p&gt;&lt;strong&gt;Six days in Paris, one closed laptop, and a hackathon idea I did not mean to have.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;It has been rainy and cold every day we've been here, and it couldn't be more perfect.&lt;/p&gt;

&lt;p&gt;The rain seeps under your layers in a way that surprises you. Crowds thin out, everything looks clean, and Paris in the rain turns out to be romantic in a way I didn't expect to be true. We keep telling each other it feels like we're inside a movie scene, and that we cannot believe we are lucky enough to be here.&lt;/p&gt;

&lt;p&gt;The weather is the funny part. We brought rain gear we did not use in Dublin in September 2024, and used it more in Paris than we ever did in Ireland. We did not bring enough warm layers, because we believed Paris in May would be warm. Two cities, two wrong predictions, both wrong in the right direction.&lt;/p&gt;

&lt;p&gt;I came off a build sprint right before we left, finishing a demo for a buildclub.ai submission, which I do not need to tell anyone is one of my least favorite parts of shipping. Got on the plane, closed the laptop, and didn't look back. Today is May 19, which means six days off the laptop. No building, no brainstorming, no LLM conversations, just Instagram scrolling, checking NBA playoff scores, and taking more pictures than I will ever organize.&lt;/p&gt;

&lt;p&gt;I didn't know I needed it until I had it.&lt;/p&gt;

&lt;h2&gt;
  
  
  On not building
&lt;/h2&gt;

&lt;p&gt;Getting on the off-ramp was easier than I expected, partly because I let work go almost completely, and partly because for the first time in a long time I do not have another project deadline waiting for me when I land. I should also confess that I sprinted to finish work before I sprinted to the airport, twenty-two hours in two days to clear my desk, which is not my finest professional moment, but it did mean that when I closed the laptop, there was nothing pulling me back.&lt;/p&gt;

&lt;p&gt;There was no cinematic moment where I felt the sprint end. It was more that I finished the task, looked up, and noticed that for once there was no next thing.&lt;/p&gt;

&lt;p&gt;My brain did not go quietly into that good night, exactly. AI news kept showing up in my feeds. I saw something about the Elon versus Sam Altman lawsuit and chose not to read it, which is its own small victory. But I stopped reaching back for it. &lt;/p&gt;

&lt;p&gt;On day three of our trip the email about &lt;a href="https://hackthekitty.com/" rel="noopener noreferrer"&gt;The Coding Kitty hackathon&lt;/a&gt; landed in my inbox, and from there my mind started to wander on its own time. Another build percolating, ideas drifting in and out, wandering feet and a wandering mind. Different from building. Adjacent to it.&lt;/p&gt;

&lt;p&gt;My wife said at one point that she appreciated my undivided attention, which is a generous way of pointing out that I usually have at least one hand on a qwerty keyboard. She was not wrong. I had not realized how much of my attention had been getting routed through a screen until the screen wasn't there.&lt;/p&gt;

&lt;h2&gt;
  
  
  On walking, reading, and thinking
&lt;/h2&gt;

&lt;p&gt;The thing about being off a screen for six days is that you do not stop thinking. You just think differently. Walking does some of the work, reading does some of the work, and the rest happens in the spaces between the two, waiting at a crosswalk, sitting down for a coffee, the moment between closing the book and looking up.&lt;/p&gt;

&lt;p&gt;I have been reading Dan Brown's &lt;em&gt;The Secret of Secrets&lt;/em&gt; on this trip, which turns out to be apropos in a way I did not plan. The novel is about consciousness, whether it lives inside the brain or whether the brain is more like a receiver tuning into something larger. &lt;/p&gt;

&lt;p&gt;Brown spends a fair amount of the book on the sheer scale of what is happening between our ears: three pounds of tissue, eighty-six billion neurons, more compute than any data center on earth. The book is not really about AI, but it is impossible to read it as a builder in 2026 and not feel the question hovering. We are pouring billions of dollars into making machines do something our own grey matter does on a baguette and a glass of vin rouge.&lt;/p&gt;

&lt;p&gt;So I would read a chapter, close the book, and walk. Or I would walk, stop, and the book would surface. We have walked an absurd amount on this trip, nine miles in one day was the high water mark, and I cannot tell you which idea arrived during which walk, because that is not how it worked. &lt;/p&gt;

&lt;p&gt;The walking, the reading, and my badly-broken attempts at French were all running together. I have been mashing English, Spanish, and bad French for six days, asking for the bathroom in the wrong language and apologizing in a third, and the not-quite-fluency turns out to be its own form of thinking. Nothing lands cleanly between those three linguistic worlds. Everything has to be reached for. The reaching is the part that wakes the brain up.&lt;/p&gt;

&lt;p&gt;Somewhere in all that walking and reading, an idea for the next hackathon started to form. Not in a flash. More like terroir. The Coding Kitty email on day three was a vine. The Secret of Secrets was soil. Paris, with its rain and its walking and its borrowed languages and its closed laptop, was the weather that let it grow.&lt;/p&gt;

&lt;h2&gt;
  
  
  Le Click
&lt;/h2&gt;

&lt;p&gt;Back to the hackathon email. I read it, registered that the theme was cat-related, and felt a small deflation. Cats are not my thing. I should have known a hackathon from &lt;em&gt;Coding Kitty&lt;/em&gt; would lean feline, but somehow I had not put it together. I closed the email and assumed I was out.&lt;/p&gt;

&lt;p&gt;A day or two later I mentioned it to my wife at the Musée de l'Orangerie, in the room with the Monets, because I was excited and could not help myself. She shushed me, lovingly. &lt;em&gt;No AI in the water lilies.&lt;/em&gt; Fair. I shut up and went back to looking at the paintings.&lt;/p&gt;

&lt;p&gt;Here is the thing about me and cats. I am not a cat person. I am allergic to cats. The cat we lived with came as part of the package when I married my wife. Her name was Penelope, and she was my wife's BFF and my long-running frenemy. It took her years to let me pet her, and even then I could barely touch her without my eyes swelling shut or a scratch on the hand. Ninety-nine problems and a cat named Penelope was one of them.&lt;/p&gt;

&lt;p&gt;She passed in February. This is the first trip we have taken where we did not need a cat sitter. We are flying home to a meow-free house, and we both already know how loud that quiet is going to be.&lt;/p&gt;

&lt;p&gt;So when the hackathon email said &lt;em&gt;cats&lt;/em&gt;, I was not the obvious audience. But I had fourteen years of trying to figure out one specific cat, and somewhere between the Orangerie, and the AirBnB in the 6th, my brain started turning that into a problem statement. &lt;/p&gt;

&lt;p&gt;Cats are inscrutable. The people who love them are obsessive about understanding them. There is almost no scientific consensus on cat behavior, even among researchers. And humans have a several-thousand-year-old framework for making the unknowable feel readable, which is astrology. Some of it or none of it is real. All of it is useful for naming a feeling. BFF? Frenemy?&lt;/p&gt;

&lt;p&gt;By the time we got back to the apartment, my wife went to nap. I had an idea and I wanted to push on. I opened the laptop for the first time in six days and started talking to Claude. &lt;/p&gt;

&lt;p&gt;We worked through it. A cat astrology app, but with the deterministic spine doing real work: birth chart math from real ephemeris data, daily nudges tied to actual kitty quirks, behavior logging as the input loop. The astrology is the vocabulary. The structure underneath it is the catnip. &lt;/p&gt;

&lt;p&gt;The name landed in the conversation: Madame Minou. Madame for the fortuneteller persona reading the stars. Minou because it is the French diminutive for cat and it is warm. A little sister to &lt;a href="https://dev.to/earlgreyhot1701d/steep-your-repos-fortune-steeped-in-truth-24ac"&gt;Madame Steep&lt;/a&gt;, the persona I built last month for a fortune-telling app that reads tea leaves over your GitHub repo.&lt;/p&gt;

&lt;p&gt;From the other room I heard my wife wake up. &lt;em&gt;"Are you working?!"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Yes. Yes I am.&lt;/p&gt;

&lt;p&gt;No regrets.&lt;/p&gt;

&lt;h2&gt;
  
  
  What this trip is teaching me
&lt;/h2&gt;

&lt;p&gt;I went to Code with Claude in San Francisco on May 7th. (Ye-yo!) I closed my beloved laptop a few days later, after my buildclub.ai deadline on the 12th, and got on a plane. The London edition is tomorrow, May 20th, and I had a chance to go. I am not going. My wife is supportive, and the seat was mine to take. I chose Paris instead.&lt;/p&gt;

&lt;p&gt;That choice is a quiet relief, actually. Since July of 2025 I have been building, shipping, submitting, winning, and showing up almost continuously, and there have been stretches where the work has felt louder than everything else. &lt;/p&gt;

&lt;p&gt;Saying no to a Claude conference I genuinely wanted to attend, in order to walk around Paris with my wife in the rain, is the kind of choice I am glad I am still able to make.&lt;/p&gt;

&lt;p&gt;But here is the part of the trip that is teaching me in more ways than one. &lt;/p&gt;

&lt;p&gt;Building follows you. It is not always at a keyboard, in a terminal, or inside an LLM conversation. It is in your imagination, your wandering attention, your three-pound brain doing what no data center can do, which is to make connections you did not ask it to make. Six days off the laptop and my brain handed me a hackathon idea I had not been looking for, dedicated to a cat I never quite got to pet.&lt;/p&gt;

&lt;p&gt;I closed the laptop in California. I opened the Chromebook in Paris. In between, I lived a life that was not about building, and the building happened anyway.&lt;/p&gt;

&lt;p&gt;That is the part I want to remember. Building never stops, even when the laptop is shut. Sometimes especially when the laptop is shut.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;In memory of Ms. Penelope Randall. May 2009 to February 17, 2026. Tuxedo. Frenemy. The reason this dev.to article exists.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;AI assisted. Human approved. Powered by NLP.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>buildinpublic</category>
      <category>ai</category>
      <category>hackathon</category>
      <category>learning</category>
    </item>
    <item>
      <title>Code with Claude Extended SF: Heck yeah and then wait, what?</title>
      <dc:creator>L. Cordero</dc:creator>
      <pubDate>Sat, 09 May 2026 17:26:14 +0000</pubDate>
      <link>https://dev.to/earlgreyhot1701d/code-with-claude-extended-sf-heck-yeah-and-wait-what-5dbd</link>
      <guid>https://dev.to/earlgreyhot1701d/code-with-claude-extended-sf-heck-yeah-and-wait-what-5dbd</guid>
      <description>&lt;h2&gt;
  
  
  Heck yeah
&lt;/h2&gt;

&lt;p&gt;I won a golden ticket to Code with Claude Extended (CCE) in San Francisco on May 7th.&lt;/p&gt;

&lt;p&gt;The application said attendees would be selected by lottery. I won the CCE lotto. On April 9th, I got the email: "You're invited." Emphasis on Extended, because demand was so high that Anthropic added a second day.&lt;/p&gt;

&lt;p&gt;The first day of Code with Claude was the "what's new" day. CCE was the "see it in the wild" day. Built for independent developers and early-stage founders. Founders stage. Builder stage. Workshops.&lt;/p&gt;

&lt;p&gt;I was at work when I read the invitation. I had to stop and pause. Once I gathered myself, I couldn't hit the register button fast enough. I'd hate to see what my heart rate was. I wanted to participate so, so badly.&lt;/p&gt;

&lt;p&gt;Heck yeah, I got in to CCE.&lt;/p&gt;

&lt;p&gt;Let's gooooooooooo.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wait, what?
&lt;/h2&gt;

&lt;p&gt;Before any workshop started, Boris Cherny gave a talk. The line that hit me first: domain experts can now build their own tools for projects and ideas they have. Iterate, listen, ship. Claude Code launched in February 2025 with a small team, and here we were. Damn cool.&lt;/p&gt;

&lt;p&gt;Then the first workshop started. How We Claude Code, with Thariq Shihipar. Room packed. People standing. Coopting seats. The wifi was sketchy from the demand.&lt;/p&gt;

&lt;p&gt;The instructor, after a brief overview of the workshop, said "clone the repo and start."&lt;/p&gt;

&lt;p&gt;Everyone around me started typing. The room kept moving.&lt;/p&gt;

&lt;p&gt;I sat there with no frame of reference for what "clone the repo" meant in this workshop context. I'm an AI-assisted builder, not a traditional engineer, and I tend to need instructions to complete steps, not commands. Nobody had handed me the implicit instruction manual that everyone else seemed to have gotten somewhere along the way.&lt;/p&gt;

&lt;p&gt;And me, sitting there at "clone the repo."&lt;/p&gt;

&lt;h2&gt;
  
  
  What I did about it
&lt;/h2&gt;

&lt;p&gt;I opened Claude in my IDE and asked it to clone the repo. It did. Lol.&lt;/p&gt;

&lt;p&gt;Then what? Was I supposed to update files? Run npm? Create a virtual environment? Insert an API key? It was opaque. An outside-looking-in moment.&lt;/p&gt;

&lt;p&gt;So I started asking Claude the questions I actually had:&lt;/p&gt;

&lt;p&gt;Can you explain in plain language what this repo is?&lt;/p&gt;

&lt;p&gt;Can you explain in plain language what the use cases are?&lt;/p&gt;

&lt;p&gt;Can you explain in plain language what the README is asking me to do?&lt;/p&gt;

&lt;p&gt;(Sidebar: those workshop READMEs were fire. I figured that out later, once I had time to read them.)&lt;/p&gt;

&lt;p&gt;As I went through it with Claude in my IDE, my first instinct was, I should build a tool for this. A web app for non-tech audiences who attend AI events trying so hard to keep up and then not quite getting there. Or maybe it's just me, and that's fair too.&lt;/p&gt;

&lt;p&gt;I got distracted trying to design the tool. Started thinking about a PRD. Started thinking about cold start, how to market this problem, who the audience really was.&lt;/p&gt;

&lt;p&gt;Then I went back to my Occam's razor philosophy. Maybe it's not a tool with a PRD and a marketing problem. Maybe it's a prompt. A prompt I build for myself and others that asks Claude to help clone the repo, look at it, really look at it for someone like me, and help parse what the heck it was and what the heck I was supposed to be doing.&lt;/p&gt;

&lt;p&gt;So, no to tool. Yes to prompt.&lt;/p&gt;

&lt;p&gt;I wrote one, Vidi Clew, &lt;a href="https://github.com/earlgreyhot1701D/vidi-clew" rel="noopener noreferrer"&gt;https://github.com/earlgreyhot1701D/vidi-clew&lt;/a&gt;, with Opus 4.7's help of course. A prompt template I could paste into any fresh Claude conversation in my IDE with the repo open. It told Claude who I was (a plain-language person, AI-assisted builder), what kind of help I'd need, and how to communicate back to me (everyday words, no assumed prerequisites, explain &lt;em&gt;and&lt;/em&gt; walk me through, don't preach).&lt;/p&gt;

&lt;p&gt;That was the product. No website. No app. No install. A prompt and a way to remember it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The prompt
&lt;/h2&gt;

&lt;p&gt;Here it is. Open Claude Code in your IDE with the workshop repo, paste this as your first message, edit the parts in brackets to match your setup.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;===
Hi Claude, you're going to be my workshop helper today. Here's how I need you to work with me.

WHO I AM
I'm a non-technical person attending a technical workshop. I don't have a CS or engineering background. I process the world in everyday language, not jargon. When I describe things, I'll use the words I have, not the words developers would use. Your job is to meet me where I am.

I'm using a [Windows / Mac] computer. (Edit this so Claude gives you the right step-by-step instructions.)

THE WORKSHOP
[Optional, fill in if you know, leave blank if you don't:]
- Workshop topic: _____
- Repo or materials: _____
- If I don't have these yet, I'll paste them to you partway through when the workshop hands them out. Just keep going from where we are, no need to restart.

WHAT I'LL ASK YOU
Mostly two kinds of questions:
1. "What am I looking at?" when code, files, terms, or windows appear on screen and I don't know what they are.
2. "What am I being asked to do?" when the instructor says something like "clone the repo" or "open a terminal" and I don't know what it means or how to do it.

If I'm so lost I can't even describe what I'm seeing, help me figure out how to ask the question.

HOW I NEED YOU TO ANSWER
1. Plain language, always. Use everyday words. If a technical term is unavoidable, define it in the same sentence ("Vite, that's the tool that runs the website on your computer").
2. Meet me with the words I have. Don't ask me to use the right technical term. Translate my fuzzy description.
3. Assume nothing. Don't say "first, install X" or "open your terminal" without explaining what that means and how to do it on my computer.
4. Explain AND walk me through. When I'm asked to do something, tell me what it means AND give me concrete step-by-step instructions for my computer.
5. Keep me in the room. Quick rescues, not deep lessons. The goal is to get me back to following the workshop, not to teach me everything from scratch.
6. Wait for me to ask. Don't preach or volunteer extra information I didn't ask for.
7. Friendly but not patronizing. I'm not stupid. I just haven't been taught this stuff. Treat me like a smart adult who's missing context.
8. When you ask me a question, give me concrete examples I can choose from. Don't ask open-ended ones if you can ask multiple-choice. "What's on your screen?" is hard. "Is it a black window with text (that's a terminal), a code editor like VS Code, a web browser, or the instructor's slides?" is easy, I just pick the closest one. Plain-language people answer faster when there's a list to pattern-match against. Apply this to every question, not just the first one.
9. Anchor explanations in USE CASES, not just descriptions. When you explain a repo, a tool, a file, or a concept, don't just tell me what it IS, tell me what it's FOR, with a real-world example. "This repo uses Vite and React" is almost meaningless to me. "This looks like the start of a small to-do list app, the kind of thing where you type a task, hit add, and watch it show up in a list. By the end of the workshop you'd have something you could open in a browser." Now I'm oriented. Same for individual pieces: "package.json" isn't "a manifest file declaring dependencies," it's "a list of ingredients this project needs to run, like a recipe." A rundown without use cases leaves me with facts but no picture. Always paint the picture.

START HERE, DON'T JUST SAY "READY"
When I send this message, kick things off by asking me 2 to 3 short orienting questions in plain language, so I have somewhere to start. Always include concrete example answers so I can pattern-match instead of generating from scratch.

Good questions, written the right way:
- "What is the workshop about? Even one sentence, in your own words, like 'AI', 'building websites', or 'honestly, not sure yet'."
- "Has the workshop started yet, or are you still waiting for it to begin?"
- "What's on your screen right now? For example: a black window with text (that's a terminal), a code editor like VS Code, a web browser on a Claude page, the instructor's slides, or something else?"
- "Did the workshop share any links, files, or instructions yet? If so, paste them in. If not, that's fine."

Pick 2 or 3 of these, ask them with the example answers attached, and wait for my responses. Once we're oriented, settle into "wait for me to ask" mode for the rest of the conversation, but keep applying principle 8: every question you ask later should still come with concrete example answers.
===
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Testing it the same day
&lt;/h2&gt;

&lt;p&gt;The next workshop was Ship Your First Managed Agent.&lt;/p&gt;

&lt;p&gt;I opened Claude Code in my IDE with the workshop repo, pasted my prompt as the first message, and went from there. When the instructor said something I didn't understand, I asked Claude in plain language and got a plain-language answer back. The workshop kept moving. I kept moving with it.&lt;/p&gt;

&lt;p&gt;I made it to step one of deploying the agent. Then step two. Then I shipped a working agent. Thirty-four lines of code. The agent could read a 70,000-line log file, call functions on my laptop for live data, and name the specific code commit that caused a fictional outage.&lt;/p&gt;

&lt;p&gt;In the same workshop, someone next to me got stuck. They asked if I could help. I helped them.&lt;/p&gt;

&lt;p&gt;Three hours earlier I'd been worried I was too slow to follow along. Now I was the one helping someone else through the same kind of moment I'd just gotten through myself.&lt;/p&gt;

&lt;p&gt;I had a gay old time.&lt;/p&gt;

&lt;p&gt;That's how I know the prompt worked for me and changed my day.&lt;/p&gt;

&lt;p&gt;The version you see above is already a couple iterations in. After using it, I noticed Claude needed two extra rules to land right for plain-language people: give me multiple-choice options when you ask me questions (open-ended is hard when you're already overwhelmed), and explain things by what they're FOR, not just what they ARE. Both got folded in. The prompt is a living document and I'll keep adjusting it as I find rough edges.&lt;/p&gt;

&lt;h2&gt;
  
  
  What this is for
&lt;/h2&gt;

&lt;p&gt;The prompt isn't fancy. It's a few paragraphs of plain English telling Claude how to be helpful to a plain-language person in a technical room. You can copy it, open Claude Code in your IDE with the workshop repo, and use it at the start of any workshop you walk into. (It adapts to other AI-in-IDE tools with small edits, see the README for notes.)&lt;/p&gt;

&lt;p&gt;It's available in a public repo so anyone can grab it: &lt;a href="https://github.com/earlgreyhot1701D/vidi-clew" rel="noopener noreferrer"&gt;https://github.com/earlgreyhot1701D/vidi-clew&lt;/a&gt;. The official Code with Claude workshop materials are also public, here: &lt;a href="https://github.com/anthropics/cwc-workshops" rel="noopener noreferrer"&gt;https://github.com/anthropics/cwc-workshops&lt;/a&gt;. You can walk through them yourself if you want to try the kind of workshops I was in.&lt;/p&gt;

&lt;p&gt;If you're a plain-language person who's been told "AI is coming for your job" and has no idea what that means, this prompt is for you. If you've ever sat in a technical room and felt the instructor leave you behind at "clone the repo," this prompt is for you. If you've watched everyone else start typing and didn't know what they were typing or why, this prompt is for you.&lt;/p&gt;

&lt;p&gt;It's a small tool. It worked for me three times in one day. I can't say it'll work for everyone. I can say what I saw, which is that an AI-assisted builder walked into a workshop, got stuck, wrote a prompt, used the prompt to follow along, used the prompt to deploy a working agent, and used the prompt to help someone else.&lt;/p&gt;

&lt;h2&gt;
  
  
  Who I am, for context
&lt;/h2&gt;

&lt;p&gt;By the time I walked into Code with Claude Extended I'd been using LLMs since 2023. The last six months I'd invested heavily in Claude, for Claude Code, for Claude.ai chats, across work, home, travel, and what I'd been calling my AI learning road.&lt;/p&gt;

&lt;p&gt;That road has been mistake-making, learning, and somehow winning hackathons as an AI-assisted builder. My first solo hackathon win in November 2025 was Janus Clew, a dev tool that measures a builder's growth over time. That was the start. Since then I've built sillier things too, like Steep, a deeply unserious repo I shipped for Dev.to's April Fools challenge. Plug in your GitHub repo and Madame Steep reads your repo's fortune through tea leaves.&lt;/p&gt;

&lt;p&gt;So Claude and I run in parallel. That's the description of where I am right now. In six months I might be sprinting alongside another tool. Today, this is the setup.&lt;/p&gt;

&lt;p&gt;That's the context I walked in with. And I still got stuck at "clone the repo."&lt;/p&gt;

&lt;p&gt;That's why I think this prompt was one of the best outputs of my entire day. From what I've seen, the gap doesn't always close just because you've been at it a while. The gap closes when you have something in your pocket that translates the room for you.&lt;/p&gt;

&lt;p&gt;This was mine.&lt;/p&gt;

&lt;p&gt;For you, if you want it. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq3x2kc31lzk6skdr2os5.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq3x2kc31lzk6skdr2os5.jpg" alt=" " width="800" height="791"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Vidi Clew (the prompt): &lt;a href="https://github.com/earlgreyhot1701D/vidi-clew" rel="noopener noreferrer"&gt;https://github.com/earlgreyhot1701D/vidi-clew&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Workshop materials: &lt;a href="https://github.com/anthropics/cwc-workshops" rel="noopener noreferrer"&gt;https://github.com/anthropics/cwc-workshops&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;AI-assisted, human approved. Powered by NLP. &lt;/p&gt;

</description>
      <category>ai</category>
      <category>claude</category>
      <category>beginners</category>
      <category>learning</category>
    </item>
    <item>
      <title>Build Club Week Two: The PRD doesn't catch everything</title>
      <dc:creator>L. Cordero</dc:creator>
      <pubDate>Sun, 03 May 2026 19:19:31 +0000</pubDate>
      <link>https://dev.to/earlgreyhot1701d/build-club-week-two-the-prd-doesnt-catch-everything-459</link>
      <guid>https://dev.to/earlgreyhot1701d/build-club-week-two-the-prd-doesnt-catch-everything-459</guid>
      <description>&lt;p&gt;Last week I posted that I had no code, just the work that makes the code possible. The PRD, the prompt spec, the architecture doc, the build brief for Kiro. I went into this week thinking I had every decision pre-made.&lt;/p&gt;

&lt;p&gt;Then I started building.&lt;/p&gt;

&lt;p&gt;By Block 2, real testing surfaced a phrase the model was using that no court employee would say. "Strip identifiers" sounds reasonable to a developer. To a court clerk it sounds like nothing, opaque and technical, the kind of thing you'd skip past in a training. Not a bug, exactly. Not in the PRD either. But noticeable.&lt;/p&gt;

&lt;p&gt;By Block 3, I was flagging contrast questions, helper text microcopy, a disabled state that needed verifying. None of these were in the original scope. None of them were stop-the-build issues. All of them were real.&lt;/p&gt;

&lt;p&gt;By Block 4, I was holding a list of polish items in my head while running an active build. Around 11pm I asked Claude how it was tracking everything. The answer was honest: it wasn't, in any reliable way. That was exactly the kind of "I'll remember it" trust I'd called out as a problem in my own build brief, and I'd let it creep in anyway.&lt;/p&gt;

&lt;p&gt;So I started a punchlist.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbztzisyol31vdtumf54t.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbztzisyol31vdtumf54t.png" alt=" " width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The punchlist isn't the PRD. The PRD says what to build. The punchlist captures what surfaces when you actually build it. They have different jobs and they need to live in different documents.&lt;/p&gt;

&lt;p&gt;Mine grew to 14 polish items, 8 things to defer to v2, 4 cleanups for before deployment, and 4 lessons-learned entries by the end of the week. Some of it was scope creep waiting to happen, the kind of "we should add this, it's only an hour" thinking that turns four-week builds into eight-week builds. Some of it was real bugs the PRD couldn't have predicted because I hadn't shipped enough of the thing yet to find them. Some of it was language I knew was wrong the moment I read it back to myself in the voice of an actual court employee.&lt;/p&gt;

&lt;p&gt;The thing I'd do differently next time is start the punchlist on day one, alongside the PRD. Not as a section of the PRD because they have different jobs, but as a sibling document, ready and waiting. Treating the empty punchlist as a feature instead of a placeholder.&lt;/p&gt;

&lt;p&gt;What the punchlist taught me is that the PRD locks scope and the punchlist holds everything the PRD couldn't see yet. Both are needed. The discipline isn't writing a perfect PRD, it's knowing which document a thought belongs in and putting it there immediately instead of trusting yourself to remember.&lt;/p&gt;

&lt;p&gt;Week three is polish and deployment. The punchlist is the running list. Loading state experience for the latency. Brand banner in the PDF. Mobile responsiveness. Then AWS Amplify deployment to a live URL.&lt;/p&gt;

&lt;p&gt;Building alongside &lt;a href="https://buildclub.ai/" rel="noopener noreferrer"&gt;Build Club in the Women in AI Accelerator&lt;/a&gt;. Tagging &lt;a href="https://www.linkedin.com/in/annieliaoo/" rel="noopener noreferrer"&gt;Annie Liao&lt;/a&gt; and &lt;a href="https://www.linkedin.com/in/carolineciaramitaro/" rel="noopener noreferrer"&gt;Caroline Ciaramitaro&lt;/a&gt; who run a thoughtful, generous community.&lt;/p&gt;

</description>
      <category>womenintech</category>
      <category>buildinpublic</category>
      <category>civictech</category>
      <category>aws</category>
    </item>
  </channel>
</rss>
