From zero to working product in two hours

So last night I went to an Umbraco.AI hackathon, and somehow walked out with a working package.

Not a hand-wavy prototype, not a "here's roughly what it could do" demo, but an actual end-to-end thing that scores form submissions with an LLM, blocks the spammy ones before they hit your inbox, and has a proper backoffice dashboard for the flagged entries.

Two hours.

And I didn't type a single line of code myself.

The problem I picked

AI-generated contact form spam has quietly become the biggest unsolved headache for anyone running a public-facing site. It reads as plausible English, has none of the obvious spam markers, and walks straight past honeypots, reCAPTCHA, and regex filters. Everyone with an Umbraco Forms install has this problem whether they realise it or not, and as far as I could see, nobody in the Umbraco ecosystem had built anything to deal with it at the Forms level.

That felt like a gap worth filling.

The idea was simple enough. For every Forms submission, build a structured prompt, send it to Umbraco.AI's chat service with the host site's configured provider (Anthropic, OpenAI, whatever's plugged in), parse the JSON response, and if the score crosses a configurable threshold, stop the submission dead. The bad submission never reaches the database, never triggers an email workflow, and never sees the success page. Flagged entries land in a JSON store that you can review from a backoffice dashboard.

That's the brief. The challenge was getting it built and demonstrable in the time available.

How I actually built it

I used Claude Code.

For anyone who hasn't come across it (and I'd be surprised if you haven't!), Claude Code is an agentic coding tool that takes plain-English intent and produces actual working code, runs it, tests it, iterates on it. You stay in the driving seat, you make the architectural decisions, you spot when it's gone off-track, you tell it what to fix. But the typing-the-actual-code part isn't your job anymore.

That matters more than it sounds, because I'm not a developer anymore. I've spent thirty-ish years in and around digital delivery, leading teams, scoping projects, understanding the shape of what good looks like, but I've not been the person hands-on-keyboard writing code at one in the morning for a long time. The shape of the platform I know well. The architectural calls I can make.

The plan was what made the difference, not the typing. I went in with a proper PRD, a corpus of test entries covering obvious spam, plausible AI outreach, and genuine enquiries, and a rough hour-by-hour breakdown. Hour one for the scoring service and corpus tuning. Hour two for the form validation handler and the JSON store. Hour three for the backoffice dashboard. Claude Code did the building. I did the thinking, the steering, and the testing.

What actually happened

In practice it was a lot less tidy than the plan.

The biggest surprise was how much the documented APIs had moved on from the version of reality the PRD was written against. The PRD said one thing about how to call the prompting service, the real package said another. The PRD said to use FormValidateNotification with a particular shape, and got the notification right but the blocking technique completely wrong. The PRD assumed Forms had an Alias property. It doesn't. The PRD pointed at a backoffice package that doesn't exist in v17 anymore.

There's something quite humbling about discovering all of that in real time, with a clock running and people drifting over to ask how it's going.

This is where the tool genuinely earns its keep. When something didn't work the way the documentation said it should, I didn't have to dig through forum threads at midnight or guess at the right call to make. I'd describe the symptom, Claude Code would spin up a tiny test probe inside the host application (a small controller, a curl call, whatever was needed to confirm what was actually happening), and within a minute or two we'd know whether the real platform behaved like the docs said it did.

Spoiler, it often didn't.

The way you actually block a submission, it turns out, is to call ModelState.AddModelError inside the validation notification. Forms' surface controller checks ModelState.IsValid before it runs the submit pipeline, so a single bad-model error stops everything. No record saved, no email sent, no success page shown. The user just sees the form re-render with an inline error message and their values still in the boxes.

Once that was working, the rest fell into place.

There were the usual smaller frictions along the way. The backoffice management API uses OpenIddict bearer tokens rather than cookies, so the dashboard's first fetch attempt came back 401 until we extended UmbLitElement, consumed the auth context, and attached the token as an Authorization: Bearer header. ASP.NET Core insists on camel-casing response properties regardless of how the POCOs are written, so the dashboard rendered empty rows until we noticed it was reading FlaggedAtUtc when the wire was sending flaggedAtUtc. The package SDK had to be Microsoft.NET.Sdk.Razor rather than the plain one, otherwise the wwwroot folder gets quietly ignored and the backoffice has nothing to load.

None of those are catastrophic. They're the kind of friction you only meet when you actually try to do the thing, and they each cost a few minutes that I didn't have to spare. The difference, with the tool in the loop, is that the cost is a few minutes rather than the half-hour or hour each of them might have eaten in a previous era.

What I built in

The other thing I leaned into early was a safety net.

A spam filter that breaks your contact form is worse than the spam itself, and I wanted that property baked in from the first commit rather than bolted on later. The scoring service has a per-call timeout. It parses JSON loosely enough to survive markdown fences round the response. It clamps the score to 0-100. It normalises the category. And it default-passes on any timeout, parse error, or exception. The AI scoring is a quality-of-life feature, not a hard gate. If anything at all goes wrong, the submission flows through as normal.

I also added a master kill switch and a configurable threshold, so anyone installing this can decide for themselves how aggressive they want it to be, or turn it off entirely without uninstalling. Small thing, but it matters. The worst thing a package can do is take options away from the people using it.

None of those decisions were the tool's. They were mine. The tool just made them happen.

What I learned

In my experience, the gap between "I understand this conceptually" and "I can ship this in two hours" is almost always made up of the tiny, undocumented frictions. Not the big architectural decisions. Those mostly took care of themselves once the problem was clear. It was the bearer tokens, the JSON casing, the SDK choice, the property that doesn't exist on the object you thought you were dealing with.

The reason I got it over the line in time was a habit I've leaned into hard this year. Write the smoke tests first, in the host application, before you write the package code. Spin up a tiny controller that calls the new API the way you think it should work, hit it with curl, see what actually comes back. By the time I sat down to brief Claude Code on the actual scoring service, I already knew exactly what the underlying API wanted as input and what shape it returned. No assumptions, no surprises at the integration step.

The headline lesson though, walking home, isn't really about Umbraco or spam filtering. It's about what the role of the person sitting in front of these tools is actually becoming.

You don't need to be able to write the code anymore. You need to be able to define the problem clearly, design the safety nets, spot when the output is heading in the wrong direction, and know when to stop and ask a better question. That's a different skill set to the one our industry has been hiring for the last fifteen years, and it's one that experienced engineers, product people, and consultants are already quietly really good at, even if they've never thought of themselves as builders.

What's left in the gap is the harder, more interesting stuff. The judgement about what to build. The discipline to test your assumptions against reality. The instinct for when something is about to go wrong. The willingness to keep going when the first three attempts didn't work. None of that gets easier with better tools. If anything, it matters more, because the tools will happily produce something plausible and broken if you don't know what you're doing.

The reason I got over the line tonight wasn't that I had Claude Code. It was that I knew exactly what I wanted to build, why, and what good looked like along the way.

Where it ends up

The package scored eight and a half out of nine on the test corpus. All the block decisions were correct. The category label drifted on one borderline entry, which is cosmetic. The end-to-end test on the actual Contact page does what it's supposed to do. Legitimate submissions flow through normally, spam submissions get blocked at validation with a friendly inline error, and the flagged entries appear in the dashboard a moment later.

It's a v0.1.0. There's a roadmap of obvious next steps. Per-form prompts. An editor feedback loop so flagged entries marked as false positives become few-shot examples for the next prompt. DB-backed storage instead of a JSON file. Stats and charts. Webhooks out to Slack or SOC tools. None of that is hard, it's just not what fits in two hours.

The bit that's stayed with me though, walking home, isn't the technical journey. It's that someone with twenty years of knowing what good software looks like, but low coding ability of their own, can now sit down in an evening and produce a real, installable, end-to-end working package.

That's a genuinely new shape in our industry, and I don't think most people have fully felt it yet.

Two hours. One end-to-end Umbraco package. Not a line of code typed by me.

What would you point a two-hour hackathon at, if you knew you could actually finish it?