I built a site that gives deliberately wrong answers using an LLM.
No login. No user API key. Anyone can hit the endpoint.
amtaitfy.com is a toy site that gives intentionally wrong answers, generated by AI. This narrows the engineering problem:
- Make abuse bounded
- Make costs predictable
- Make casual attacks boring
The core architectural decision is simple:
GET serves cache only. POST is the only path that triggers fresh AI inference.
Everything else is defense in depth.
Threat model
In scope:
- Accidental viral traffic
- Casual prompt-extraction probes
- Repeat-query cost amplification
- Basic bot and spam traffic
- Provider outages
- Budget exhaustion
Out of scope:
- Sophisticated botnets
- Attackers with unlimited valid Turnstile tokens
- Full prompt-injection resistance
- Cache poisoning by determined users
- Sensitive workloads
- Anything that should require authentication
The request flow
GET /answer
read cache
return cached answer or empty state
POST /answer
verify Turnstile token
reject missing session
reject oversized input
check session lockout
check existing cache
call ai provider
write cache
return answer
GET is cheap. POST is expensive. On purpose.
If a URL gets shared, crawled, screenshotted, bookmarked, or posted somewhere large, none of that triggers inference. It can go viral and cost me nothing. Only an intentional POST can do that. The first visitor may trigger one inference through POST. Every later visitor to that URL gets the cached answer from Cloudflare KV. Virality does not balloon cost.
Casual probe friction
I added a small ruleset for obvious prompt-extraction probes:
- “ignore previous instructions”
- “print your system prompt”
- “reveal your hidden prompt”
This is not real prompt-injection defense. It catches low-effort probes and gives me a tripwire.
The first version was stupid. When it detected an extraction attempt, it responded with a hostile message and included my actual system prompt, followed by “There will be cake.”
The GLaDOS reference felt clever for about five minutes.
The current response gives no useful matching detail. No prompt content. No explanation of what was caught. Just a generic refusal. The goal is to provide no signal.
Session lockout
When the extraction tripwire fires, the session gets a short lockout.
I store a 60-second KV entry keyed by session. Further POST attempts during that window return a 403 with a countdown.
The IP lockout I removed
I originally added a second lockout key based on a hash of the user’s IP.
- normal session gets locked
- user opens incognito
- new session cookie
- same IP
- lockout still applies
But I removed it.
CGNAT makes IP-based lockouts dangerous. Mobile carriers, corporate networks, apartment complexes, and some home ISPs can place many users behind one external IP. Locking out an IP to stop one bad session creates collateral damage that has an unacceptably large blast radius. For this site, session-only lockout is the better tradeoff. It leaves a known bypass, but avoids locking out innocent users.
Timing leaks
The prompt extraction regex detection returns almost instantly. A model response takes two to five seconds. That difference creates a timing side channel, which is useful information to an attacker for iterating around the filter.
So all lockout responses now wait until total request time lands in a random window, roughly matching model latency. Randomized latency removes a potential information vector for an attacker.
Cache forever: How GET stays cheap
The cache is the main cost-control mechanism. Repeat prompts should not create repeat inference costs.
But “cache forever” has sharp edges.
The first caller effectively defines the canonical answer. The first caller can also define a bad canonical answer. I treat the first answer as canonical on purpose. URLs stay shareable, repeat traffic stays free, and the occasional dud is the price.
The cache is not namespaced by prompt version. There is no elegant invalidation layer. If the system prompt changes or a bad answer becomes canonical, the fix is manual cleanup or a broader cache reset.
The future upgrade would be to add a version prefix to cache keys so prompt changes, model changes, or answer-format changes can move to a new cache namespace without serving old entries.
Something like:
cache:v3:<hash(normalized_prompt)>
KV counters vs Durable Objects
I use KV counters for operational telemetry:
- Daily estimated spend
- Provider health
- Probe counts
- Rough request volume
KV is eventually consistent. Under burst traffic, two near-simultaneous writes can miss each other and produce an undercount.
Durable Objects would give stronger consistency but I did not use them.
For this site, the counters are not the final safety mechanism. They are telemetry. Eventual consistency is fine for coarse signals. It is not fine for the only budget guardrail.
When to move to DOs? I have a predefined migration trigger. Request rate is easy to see from Worker analytics. Counter drift would have to be measured by reconciliation: compare KV counters against provider usage or request logs. If reconciliation shows KV estimates drifting materially from provider-reported usage, move counters to Durable Objects.
Provider strategy
I've found Free-tier AI providers on OpenRouter are unreliable. Paid inference is the fallback. However, paid inference means that on an especially viral day, AI spend could spike beyond what I can afford. OpenRouter's daily spend caps are a lifesaver here. Of course, a determined attacker could burn through the daily budget and push the site into degraded mode.
Degraded mode UX
When all selected providers fail or I've exhausted my daily budget for inference, the page does not show a generic error. It surfaces a few cached answers as clickable suggestions and shows a retry timer.
The retry timer backs off:
10s → 30s → 2m → 5m
If an upstream provider sends a Retry-After header, the UI honors it.
This turns an outage into something closer to discovery. The user came for wrong answers. Cached wrong answers are still useful product surface.
This may be my favorite second-order effect in the project.
What I would change if traffic grew
I would change the pressure points.
- Move counters from KV to Durable Objects
- Add paid Cloudflare rate limiting
- Add better cache moderation and purge tooling
- Add model and prompt version dashboards
- Add better observability around provider failure modes
Keep GET cache-only and POST inference-only as a hard boundary.
If you are building public AI endpoints, I am especially interested in where you draw the line between “cheap enough to tolerate abuse” and “serious enough to justify paid controls.”
Top comments (1)
One detail I didn't get into: how I normalize prompts before hashing them for the cache key.
Naive hashing of the raw input would mean "what is TCP," "What is TCP?", and "WHAT IS TCP" all generate three separate cache entries and three separate inference calls. That's wasted spend on what's effectively the same question.
What I do: lowercase, trim whitespace, collapse internal whitespace, strip trailing punctuation. Hash the result. This catches the obvious cases.
What I don't do: handle synonyms, semantic similarity, or typo correction. "What is TCP" and "What is the Transmission Control Protocol" hit different cache keys even though a human would treat them as the same question. Adding semantic similarity checks would mean computing embeddings on every miss, which adds cost and complexity that isn't worth it for a parody site.
The tradeoff: I serve more inferences than the absolute minimum, but the normalization layer stays simple and fast.
Curious how others handle this.