Prajwal Gaonkar

Posted on Mar 20

Building an Unbreakable Public Form: From Concept to Production Backend

#backend #node #security #systemdesign

If you caught my previous blog post, we explored the high-level concepts of securing public forms: IP blocking, session tokens, idempotency, and CAPTCHAs. We talked about what needed to be done to stop bots and duplicate data.

But conceptual theory only gets you so far. When you sit down to actually write the code, the reality of race conditions, database locks, and user behavior hits you fast. Today, we are going completely under the hood. We're looking at the exact end-to-end system design of how those ideas come together into a real, production-ready Node.js backend.

Want to jump straight to the code? The entire repository is open-source and available here: OP-Prajwal/publicForm

1. The Problem Statement

Building a form for logged-in users is straightforward because you have a user ID. But public forms—like contact pages or recruitment applications—are entirely different.

No Authentication: We don't know who is making the request.
Shared Wi-Fi: We can't strictly ban an IP address permanently without risking locking out an entire dorm room or office of legitimate users.
Bot Spam: Within hours of going live, bots will find your endpoint and try to drop malicious payloads or thousands of fake entries.
The Impatient User: A real human on a laggy 3G connection clicks "Submit", nothing happens instantly, so they furiously click "Submit" six more times. Without safeguards, you now have six identical records in your database.

2. The High-Level Architecture

To solve all of this, we need a layered defense. We can't rely on just a CAPTCHA or just an IP limit. The architecture acts like a funnel, rejecting bad actors at the cheapest possible computing layer before allowing data into the database structure.

Interactive architecture diagram available on Eraser.io

The Pipeline:
User → Rate Limiter → CAPTCHA → Idempotency Check → Token Validation → Database

3. Step-by-Step Request Flow

Here is the exact sequence of events when a user interacts with our system:

User loads the form: The React frontend silently makes a GET request to the backend.
Server generates a secure token: The backend generates a Cryptographically Secure Pseudorandom (CSPRNG) token, saves it to the database, and sends it to the frontend.
User submits the form: The user fills the data, solves the CAPTCHA silently, and hits submit. The frontend generates a unique UUID (the Idempotency Key) and sends everything to the server.
Rate limiting: Express middleware checks if this IP is making too many requests too fast.
CAPTCHA validation: The controller immediately asks Cloudflare if the human actually solved the challenge.
Idempotency check: The database checks if this exact UUID has been seen recently.
Token validation (atomic): The backend verifies the CSPRNG token is valid and immediately burns it so it can never be used again.
Store in DB: Only after surviving this is the data finally inserted into the database.

4. Implementation Details in Node.js

Let's look at how we actually built this backend. We utilized an Express.js middleware-based architecture to keep our routing clean:

router.post("/submit", formLimiter, submitForm);

Behind the scenes:

Token Generation: We used Node's native crypto library to generate tokens (crypto.randomBytes(32).toString('hex')).
Idempotency: When the user clicks submit, the client generates a key using crypto.randomUUID(). We store this key in an Idempotency MongoDB collection.
Atomic Validation: We execute findOneAndUpdate() in Mongoose to verify and "burn" the token in the exact same database operation to prevent race conditions.

5. CSPRNG Tokens: Why Randomness Matters

A session token is only as good as its un-guessability. If you generate tokens using something predictable like Math.random() or a timestamp, a sophisticated bot script can mathematically guess the next token your server will issue, completely bypassing the requirement to load your frontend.

By using a CSPRNG (Cryptographically Secure Pseudorandom Number Generator), we lean on the operating system's entropy pool. It is practically impossible for an attacker to predict a 32-byte hex string.

6. Idempotency Keys: Taming the Double-Click

Idempotency is just a fancy word for a simple concept: doing the exact same thing twice should yield the same result without causing side effects.

When our user double-clicks "Submit", two identical network requests fire simultaneously. Both carry the exact same payload and the exact same idempotencyKey UUID generated by React.

When the backend receives the first request, it logs the key with a status of PROCESSING. If the second request arrives, our system recognizes the UUID and drops the duplicate request without inserting a cloned record into the main database.

7. Defeating Race Conditions

Idempotency sounds easy until you encounter concurrency.

Imagine those two identical requests hit the Node.js server at the exact same microsecond. Both queries check the database. Both queries see that the Idempotency Key doesn't exist yet. Both queries try to insert the record.

If we handled this with standard Javascript if/else logic, both would succeed. To fix this natively, we don't need complex database transactions. We rely on the database layer to act as our atomic lock.

In MongoDB (Mongoose)

We literally just add a unique: true index to the Idempotency Key in our schema:

const idempotencySchema = new mongoose.Schema({
  key: {
    type: String,
    unique: true, // MongoDB enforces atomicity here
    required: true,
  },
  status: { type: String, default: "PROCESSING" }
});

The first request successfully claims the lock and inserts the record. The second concurrent request is instantly slapped down by MongoDB throwing an E11000 Duplicate Key Error. We catch that error in our backend and safely drop the duplicate. Race condition solved.

In SQL (PostgreSQL / MySQL)

If you are building this in a relational database, the exact same principle applies. You create a UNIQUE constraint on the column:

CREATE TABLE idempotency_locks (
    key UUID PRIMARY KEY,
    status VARCHAR(20) DEFAULT 'PROCESSING'
);

-- Or adding it to an existing table:
ALTER TABLE idempotency_locks ADD CONSTRAINT unique_idemp_key UNIQUE (key);

When two identical INSERT commands execute at the exact same time, the first one writes the row, and the database engine blocks the second one from executing, instantly throwing a Unique Violation Error (e.g., Code 23505 in Postgres).

8. Rate Limiting: The Outer Shield

If we have all of these protections, why do we still need Rate Limiting?

Because database queries cost CPU cycles. If a malicious script attempts to submit 5,000 forms a second using fake CAPTCHAs and fake tokens, your Node server is going to spend all of its resources connecting to MongoDB just to reject them.

The Rate Limiter (express-rate-limit) sits at the very outer edge of the middleware. If an IP exceeds 20 requests a minute, it gets blocked entirely before the Node controller even parses the JSON payload.

9. CAPTCHA: The Bouncer

Why is the CAPTCHA validation the very first thing we do inside the main controller?

We use Cloudflare Turnstile to verify humans. Pinging Cloudflare's API is vastly cheaper and safer for our server than running multiple MongoDB queries. If the payload arrives with a missing or invalid CAPTCHA token, we immediately drop the request with a 400 Bad Request. We don't bother checking Idempotency. We don't bother checking the CSPRNG token.

If you aren't human, you don't get past the lobby.

10. The Final System Flow Summarized

When you step back, the elegance of the system reveals itself. We built a heavily guarded fortress for a simple form payload:

CAPTCHA (Is it a bot?) → Rate Limit (Is it spamming?) → Idempotency (Is it a double-click duplicate?) → Token (Is it a forged, headless request?) → Database (Safe and clean).

11. Key Learnings: Layered Defense

The biggest takeaway from building this is the concept of layered defense.

No single security measure is flawless. Rate limits can be bypassed using VPNs or distributed botnets. CAPTCHAs can occasionally be solved by advanced solvers. Tokens can be harvested. But when you stack them sequentially, the cost and difficulty for an attacker skyrocket exponentially. Real-world backend thinking is about assuming every single layer will eventually fail—and ensuring the layer behind it is ready to catch the mistake.

12. Conclusion

What started as an innocent HTML form transformed into a massive system design problem. Public-facing endpoints are arguably some of the most difficult pieces of infrastructure to secure properly because you have absolutely zero trust in the client interacting with it.

If you are just building a quick side project, a simple reCAPTCHA might be enough. But if you want to build at enterprise grade, learning how to juggle atomic databases, idempotency keys, and session validation is invaluable. It shifts your mindset from "How do I make this work?" to "How do I make this unbreakable?"

DEV Community