Why We Chose JSONB for AI: How NextBlock CMS Bypassed the "HTML Wall" for Next.js 16

#nextjs #ai #postgres #architecture

If you’ve ever tried to pipe Generative AI directly into a production-grade CMS, you’ve likely hit the "HTML Wall."

Traditional rich-text editors treat documents as unstructured strings. When you ask an LLM to generate a blog post, it usually hands back a blob of HTML. In a modern Next.js 16 environment, that blob is a performance ticking time bomb. You have to sanitize it, parse it, and hope it doesn't break your React Server Components (RSC) or trigger a massive re-render that tanks your Core Web Vitals.

When we built NextBlock Cortex, we made a fundamental engineering decision: AI never touches raw HTML. We store everything as strict, node-based JSONB.

Here is why that decision is the backbone of our performance.

The Problem: The "Sanitization Tax"
The "Old Way" of handling AI content looks like this:

AI generates a string of HTML.

Server receives the string and runs it through a heavy sanitizer like DOMPurify to prevent XSS.

Database stores the string.

Client fetches the string and has to parse it again to render it within a React tree.

Every step in this process adds latency. Worse, if the AI generates an unsupported tag or a broken

, your editor engine (like Tiptap or ProseMirror) will aggressively strip it out, leading to data loss and a disjointed user experience.

The NextBlock Solution: The Zero-Validation Pipeline
In NextBlock, we treat AI as a structured data constructor, not a writer. We use constrained decoding to force the LLM to output content that follows a strict Zod-backed JSON schema.

By doing this, we create what we call the Zero-Validation Pipeline:

Direct-to-DB: Because the AI's output is algorithmically guaranteed to be valid JSON that matches our database schema, we skip the parsing and cleaning phase entirely.

Atomic Transactions: The JSONB payload is inserted directly into the PostgreSQL column. If a token doesn't fit the schema, the inference fails before it ever touches our disk.

The Code: Mapping the Schema
By defining our content as a tree of nodes rather than a string, we ensure 1:1 compatibility between the AI, the DB, and the Editor.

TypeScript
// A simplified look at our content schema
const NextBlockNodeSchema = z.object({
type: z.enum(['heading', 'paragraph', 'image', 'codeBlock']),
attrs: z.record(z.any()).optional(),
content: z.array(z.lazy(() => NextBlockNodeSchema)).optional(),
text: z.string().optional(),
});

type ContentBlock = z.infer;
Why Next.js 16 Loves JSONB
The move to JSONB wasn't just about data integrity; it was a pure performance play for the Next.js 16 ecosystem:

Native RSC Rendering: Since the content is already a JSON object, we map it directly to React components. No dangerouslySetInnerHTML, no hydration mismatches, and no intermediate parsing overhead.

Turbopack & use cache: Standardized JSON allows for highly predictable caching. Using the new Next.js 16 use cache directive, we can cache AI-generated blocks at the edge with surgical precision.

Perfect Lighthouse Scores: By eliminating the "sanitization tax" and parsing lag, NextBlock maintains its 100/100 performance guarantee, even on pages heavily populated by AI-generated modules.

Conclusion: The Schema is the Product
In the era of AI-native applications, "the schema is the product." By moving away from legacy HTML and embracing strict JSONB, we’ve ensured that NextBlock isn't just faster—it’s smarter. Our data is readable by humans, renderable by React, and queryable by future AI agents without the mess of regex or tag-stripping.

NextBlock is currently in early access. If you're tired of "Frankenstein" WordPress architectures and want a CMS built for the 2026 tech stack, check us out on GitHub.

DEV Community

Why We Chose JSONB for AI: How NextBlock CMS Bypassed the "HTML Wall" for Next.js 16

Top comments (0)