Arish singh

Posted on Jul 4

How I Built Velocity — An AI Platform That Turns Plain English Into Production-Grade n8n Workflows

#ai #automation #nextjs #showdev

ai #n8n #nextjs #showdev

You describe the automation. Velocity ships the workflow — importable, deployable, running in your n8n instance one click later.

Every automation builder has been there. You know exactly what you want: "when a form is submitted, enrich the lead, drop it in a sheet, and ping Slack." Then you spend the next two hours in the n8n canvas hunting for the right node, guessing parameter names, wiring connections, and debugging Could not find property option errors on import.

The knowledge to build that workflow exists. It's just locked behind hundreds of node types, typeVersion quirks, and expression syntax. So I built Velocity — an AI platform that takes a plain-English description and generates a complete, valid, importable n8n workflow JSON. Not a 4-node demo. A production-grade pipeline with real triggers, configured parameters, error notifications, and sticky-note documentation — the depth of an actual published n8n template.

What I Built

Velocity is a full-stack AI automation copilot. The core loop:

Describe your automation in plain English → Gemini generates the full n8n workflow JSON
Chat with the copilot → iterate on the workflow conversationally, with full history
One-click deploy → Velocity pushes the workflow straight into your n8n instance via its REST API and activates it
Paste an error log → get a structured cause / fix / patch explanation
Browse real templates → proxied live from the official n8n template library

Tech stack:

Next.js 16 (App Router, Turbopack) + React 19
Google Gemini (raw REST API — no SDK) with a multi-model failover chain
Supabase (auth + Postgres with Row Level Security)
Redis (chat history hot cache + distributed rate limiting)
n8n public REST API for one-click deployment
Tailwind CSS v4 + Base UI + Motion (Framer Motion v12)
Deployed on Vercel

System Architecture

                        ┌─────────────────────────────┐
                        │        Next.js 16 App        │
                        │                             │
 Browser ──────────────▶│  /api/chat ─────────┐       │
   │                    │  /api/generate-workflow ─┐  │      ┌──────────────┐
   │  Supabase JWT      │  /api/explain-error ──┐  │  │      │    Gemini    │
   │  (Authorization    │  /api/analyze-workflow│  │  ├─────▶│  REST API    │
   │   header)          │  /api/plan-workflow ──┘  │  │      │ + fallback   │
   │                    │                          │  │      │   models     │
   │                    │  /api/deploy-to-n8n ─────┼──┼──┐   └──────────────┘
   │                    │  /api/templates ─────────┼──┼──┼─▶ api.n8n.io
   │                    │  /api/conversations ──┐  │  │  │
   │                    └───────────────────────┼──┼──┼──┼──┘
   │                                            │  │  │  │
   │                          ┌─────────────────▼──▼──┤  └─▶ Your n8n
   │                          │  Supabase Postgres    │      instance
   └─────────────────────────▶│  (RLS: auth.uid())    │      (create +
                              └───────────┬───────────┘       activate)
                                          │
                              ┌───────────▼───────────┐
                              │  Redis                │
                              │  · recent-turns cache │
                              │  · rate-limit counters│
                              └───────────────────────┘

Every AI route follows the same pipeline: verify JWT → rate limit → build prompt → call Gemini (with failover) → repair/parse JSON → persist → respond. Postgres is always the source of truth; Redis is always a cache that fails open.

How I Built It

The System Prompt Is the Product

The single hardest problem wasn't code — it was getting the model to emit workflow JSON that n8n will actually import. n8n is unforgiving: a wrong typeVersion loads a different parameter schema, a hallucinated dropdown value throws Could not find property option, and referencing a workflow that doesn't exist throws Could not find workflow.

The fix was treating the system prompt like a schema contract, not a vibe. A few of the rules that came directly from watching real imports fail:

IMPORT SAFETY — these prevent the two most common load errors:
- typeVersion: use 1 for simple core nodes unless you specifically need a
  newer version's fields. A wrong typeVersion loads a different parameter
  schema and breaks with "Could not find property option".
- Do NOT emit resource-locator objects ({"__rl":true,"mode":...,"value":...}).
  Their mode/value pairs are the #1 cause of "Could not find property option".
- NEVER reference another workflow you don't have a real id for.
- "connections" keys are the SOURCE node's NAME (exact, case-sensitive) —
  never its id.

Each of those lines exists because a generated workflow broke without it. The prompt also encodes verified parameter shapes for the most common nodes (HTTP Request, Set, If, Webhook, Schedule Trigger), because "parameters": {} produces workflows that import but do nothing:

- HTTP Request (POST): {"method":"POST","url":"...","authentication":"none",
  "sendBody":true,"body":{"contentType":"json","content":{...}}}
  GOTCHA: POST/PUT/PATCH MUST include "sendBody":true.

And the killer rule for accuracy: when a service has no dedicated n8n node, use a fully-configured HTTP Request node against its REST API instead of guessing. A configured httpRequest beats an empty branded stub every time.

One Contract, Two Routes — Killing Prompt Drift

Velocity has two AI surfaces that both emit n8n JSON: the conversational copilot (/api/chat) and the structured one-shot generator (/api/generate-workflow). Early on they had separate prompts, and they drifted — a rule fixed in one route would still break in the other.

The fix was boring and effective: the n8n contract lives in exactly one file, and both routes compose their system prompts from it.

// n8nPrompt.ts — the single source of truth
export const N8N_WORKFLOW_SHAPE = `An n8n workflow is a JSON object that imports cleanly: {...}`;
export const N8N_RULES = `RULES — follow every one: ...`;

// Conversational copilot (used by /api/chat)
export const CHAT_SYSTEM = `You are Velocity, an AI copilot...
${N8N_WORKFLOW_SHAPE}
${N8N_RULES}
...`;

// Structured one-shot (used by /api/generate-workflow)
export const GENERATE_SYSTEM = `You are Velocity, an expert at building n8n workflows...
${N8N_WORKFLOW_SHAPE}
${N8N_RULES}`;

If you have two prompts encoding the same output schema, they will drift apart. Shared constants are the cheapest insurance you'll ever buy.

Surviving the Free Tier: A Model Failover Chain

I run this on Gemini's free tier, and the preview model I use (gemini-3-flash-preview) has a 20 requests per day allowance. That's not a rate limit, it's a countdown timer. Rather than let the app die at request 21, the Gemini wrapper detects quota-exhausted 429s and transparently fails over to stable models that still have quota:

export const GEMINI_MODEL = "gemini-3-flash-preview";
const FALLBACK_MODELS = ["gemini-2.5-flash", "gemini-2.0-flash"];

// Signals the primary/fallback loop to try the next model.
class QuotaExhaustedError extends Error {}

async function generate(contents, maxTokens, temperature, jsonMode = false) {
  const models = [GEMINI_MODEL, ...FALLBACK_MODELS];
  let lastQuota: QuotaExhaustedError | null = null;

  for (const model of models) {
    try {
      return await generateWithModel(model, contents, maxTokens, temperature, jsonMode);
    } catch (err) {
      // Only a spent quota triggers failover; every other error is terminal.
      if (err instanceof QuotaExhaustedError) { lastQuota = err; continue; }
      throw err;
    }
  }
  throw lastQuota ?? new Error(QUOTA_MESSAGE);
}

The subtle part is distinguishing the two kinds of 429. A regular rate-limit 429 recovers if you back off; a quota 429 will not clear until tomorrow, so retrying is pure waste. The tell is in the response body:

// A 429 whose body mentions quota/billing (or "limit: 0") means the key's
// free-tier allowance is exhausted — retrying won't help.
function isQuotaExhausted(body: string): boolean {
  return /\blimit:\s*0\b|exceeded your current quota|billing/i.test(body);
}

Transient errors (500/502/503/504 and recoverable 429s) get exponential backoff with jitter — and the wrapper honors the API's own Retry-After header and Gemini's RetryInfo body when they suggest a delay. If the suggested delay is longer than the request budget, it gives up early with a human-readable message instead of hanging.

I also skipped the SDK entirely. The wrapper is a raw fetch against the Generative Language REST API — ~230 lines including all the retry/failover logic. When your error handling is your reliability story, owning the HTTP layer beats fighting an SDK's opinions.

Parsing JSON From a Model That Doesn't Always Return JSON

Even with responseMimeType: "application/json" forced, LLM output is only almost JSON often enough to hurt: markdown fences around the object, a stray sentence before it, trailing commas, literal newlines inside string values, and — the worst one — output truncated mid-object at the token limit.

A naive JSON.parse(raw.slice(raw.indexOf("{"), raw.lastIndexOf("}") + 1)) fails on all of those. So I wrote parseModelJson, a repair parser that extracts the first balanced JSON value with a proper scanner (string-aware, escape-aware), fixes control characters, and — if the input was truncated — closes the open string and any open brackets so the result still parses:

// Reached the end with brackets still open → truncated. Best-effort close.
if (inString) out += '"';
out = removeTrailingCommas(out.replace(/,\s*$/, ""));
while (stack.length) out += stack.pop();
return out;

Then it tries candidates from cleanest to rawest, each with a trailing-comma-stripped retry:

for (const candidate of [extracted, stripped, raw]) {
  if (!candidate) continue;
  const direct = tryParse<T>(candidate);
  if (direct !== undefined) return direct;
  const repaired = tryParse<T>(removeTrailingCommas(candidate));
  if (repaired !== undefined) return repaired;
}
return null;

The same philosophy shows up in the chat flow: when a user asks to deploy "the workflow from our conversation," a depth-tracking scanner walks the assistant messages newest-first, pulls out every balanced {...} slice, and keeps the ones that actually contain a nodes array. Fenced, unfenced, buried in prose — it finds the workflow.

Rule of thumb: never let JSON.parse be the last line of defense between an LLM and your user.

Auth: Let Postgres Enforce Ownership, Not Your Route Code

Every API route authenticates the same way: the browser sends its Supabase JWT in the Authorization header, and the server builds a request-scoped Supabase client that carries that token:

export async function getAuth(req: Request): Promise<AuthContext | null> {
  const token = bearerToken(req);
  if (!token) return null;

  // Request-scoped client carrying the user's JWT, so RLS (auth.uid())
  // applies to every query made through it.
  const db = createClient(url, anon, {
    global: { headers: { Authorization: `Bearer ${token}` } },
    auth: { autoRefreshToken: false, persistSession: false },
  });

  const { data, error } = await db.auth.getUser(token);
  if (error || !data.user) return null;
  return { userId: data.user.id, db };
}

The important choice: routes query through this client, not through the service-role admin client. That means Postgres Row Level Security (auth.uid()) applies to every single query. Even if I write a buggy WHERE clause, user A physically cannot read user B's conversations — the database refuses. The service-role key is used in exactly one place: the signup route, where admin user provisioning genuinely needs it.

Redis: Always a Cache, Never the Truth

Chat needs the recent conversation history on every turn. Hitting Postgres for the last 20 messages on every message works, but Redis makes it instant — as long as you're disciplined about what Redis is:

// Prior history: Redis cache, falling back to Postgres (and repriming).
let history = await getCachedTurns(conversationId);
if (history === null) {
  const { data: rows } = await auth.db
    .from("messages")
    .select("role, content")
    .eq("conversation_id", conversationId)
    .order("created_at", { ascending: false })
    .limit(MAX_TURNS);
  history = ((rows ?? []) as Turn[]).reverse();
  await primeCache(conversationId, history);
}

Both turns are persisted to Postgres before the cache is updated. A cache miss, an expired key, a Redis outage — all of it degrades to a Postgres read and a reprime. Nothing is ever lost because nothing important ever lived only in Redis.

The rate limiter follows the same philosophy but inverted — it fails open:

// Fails OPEN: if Redis is absent or errors, requests are allowed so chat
// keeps working — rate limiting is a safety layer, never a hard dependency.
export async function checkRateLimit(scope: string): Promise<RateResult> {
  const redis = getRedis();
  if (!redis) return { ok: true };
  ...
}

It runs two layered counters — a global one protecting the shared Gemini key's quota across all users, and a per-user one so a single caller can't hog the whole budget. The defaults (5/min, 20/day) deliberately mirror the Gemini free-tier quota, so the app's rate limit and the upstream quota fail at the same boundary with a friendly message instead of a raw 429.

One-Click Deploy: Closing the Loop

Generating JSON the user has to manually import is a demo. Deploying it is a product. /api/deploy-to-n8n accepts either a plain-English prompt (generate first, then deploy) or existing workflow JSON, pushes it to the user's n8n instance via the public REST API, and tries to activate it:

const created = await createWorkflowInN8n(workflowDraft, fallbackName);
const activation = await activateWorkflowInN8n(created.workflowId);

return NextResponse.json({
  workflowId: created.workflowId,
  workflowUrl: created.workflowUrl,   // deep link into the n8n editor
  activated: activation.activated,
  activationError: activation.activated ? undefined : activation.detail,
});

Note the shape of the response: activation failure is not an error. A workflow with a manual trigger can't be activated — that's expected, so the route reports activated: false with the detail and still hands back the editor URL. Modeling "partial success" honestly beat forcing everything into success/failure.

Problems I Faced (aka Lessons Learned)

The hardest bugs were import bugs, not code bugs. A workflow can be perfectly valid JSON and still fail n8n's import with Could not find property option. The causes were never in my code — they were in what the model invented: resource-locator objects, wrong typeVersions, dropdown values that don't exist. The fix was moving n8n's failure modes into the prompt as explicit prohibitions. Every import error became a new rule. The prompt is a changelog of everything that ever broke.

"Add more validation code" is usually the wrong first move. My instinct on every bad output was to write a post-processor. But post-processors can't fix a hallucinated parameter schema — they can only detect it. Restructuring the prompt (verified parameter shapes, "use httpRequest when unsure") fixed at the source what validation could only reject.

Free-tier quotas are an architecture constraint, not an ops detail. 20 requests/day on the primary model shaped real design decisions: the failover chain, the two-layer rate limiter mirroring the quota, distinguishing quota-429s from rate-429s, and honoring Retry-After. If I'd treated the quota as "someone else's problem," the app would be a coin flip.

LLMs leak their reasoning into output, and you have to tell them not to. Models love to open with "Okay, the user wants a workflow that..." before the JSON. Two mitigations: a DIRECTIVE suffix appended to every final user turn ("Output ONLY the final answer... Begin the answer immediately"), and hard constraints in the chat prompt banning meta-commentary, node-by-node prose dumps, and "here is the workflow" framing. Prompt discipline is output discipline.

Every external dependency needs a "not configured" story. Supabase clients return null when env vars are missing. hasGemini() gates every AI route. hasN8nConfig() gates deployment. Every route degrades to a clear "add X to .env.local" message instead of a crash. It made local development, demos, and partial deployments painless — you can run the marketing page with zero env vars.

Next.js 16 is not the Next.js in the model's training data. The repo has a standing rule: read the docs shipped in node_modules/next/dist/docs/ before writing framework code, because App Router conventions moved again. When a framework error looks undocumented, check your framework version first — the answer is almost always a recent breaking change.

What's Next

Velocity's core loop — describe → generate → deploy — works end-to-end. What I'm building next:

Workflow analysis & repair — paste any existing workflow and get a structured health report with auto-fixes (the /api/analyze-workflow and /api/explain-error routes are the foundation)
Plan-first generation — show the step plan and required services before generating, so users can edit the plan instead of the JSON
Execution monitoring — pull real run data from connected n8n instances into the dashboard
Template remixing — start from one of the official n8n templates and modify it conversationally

The core insight hasn't changed: automation knowledge shouldn't be locked behind a node catalog and a canvas. You describe the outcome; the machine should handle the wiring.

Built with Next.js 16, React 19, Google Gemini, Supabase, Redis, Tailwind CSS v4, and the n8n REST API.

If you've fought with LLM structured output, n8n imports, or free-tier quotas — I'd genuinely love to hear how you handled it. Drop a comment. 👇

— Arish Singh

DEV Community