KristinZ

Posted on Jun 8 • Edited on Jun 23

From React Developer to AI Engineer: What Actually Changes

#typescript #ai #webdev #career

From React Developer to AI Engineer: What Actually Changes

A few years ago my manager announced the company was going all-in on AI. Everyone nodded. After the meeting ended, a few of us gathered in the hallway and exchanged a look: who's actually going to do this?

The problem wasn't motivation. The problem was the wall you hit the moment you sit down to figure out how. AI development seemed to be Python territory. Our stack was TypeScript. Nobody on the team wrote Python, and hiring someone who did felt like admitting defeat. So "AI transformation" became a topic that got raised repeatedly and shelved just as repeatedly.

I started digging into it myself. Not out of any particular sense of mission — just the feeling that this was coming whether I was ready or not.

What I found surprised me.

The Insight That Changed Everything

After reading through a lot of material and running experiments, I arrived at one insight that reframed everything:

For most applications, AI development is fundamentally about calling APIs.

Training your own models? That's millions of dollars in compute — beyond the reach of most companies. Self-hosting open-source models? Possible, but operationally complex. The practical path for building AI products is to call the APIs that OpenAI, Anthropic, and Google expose. They've wrapped their best models into HTTP endpoints, billed by token.

And calling HTTP endpoints, processing JSON, handling async streams — that's exactly what frontend developers do every day.

Python can do this. So can TypeScript. For someone already at home in the frontend ecosystem, the switching cost is nearly zero.

More than that: TypeScript has a genuine advantage in AI application development. LLM structured outputs need strict validation — Zod pairs with TypeScript seamlessly. Sharing types between frontend and backend eliminates friction when the LLM response needs to drive UI rendering. This isn't a workaround. It's a good fit.

What You Actually Need to Learn

The shift from frontend to AI engineering involves five new areas. Here's an honest take on each one.

1. How LLMs Work (Less Than You Think)

You don't need to understand the mathematics of transformers to build AI applications. What you do need to understand:

Context windows. Everything you send to the LLM — system prompt, conversation history, retrieved documents — competes for space in a fixed-size context window. Managing what's in that window is a large part of AI engineering. A chat application that doesn't truncate history will eventually fail silently.

Tokens, not characters. LLMs think in tokens. "TypeScript" might be 1 token; "supercalifragilistic" might be 6. Pricing is per token, rate limits are per token, and the context window is measured in tokens. You need an intuition for token counts even if you're not doing math constantly.

Temperature and sampling. Higher temperature = more creative (and less reliable) output. Lower temperature = more deterministic. For structured output (JSON responses, form filling) you want temperature near 0. For creative tasks you want it higher. This is a dial you'll be turning constantly.

2. Prompt Engineering (More Engineering Than Art)

Prompt engineering has a bad reputation as something vague and mystical. In practice it's more like writing a tight specification.

The things that actually work:

Be explicit about format. If you want JSON, say so and show an example. If you want a bullet list, say so. LLMs follow formatting instructions much more reliably than they follow vague requests.

Use XML-style delimiters. <context>, <user_input>, <instructions> — wrapping content in tags makes it unambiguous to both the LLM and to you. It also helps with prompt injection defense.

Few-shot examples beat long instructions. Showing the LLM two or three examples of input/output pairs works better than explaining what you want in prose. This is counterintuitive but consistent.

Validate structured output with Zod. When you need the LLM to return JSON with a specific shape, define a Zod schema and use it to validate the response. When validation fails, retry with the error message. This loop is surprisingly reliable.

import { z } from 'zod';

const ResponseSchema = z.object({
  sentiment: z.enum(['positive', 'negative', 'neutral']),
  confidence: z.number().min(0).max(1),
  summary: z.string().max(200),
});

async function analyzeSentiment(text: string) {
  const response = await callLLM(`
Analyze the sentiment of this text. Respond with JSON matching this schema:
{ "sentiment": "positive"|"negative"|"neutral", "confidence": 0-1, "summary": "brief explanation" }

Text: <text>${text}</text>
`);

  return ResponseSchema.parse(JSON.parse(response));
}

3. RAG (The Core Pattern for Real Products)

Vector search and RAG (Retrieval-Augmented Generation) sounds intimidating but the core pattern is straightforward:

When documents are ingested, split them into chunks and convert each chunk to a vector embedding (a high-dimensional array of numbers)
Store those vectors in a database that supports similarity search (pgvector if you're using PostgreSQL)
When a user asks a question, convert the question to a vector and find the most similar chunks
Inject those chunks into the LLM's context and ask it to answer based on them

This is how you give an LLM knowledge of your company's internal documents, your product catalog, or any other data that didn't exist when the LLM was trained.

The gap between a working RAG demo and a production RAG system is mostly about precision. Pure vector search has a blind spot for exact terms — proper nouns, version numbers, model names. Adding BM25 keyword search alongside vector search (hybrid retrieval) fixes most of this. Adding a reranker on top improves precision further.

For TypeScript developers: pgvector works with any PostgreSQL client you already know. Embeddings are just arrays. The "scary" parts of this stack are mostly familiar infrastructure.

4. Agents (Where It Gets Interesting)

An Agent is an LLM that can take actions — call functions, read files, search the web — and reason about what to do next based on the results.

The core loop:

LLM receives a goal and a list of available tools
LLM decides which tool to call and with what arguments
Tool executes, result goes back to LLM
LLM decides next step (another tool call, or final answer)

This is called the ReAct loop (Reasoning + Acting). In TypeScript:

async function runAgent(goal: string, tools: Tool[]): Promise<string> {
  const messages: Message[] = [
    { role: 'system', content: buildSystemPrompt(tools) },
    { role: 'user', content: goal },
  ];

  while (true) {
    const response = await callLLM(messages);

    if (response.type === 'answer') {
      return response.text; // Done
    }

    // Execute the tool the LLM requested
    const result = await executeTool(response.toolName, response.args, tools);

    // Add both the tool call and result to the conversation
    messages.push({ role: 'assistant', content: response.raw });
    messages.push({ role: 'tool', content: result });

    // Loop: LLM will reason about the result and decide next step
  }
}

The hard parts of agent development aren't the code. They're reliability (agents fail in non-obvious ways), handling errors gracefully (tools fail, LLMs make wrong decisions), and knowing when to stop (infinite loops are a real problem). These are engineering problems, not AI research problems.

5. The Production Gap

The gap between a working demo and a production AI application is mostly these four things:

Observability. What was the prompt? What was the output? How many tokens? You need to record this for every LLM call. Traditional application monitoring tools aren't designed for LLMs. Use a purpose-built tool like LangFuse.

Cost control. LLMs are billed by token. A multi-turn agent session can consume tens of thousands of tokens. Without rate limiting and quotas, a single user can exhaust your monthly budget.

Security. User input goes directly into prompts. Prompt injection is a real attack class. You need input validation, structured prompt design, and output validation.

Streaming. Users expect to see output as it generates — waiting 10 seconds for a complete response feels broken even when it's correct. SSE (Server-Sent Events) over HTTP is the natural fit; it's the same pattern you'd use for any server-to-client push.

The Mental Model Shift

The biggest shift isn't technical. It's accepting that LLMs are non-deterministic.

In normal software, a function with the same inputs always produces the same outputs. LLMs don't work that way. The same prompt can produce different responses on different calls. Sometimes those responses are wrong. Sometimes they're confidently wrong.

This changes how you test and how you design. You can't write a unit test that asserts exact output. You test for properties: does the response contain the required fields? Is it within the expected length? Does it pass the validation schema? You build retry logic. You design graceful degradation.

After spending a career where code either works or it doesn't, this is genuinely uncomfortable at first. It gets easier once you stop treating LLM responses like deterministic functions and start treating them like network calls to a smart but unreliable service — you'd never write code that crashes if a network request returns unexpected content.

What Hasn't Changed

The tooling you know still applies. TypeScript's type system. Node.js's async model. PostgreSQL. Docker. REST APIs. CI/CD pipelines.

The patterns you know still apply. Input validation. Error handling. Rate limiting. Caching. Logging.

AI development adds a new category of component — the LLM — with its own quirks and failure modes. But the surrounding infrastructure is the same infrastructure you've been building for years.

The Python wall that stopped my team wasn't real. We had everything we needed.

This article is adapted from the preface and opening chapters of From Frontend to AI Engineering or at Leanpub — A Practical Guide to AI Agents, RAG, MCP Servers and LLM Apps in TypeScript, written for frontend and full-stack developers.

Top comments (3)

Hemapriya Kanagala • Jun 23

I think a lot of people still assume AI engineering means jumping straight into Python and ML research, so I liked the point that for most applications you're really building around APIs, data, and workflows.

Thanks for sharing this 😄

Some comments may only be visible to logged-in visitors. Sign in to view all comments.