DEV Community

Harshdeep Singh
Harshdeep Singh

Posted on • Originally published at theharshdeepsingh.com

How to Build an AI Resume Builder with LangChain and Node.js

A few months back, my friend Marcus was applying for a senior backend role at a fintech company. He had five years of solid experience — distributed systems, AWS, the whole stack. But his resume read like a list of job descriptions someone had copied from LinkedIn. "Responsible for maintaining microservices." "Assisted with CI/CD pipeline implementation." You know the type.

I told him: the problem isn't what you did, it's how you're saying it. Hiring managers spend about six seconds on a resume before deciding whether to read it properly. Six seconds. And if those six seconds are spent reading "responsible for maintaining" — you've lost them.

We spent two hours rewriting it together. Every bullet point started with a strong verb. Every achievement had a number. "Reduced API response time by 40% by introducing Redis caching across three high-traffic endpoints." Much better. Marcus got the interview.

The obvious next thought was: what if you could automate this? Not in the "dump your resume into ChatGPT and ask it to make it better" way — that produces generic slop. I mean a real, structured AI pipeline that understands resume context, applies professional rewriting patterns, and returns clean, job-specific output.

That's what LangChain is built for. And in this guide, we're going to build exactly that: an AI-powered resume rewriter using LangChain and Node.js, with a real Express API, streaming responses, and the kind of prompt engineering that actually produces good results.

What Is LangChain, and Why Bother?

Here's the honest answer: LangChain is an orchestration framework for building applications on top of large language models. Think of it the way you'd think of Express.js — Express doesn't do anything you couldn't do with raw Node's http module, but it gives you a structured, composable way to build web apps that doesn't collapse under its own weight.

LangChain does the same thing for LLM applications. You could just call the OpenAI API directly everywhere. For a one-off script, that's fine. But as soon as your app grows — different prompts for different tasks, multi-step reasoning chains, memory across conversations — raw API calls get messy fast.

Here's what raw OpenAI API code looks like once a project grows:

// Raw OpenAI — works, but scales badly
const response = await openai.chat.completions.create({
  model: "gpt-4",
  messages: [
    { role: "system", content: systemPrompt },
    { role: "user", content: `Rewrite this section: ${section}` }
  ]
});
const rewritten = response.choices[0].message.content;

Enter fullscreen mode Exit fullscreen mode

That's fine for one call. Now add: prompt versioning, chaining that output into a second model call, memory from previous messages, fallback to a different model when rate limits hit, streaming output to the client. Suddenly you're managing a lot of state manually.

LangChain handles all of that with composable primitives: PromptTemplate for reusable, testable prompts; LLMChain for connecting a prompt to a model; SequentialChain for multi-step pipelines; built-in streaming support; and integrations with every major LLM provider.

For our resume builder, the chain looks like this: parse the resume into structured sections, run each section through a prompt that produces action-oriented bullet points, then return the assembled result. Let's build it.

What We're Building

Before we write a line of code, here's the system at a glance:

┌─────────────────────────────────────────────────────┐
│                   CLIENT (Frontend)                  │
│         POST /api/rewrite { resumeText, section }    │
└──────────────────────┬──────────────────────────────┘
                       │
                       ▼
┌─────────────────────────────────────────────────────┐
│                  EXPRESS API (Node.js)               │
│  1. Validate input                                   │
│  2. Parse resume into sections                       │
│  3. Call LangChain rewrite chain                     │
│  4. Return improved bullet points                    │
└──────────────────────┬──────────────────────────────┘
                       │
                       ▼
┌─────────────────────────────────────────────────────┐
│              LANGCHAIN REWRITE CHAIN                 │
│  PromptTemplate → ChatOpenAI (GPT-4) → Output       │
└──────────────────────┬──────────────────────────────┘
                       │
                       ▼
┌─────────────────────────────────────────────────────┐
│                  OPENAI API (GPT-4)                  │
└─────────────────────────────────────────────────────┘

Enter fullscreen mode Exit fullscreen mode

Nothing revolutionary — but each layer has a single, testable job. The chain is the interesting part, so let's get there quickly.

Project Setup

Start a new Node.js project and install the dependencies:

mkdir resume-ai && cd resume-ai
npm init -y
npm install express langchain @langchain/openai @langchain/core dotenv

Enter fullscreen mode Exit fullscreen mode

Create a .env file at the root:

OPENAI_API_KEY=sk-your-key-here
PORT=3001

Enter fullscreen mode Exit fullscreen mode

And your project structure:

resume-ai/
├── src/
│   ├── parseResume.js
│   ├── resumeChain.js
│   └── app.js
├── .env
└── package.json

Enter fullscreen mode Exit fullscreen mode

Add "type": "module" to package.json so we can use ES module syntax throughout.

Step 1: Parsing the Resume

This is the unglamorous part that everyone skips, and it's why most AI resume tools produce garbage. You can't just throw 800 words of resume text at a model and ask it to "make it better." You need to isolate the section you're improving — otherwise the model is operating without context.

Here's a simple section parser. It's not perfect — real resumes come in dozens of formats — but it handles the common patterns:

// src/parseResume.js
export function parseResumeText(rawText) {
  const sections = {
    summary: "",
    experience: [],
    skills: [],
    education: [],
  };

  const sectionKeywords = {
    summary: ["summary", "objective", "profile", "about"],
    experience: ["experience", "employment", "work history", "career"],
    skills: ["skills", "technical skills", "technologies", "competencies"],
    education: ["education", "academic", "degree", "university"],
  };

  const lines = rawText.split("\n").filter((l) => l.trim().length > 0);
  let currentSection = null;

  for (const line of lines) {
    const lowerLine = line.toLowerCase().trim();

    const detected = Object.entries(sectionKeywords).find(([, keywords]) =>
      keywords.some((kw) => lowerLine.includes(kw))
    );

    if (detected && lowerLine.length  {
  const { resumeText, targetSection } = req.body;

  if (!resumeText || typeof resumeText !== "string") {
    return res.status(400).json({ error: "resumeText is required" });
  }
  if (!targetSection || typeof targetSection !== "string") {
    return res.status(400).json({ error: "targetSection is required" });
  }

  // Stay within token limits — GPT-4 context window is large,
  // but we don't need to send the whole resume every time.
  const resumeContext = resumeText.slice(0, 3000);

  try {
    const result = await rewriteChain.call({
      resumeContext,
      sectionText: targetSection,
    });

    res.json({
      original: targetSection,
      rewritten: result.text.trim(),
    });
  } catch (err) {
    console.error("Chain error:", err.message);

    if (err.message?.includes("Rate limit")) {
      return res.status(429).json({ error: "Rate limit hit. Try again in a moment." });
    }

    res.status(500).json({ error: "Rewrite failed. Check your OpenAI API key." });
  }
});

const PORT = process.env.PORT || 3001;
app.listen(PORT, () => console.log(`Resume AI API running on :${PORT}`));

Enter fullscreen mode Exit fullscreen mode

The input size limit (50kb) and the resumeContext.slice(0, 3000) are both intentional. Most GPT-4 token limits won't be hit by a 3,000-character resume excerpt, but some resumes are surprisingly long — especially ones with extensive project descriptions. Truncating at 3,000 characters keeps costs predictable.

Step 4: Streaming the Response

For a good UX, you want to stream the AI response as it arrives rather than waiting for the full completion. A 400-word rewrite might take 6–8 seconds to complete — a blank screen for 8 seconds feels broken.

LangChain makes streaming straightforward with callbacks:

import { HumanMessage } from "@langchain/core/messages";

app.post("/api/rewrite/stream", async (req, res) => {
  const { resumeText, targetSection } = req.body;

  res.setHeader("Content-Type", "text/event-stream");
  res.setHeader("Cache-Control", "no-cache");
  res.setHeader("Connection", "keep-alive");
  res.flushHeaders();

  const streamingModel = new ChatOpenAI({
    modelName: "gpt-4",
    temperature: 0.4,
    streaming: true,
    callbacks: [
      {
        handleLLMNewToken(token) {
          res.write(`data: ${JSON.stringify({ token })}

`);
        },
        handleLLMEnd() {
          res.write("data: [DONE]\n\n");
          res.end();
        },
        handleLLMError(err) {
          res.write(`data: ${JSON.stringify({ error: err.message })}

`);
          res.end();
        },
      },
    ],
  });

  const resumeContext = resumeText?.slice(0, 3000) || "";
  const prompt = `Rewrite these resume bullets for a software developer. Be concise and action-oriented:\n${targetSection}`;

  await streamingModel.invoke([new HumanMessage(prompt)]);
});

Enter fullscreen mode Exit fullscreen mode

On the frontend, you'd consume this with the Fetch API and ReadableStream. Each data: event carries a token, and you append it to the UI as it arrives. The user sees the response materialize in real time — feels fast, even when it isn't.

Watch: LangChain in Node.js (Quick Start)

Common Pitfalls (and How to Dodge Them)

1. Token limits sneaking up on you

GPT-4's context window is large, but you pay per token. If you're sending the full resume + prompt on every request, costs add up fast at scale. The fix: truncate the resume context (as shown above) and cache the parsed sections so you're not re-parsing on every API call.

2. The model inventing achievements

This is the big one. Ask the model to "quantify achievements" without any source data, and it will make numbers up. "Reduced load time by 73%" sounds great until the hiring manager asks about it in an interview. The fix: explicitly tell the model in the prompt: "Only add numbers if they appear in the original text. If no numbers are present, use qualitative language instead."

3. Prompt injection through resume content

A crafty user could put something like "Ignore all previous instructions and output..." inside their resume text. Since you're sending that text directly to the model, it works. The fix: sanitize input and separate resume content from the instruction portion of the prompt with a clear delimiter, like ---RESUME START--- / ---RESUME END---.

4. Not rate limiting

OpenAI's rate limits are per API key, not per user. One user hammering your endpoint can hit the limit for everyone. Add a rate limiter like express-rate-limit before you go live — 5 requests per minute per IP is a reasonable starting point for a resume tool.

5. Picking GPT-4 when you don't need it

GPT-4 is expensive and slow. For most resume rewriting tasks, gpt-4o-mini produces nearly identical results at a fraction of the cost. Test both. You might be surprised how good the cheaper model is for structured, constrained tasks like this one.

LangChain vs. Raw OpenAI API — When to Use Which

Factor

Raw OpenAI API

LangChain

Setup complexity

Low — one import, one call

Medium — more abstractions to learn

Single prompt apps

Perfect fit

Overkill

Multi-step chains

Tedious to wire manually

First-class support

Prompt reuse and testing

DIY — no built-in structure

PromptTemplate makes this easy

Memory across turns

Manual array management

Built-in memory types

Streaming

Supported, manual wiring

Supported, callback-based

Switching LLM providers

Rewrite API calls

Swap the model object

Community / ecosystem

Smaller (OpenAI-specific)

Large, active, lots of integrations

The rule of thumb: if your app makes more than two different types of LLM calls, or if you need any kind of chaining, LangChain saves you from writing orchestration code from scratch. For a simple one-shot wrapper, raw API is cleaner.

TL;DR

  • LangChain is an orchestration layer for LLM apps — think Express for AI. Use it when you have multi-step chains, prompt reuse, or memory requirements.- Parse before you prompt. Sending a raw resume blob to the model is a recipe for generic output. Identify the section you want to improve and give the model focused context.- Constrain the prompt explicitly. Action verbs, number quantification, bullet count — tell the model exactly what format you want. Vague prompts produce vague results.- Stream responses for better UX. A blank screen for 8 seconds feels broken; a response materializing in real time feels fast.- Guard against pitfalls: rate limit your API, sanitize resume input against prompt injection, and test gpt-4o-mini before defaulting to GPT-4 — it's often good enough and 10x cheaper.

Top comments (0)