DEV Community: Wences Martinez

Building a production-ready RAG pipeline

Wences Martinez — Wed, 13 May 2026 08:51:51 +0000

Large Language Models (aka LLMs) have a memory problem: their knowledge stops the day their training data was cut off, they don't know your codebase, they don't know last week's tickets…

When they're missing context they don't say so… they guess, confidently. The polite term is hallucination; the less polite one is lying with style.

Retrieval-Augmented Generation (aka RAG) is how you fix that without retraining anything.

Think of it as turning a closed-book exam into an open-book one. The LLM is still the writer, but now it has a librarian: a system that fetches the right passages from your data and hands them over before the model puts pen to paper.

I built Keystone to learn this end-to-end.

Keystone does two things:

Ingest a GitHub repository's activity → every PR, commit, issue, and discussion
Answer questions about why the codebase looks the way it does.

The first prototype didn't have a retrieval system, it had a giant string.

That worked for tiny repos. On a real one (+1000 commits, +500 merged PRs, tons of issues, plus a tree of ~1,200 files) it broke in four ways at once:

The prompt blew past the context window.
The model lost the thread halfway through.
The latency hit double digits of seconds, and every answer cost ~$0.15 in tokens for a query that should cost a fraction of a cent.

That's the moment RAG stops being optional but required.

Below is what I actually built, lifted from the codebase running today 👇🏻

What RAG actually is (and what it isn't)

The clean mental model: RAG is just "search, then prompt."

You convert your data into a search index ahead of time. At query time, you look up the most relevant pieces and paste only those into the prompt. That's it!

We can define two main stages:

Retrieval: search your data and pull the chunks most relevant to the user's question.
Generation: send those chunks plus the question to a regular LLM call and let it write the answer.

Everything else; embeddings, vector databases, re-ranking, hybrid search… it exists because matching meaning is harder than matching keywords.

When to use RAG

The answer depends on facts the model doesn't have.

Private docs, your codebase, last week's tickets, anything post-cutoff or non-public.

It's also how you get grounded answers with citations, the model can point at exactly which chunk of which document it used, which is the difference between a tool you can ship to users and a demo you can't.

When NOT to use RAG

You want the model to behave differently (tone, format, reasoning style). That's a f*ine-tuning* or prompting problem, not a retrieval problem.

RAG injects knowledge, it does not change behavior.

The second mistake I see: people reach for RAG when the data is small enough to fit in a prompt. If you have 10,000 tokens of context, just paste it. RAG buys you scale at the cost of an extra layer that will leak relevance bugs into your product.

The four stages every RAG has

Every RAG in production has the same four stages, and each one breaks in its own special way if you do it naively:

Ingestion: pull data from somewhere and split it into chunks.
Embedding: turn each chunk into a vector so similarity becomes math.
Retrieval: search for the chunks closest to the user's question.
Synthesis: hand those chunks to an LLM and let it write the answer.

The next sections go through each one in order, with what I do in Keystone and what I'd warn you about.

1. Chunking: where most RAG systems fail

This is the section nobody wants to read because chunking sounds boring.

It's also the section that decides whether your RAG actually works.

The naive approach is "split text every 500 tokens." That dies for two reasons:

A PR, for example, is not just 500 tokens of one thing. It's a title, a body, a list of files, a list of commit messages, sometimes comments, discussions. Embedding them as one blob averages five different topics into one vector. Retrieval returns the wrong PR because the vector is an average of irrelevant stuff.
Not all artifacts are equal. A merged PR with five reviews carries more architectural signal than a Fix typo commit. Treating them with the same chunk size and same metadata throws away the asymmetry.

I use typed chunking; different artifact types get different chunkers, different size budgets, and different metadata:

function chunkPR(pr: IngestPullRequest): EmbeddingChunk {
  const filesStr   = pr.files.join(', ')
  const commitsStr = pr.commits.map(c => c.message).join(' | ')
  const commentsStr = pr.comments?.length
    ? ' | Comments: ' + pr.comments.map(c => `${c.author}: ${c.body}`).join(' | ')
    : ''
  const reviewsStr = pr.reviews?.length
    ? ' | Reviews: ' + pr.reviews.map(r => `${r.author}: ${r.body}`).join(' | ')
    : ''

  const raw = `[PR #${pr.number}] ${pr.title}: ${pr.body ?? ''} | Files: ${filesStr} | Commits: ${commitsStr}${commentsStr}${reviewsStr}`

  return {
    sourceId: `pr:${pr.number}`,        // <- stable, dedupable
    content:  truncate(raw, 4000),       // <- PRs get 4000 chars
    metadata: { type: 'pr', author: pr.author, number: pr.number, merged_at: pr.merged_at }
  }
}

And here's what a real chunk looks like coming out of that function:

{
  "sourceId": "pr:42",
  "content": "[PR #42] Replace REST with GraphQL for the data layer: Switched from ...",
  "metadata": {
    "type": "pr",
    "author": "wencesms92",
    "number": 42,
    "merged_at": "2025-11-14T10:22:00Z"
  }
}

Issues get their own chunker with the same shape but a smaller budget (1500 chars) and different metadata. Same pattern, different parameters.

Three things to notice:

The [PR #N] prefix is intentional. Embedding models are sensitive to what's at the front of the text, so putting the artifact type and number first lets the model anchor on it. When I tried without the prefix, the same PR ranked lower for queries like "what did PR 42 change?"
Each sourceId is stable and globally unique (pr:42, issue:7, readme:root, topology:tree). That key is what makes the upsert work, and it's also what lets a webhook re-embed a single PR after a merge without rebuilding the world. Same chunker, same SQL upsert, just one row.
Commits get aggregated, not chunked individually. This is the most non-obvious decision in the whole pipeline. If you embed every commit one-by-one, you drown the index in noise. I instead deduplicate commits already present inside a PR (they're embedded with the PR) and then summarize the leftover "orphan" commits into a single chunk:

// Orphans = commits not already inside any PR
const prCommitShas = new Set(data.pullRequests.flatMap(pr =>
  pr.commits.map(c => c.sha)))
const orphans = data.commits.filter(c => !prCommitShas.has(c.sha) &&
  !isNoiseCommit(c))

if (orphans.length > 0) {
  chunks.push({
    sourceId: 'commits:orphan-summary',
    content: truncate(
      `[Commits] ${orphans.length} standalone commits (not in PRs) | Authors: ${authorsStr} | Recent: ${recentMsgs}`,
      4000
    ),
    metadata: { type: 'commits', count: orphans.length, orphan: true }
  })
}

Before the orphan rolls up, a noise filter strips anything useless:

const NOISE_MSG_PATTERNS = [
  /^merge branch/i, /^merge pull request/i, /^wip$/i, /^fix typo/i,
  /^fixup!/i, /^squash!/i, /^initial commit$/i, /^update \S+$/i
]
const NOISE_AUTHOR_PATTERNS = [/\[bot\]$/, /^dependabot/i, /^renovate/i, /^github-actions/i]

Filtering bot commits and merge noise before they hit the embedding API saves cost, keeps the index dense, and stops "what's the architecture" queries from returning seventeen dependabot bumped lodash chunks.

So… don't embed garbage!

2. Embeddings: picking the right model

The boring truth: most embedding models are good enough. The real trade-off is dimension count × cost × domain fit.

I went with Mistral AI codestral-embed-2505 (1536 dimensions), a code-tuned embedding model ranks in a way a general-purpose model does not.

Two main reasons:

Generous free-tier → Mistral's free tier is generous enough to run real embedding workloads without hitting a paywall on day one of a side project. OpenAI's free credits evaporate the moment you embed a real dataset.
Domain fit → My data is code-adjacent: commit messages, file paths, PR titles.

The call itself is unremarkable, which is the point. The work happens in the chunking, not here:

// At query time, embed the user's question with the same model
const { embedding } = await embed({
  model: mistral.textEmbeddingModel('codestral-embed-2505'),
  value: query
})

3. Retrieval (that doesn't suck)

Retrieval can be one query:

const vectorStr       = `[${embedding.join(',')}]`
const projectIdsArray = `{${projectIds.join(',')}}`

const results = await prisma.$queryRawUnsafe<MatchEmbeddingRow[]>(
  `SELECT pe.id, pe."projectId" as project_id, p.name as project_name,
          pe."sourceId" as source_id, pe.content, pe.metadata,
          (1 - (pe."embedding" <=> $1::vector(1536)))::float as similarity
   FROM "ProjectEmbedding" pe
   JOIN "Project" p ON p.id = pe."projectId"
   WHERE pe."projectId" = ANY($2::text[])
     AND 1 - (pe."embedding" <=> $1::vector(1536)) > 0.3
   ORDER BY pe."embedding" <=> $1::vector(1536)
   LIMIT 12`,
  vectorStr,
  projectIdsArray
)

Five things this is doing on purpose:

<=> is the pgvector cosine-distance operator. Combined with the HNSW index built on vector_cosine_ops, this query uses the index instead of a sequential scan.
Pre-filter by projectId = ANY(...) before the vector search. Permissioning happens before similarity ranking, so you never see a chunk from a project you don't have access to, and the index narrows the search space.
Threshold of 0.3 similarity. Below that, the chunk is more noise than signal. Lower threshold → more recall → more garbage in the prompt. Tune this on real queries, not synthetic ones.
Top 12 results. Enough that 2-3 misses still leave a usable signal; small enough that the prompt stays cheap. I started at 25 and it was overkill. The model latched onto the first 5 anyway and the rest were filler.
JOIN the Project name in the SELECT. When the query spans multiple repos, the model needs to know which repo a chunk came from. The repo name shows up in the chunk payload, which is what lets the answer cite [repo-A] vs [repo-B] accurately.

No re-ranker. No keyword pre-filter. One stage.

The chunking does enough work upfront that a second-stage ranker hasn't been worth its latency yet, and that's a real result, not laziness.

Re-rankers earn their keep when your chunks are big, noisy, and undifferentiated. My chunks are small, typed, and prefixed.

4. Context assembly and the LLM call

This is where Keystone diverges from textbook RAG.

Classic RAG does this:

embed(query) → search → concat(top_k) → prompt → generate

I do this:

prompt(LLM, tools={search, tree, file}) → LLM decides → up to 10 tool calls → final answer

The LLM is the orchestrator. It sees a system prompt that explains the two data sources, vectorized memory vs. live code, and the available repos. Then it chooses which tool to call.

The split looks like this:

Vectorized memory holds the why → PR descriptions, issue threads, commit messages, the artifacts where decisions are explained. Vectors of these stay useful even when the code drifts.
Live file access holds the what → The current package.json, the current list of plugins, the current value of a constant. Stale vectors of months-old code lie about the present, so for "what" questions I read the file fresh via the GitHub API.

Here's what the agentic retrieval actually looks like in production logs:

[Chat Tool] searchTechnicalMemory: "relationship between open-webui, opencode, and openclaw" across 3 project(s)

[Chat Tool] Found 9 results [
  { repo: 'open-webui', sourceId: 'readme:root',            similarity: 0.555 },
  { repo: 'openclaw',   sourceId: 'readme:root',            similarity: 0.544 },
  { repo: 'opencode',   sourceId: 'readme:root',            similarity: 0.503 },
  { repo: 'openclaw',   sourceId: 'topology:tree',          similarity: 0.465 },
  { repo: 'open-webui', sourceId: 'topology:tree',          similarity: 0.463 },
  { repo: 'opencode',   sourceId: 'topology:tree',          similarity: 0.463 },
  { repo: 'opencode',   sourceId: 'commits:orphan-summary', similarity: 0.439 },
  { repo: 'openclaw',   sourceId: 'commits:orphan-summary', similarity: 0.431 },
  { repo: 'open-webui', sourceId: 'commits:orphan-summary', similarity: 0.420 }
]

The model chose to search across all three repos in a single call, it understood the query was cross-project without being told.

The readme:root chunks rank highest (0.55, 0.54, 0.50) because READMEs describe what a project is, and the query asks exactly that.
The topology:tree chunks rank next: file structure is the second most useful signal for understanding how three repos relate.
The commits:orphan-summary chunks come in last but still above the 0.3 floor, adding commit-level context without the noise of individual commits.

Two practical effects:

The model can iterate → It might search memory, realize the answer needs a file, fetch the file, then answer.
The prompt stays small → Only the chunks the model actually requested make it into the conversation. No "stuff top-25 into system prompt" bloat.

The synthesis model itself is Mistral AI devstral-small-latest: small, cheap, fast. With good retrieval you don't need a frontier model for the writing step. The expensive part of "intelligence" is finding the right context. Writing a coherent paragraph from good context is the easy part.

Every call gets logged with input/output tokens, step count, and finish reason, both to a usage table and to PostHog. That's the observability layer that lets me actually answer "is retrieval getting better or worse this week?" with a graph instead of a vibe.

Closing

The pipeline above (typed chunking, code-tuned embeddings, HNSW + pgvector, and an LLM that knows when to search) is what's running inside Keystone today.

It's small, opinionated, and it works because every stage has one job and respects the constraints of the next.

If there's one thing to take away: ignore the model leaderboards for a week and go obsess over your chunking. That's where the wins are.

The fanciest embedding model in the world can't rescue data that's been concatenated into mush, and the cheapest model is plenty good when the chunks coming in are sharp, typed, and free of noise.

RAG isn't a magic upgrade for LLMs. It's a librarian, and a librarian is only as good as the way you organized the shelves.

Keystone is the project I'm building to give software teams a living memory of their codebase: every PR, commit, issue, and decision, queryable in natural language.

If you have any suggestions I'd love to hear them on the comments section!

Thanks for reading! 👋

Wences.

Optimizing Nuxt + Prisma in Docker: How we cut our image size by 84%

Wences Martinez — Thu, 05 Mar 2026 08:29:59 +0000

By now, Nuxt hardly needs an introduction as one of the most used full-stack frameworks. For those new to the Vue ecosystem, it is essentially the counterpart to Next.js in the React world.

So, Nuxt is not just a frontend framework. With Nitro as the server engine, you get SSR, API routes, server middleware… it’s a real full stack framework. We’ve been using it at Resizes for a long time to build complex applications, and it works really well!

That sounds really cool, right? But what is the most efficient, fastest, and most lightweight way to build a Docker image for a Nuxt app?

Let’s talk about it!

Our Use Case

At Resizes we have developed several applications with Nuxt4, and one of them has several modules and dependencies; an auth library, an ORM to handle database migrations and even an embedded database, so dockerizing it efficiently was not as straightforward as it seems.

In this post I’ll share how we do it, with real numbers from our GitHub Actions pipeline and the hidden traps we found along the way.

The stack

Nuxt 4 + Nitro (server engine)
Prisma ORM + SQLite via better-sqlite3 (native C++ module)
Docker with multi-stage build

The following package.json reveals some of our most used dependencies regarding a Nuxt 4 application:

// package.json
...
"scripts": {
    "dev": "nuxt dev",
    "build": "nuxt build",
    "generate": "nuxt generate",
    "preview": "nuxt preview",
    "postinstall": "nuxt prepare && prisma generate",
    "lint": "eslint .",
    "lint:fix": "eslint . --fix"
},
...
"dependencies": {
    "better-sqlite3": "~12.0.0",
    "@prisma/client": "~7.2.0",
    "@prisma/adapter-better-sqlite3": "~7.3.0",
    "@anthropic-ai/sdk": "~0.72.1",
    "better-auth": "~1.4.17",
    "zod": "~4.3.5",
    "prisma": "~7.2.0",
    "dotenv": "~17.2.3"
    ...
  },
  "devDependencies": {
    "nuxt": "~4.2.2",
    "@nuxt/ui": "~4.3.0",
    "@nuxt/eslint": "~1.12.1",
    "vue": "~3.5.27",
    "@vueuse/nuxt": "~14.1.0",
    "typescript": "~5.9.3",
    "eslint": "~9.39.2",
    "eslint-config-prettier": "~10.1.8",
    "prettier": "~3.8.1",
    "tsx": "~4.21.0",
    ...
  }

The Dockerfile: single vs multi stage

Can you deploy a Nuxt application with a basic Dockerfile? Of course you can! But that doesn’t mean you should.

While a single-stage Dockerfile will technically build your app, it drags a huge amount of baggage into production. We’re talking about build tools, dev dependencies, source code, and artifacts.

You end up shipping the entire kitchen sink instead of just the lean, compiled application.

An example of a basic single-stage Dockerfile:

# Single-stage Dockerfile (for comparison only — DO NOT use in production)
FROM node:24-slim
RUN apt-get update && apt-get install -y python3 make g++ --no-install-recommends && rm -rf /var/lib/apt/lists/*

WORKDIR /app
ARG DATABASE_URL="file:./dummy.db"
COPY . .

RUN npm ci
RUN npx prisma generate && npx nuxt build

ENV NODE_ENV=production
ENV NITRO_HOST=0.0.0.0
ENV NITRO_PORT=3000

# Still need to fix Nitro stubs even in single-stage
RUN rm -rf .output/server/node_modules
RUN cp -r node_modules .output/server/node_modules

COPY docker-entrypoint.sh ./docker-entrypoint.sh
RUN chmod +x ./docker-entrypoint.sh

EXPOSE 3000

ENTRYPOINT ["./docker-entrypoint.sh"]
CMD ["node", ".output/server/index.mjs"]

We’ve done the test and we’ve built our app with the above Dockerfile vs a multi-stage Dockerfile to compare the size of the final image between them.

The results are crazy: 4.05 GB vs 637MB. The multi-stage Dockerfile is almost x7 times smaller than the single-stage Dockerfile 🫢

But… how we did it?

Multi-stage build separates the process into phases:

Build stages: have everything needed to compile. They’re temporary images that get discarded from the final image!
Runner stage: contains only the minimum code to run the app. It’s the only image pushed to the registry! The concrete benefits:

Smaller final image.

Reduced attack surface — no compilers in production.
Faster deploys — less pull time on each Kubernetes node.
Automatic parallelism — Docker runs independent stages in parallel (if required).
Granular caching — if you only change code, the dependencies stage is fully cached.

Our Dockerfile has 4 stages. Let’s go through each one.

Stage 1: Base tools

# -------- Base --------
FROM node:24-slim AS base
RUN apt-get update && apt-get install -y python3 make g++ openssl --no-install-recommends && rm -rf /var/lib/apt/lists/*
WORKDIR /app

You’re probably wondering why the heck do we need python3, make & g++ dependencies if this is a node environment application.

Easy, we’re using SQLite as a embedded database through the super fast better-sqlite3 library, and this library needs to compile native C++ bindings.

SQLite is a library written in C, and to use it from Node.js, better-sqlite3 includes a “binding” (a bridge between JavaScript and compiled C code).

This binding is compiled during npm install using node-gyp, which needs:

g++ — C++ compiler
make — orchestrates the compilation
python3 — because node-gyp is written in Python

The result is a .node file (binary addon) specific to your architecture and Node version.

These tools take up ~293MB, but they won’t be in the final image 💪🏼

Stage 2: Full dependencies


# -------- Dependencies --------
FROM base AS deps
COPY package.json package-lock.json ./
RUN npm ci --ignore-scripts
In the second stage we’re only copying the package.json and package-lock.json and run npm ci with the ignore-scripts flag.

In the second stage we’re only copying the package.json and package-lock.json and run npm ci with the ignore-scripts flag.

This flag is really important if you want to avoid any unnecessary drama. Tools like Prisma or Nuxt love to run ‘postinstall’ scripts the second they’re downloaded, but since we haven’t copied the full source code yet, those scripts would just crash.

By skipping the installation with this flag, we keep it lightning-fast, let Docker cache the layer perfectly, and save the actual heavy lifting for the Build stage.

Stage 3: Building the application

This is where the heavy lifting happens!

Now that we have our dependencies ready, we perform a triple threat of critical tasks: generating the Prisma client, compiling the Nuxt bundle, and rebuilding better-sqlite3.

This last step is vital — it ensures our SQLite driver is natively compiled for the Linux environment, avoiding those dreaded “mismatched binary” errors in production.

# -------- Build --------
FROM base AS build
ENV NODE_OPTIONS="--max-old-space-size=4096"
COPY --from=deps /app/node_modules ./node_modules
COPY . .

ARG DATABASE_URL="file:./dummy.db"
RUN npx prisma generate && npx nuxt build && npm rebuild better-sqlite3

Key points:

npx prisma generate: Generates the Prisma Client based on your schema.
npx nuxt build: This command triggers the full production build, bundling the client and server-side code through the Nitro engine into a standalone directory (.output folder)
npx rebuild better-sqlite3: The “secret sauce” for SQLite. This recompiles the native C++ bindings for our database driver directly inside the Linux container. It ensures the binary perfectly matches our production environment, preventing the dreaded “mismatched architecture” errors.

Stage 5: The production image

# -------- Runtime --------
FROM node:24-slim AS runner
RUN apt-get update && apt-get install -y openssl --no-install-recommends && rm -rf /var/lib/apt/lists/*
WORKDIR /app

ENV NODE_ENV=production
ENV NITRO_HOST=0.0.0.0
ENV NITRO_PORT=3000

# Minimal Prisma CLI for migrations - must run BEFORE package.json is copied,
# otherwise npm sees the app's package.json and installs ALL dependencies.
COPY package-lock.json ./
RUN npm install --ignore-scripts \
    "prisma@$(node -p "require('./package-lock.json').packages['node_modules/prisma'].version")" \
    "dotenv@$(node -p "require('./package-lock.json').packages['node_modules/dotenv'].version")" \
    && npm cache clean --force \
    && rm package-lock.json

# Copy compiled app (keeps Nitro's runtime packages in .output/server/node_modules/)
COPY --from=build /app/.output ./.output

# Replace better-sqlite3 with the Linux binary compiled in the build stage.
RUN rm -rf .output/server/node_modules/better-sqlite3
COPY --from=build /app/node_modules/better-sqlite3 ./.output/server/node_modules/better-sqlite3

# Copy Prisma config + migrations folder for migrate deploy
COPY --from=build /app/prisma ./prisma
COPY --from=build /app/prisma.config.ts ./prisma.config.ts

COPY docker-entrypoint.sh ./docker-entrypoint.sh
RUN chmod +x ./docker-entrypoint.sh

EXPOSE 3000

ENTRYPOINT ["./docker-entrypoint.sh"]
CMD ["node", ".output/server/index.mjs"]

The final image uses node:24-slim; no compilers, no Python, no dev tools. Just Node.js and your app.

Three important details here:

We leave .output/server/node_modules intact — except for better-sqlite3, which we delete and replace with a freshly compiled binary (in build stage).
We install only prisma and dotenv into /app/node_modules, the bare minimum needed to run prisma migrate deploy in the entrypoint script. The rest of the app’s dependencies are already bundled inside .output/server/node_modules by Nitro.
Automatic Prisma migrations with entrypoint.sh, this script runs prisma migrate deploy every time the container starts, ensuring your database always has the latest schema:

#!/bin/sh

echo "Applying database migrations..."

export NODE_PATH="/app/node_modules"
/app/node_modules/.bin/prisma migrate deploy

echo "Starting application..."
exec "$@"

GitHub Actions pipeline results

These are real numbers from our GitHub Actions pipeline for AMD64 architecture when building the app’s image:

💡 Multi-arch tip: If you build ARM64 images on AMD64 runners via QEMU, expect 4–7x slower builds. Use native ARM runners if speed matters!

Hidden Gotchas

Nitro can’t bundle everything… some packages must stay in node_modules!

Nitro uses Rollup/esbuild to bundle your server code into optimized .mjs files — but not everything can be bundled.

Modules like better-sqlite3 contain native C++ bindings (.node files), which are platform-specific binaries that simply cannot be converted into a .mjs file. For those, Nitro falls back to copying them — along with their full dependency tree — into .output/server/node_modules/.

So, we have to leave .output/server/node_modules intact, except for better-sqlite3, which we delete and replace with a freshly compiled binary (previously generated with npm rebuild better-sqlite3).

We install only prisma and dotenv into the root /app/node_modules; the bare minimum needed to run prisma migrate deploy in the entrypoint. The rest of the app’s dependencies are already bundled inside .output/server folder by Nitro.

💡 We explicitly mark better-sqlite3, among other dependencies, as external in the nuxt.config.ts file, which tells Nitro to skip bundling them entirely and copy them as-is into .output/server/node_modules directly:

// nuxt.config.ts
export default defineNuxtConfig({
  nitro: {
    esbuild: {
      options: { target: 'es2020' }
    },
    externals: {
      external: ['better-sqlite3', ...],
    },
  },
})

Conclusion

Dockerizing a full-stack Nuxt application isn’t trivial when you have native modules or you want to reduce significantly the final image size, but with a solid multi-stage build it’s completely viable.

Key takeaways:

Multi-stage build is not optional — it’s the difference between 637MB and 4.05 GB. A 84% image size reduction.
The numbers speak: ~4 minutes for a complete AMD64 build, lightweight final image ready for production.
Don’t delete .output/server/node_modules — Nitro copies the externalized packages there and needs them at runtime. Only replace the ones with native binaries (like better-sqlite3) that need to be recompiled for Linux.

If you’re evaluating Nuxt for your next full-stack project, give it a shot. The Vue ecosystem has a first-class full stack framework, and with Docker, deploying it is just matter of minutes!

This post is based on our real experience deploying Nuxt applications in production at Resizes. All timings and data are from our GitHub Actions pipeline.

Feel free to share this post among your community!

See you! 👋🏻