DEV Community

Tommaso Bertocchi
Tommaso Bertocchi

Posted on

How to Scan File Uploads in Express

Many Express apps let users upload files.

That usually starts as a product feature:

  • profile pictures
  • resumes
  • PDFs
  • invoices
  • ZIP archives
  • documents sent to internal workflows

But an upload endpoint is also part of your attack surface.

A file can look harmless from its extension alone and still be risky once your app stores it, serves it, unzips it, or sends it to another system.

In this tutorial, we’ll build a simple Express upload route that scans files before storage using:

  • Express for the API
  • Multer for multipart/form-data
  • Pompelmi for file inspection

By the end, you’ll have a route that:

  • accepts a file upload
  • inspects the uploaded bytes
  • blocks suspicious or malicious files
  • only saves files that pass your policy

Why file uploads need scanning

A lot of upload pipelines still trust checks that are too shallow, such as:

  • the filename extension
  • the client-provided MIME type
  • a simple allowlist like .pdf, .jpg, .zip

That is not enough.

A safer pattern is:

  1. receive the file
  2. inspect it immediately
  3. decide whether it is safe enough for your route
  4. only then store or process it

That “inspect first, store later” approach is exactly what we’ll implement here.


What we’re using

Express

Express is the HTTP layer for our upload endpoint.

Multer

Multer is a Node.js middleware for handling multipart/form-data, which is the format commonly used for file uploads in Express apps.

For this tutorial, we’ll use memory storage so Multer gives us a Buffer in req.file.buffer. That makes it easy to scan the file before writing anything to disk.

Pompelmi

Pompelmi is an open-source file upload security library for Node.js. It can inspect uploaded files before storage and report a verdict such as clean, suspicious, or malicious.

It is designed to help catch issues such as:

  • MIME spoofing and magic-byte mismatches
  • risky archives
  • deep nesting and archive abuse
  • polyglot files
  • optional YARA-based matches

Project setup

Create a new folder and install the dependencies:

mkdir express-upload-scan
cd express-upload-scan
npm init -y
npm install express multer pompelmi
Enter fullscreen mode Exit fullscreen mode

This tutorial assumes you are running a recent Node.js version supported by Pompelmi.


Build the upload route

Create a file named server.mjs:

import express from "express";
import multer from "multer";
import { mkdir, writeFile } from "node:fs/promises";
import { join } from "node:path";
import { randomUUID } from "node:crypto";
import { scanBytes, STRICT_PUBLIC_UPLOAD } from "pompelmi";

const app = express();
const port = process.env.PORT || 3000;

const upload = multer({
  storage: multer.memoryStorage(),
  limits: {
    fileSize: 10 * 1024 * 1024, // 10 MB
    files: 1,
  },
});

app.get("/", (_req, res) => {
  res.type("html").send(`
    <h1>Upload a file</h1>
    <form action="/upload" method="post" enctype="multipart/form-data">
      <input type="file" name="file" required />
      <button type="submit">Upload</button>
    </form>
  `);
});

app.post("/upload", upload.single("file"), async (req, res, next) => {
  try {
    if (!req.file) {
      return res.status(400).json({ error: "No file uploaded" });
    }

    const report = await scanBytes(req.file.buffer, {
      filename: req.file.originalname,
      mimeType: req.file.mimetype,
      policy: STRICT_PUBLIC_UPLOAD,
      failClosed: true,
    });

    if (report.verdict !== "clean") {
      return res.status(422).json({
        error: "Upload blocked",
        verdict: report.verdict,
        reasons: report.reasons,
      });
    }

    const uploadsDir = join(process.cwd(), "uploads");
    await mkdir(uploadsDir, { recursive: true });

    const safeName = `${randomUUID()}-${req.file.originalname}`;
    const destination = join(uploadsDir, safeName);

    await writeFile(destination, req.file.buffer);

    return res.status(201).json({
      ok: true,
      verdict: report.verdict,
      filename: safeName,
      size: req.file.size,
    });
  } catch (error) {
    next(error);
  }
});

app.use((err, _req, res, _next) => {
  if (err instanceof multer.MulterError) {
    return res.status(400).json({
      error: "Upload rejected by Multer",
      code: err.code,
      message: err.message,
    });
  }

  console.error(err);
  return res.status(500).json({ error: "Internal server error" });
});

app.listen(port, () => {
  console.log(`Server listening on http://localhost:${port}`);
});
Enter fullscreen mode Exit fullscreen mode

Run it with:

node server.mjs
Enter fullscreen mode Exit fullscreen mode

Then open http://localhost:3000 and upload a file.


How this works

Let’s break down the important parts.

1. Multer parses multipart/form-data

This line creates the upload middleware:

const upload = multer({
  storage: multer.memoryStorage(),
  limits: {
    fileSize: 10 * 1024 * 1024,
    files: 1,
  },
});
Enter fullscreen mode Exit fullscreen mode

We are doing three useful things here:

  • using memory storage so the file is available as req.file.buffer
  • limiting the upload size to 10 MB
  • only allowing one file on this route

That matters because if you scan after writing to disk, you’ve already accepted and stored the file. In this example, we keep the file in memory long enough to inspect it first.

2. Pompelmi scans the uploaded bytes

This is the core step:

const report = await scanBytes(req.file.buffer, {
  filename: req.file.originalname,
  mimeType: req.file.mimetype,
  policy: STRICT_PUBLIC_UPLOAD,
  failClosed: true,
});
Enter fullscreen mode Exit fullscreen mode

Here’s what each option is doing:

  • req.file.buffer: the actual uploaded file bytes
  • filename: useful metadata for policy checks and reporting
  • mimeType: the MIME type supplied by the upload layer
  • policy: STRICT_PUBLIC_UPLOAD: a strict policy suitable for untrusted public uploads
  • failClosed: true: if inspection fails unexpectedly, block the upload instead of letting it through

This is a much safer default than “best effort” validation on a public endpoint.

3. Only clean files get stored

This condition is the boundary between accepted and rejected uploads:

if (report.verdict !== "clean") {
  return res.status(422).json({
    error: "Upload blocked",
    verdict: report.verdict,
    reasons: report.reasons,
  });
}
Enter fullscreen mode Exit fullscreen mode

If the verdict is not clean, the request stops there.

Only after the scan passes do we create the uploads/ directory and write the file to disk.

That ordering is the key idea of the whole tutorial.


Example responses

Clean upload

{
  "ok": true,
  "verdict": "clean",
  "filename": "a9f7f5f9-06d2-4aa9-a73e-31bcb84d9b29-document.pdf",
  "size": 48213
}
Enter fullscreen mode Exit fullscreen mode

Blocked upload

{
  "error": "Upload blocked",
  "verdict": "suspicious",
  "reasons": [
    {
      "code": "mime-mismatch",
      "message": "Detected file signature does not match the declared MIME type"
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

Your exact reasons output will depend on the file and the policy.


Test it with curl

You can also test the route from the terminal:

curl -F "file=@./test.pdf" http://localhost:3000/upload
Enter fullscreen mode Exit fullscreen mode

Try it with:

  • a normal PDF
  • a renamed file with a misleading extension
  • a ZIP archive
  • a file that should be blocked by your policy

That gives you a quick way to verify the decision boundary of your route.


Why this pattern is better than extension checks

A lot of apps still do something like this:

const allowed = [".jpg", ".png", ".pdf"];
Enter fullscreen mode Exit fullscreen mode

That is useful as a UX hint, but it is not a security boundary.

File names can be changed easily.

Client-provided MIME types can also be misleading.

A real upload defense should look at the file itself, not just the label attached to it.


Production notes

The example above is intentionally simple, but here are a few things you should think about before using this in production.

Set tight Multer limits

If you use memory storage, size limits matter.

Keep fileSize, files, and route-specific constraints as small as your product allows.

Keep error handling explicit

Multer and your scanner can fail for different reasons.

Return clear errors to clients, but avoid leaking unnecessary internal details.

Store only after inspection

Do not move the write-to-disk step before the scan.

Otherwise you lose the main security benefit.

Match policy to route risk

A public document upload endpoint, an internal admin tool, and an image-only avatar route do not all need the same policy.

Choose the strictness based on the trust level and the downstream processing pipeline.

Consider what happens after upload

Scanning at the boundary is a strong first layer, but also think about what your app does next:

  • Will the file be served back to users?
  • Will another service parse it?
  • Will you unzip it?
  • Will a worker transform it?

The more processing a file triggers, the more important your upload boundary becomes.


A smaller version if you already have req.file

If your app already uses Multer somewhere else, the minimal scanning step is just this:

import { scanBytes, STRICT_PUBLIC_UPLOAD } from "pompelmi";

const report = await scanBytes(req.file.buffer, {
  filename: req.file.originalname,
  mimeType: req.file.mimetype,
  policy: STRICT_PUBLIC_UPLOAD,
  failClosed: true,
});

if (report.verdict !== "clean") {
  return res.status(422).json({
    error: "Upload blocked",
    verdict: report.verdict,
    reasons: report.reasons,
  });
}
Enter fullscreen mode Exit fullscreen mode

That is the core integration.


Final thoughts

If your Express app accepts user uploads, don’t treat that endpoint as a boring plumbing detail.

Treat it like a security boundary.

The simplest safe flow is:

  1. receive the file
  2. scan the bytes
  3. block anything that is not clean
  4. store only the files you trust enough to keep

That single change can make your upload pipeline much safer without turning your app architecture upside down.

If you want to try this approach, take a look at Pompelmi here:


Further reading

Top comments (0)