sharp on Alpine Without Tears: a 260 MB Image Resize API in Hono
Every site eventually needs on-the-fly image resizing. Cloudinary and imgix charge per transform. imgproxy and thumbor work but drag in a whole opinionated stack. Here's the middle path: ~400 lines of TypeScript, three real dependencies, one 260 MB container, four endpoints that cover 95% of what a static site generator or a CDN origin actually asks for.
📦 GitHub: https://github.com/sen-ltd/image-resize-api
I've watched the "image sizing" line item creep onto four different infra bills over the years. Cloudinary is the usual pick because the SDK is nice, but their free tier doesn't survive contact with a real marketing site, and the paid tier is priced for agencies, not for a blog. The self-hosted alternatives all work, but they pull you in: imgproxy wants you to use its URL-signing scheme, thumbor wants Python and a config file the size of a small novel, and cloud providers' "image handler" Lambdas want you to buy into their whole serverless stack before you can even resize a PNG.
I wanted something that looked like a 12-factor HTTP service with a curl example in the README. Four verbs, libvips under the hood, one Dockerfile.
The stack
Three runtime dependencies. Not nine, not eighteen. Three.
-
Hono for the HTTP layer. It ships a request/response shape that's literally the Fetch spec, which means tests drive the app with
app.request(url, init)instead of standing up a real socket. -
zod for query-param coercion. Users type
?w=400&quality=85as strings; zod makes them integers with ranges and a useful error object. - sharp for everything that touches pixels. sharp is the Node binding for libvips, the streaming image library Kleis Koens has been polishing since 1990. It's the same library Cloudinary and Wikipedia's thumbnail farm use under the hood.
No ImageMagick. No spawning a subprocess. No @napi-rs/image. No jimp.
Why sharp, not a subprocess
The tempting design for "resize images in a Node service" is:
await execa('convert', ['-resize', '400x300', '-quality', '85', input, output])
I've shipped this pattern. It works. Here's why it's a bad foundation:
Subprocess start is slow. Spawning ImageMagick on a 10 KB thumbnail takes ~80 ms before it touches a single pixel. Doing this at request time means a single Node worker can serve maybe 12 requests/sec. sharp calls into libvips in-process — 3 ms of overhead, and your worker handles hundreds of requests/sec.
Temp files on disk. Subprocess means "write the upload to
/tmp/x, run convert, read the output back". Now you need to clean up temp files on crash, handle disk-full, think about symlinks. sharp isUint8Array → Uint8Array. Nothing hits disk.ImageMagick has a huge attack surface. It's the library that brought us ImageTragick in 2016. libvips is much smaller (~150k lines of C vs ~500k), has had far fewer CVEs, and has no "delegate" system that shells out to ghostscript on weird inputs.
sharp is the right answer for almost any "Node resizes an image" question. The only real downside is the build, which brings us to the main event.
sharp on alpine without tears
The classic sharp-in-Docker horror story: you FROM node:20-alpine, run npm install, and the build fails because libvips isn't there. Or it "works" but pulls in 300 MB of build tools that ride along into the runtime image. Or it works on x64 and segfaults on arm64 because sharp's prebuilt binary is only compiled for one of them.
The multi-stage Dockerfile that actually behaves:
# --- builder ---
FROM node:20-alpine AS builder
RUN apk add --no-cache vips-dev build-base python3
WORKDIR /build
COPY package.json package-lock.json* ./
RUN npm install --no-audit --no-fund
COPY tsconfig.json vitest.config.ts ./
COPY src ./src
COPY tests ./tests
RUN npm run build
# --- runtime ---
FROM node:20-alpine AS runtime
RUN apk add --no-cache vips # NOTE: vips, not vips-dev
WORKDIR /app
COPY package.json package-lock.json* ./
RUN npm install --omit=dev --no-audit --no-fund \
&& npm cache clean --force && rm -rf /root/.npm
COPY --from=builder /build/dist ./dist
COPY LICENSE ./
RUN addgroup -S app && adduser -S app -G app && chown -R app:app /app
USER app
ENV NODE_ENV=production PORT=8000 HOST=0.0.0.0 MAX_UPLOAD_MB=20
EXPOSE 8000
CMD ["node", "dist/main.js"]
Three things earn their keep here.
vips-dev in builder, vips in runtime. This is the whole trick. vips-dev has the headers and the .pc files sharp's native install wants to see when it has to compile against the local libvips. vips just has the runtime shared libraries. You can't run the service without vips, and you can't build it without vips-dev, but the runtime image is much smaller if you leave vips-dev out.
build-base and python3 are in the builder, not the runtime. build-base is Alpine's meta-package for gcc, make, binutils, libc-dev. python3 is there because node-gyp still uses Python for its bindings. Both are enormous (~200 MB combined on disk). Neither gets into the runtime image.
npm install --omit=dev in the runtime stage, not npm ci --omit=dev on the builder's node_modules. You could theoretically copy node_modules from builder, but then sharp's native parts were compiled against the builder image's libvips, which is fine but opaque to future-you who wonders why the image breaks when you bump Alpine. A fresh --omit=dev install in the runtime stage downloads sharp's prebuilt musl binary and calls it a day.
Final image: 264 MB. That's big for a Node service, but it's almost entirely libvips and the prebuilt sharp binary, not our code. A hello-world Hono image is ~180 MB; the delta is 80 MB of libvips and its dependencies, which is what it actually costs to decode every image format ever invented.
The four routes
export function createApp(): Hono {
const app = new Hono();
app.use('*', requestLogger());
app.route('/health', healthRoutes());
app.route('/resize', resizeRoutes());
app.route('/thumbnail', thumbnailRoutes());
app.route('/crop', cropRoutes());
app.route('/info', infoRoutes());
app.get('/', (c) => c.html(landingPage()));
app.notFound((c) => c.json({ error: 'not_found' }, 404));
return app;
}
The HTTP layer is maybe 40 lines of glue. The interesting parts live in two files: transformer.ts, which is the pure sharp wrapper, and validators.ts, which does magic-byte sniffing and zod query parsing.
Here's what a resize looks like in the transformer layer — no HTTP, no FormData, just bytes in, bytes out:
export async function resize(
bytes: Uint8Array,
opts: ResizeOptions,
): Promise<Uint8Array> {
const hasDim = opts.width !== undefined || opts.height !== undefined;
let pipeline = sharp(bytes, { failOn: 'error' });
if (hasDim) {
pipeline = pipeline.resize({
width: opts.width,
height: opts.height,
fit: opts.fit,
withoutEnlargement: true, // never upscale past the source
});
}
pipeline = applyOutput(pipeline, opts.format, opts.quality);
return new Uint8Array(await pipeline.toBuffer());
}
There are two design decisions in that snippet worth calling out.
failOn: 'error'. sharp has warning levels for "the image is weird but I can probably decode it" — things like a truncated JPEG or an IDAT chunk with a bad CRC. By default it makes a best effort. In a public-facing API I want the opposite: if the input isn't clean, reject it at the HTTP layer with a 422 rather than silently serving a partially-decoded thumbnail.
withoutEnlargement: true. If someone uploads a 100x100 avatar and asks for 400x400 cover, the default sharp behaviour is to blurrily blow it up. Nobody actually wants that. With this flag, sharp returns the source unchanged when asked to resize larger, and callers get a correct 100x100 image they can lay out as they please.
The fit-mode decision tree
sharp's five fit modes map directly to CSS's object-fit with one extra. Here's the version I keep in my head:
-
cover— fill the box exactly, crop whatever overflows. Right for square grid thumbs, hero images, anything where the container shape is fixed and content can be clipped. -
contain— fit inside the box, letterbox with a background if the aspect ratios don't match. Right when you need a known output size AND you refuse to crop the content. -
fill— stretch to the box, ignore aspect ratio. Almost always wrong. I expose it because sharp does and I'd rather document it than silently remove it. -
inside— preserve aspect ratio, never exceed either dimension. This is the "max 400x300" case — you get something that fits inside the box but might be smaller on one axis. -
outside— preserve aspect ratio, at least one dimension matches the box. This is the "minimum 400x300 then maybe crop later" case. Useful wheninsidewould be too small for your layout.
Every fit bug I've ever filed in production was because someone typed cover when they meant inside. If you're putting this service behind a templating engine, I'd bake the mode into the URL in your template and not let it be a runtime arg.
Magic bytes matter
The request handler looks almost boring:
const entry = form.get('file');
if (!(entry instanceof File)) throw new MissingFileError('file');
if (entry.size > limit) throw new PayloadTooLargeError(maxUploadMb());
const buf = new Uint8Array(await entry.arrayBuffer());
if (buf.byteLength > limit) throw new PayloadTooLargeError(maxUploadMb());
assertImageBytes(buf);
return { bytes: buf, format };
assertImageBytes is the part that matters. It sniffs the first 12 bytes:
const MAGIC = [
{ format: 'jpeg', prefix: [0xff, 0xd8, 0xff] },
{ format: 'png', prefix: [0x89, 0x50, 0x4e, 0x47, 0x0d, 0x0a, 0x1a, 0x0a] },
{ format: 'gif', prefix: [0x47, 0x49, 0x46, 0x38] },
{ format: 'webp', prefix: [0x52, 0x49, 0x46, 0x46] }, // + 'WEBP' at offset 8
{ format: 'tiff', prefix: [0x49, 0x49, 0x2a, 0x00] },
{ format: 'tiff', prefix: [0x4d, 0x4d, 0x00, 0x2a] },
];
// AVIF is an ISOBMFF box — check for 'ftyp' at offset 4 and 'avif'/'avis' at offset 8
Three things to notice.
First, we never trust Content-Type. The header is attacker-controlled and says whatever the curl user wants. The only signal we trust is the file's own opening bytes.
Second, SVG is conspicuously absent. libvips has an SVG backend (librsvg) that dutifully fetches every <image href="..."> inside the SVG, including ones pointing at http://169.254.169.254/latest/meta-data/ on your AWS host. You do not want to handle SVG uploads in a service that runs inside a VPC. If you need SVG rasterization, run it in an isolated container with no network.
Third, WebP is a two-step check. The container is RIFF, so the first four bytes are RIFF, but plenty of other formats (WAV, AVI) also start with RIFF. The WEBP fourcc sits at offset 8, after the 32-bit length field. Without the second check, we'd happily accept a WAV file and then sharp would return a confusing error.
Testing without fixture files
One of the joys of sharp is that it has a create API — you can synthesise a valid image in memory without touching the disk. Every test in this repo looks like this:
let red200: Uint8Array;
beforeAll(async () => {
red200 = new Uint8Array(
await sharp({
create: { width: 200, height: 200, channels: 3,
background: { r: 255, g: 0, b: 0 } },
}).png().toBuffer(),
);
});
it('produces a webp with the requested dimensions', async () => {
const out = await resize(red200, {
width: 100, height: 100, fit: 'cover', quality: 80, format: 'webp',
});
const meta = await sharp(out).metadata();
expect(meta.format).toBe('webp');
expect(meta.width).toBe(100);
});
43 tests across four files, no fixtures checked in, nothing to bit-rot. The HTTP tests drive the Hono app with app.request() so there's no socket or port — tests run in 237 ms locally and 335 ms inside the builder container.
Tradeoffs I didn't fix
Image size. 264 MB is the cost of libvips. If you're willing to use Debian slim instead of Alpine, you can get to ~220 MB. If you're willing to use node:20-alpine with a statically-linked libvips and no dev tools you might hit 180 MB. I stopped at 264 because it's the path-of-least-resistance build and the article is supposed to be "sharp on alpine without tears", not "sharp on alpine with spreadsheet".
No CDN cache layer. Every request is a fresh transform. In production you put a CDN in front of this and use Cache-Control headers so identical (source, params) requests hit the edge. I'm not going to write a two-tier cache in 400 lines of code and neither should you.
No animated GIF or APNG support. sharp can decode the first frame of an animated GIF and will silently drop the animation. I'm happy with that for a thumbnail API. If you need animated output, you want a video transcoder, not an image resizer.
No SVG. See the security section above.
No auth, no rate limiting. This is a boring single-responsibility service. Put it behind your gateway or API Gateway or Cloudflare — that's where auth and rate limiting belong.
Try in 30 seconds
git clone https://github.com/sen-ltd/image-resize-api
cd image-resize-api
docker build -t image-resize-api .
docker run --rm -p 8000:8000 image-resize-api
In another terminal:
curl -F "file=@photo.jpg" \
"http://localhost:8000/resize?w=400&fit=inside&format=webp&quality=85" \
-o thumb.webp
curl -F "file=@photo.jpg" http://localhost:8000/info | jq
Open http://localhost:8000/ for the upload-and-preview demo. That's the whole surface area. Four verbs, one container, no surprises.
If you've been paying Cloudinary for thumbnails on a side project, this is the direction to go. If you're running imgproxy and it feels like more than you need, this is smaller. If you're spawning ImageMagick in a subprocess, please read the "Why sharp, not a subprocess" section again and then switch.

Top comments (0)