Ely

Posted on Nov 9 • Originally published at edeckers.Medium

Building Flipr: a URL shortener, one commit at a time

#typescript #kubernetes #architecture #tutorial

Part 1: the simplest thing that could possibly work

Welcome to the first post in a series that dives into the world of distributed systems building on Kubernetes, using as a subject the "Todo application" of systems design: a URL shortener we'll call Flipr 🐬.

The goal is to build an actual, functioning, deployable system, designed to scale from a single instance to a distributed cluster on Kubernetes.

Photo of a happy dolphin in the water, who I like to think is named Flipr — *Photo by Louan García on Unsplash*

What this series is and what it is not

This series focuses on building a distributed, scalable URL shortener and the architectural decisions behind it. To keep that focus sharp, I'm intentionally omitting several things that would be critical in a production system, such as:

No unit tests: focus is on architectural evolution, not a production-ready codebase
Minimal input validation: basic checks only, not production-grade security hardening
No abuse prevention: there will be rate limiting, but anti-spam measures won't be covered
No user accounts: this is a simple shortener demonstrating distributed systems concepts

So, if you're looking for a complete, production-ready URL shortener, check GitHub and take your pick. But if you want to understand how distributed systems are built and how to benchmark architectural improvements, you're in the right place!

Requirements & scale targets

Before we write any code, let's define what we're building toward. I'm intentionally keeping the performance targets vague, because actual throughput and latency will depend entirely on the hardware you deploy to.

And while we don't define hard performance targets upfront, we'll be benchmarking each iteration and comparing results throughout this series. This way, we can demonstrate that our architectural evolution actually improves performance and scalability, not just adds complexity.

Functional Requirements:

Short Code Specifications:
- Length: 6-7 characters
- Character set: a-z, A-Z, 0-9 (62 characters)
- Namespace: 62^7 = ~3.5 trillion possible codes
URL Constraints:
- Max original URL length: 2,048 characters (browser-safe)
- Short code lifetime: Indefinite (no expiration)

Design Goals:

Horizontal Scalability: add more instances to handle increased load, rather than relying on bigger servers
Read-Heavy workload: expect redirects to vastly outnumber shortening requests (typical 10:1 ratio or higher)
Storage growth: design for millions of short codes with room to grow
Low latency: keep both shortening and redirects fast (sub-100ms when possible)

So where do we start?

The best place to start is with the simplest possible implementation of the core functionality: shortening URLs and redirecting to them. From here, we can iteratively add features, optimizations, and infrastructure components, each time motivated by a real need that arises from the limitations of the previous version.

Before we dive into code, a quick note on approach (you should skip this if you don't care for me getting all philosophical)

When you start simple and get something working, storage methods, scaling strategies, and database schemas follow naturally from the problems you encounter, not from upfront speculation.

Over my career, I've met countless developers who approach this in the exact reverse order, and I used to be one of them: I'd start with the database model, design the perfect schema, plan the caching strategy, and then try to build the application around it.

Starting with the database model, however, locks you in early. For starters it decides that you'll be using a database, maybe even a particular brand. But an application should dictate storage, not the other way around.

Now I build production applications differently: I build the simplest thing that works, and let the architecture emerge from real constraints. It's not just an approach for this blog series, it's how I work day-to-day.

I know this isn't some kind of mind-blowing revelation, but if you've never tried it, I highly recommend it: it's allowed me to demonstrate and discuss with stakeholders early-on, and it's kept my pet projects from stalling out, because when you see something working quickly, you keep those sweet-sweet endorphins pumping instead of getting stuck in architecture decisions and losing interest.

Alright, enough with the philosophical digression, let's finally see some code 😅

Flipr 0: let's go ephemeral!

As promised, this first version is intentionally minimal, without any external dependencies such as databases or caching layers: pure business logic and a simple HTTP server. Everything lives in memory, which means it all disappears when the server restarts. Exactly how we like it for now :)

The stack

Nothing fancy, just solid, battle-tested tools that get the job done!

TypeScript for adding some sanity ontop of JavaScript
Node.js as the runtime
Express for the HTTP server
Winston for structured logging
Zod for request validation

Business logic

Let's start with the core: the Shortener class, which is where all the URL shortening logic lives, completely independent of HTTP, databases, or any infrastructure concerns. By keeping this business logic separate from transport and storage, the code stays clean and testable. As a bonus, we could reuse it in other contexts later, like a CLI tool or a different web framework.

import { validator } from './validator';

const NUMBER_OF_CODE_GENERATOR_RETRIES = 10;

type ShortRecord = {
  code: string;
  url: string;
};

export class CodeRestrictedError extends Error {
  constructor(message: string) {
    super(message);
    this.name = 'CodeRestrictedError';
  }
}

export class FailedUrlRetrievalError extends Error {
  constructor(message: string) {
    super(message);
    this.name = 'FailedUrlRetrievalError';
  }
}

export class ShortCodeNotFoundError extends Error {
  constructor(message: string) {
    super(message);
    this.name = 'ShortCodeNotFoundError';
  }
}

const lookup: { [key: string]: ShortRecord } = {};

const generateShortCode = (length: number) => {
  const chars =
    'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789';
  let code = '';
  for (let i = 0; i < length; i++) {
    code += chars.charAt(Math.floor(Math.random() * chars.length));
  }
  return code;
};

const exists = (code: string) => !!lookup[code];

const insert = (code: string, url: string) => {
  lookup[code] = { code, url };
};

const get = (code: string): ShortRecord => lookup[code];

const generateShortCodeWithRetry = (
  length: number,
  validate: (code: string) => boolean,
) => {
  let code = generateShortCode(length);
  let attempts = 0;
  while (attempts < NUMBER_OF_CODE_GENERATOR_RETRIES) {
    if (validate(code)) {
      return code;
    }

    code = generateShortCode(length);
    attempts++;
  }

  throw new Error('Failed to generate a unique short code');
};

type CodeBlockList = {
  reserved: Set<string>;
  offensive: Set<string>;
  protected: Set<string>;
};

export type ShortenerConfig = {
  shortcodeLength: number;
  codeBlockList: CodeBlockList;
};

export class Shortener {
  private readonly validate: (code: string) => boolean;

  public constructor(private readonly config: ShortenerConfig) {
    this.validate = validator(
      this.config.codeBlockList.reserved,
      this.config.codeBlockList.offensive,
      this.config.codeBlockList.protected,
    );
  }

  private test = (code: string) => this.validate(code) && !exists(code);

  public shorten = (url: string, customCode?: string): ShortRecord => {
    if (customCode && !this.test(customCode)) {
      throw new CodeRestrictedError('Custom code is restricted');
    }

    let newShortCode =
      customCode ||
      generateShortCodeWithRetry(this.config.shortcodeLength, this.test);

    insert(newShortCode, url);

    const r = get(newShortCode);
    if (!r) {
      throw new FailedUrlRetrievalError('Failed to retrieve the shortened URL');
    }

    return {
      code: r.code,
      url: r.url,
    };
  };

  public resolve = (shortCode: string): ShortRecord => {
    if (!exists(shortCode)) {
      throw new ShortCodeNotFoundError('Short code not found');
    }

    return get(shortCode);
  };
}

Let's break this down

Storage: The lookup object is our entire database. It's a simple key-value store mapping short codes to URLs. When the server restarts, everything disappears. Perfect for now.

Code Generation: The generateShortCode function creates random strings from a 62-character alphabet (a-z, A-Z, 0-9). With a default length of 7 characters, that gives us 62^7 = ~3.5 trillion possible codes. Plenty of headroom.

Collision Handling: The generateShortCodeWithRetry function tries up to 10 times to generate a valid code. "Valid" means it passes the blocklist check and doesn't already exist in our lookup. If we can't find a valid code after 10 attempts, we throw an error. In practice, with trillions of possible codes and an empty database, collisions are very unlikely.

Blocklists: the imported validator function checks three lists:

Reserved: things like api, health, admin that can be used to claim authority and exploit unsuspecting users
Offensive: self-explanatory
Protected: codes we might want to use for special purposes later, or might be exploited by confusing people, such as names of well-known social media sites

Custom codes: users can optionally provide their own short code (like flipr.io/mysite). We validate it the same way as generated codes. If it's restricted or already taken, we throw a CodeRestrictedError.

Error types: three custom error classes let the HTTP layer distinguish between different failure modes:

CodeRestrictedError: The custom code is blocked or taken (422 Unprocessable Entity)
ShortCodeNotFoundError: The code doesn't exist (404 Not Found)
FailedUrlRetrievalError: Something went wrong after insertion (500 Internal Server Error)

This separation keeps the business logic clean. The Shortener class doesn't know anything about HTTP status codes.

Wiring it up with Express

The Express server is straightforward. Here are the two main endpoints:

Shortening a URL:

app.post('/api/shorten', async (req, res) => {
  logger.info('URL shortening request', {
    type: 'shortening_try',
    originalUrl: req.body.url,
    customCode: req.body.custom_code,
  });

  try {
    const { url, custom_code } = URLRequest.parse(req.body);

    const r = shortener.shorten(url, custom_code);
    const shortUrl = `${baseUrl}/${r.code}`;

    logger.info('URL shortened successfully', {
      type: 'shortening_success',
      shortCode: r.code,
    });

    res.json({
      short_code: r.code,
      short_url: shortUrl,
      original_url: r.url,
    });
  } catch (error) {
    logger.error('API shorten error:', { type: 'shortening_error', error });

    if (error instanceof CodeRestrictedError) {
      res.status(422).json({ error: 'Custom code restricted' });
      return;
    }

    if (error instanceof FailedUrlRetrievalError) {
      res.status(500).json({ error: 'Failed to retrieve the shortened URL' });
      return;
    }

    throw error;
  }
});

We validate the request body with Zod, call shortener.shorten(), and return the result. If the shortener throws one of our custom errors, we map it to the appropriate HTTP status code.

Resolving a short code:

app.get('/:shortCode', async (req, res) => {
  logger.info('Redirection request', {
    type: 'redirect_try',
    shortCode: req.params.shortCode,
  });

  try {
    const r = shortener.resolve(req.params.shortCode);

    logger.info('Redirection successful', {
      type: 'redirect_success',
      shortCode: req.params.shortCode,
      destinationUrl: r.url,
    });

    res.redirect(302, r.url);
  } catch (error) {
    logger.error('Redirect error:', { type: 'redirect_error', error });

    if (error instanceof ShortCodeNotFoundError) {
      res.status(404).send('<h1>Short URL not found</h1>');
      return;
    }

    throw error;
  }
});

This is the redirect endpoint. When someone visits flipr.io/abc123, we look up the code and send a 302 redirect to the original URL. If the code doesn't exist, we return a 404.

The rest of the Express setup is standard boilerplate: CORS headers, static file serving for a simple web UI, a health check endpoint, and Winston logging throughout. You can see the full server code in this GitHub repo.

Running it locally

Want to try Flipr yourself? The code is open source and ready to run.

Clone and install:

git clone https://github.com/edeckers/flipr-distributed-url-shortener.git
cd flipr
git checkout part-1-ephemeral
npm install

Start the server:

npm start

The server will start on http://localhost:8000. You'll see Winston logging output in your terminal.

Shorten a URL:

curl -X POST http://localhost:8000/api/shorten \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com/some/very/long/url"}'

You'll get back a response like:

{
  "short_code": "aB3xY9",
  "short_url": "http://localhost:8000/aB3xY9",
  "original_url": "https://example.com/some/very/long/url"
}

Test the redirect:

Visit http://localhost:8000/aB3xY9 in your browser, or use curl:

curl -L http://localhost:8000/aB3xY9

Try a custom code:

curl -X POST http://localhost:8000/api/shorten \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com", "custom_code": "mysite"}'

If you have the web UI enabled, you can also visit http://localhost:3000 in your browser for a simple form interface.

What's next

In the next post, we'll containerize Flipr and create basic Kubernetes configurations using Kustomize. We'll explore how to package the application for deployment and set up the foundation for running it in a distributed environment.

The code evolves in branches on GitHub, so you can follow along commit by commit, deploy it yourself, or jump to any stage that interests you. This is designed as a learning resource for understanding distributed systems architecture, and it's fully open source under the MPL-2.0 license.

If you spot issues, have suggestions, or want to contribute, open an issue or PR on the repo. I'm building this in public as both a learning exercise and a reference implementation for self-hosting.

Wrapping it up

We started with a simple question: what's the minimum viable URL shortener? Turns out, it's about 150 lines of TypeScript. This ephemeral version won't survive a restart, and it won't scale beyond a single instance, but that's fine and intended: we've proven the concept works, and we have a solid foundation to build on.

Next part of this series, we'll dockerize the application and add a basic Kubernetes configuration. See you then!

Have you ever started a project by building the database schema first, only to realize later that your application logic didn't quite fit? Or do you prefer starting with the data model? I'd love to hear how you approach greenfield projects in the comments!

If the Dutch language doesn't scare you, and you'd like to know more about what keeps me busy aside from writing these blog posts, check my company website branie.it! Maybe we can work together on something someday :)

DEV Community