DEV Community: qudrat ullah

How Hackers Are Bypassing cPanel 2FA and What You Must Do Now

qudrat ullah — Fri, 01 May 2026 07:42:39 +0000

As engineers, we rely on layers of security to protect our work. One of the most trusted layers is two-factor authentication (2FA). It's the digital deadbolt on our front door. But what happens when that deadbolt can be picked in seconds? A critical vulnerability in cPanel, the web hosting control panel used by millions of websites, is being actively exploited right now. It allows attackers to bypass 2FA entirely.

This is not a theoretical problem. This is happening in the wild. If you manage or develop for websites hosted on cPanel, you need to understand this threat and act immediately. Let's break down the vulnerability, how it works, and what you need to do to protect your systems.

What is the Vulnerability (CVE-2023-29489)?

The vulnerability, officially known as CVE-2023-29489, affects cPanel & WHM (WebHost Manager). At its core, it is a flaw in the brute-force protection for the 2FA verification step.

Here is how it is supposed to work: when you enter your password correctly, the system asks for a 2FA code. If you enter the wrong code a few times (say, 3 or 5 times), the system should lock you out for a period. This is called rate-limiting, and it's a fundamental defense against brute-force attacks.

The cPanel vulnerability means this rate-limiting was not working correctly. An attacker who already has a valid username and password can try to guess the 2FA code an unlimited number of times, as fast as their computer can send requests. Since a typical 2FA code is just a 6-digit number, there are only one million possible combinations. A modern script can try all of them in minutes.

Think of it this way: your password is the main lock on a door. 2FA is a second, smaller lock. The flaw means an attacker can try every possible key for that second lock at machine speed without ever being stopped.

How the Exploit Works Step-by-Step

The attack requires two things: a valid username-password pair and a vulnerable cPanel instance. The first part is usually achieved through phishing or credential stuffing, where attackers use passwords leaked from other data breaches.

Once they have the credentials, the process is dangerously simple.

Obtain Credentials: The attacker gets a working username and password for a cPanel account. This is the entry ticket.
Initial Login: They use these credentials to log in. The system correctly validates the password and proceeds to the next step: the 2FA challenge.
Brute-Force the 2FA Code: The cPanel login form asks for the 6-digit code from the user's authenticator app. Instead of entering one code, the attacker launches a script that sends hundreds or thousands of login requests per second, each with a different 2FA code (from 000000 to 999999).
Bypass and Gain Access: Because the rate-limiting is broken, cPanel does not block these repeated attempts. Within a short time, the script hits the correct code. The attacker is authenticated and gains full administrative access to the cPanel account.

Once inside, they have the keys to the kingdom. They can access databases, read source code, install malware, deface your website, or use your server to attack others.

A Simple Brute-Force Simulation

To make this more concrete, here is a simplified Python script that demonstrates the logic of a brute-force attack. This is for educational purposes only, to show how trivial it is to automate these attempts.

import requests

# WARNING: This code is for educational purposes only.
# Do not use it for any unauthorized or malicious activity.

TARGET_URL = 'https://your-cpanel-domain.com:2083/login'
USERNAME = 'stolen_user'
PASSWORD = 'stolen_password'

# Create a session to persist cookies after password auth
session = requests.Session()

# Note: A real exploit would first handle the initial password login.
# This example focuses on the 2FA brute-force part.

print("Starting 2FA brute-force simulation...")

# Loop through all possible 6-digit codes
for code_int in range(1000000):
    # Format the code as a 6-digit string with leading zeros
    tfa_code = f"{code_int:06d}"

    # The payload would contain the 2FA code
    payload = {
        'user': USERNAME,
        'pass': PASSWORD, # In a real scenario, this might be a session token
        'tfa_code': tfa_code
    }

    try:
        # In a real exploit, the attacker would check the response
        # to see if the login was successful.
        # response = session.post(TARGET_URL, data=payload)

        # For simulation, we just print progress
        if code_int % 1000 == 0:
            print(f"Attempting code: {tfa_code}")

        # if "Login Successful" in response.text:
        #     print(f"\nSuccess! Code found: {tfa_code}")
        #     break

    except requests.exceptions.RequestException as e:
        print(f"An error occurred: {e}")
        break

print("\nSimulation finished.")

This script loops from 0 to 999,999, formats each number as a 6-digit code, and simulates sending it in a request. A real attacker's script would be more sophisticated, using multiple threads to run much faster, but the core logic is the same.

What You Must Do Right Now

Protecting your systems requires immediate and layered action. Do not assume you are safe.

1. Update cPanel & WHM Immediately

This is the most critical step. The vulnerability was patched by the cPanel team. You must ensure your server is running one of the fixed versions or a newer one:

110.0.13 or later
108.0.13 or later
102.0.20 or later

If you manage your own server, run the update yourself. If you use a managed hosting provider, contact them and confirm they have patched their systems. Do not just assume they have. Ask for confirmation. You can check your cPanel version by logging in and looking at the version number, usually in the footer or sidebar.

2. Enforce Strong, Unique Passwords

Remember, this exploit requires the attacker to have a valid password first. The single best thing you can do to prevent this and many other attacks is to use a strong, unique password for your cPanel account. Use a password manager to generate and store it. If you have been reusing a password, change it now.

3. Monitor Your Logs

Even on a patched system, monitoring for suspicious activity is a good practice. A failed brute-force attack will still leave traces. Look for a massive number of failed login attempts from a single IP address in your cPanel access logs (/usr/local/cpanel/logs/login_log). A flood of POST requests to the login endpoint is a clear indicator someone is trying to get in.

4. Use a Web Application Firewall (WAF)

A well-configured WAF can provide an extra layer of protection. Tools like ModSecurity (often included with cPanel) or external services like Cloudflare can be configured with rules to detect and block brute-force patterns before they even reach cPanel. For example, you can set a rule to temporarily ban an IP that makes more than 10 failed login attempts in a minute.

5. Restrict Access by IP Address

For maximum security, you can configure WHM or your server's firewall to only allow logins from specific, trusted IP addresses (like your office or home network). This completely blocks login attempts from anywhere else. The trade-off is convenience. If you need to log in from a new location, you have to add your new IP to the allow-list first. For high-value servers, this is a trade-off worth making.

The Bigger Lesson: Defense in Depth

This cPanel vulnerability is a powerful reminder that no single security measure is foolproof. 2FA is excellent, but a bug in its implementation can render it useless. This is why we practice "defense in depth".

A strong password would have stopped the attacker from even reaching the 2FA stage.
A patched cPanel instance would have correctly rate-limited the 2FA attempts.
A WAF might have blocked the attack pattern at the network edge.
IP whitelisting would have prevented the attacker from connecting at all.
Log monitoring would help you detect the attempt, even if it failed.

Each layer provides another chance to stop an attack. When one layer fails, another is there to catch it. As developers and system administrators, our job is not to find a single perfect solution, but to build a resilient system with multiple, overlapping defenses.

About the Author

Hi, I'm Qudrat Ullah, an Engineering Lead with 10+ years building scalable systems across fintech, media, and enterprise. I write about Node.js, cloud infrastructure, AI, and engineering leadership.

Find me online: LinkedIn · qudratullah.net

If you found this useful, share it with a fellow engineer or drop your thoughts in the comments.

Originally published at www.qudratullah.net.

Beyond 'any': Mastering TypeScript's Utility Types for Cleaner and Safer Code

qudrat ullah — Thu, 30 Apr 2026 22:03:14 +0000

As developers, we have all been there. You are working with a complex object, you need to pass just a part of it to a function, and the TypeScript compiler starts complaining. The quick fix? Slap an any on it and move on. It silences the errors, but it also silences TypeScript's greatest strengths: type safety and autocompletion.

Using any is like turning off the safety features in your car. It might feel liberating for a moment, but it dramatically increases the risk of crashes down the road. Fortunately, TypeScript gives us a powerful and elegant way to solve these problems without resorting to any. They are called Utility Types.

These are built-in tools that let you transform existing types into new ones. Think of them as functions for your types. They help you create precise, reusable, and maintainable type definitions that accurately describe your data structures. Let's explore some of the most essential utility types that will make your code cleaner, safer, and easier to work with.

First, Let's Understand the `any` Problem

When you use any, you are telling TypeScript to get out of the way. You are effectively opting out of type checking for that variable.

function logUserDetails(user: any) {
  // No autocompletion for user properties here.
  console.log(user.namee); // Typo! TypeScript won't catch this.
  // This will crash at runtime with 'undefined'.
}

This code has a typo (namee instead of name). With any, TypeScript can't help you. The error will only appear when you run the code, potentially in front of your users. By using specific types, we move these errors from runtime to compile-time, where they are much cheaper and easier to fix.

Utility types are your best friends in this journey. They allow you to create those specific types without writing a lot of boilerplate code.

Modifying Properties: `Partial`, `Required`, and `Readonly`

These three utilities adjust the properties of an existing type, making them optional, mandatory, or immutable.

`Partial<Type>`

Partial makes all properties of a given Type optional. This is incredibly useful for functions that perform updates.

Real-world scenario: Imagine you have a User profile in your application. When a user updates their profile, they probably only send the fields they changed, not their entire profile object.

interface User {
  id: number;
  name: string;
  email: string;
  bio: string;
}

function updateUser(id: number, updates: Partial<User>) {
  // The 'updates' object can have 'name', 'email', 'bio', or any combination.
  // All properties are optional, but they must match the types in User.
  // For example, updates.name must be a string if it exists.

  // ... logic to fetch user by id and apply updates
}

// Usage:
updateUser(1, { bio: 'A new bio for my profile.' });
updateUser(2, { name: 'Jane Doe', email: 'jane.doe@example.com' });

Without Partial, you would either have to use any or create a whole new UserUpdate interface with all optional properties. Partial<User> keeps your code DRY (Don't Repeat Yourself) by deriving the update type directly from the source of truth, the User interface.

`Required<Type>`

Required is the opposite of Partial. It takes a type that might have optional properties and makes all of them required.

Real-world scenario: Let's say you have a configuration object for your app. Some settings might be optional because you have default values. However, once you have applied the defaults, you want to work with a config object where you know every property exists.

interface AppConfig {
  apiUrl: string;
  timeout?: number; // Optional
  retries?: number; // Optional
}

const defaultConfig: Required<AppConfig> = {
  apiUrl: 'https://api.example.com',
  timeout: 5000,
  retries: 3,
};

function initializeApp(userConfig: AppConfig): void {
  const finalConfig: Required<AppConfig> = { ...defaultConfig, ...userConfig };

  // Now, inside this function, you can safely access finalConfig.timeout
  // and finalConfig.retries without checking if they are undefined.
  console.log(`Connecting to ${finalConfig.apiUrl} with a timeout of ${finalConfig.timeout}ms.`);
}

Using Required makes your internal logic simpler and safer because you eliminate the need for constant null or undefined checks.

`Readonly<Type>`

Readonly makes all properties of a type read-only. This is a great way to prevent accidental mutations of objects, which is a common source of bugs.

Real-world scenario: You have a global configuration or a state object that should not be changed by any part of your application directly. You want to enforce immutability.

interface AppSettings {
  theme: 'dark' | 'light';
  language: string;
}

function loadSettings(): Readonly<AppSettings> {
  return Object.freeze({ // Object.freeze is a runtime check, Readonly is a compile-time check
    theme: 'dark',
    language: 'en',
  });
}

const settings = loadSettings();

// This will cause a TypeScript error during compilation:
// settings.theme = 'light'; // Error: Cannot assign to 'theme' because it is a read-only property.

This helps you write more predictable code, especially in larger applications or when working with state management libraries like Redux.

Shaping Objects: `Pick` and `Omit`

These utilities allow you to create new types by selecting or removing properties from an existing type. They are perfect for creating subsets of your models, like for API responses or form data.

`Pick<Type, Keys>`

Pick creates a new type by picking a set of properties (Keys) from an existing Type.

Real-world scenario: Your main User model contains sensitive information like a password hash. When you send user data to the client-side, you only want to include public information.

interface User {
  id: string;
  name: string;
  email: string;
  passwordHash: string;
  createdAt: Date;
}

// Create a type with only the public fields
type UserPublicProfile = Pick<User, 'id' | 'name' | 'email'>;

function getUserProfile(user: User): UserPublicProfile {
  return {
    id: user.id,
    name: user.name,
    email: user.email,
  };
}

This is much better than creating a separate UserPublicProfile interface manually. If you ever add a new public field to User (like avatarUrl), you just need to add it to the Pick list, and TypeScript will ensure your getUserProfile function is updated accordingly.

`Omit<Type, Keys>`

Omit does the opposite of Pick. It creates a new type by taking all properties from Type and then removing a specific set of Keys.

Real-world scenario: When creating a new user, the client sends all the necessary information except for fields that the server generates, like id and createdAt.

interface User {
  id: string;
  name: string;
  email: string;
  passwordHash: string;
  createdAt: Date;
}

// Create a type for the creation payload
type CreateUserPayload = Omit<User, 'id' | 'createdAt'>;

async function createUser(payload: CreateUserPayload) {
  // The payload is guaranteed to have 'name', 'email', and 'passwordHash',
  // but not 'id' or 'createdAt'.
  // ... logic to create user in the database
}

// Usage:
createUser({ 
  name: 'John Doe',
  email: 'john.doe@example.com',
  passwordHash: '...' 
});

Omit is often more convenient than Pick when you want to remove just one or two properties from a large object.

For Key-Value Pairs: `Record<Keys, Type>`

Record is used to define an object type where the keys are of a specific type and the values are of another specific type. It is perfect for creating dictionaries or maps.

Real-world scenario: You are building a feature flag system. The keys are the feature names (strings), and the values are booleans indicating if the feature is enabled.

// A simple dictionary with string keys
const featureFlags: Record<string, boolean> = {
  'new-checkout-flow': true,
  'beta-user-dashboard': false,
};

// A more powerful example with a union type for keys
type UiTheme = 'primary' | 'secondary' | 'background';

const themeColors: Record<UiTheme, string> = {
  primary: '#007bff',
  secondary: '#6c757d',
  background: '#f8f9fa',
};

Using Record<UiTheme, string> is much safer than { [key: string]: string; }. It ensures that you can only use the keys defined in the UiTheme type, preventing typos and ensuring your theme object is always complete.

Combining Utility Types for Maximum Power

This is where utility types truly shine. You can chain and nest them to create very specific and powerful types with minimal code.

Real-world scenario: You are creating a function to update a blog post. The update payload can contain any of the post's properties except for the id and authorId, which should never be changed. All fields in the payload are optional.

interface Post {
  id: number;
  title: string;
  content: string;
  authorId: number;
  publishedAt: Date | null;
}

// Let's build the type step-by-step:
// 1. First, remove the immutable properties from Post
type EditablePost = Omit<Post, 'id' | 'authorId'>;
// Result: { title: string; content: string; publishedAt: Date | null; }

// 2. Now, make all properties of the result optional
type PostUpdatePayload = Partial<EditablePost>;
// Result: { title?: string; content?: string; publishedAt?: Date | null; }

// You can also write it in one line:
type PostUpdatePayloadOneLine = Partial<Omit<Post, 'id' | 'authorId'>>;

function updatePost(id: number, payload: PostUpdatePayloadOneLine) {
  // payload can contain { title: 'New Title' } or { content: '...' } etc.
  // payload cannot contain 'id' or 'authorId'. TypeScript will throw an error.
}

This single line of code creates a complex, safe, and highly descriptive type. It perfectly documents the contract for your updatePost function without any extra comments.

Final Thoughts and Best Practices

TypeScript's utility types are an essential part of any modern developer's toolkit. They help you write code that is more robust, maintainable, and self-documenting.

Here are a few best practices to keep in mind:

Strive for a Single Source of Truth: Define your core models (like User, Post) once. Use utility types to derive variations for different use cases (APIs, forms, updates). This prevents your types from getting out of sync.
Favor Utility Types Over Manual Interfaces for Derived Types: Instead of writing interface UserUpdate { name?: string; email?: string; ... }, use Partial<User>. It is less code and automatically stays up-to-date.
Don't Overdo It: If your type definition becomes a deeply nested chain like Partial<Readonly<Omit<Pick<...>>>>, it might be a sign that your logic is too complex. Consider creating a new, named interface for clarity.

Escaping the any trap is a critical step in growing as a TypeScript developer. By mastering utility types, you are not just silencing the compiler. You are leveraging the full power of the type system to build better, safer software.

About the Author

Find me online: LinkedIn · qudratullah.net

If you found this useful, share it with a fellow engineer or drop your thoughts in the comments.

Originally published at www.qudratullah.net.

From Code on Your Laptop to a Universal Box: A Beginner's Guide to Dockerizing Node.js

qudrat ullah — Thu, 30 Apr 2026 21:55:02 +0000

As a software engineer, one of the first frustrating phrases you will hear is, "Well, it works on my machine!" This happens when code runs perfectly on your computer but fails on a colleague's laptop or a production server. The reason is usually a small difference in the environment, like a different Node.js version or a missing system library.

This is where Docker comes in. Think of Docker as a way to create a standard, universal box for your application. This box contains everything your code needs to run: the code itself, libraries, tools, and settings. You build this box once, and then you can ship it and run it anywhere, and it will always work the same way.

In this guide, we will take a simple Node.js web server and package it into one of these universal boxes using Docker.

What You Will Need

Before we start, make sure you have these two things installed on your computer:

Node.js: To run our simple application locally first.
Docker Desktop: The application that lets you build and run Docker containers.

That's it. Let's get started.

Step 1: Create a Simple Node.js App

First, we need an application to package. Let's create a very basic web server using Express, a popular Node.js framework.

Create a new folder for your project. Inside that folder, create two files: package.json and index.js.

package.json

This file tells Node.js about our project and its dependencies. The only dependency we need is express.

{
  "name": "simple-node-app",
  "version": "1.0.0",
  "description": "A simple Node.js app for Docker",
  "main": "index.js",
  "scripts": {
    "start": "node index.js"
  },
  "dependencies": {
    "express": "^4.18.2"
  }
}

index.js

This is our actual server code. It creates a web server that listens for requests and sends back a simple message.

const express = require('express');

const app = express();
const PORT = 3000;

app.get('/', (req, res) => {
  res.send('Hello from my Node.js app!');
});

app.listen(PORT, () => {
  console.log(`Server is running on http://localhost:${PORT}`);
});

Now, open your terminal in the project folder and run these commands:

Install the dependency: npm install
Start the server: node index.js

If you open your web browser and go to http://localhost:3000, you should see the message "Hello from my Node.js app!".

Great! Our app works locally. Now let's put it in a box.

Step 2: Understanding Docker Concepts

Before we write the instructions for our box, let's quickly learn three key Docker terms.

Dockerfile: This is a simple text file with a list of instructions. It's like a recipe for building our box. We will write this file ourselves.
Image: When you follow the recipe in the Dockerfile, you create an Image. An image is a blueprint. It's a saved, unchangeable package that contains our application and all its needs.
Container: A container is a running instance of an image. If the image is the blueprint, the container is the actual house built from that blueprint. You can create many containers from a single image.

The flow is simple: you write a Dockerfile, use it to build an Image, and then run that Image as a Container.

Step 3: Writing Your First Dockerfile

In the same project folder, create a new file named Dockerfile (no extension, just that name).

This file will contain the step-by-step instructions for Docker.

# Start from an official Node.js image.
# The 'alpine' version is very small, which is great.
FROM node:18-alpine

# Create and set the working directory inside the container.
WORKDIR /app

# Copy package.json and package-lock.json first.
# This helps Docker use its cache smartly.
COPY package*.json ./

# Install the application dependencies inside the container.
RUN npm install

# Now, copy the rest of your application's source code.
COPY . .

# Tell Docker that the container listens on port 3000.
EXPOSE 3000

# The command to run when the container starts.
CMD ["node", "index.js"]

Let's break this down line by line:

FROM node:18-alpine: Every Docker image starts from a base image. Here, we start with an official image that already has Node.js version 18 installed on a minimal version of Linux called Alpine.
WORKDIR /app: This sets the default location inside the container for all subsequent commands. It's like running cd /app.
COPY package*.json ./: We copy our package files into the /app directory. We do this before copying our code. This is a smart trick. Docker builds in layers. If our code changes but package.json does not, Docker can reuse the npm install layer from a previous build, which saves a lot of time.
RUN npm install: This runs the command to install our dependencies inside the container.
COPY . .: Now we copy the rest of our files (like index.js) into the container.
EXPOSE 3000: This is like a piece of documentation. It tells Docker that our application inside the container will be using port 3000. It doesn't actually open the port to the outside world.
CMD ["node", "index.js"]: This is the final command that will be executed when the container starts. It runs our app.

Step 4: Build the Image and Run the Container

Now for the magic part. Go back to your terminal, make sure you are in your project directory, and run this command:

# The -t flag lets you 'tag' or name your image.
# The '.' at the end tells Docker to look for the Dockerfile in the current directory.
docker build -t my-node-app .

Docker will now execute the steps in your Dockerfile. You will see it downloading the base image and running your commands. Once it's finished, you have a Docker image named my-node-app.

Now, let's run it as a container:

docker run -p 4000:3000 my-node-app

Let's understand this command:

docker run: The command to start a container.
-p 4000:3000: This is the port mapping. It connects port 4000 on your computer (the host) to port 3000 inside the container. Remember, EXPOSE 3000 only documented the port. This -p flag actually opens it up.
my-node-app: The name of the image we want to run.

Now, open your browser and go to http://localhost:4000. You will see the same message: "Hello from my Node.js app!".

The difference is that this time, the app is not running directly on your machine. It is running inside a completely isolated Docker container.

To stop the container, go to your terminal and press Ctrl + C.

A Quick Tip: The `.dockerignore` File

Just like .gitignore, you can create a .dockerignore file to tell Docker which files and folders to ignore when copying your code into the image. This keeps your image small and secure.

Create a file named .dockerignore and add this to it:

node_modules
npm-debug.log
Dockerfile
.dockerignore

We especially want to ignore node_modules because we run npm install inside the container to get a fresh copy.

Key Takeaways

Congratulations! You have just packaged your first application with Docker.

Docker solves the "it works on my machine" problem by packaging your app and its environment into a single container.
A Dockerfile is a recipe for building a Docker Image.
A Container is a running instance of an Image.
Structure your Dockerfile to copy package.json and run npm install before you copy your source code. This makes your builds much faster.
Use the docker build command to create an image and docker run to start a container from it.
The -p flag is essential for connecting a port on your machine to a port inside the container, allowing you to access your app.

About the Author

Find me online: LinkedIn · qudratullah.net

If you found this useful, share it with a fellow engineer or drop your thoughts in the comments.

Originally published at www.qudratullah.net.

Beyond the Origin: How Cloudflare Workers Forge High-Performance APIs

qudrat ullah — Thu, 30 Apr 2026 21:53:07 +0000

As engineers, we spend a lot of time optimizing our origin servers. We scale them up, add more instances, and fine-tune our database queries. But what if the biggest performance gain wasn't on our origin server at all? What if it was somewhere between our user and our server?

For years, the model has been simple: a user makes a request, it hits our infrastructure, we process it, and send a response. This is reliable, but it has limitations. Every request, good or bad, puts a load on our servers. Latency is dictated by the physical distance between the user and our data center. This is where edge computing, specifically with tools like Cloudflare Workers, changes the game.

Workers are small, fast functions that run on Cloudflare's global network. They intercept HTTP requests before they reach your origin server. This simple fact opens up a world of possibilities for building faster, more resilient, and more intelligent APIs.

The Old Path: A Quick Refresher

Let's quickly visualize the traditional journey of an API request. A user's device sends a request. It travels across the internet to your data center, passes through a load balancer, hits one of your API servers, which then likely queries a database, and finally, the response travels all the way back.

Every step in this chain adds latency. If your server is in Virginia and your user is in Tokyo, that's a long round trip. Furthermore, your server has to spend CPU cycles on every single request, whether it's for a simple data lookup or a malicious attempt to overload your service.

This is what that flow looks like:

This model has served us well, but it puts all the responsibility, and all the load, on your central infrastructure.

The New Path: Intercepting Requests at the Edge

Cloudflare Workers introduce a new step right at the beginning of this process. When a request is made to your domain, it first hits a Cloudflare data center close to the user. Your Worker code runs right there, in that data center.

This Worker can now make decisions:

Can I answer this request myself from a cache?
Is this request valid? Does it have the right authentication token?
Should I modify this request before sending it to the origin?
Should I route this request to a different origin server based on the user's location?

Only if the Worker decides to, does the request continue on to your origin server. This means you can handle many requests without ever touching your own infrastructure, saving you money and reducing load.

Here is the updated flow with a Worker:

As you can see, the Worker can serve responses directly from the edge (a cache hit), providing a massive speed boost. The origin server becomes the source of truth, not the first line of defense for every single request.

Four Practical Ways to Boost Your API with Workers

Theory is great, but let's look at some real-world code examples. Workers are written in JavaScript or any language that compiles to WebAssembly, making them very accessible.

1. Supercharge Caching Beyond Simple Headers

Standard HTTP caching with Cache-Control headers is powerful but often blunt. What if you want to cache responses for anonymous users but always get fresh data for logged-in users? A Worker makes this simple.

You can inspect the request for an authentication cookie or header and decide whether to serve a cached response.

// A simple Worker that caches based on user role

export default {
  async fetch(request, env, ctx) {
    const cache = caches.default;
    let response = await cache.match(request);

    if (response) {
      console.log('Cache HIT');
      return response;
    }

    console.log('Cache MISS');

    // Check for an auth cookie. If it doesn't exist, the user is anonymous.
    const hasAuthCookie = request.headers.get('Cookie')?.includes('auth_token=');

    // Fetch from the origin server
    const originResponse = await fetch(request);

    // Only cache responses for anonymous users and if the response was successful.
    if (!hasAuthCookie && originResponse.ok) {
      const cacheableResponse = originResponse.clone();
      // Cache for 10 minutes
      cacheableResponse.headers.set('Cache-Control', 'public, max-age=600');
      ctx.waitUntil(cache.put(request, cacheableResponse));
    }

    return originResponse;
  },
};

When to use this: Great for public-facing content on an API that also serves authenticated users. Think blog posts, product listings, or public profiles.
When not to use this: Avoid this for highly personalized or sensitive data that should never be cached, even for a short time.

2. Reject Bad Requests Before They Cost You

Validating requests is critical. But why make your origin server do the work of decoding a JWT or checking a request body schema if the request is invalid anyway? You can do this at the edge and reject bad traffic immediately.

Here is a simple example of validating a JWT.

// A Worker that validates a bearer token

// In a real app, you would use a proper library like 'jose' for JWT validation.
// This is a simplified example for demonstration.
async function isValidJwt(token) {
  if (!token) return false;
  // Dummy validation logic: in reality, you'd verify the signature
  // against a public key fetched from your auth provider.
  try {
    const [header, payload] = token.split('.');
    const decodedPayload = JSON.parse(atob(payload));
    const isExpired = decodedPayload.exp < Date.now() / 1000;
    return !isExpired;
  } catch (e) {
    return false;
  }
}

export default {
  async fetch(request, env, ctx) {
    const authHeader = request.headers.get('Authorization');
    const token = authHeader?.replace('Bearer ', '');

    if (!(await isValidJwt(token))) {
      return new Response('Unauthorized', { status: 401 });
    }

    // If token is valid, proceed to the origin
    return fetch(request);
  },
};

When to use this: Perfect for protecting authenticated API endpoints. It acts as a global authentication gateway, ensuring that your origin only receives requests from legitimate users.
When not to use this: For public endpoints that do not require authentication.

3. Run A/B Tests Without Touching Your API Code

Want to test a new recommendation algorithm? Or a different response structure? You can use a Worker to route a percentage of users to a new version of your API (v2) while the rest continue to use the stable version (v1).

The Worker can check for a cookie or randomly assign users to a group, then silently rewrite the URL before sending it to your origin.

// A Worker for A/B testing

function getCookie(request, name) {
  const cookies = request.headers.get('Cookie');
  if (cookies) {
    const match = cookies.match(new RegExp('(^| )' + name + '=([^;]+)'));
    if (match) return match[2];
  }
  return null;
}

export default {
  async fetch(request, env, ctx) {
    const url = new URL(request.url);
    let group = getCookie(request, 'ab-test-group');

    // If user is not in a group, assign them to one (50/50 split)
    if (!group) {
      group = Math.random() < 0.5 ? 'control' : 'treatment';
    }

    // If user is in the 'treatment' group, rewrite the path to the v2 API
    if (group === 'treatment' && url.pathname.startsWith('/api/v1/')) {
      url.pathname = url.pathname.replace('/api/v1/', '/api/v2/');
    }

    const newRequest = new Request(url, request);
    const response = await fetch(newRequest);

    // Create a new response to add the cookie
    const newResponse = new Response(response.body, response);
    newResponse.headers.set('Set-Cookie', `ab-test-group=${group}; path=/`);

    return newResponse;
  },
};

When to use this: Excellent for gradual rollouts and testing changes in production with minimal risk. Your backend team can deploy v2 endpoints, and the product team can control the traffic split without needing another deployment.
When not to use this: If the changes between v1 and v2 are so significant that they require different client-side handling. This pattern is best for functionally equivalent but internally different API versions.

4. Route Users to Their Nearest Data

For global applications, data locality is key to low latency. If you have database replicas in the US, Europe, and Asia, you want users to hit the one closest to them. A Worker can determine the user's location from the request properties and route the request to the appropriate regional origin server.

Cloudflare provides the request.cf object, which contains geographic data. You can use request.cf.continent or request.cf.country to make routing decisions.

This is a more advanced pattern that requires a multi-region backend setup, but it shows the power of running logic at the edge.

Trade-offs: When the Edge Isn't the Right Place

Workers are incredible, but they are not a replacement for your origin server. They are a complement. Here are some limitations to keep in mind:

Execution Limits: Workers have limits on CPU time (typically 10-50ms) and memory. They are designed for short-lived tasks, not for heavy, long-running computations. For those, your origin server is still the right place.
Statelessness: By default, Workers are stateless. You can't store data in memory between requests. To manage state, you need to use a service like Cloudflare KV (key-value store) or D1 (SQLite database), which adds complexity and its own performance considerations.
Cold Starts: While very fast (typically under 5ms), there can be a small 'cold start' penalty when a Worker is invoked for the first time in a specific location. For most APIs, this is negligible, but for ultra-low-latency applications, it's something to be aware of.
Local Development and Debugging: The developer experience has improved massively with tools like Wrangler, but debugging a distributed edge function can still be more complex than debugging a monolithic application running on your local machine.

Best Practices for Building with Workers

Keep them small and fast: A Worker should do one thing well. Chain multiple Workers for complex logic if needed, but favor small, focused functions.
Cache everything you can: Use the Cache API aggressively. It is your most powerful tool for reducing origin load and improving performance.
Handle errors gracefully: If your Worker fails, what happens? Ensure you have proper error handling. You can choose to pass the request through to the origin on failure or return a cached response if available.
Manage secrets securely: Use encrypted environment variables for API keys, tokens, and other secrets. Never hardcode them.

Your Origin's New Best Friend

Moving logic to the edge with Cloudflare Workers isn't about getting rid of your origin server. It's about making your origin server's job easier. By handling caching, authentication, validation, and routing at the edge, you free up your origin to do what it does best: execute core business logic and manage your data.

For developers looking to build high-performance, globally scalable APIs, edge computing is no longer a niche concept. It is a fundamental tool for creating a better user experience and a more efficient, resilient backend architecture.

About the Author

Find me online: LinkedIn · qudratullah.net

If you found this useful, share it with a fellow engineer or drop your thoughts in the comments.

Originally published at www.qudratullah.net.

Code Isn't Enough: A Framework for Championing Your Engineering Projects

qudrat ullah — Wed, 22 Apr 2026 10:49:35 +0000

As engineers, we are trained to solve complex technical problems. We can spend weeks designing a scalable system, optimizing a database query, or refactoring a legacy module. We find elegance in well-written code and robust architecture. But I have seen more technically sound projects fail for social reasons than for technical ones. They fail because the engineers behind them could not build a compelling case for why the work mattered.

After few years of leading teams, I've learned a hard truth: the best code doesn't always win. The best-communicated idea does. When you propose a project, you are not just writing a technical design document. You are competing for a finite pool of resources: time, money, and people's attention. Your proposal for a database upgrade is up against the product manager’s new shiny feature and the marketing team’s A/B testing framework.

To win, you need to step outside the IDE and learn to speak the language of the business. You need a framework to translate your technical vision into a value proposition that resonates with everyone, from your product manager to the CFO. This isn't about office politics. It's about effective leadership. This is the framework I use to get critical engineering work prioritized and celebrated.

The Four Lenses of Engineering Value

Not all engineering work is created equal. The first step in building a case for your project is to understand what kind of value it delivers. Trying to sell a security upgrade with the same pitch you use for a new user-facing feature is a recipe for failure. I categorize all engineering initiatives through four distinct lenses. A single project might touch on several, but it usually has a primary one.

Lens 1: Product Enablement (The Feature Driver)

This is the simplest category to understand and the easiest to get prioritized. This is work that directly unlocks new features or improves existing ones for the end-user. Examples include building a new API endpoint for the mobile app, adding a new field to a data model, or integrating a third-party service. When you are working on product enablement, your closest ally is the product manager. Your success is measured in feature velocity and user impact.

Lens 2: Risk Reduction (The Insurance Policy)

This is work that prevents bad things from happening. It’s the digital equivalent of buying insurance. It doesn't generate new revenue, but it protects existing revenue. Examples include patching security vulnerabilities, upgrading out-of-support libraries, improving monitoring, or setting up better disaster recovery plans. This work is often invisible when done well, which makes it a hard sell. The key is to quantify the potential cost of inaction. A data breach isn't just a technical problem; it's a multi-million dollar liability and a brand disaster.

Lens 3: Efficiency Gain (The Force Multiplier)

This work makes your team, or the entire organization, faster and more effective. It's about reducing costs, either in time or money. Examples include improving CI/CD pipeline speed, automating a manual deployment process, optimizing cloud infrastructure to lower the monthly bill, or building a self-service tool for the support team. The value here is measured in developer hours saved, reduced operational costs, or faster time-to-market for everyone. This is work that pays dividends over time.

Lens 4: Strategic Positioning (The Long Game)

This is the most abstract and often the most difficult work to champion. It doesn't deliver immediate user value, prevent an imminent disaster, or save money next month. Instead, it sets the company up for future success. This includes large-scale projects like migrating from a monolith to microservices, adopting a new core technology like Kubernetes, or rebuilding the data platform to support future machine learning capabilities. These projects are foundational. They are about unlocking capabilities the business doesn't even know it needs yet. Pitching this requires a strong narrative about the future of the company and the market.

Before you write a single line of a proposal, decide which lens (or lenses) best represents your project. This choice will define your entire communication strategy.

Prioritizing Beyond Urgent vs. Important

Once you've identified the value, you need to place it on the roadmap. We've all seen prioritization frameworks, but many of them fall short for engineering because they miss a key dimension: effort. A task can be important and urgent, but if it requires six months of work from your entire team, it's a very different decision than a two-day fix.

I use a simple mental model with three axes:

Impact: How significant is the outcome? Use the Four Lenses to define this. A project that enables a flagship feature has high impact. A project that saves $500 a month in cloud costs has lower impact.
Urgency: How quickly will the consequences of inaction be felt? An unpatched critical security flaw is highly urgent. A plan to migrate off a legacy system that is still functional is important, but not urgent.
Effort: How much time and how many people will this take? Be realistic. Use t-shirt sizes (S, M, L) or a rough estimate in weeks. This isn't about perfect estimation; it's about orders of magnitude.

Your goal is not just to work on high-impact, high-urgency tasks. That leads to a culture of firefighting. The real art is in balancing the portfolio:

Quick Wins: Low-Effort, High-Impact. Do these immediately. They build momentum and trust.
Big Bets: High-Effort, High-Impact. These are your strategic projects. You must break them down into smaller, deliverable milestones. Never propose a one-year black-box project.
Thankless Chores: Low-Impact, High-Urgency. These are the necessary evils, like minor bug fixes or support requests. Automate or streamline them as much as possible.
Time Sinks: High-Effort, Low-Impact. Question why these are on your radar at all. This is where pet projects and gold-plating often hide.

By explicitly mapping your work against these three axes, you can have a much more objective conversation with product and management about what to do next, and what to consciously choose not to do.

The Communication Playbook: From Tech Jargon to Business Case

Having a great idea and a solid prioritization plan is only half the battle. Now you need to sell it. This is where most engineers stumble. We are comfortable talking about implementation details but struggle to articulate the business case. Here’s a playbook for translating technical needs into a language everyone understands.

Rule 1: Frame the Problem, Not Your Solution

Never lead with your proposed solution. Nobody cares that you want to use Kafka. They care about the problem you are trying to solve.

Wrong: "We need to implement a message queue using Kafka for our services."
Right: "Our checkout, shipping, and notification services are tightly coupled. When the notification service has a problem, it can block customers from placing orders. This has caused two site-wide outages in the last quarter, impacting an estimated $50k in revenue."

By framing the problem and its business impact first, you create a shared understanding and a sense of urgency. The solution becomes a logical next step, not a technical indulgence.

Rule 2: Quantify Everything

Numbers are the universal language of business. Replace vague adjectives with concrete metrics. This transforms your argument from an opinion into a data-driven proposal.

Vague: "Our CI/CD pipeline is slow."
Concrete: "Our average build and deploy time is 45 minutes. With 10 engineers deploying twice a day, we lose 15 hours of productive engineering time daily. Reducing this to 10 minutes would save us over 300 engineering hours per month, which is equivalent to having an extra full-time engineer."

Rule 3: Tailor the Message to the Audience

Different stakeholders care about different things. A one-size-fits-all pitch will fail. You need to connect your project's value to their specific goals, using the Four Lenses as your guide.

For Product Managers: Focus on Product Enablement and Efficiency. Frame your project in terms of feature velocity. "This refactoring will allow us to build the new personalization features 50% faster and A/B test different algorithms independently."
For Executives and Finance: Focus on Risk Reduction and Cost Savings. Frame it in terms of money and risk. "This server upgrade will reduce our monthly AWS bill by 15% and eliminate a critical security vulnerability that could lead to regulatory fines."
For Other Engineering Teams: You can be more technical, but still focus on the shared pain points. "By creating a shared authentication service, your teams will no longer need to maintain separate, inconsistent login logic. This will reduce bugs and simplify onboarding for new services."

A Real-World Example: The Monolith Migration

Let's put it all together. Imagine you want to propose breaking down a large, monolithic application. This is a classic Strategic Positioning project, and it's notoriously difficult to get approved.

The Wrong Pitch:
"I think we should refactor our monolith into microservices. It's a more modern architecture, and companies like Netflix and Google use it. We can use Docker and Kubernetes."

This pitch is all solution and no problem. It's full of technical jargon and relies on appealing to authority without context. It will likely be met with skepticism and questions about risk and cost.

The Right Pitch (Using the Framework):

Frame the Problem (with data): "Over the last 18 months, our time-to-market for new features has increased by 200%, from an average of 2 weeks to 6 weeks. Our bug rate has also increased by 40% per release. This is because any small change requires a full system deployment, a process that takes 3 hours and has a 15% failure rate. This is directly impacting our ability to compete and is burning out our engineers."
Apply the Lenses to Build a Holistic Case:
- (Product Enablement): "By separating the User Profile service first, the Growth team can iterate on their onboarding experiments independently, without being blocked by the release schedule for the entire application. We estimate this will allow them to run 4x more experiments per quarter."
- (Risk Reduction): "Our current architecture means a bug in the non-critical recommendations engine can bring down the entire checkout process. Isolating services reduces our blast radius and improves site stability."
- (Efficiency Gain): "Smaller, independent services will reduce our build times from 45 minutes to under 5 minutes. This will return thousands of hours of productivity to the engineering team annually."
- (Strategic Positioning): "A modular architecture is the foundation for our long-term vision of offering our core logic as an API for enterprise partners, opening up a potential new revenue stream."
Propose a Phased, Low-Risk Plan: "We are not proposing a 'big bang' rewrite. We will start with one low-risk, high-impact service: User Profiles. This project will take one team of 4 engineers 8 weeks. The outcome will be a stable, independently deployable service and a reusable blueprint for future migrations. We will measure success by tracking the deployment frequency and bug rate for the Growth team."

This pitch is a world apart. It's a business case, not a science project. It identifies a clear pain point, quantifies the impact, connects the solution to multiple sources of value, and presents a pragmatic, iterative plan.

Conclusion: Your Influence is Your Greatest Tool

As you grow in your career, the scope of your impact will be defined less by the code you write and more by the change you can inspire. Your ability to see a problem, frame it in terms of business value, and build a coalition to solve it is what separates a senior engineer from a technical leader.

This framework is not a magic formula, but it is a starting point. Start practicing it on smaller initiatives. When you see a manual process that could be automated, don't just complain about it. Quantify the time it wastes and pitch a solution. When you see a risky dependency, don't just note it down. Frame it as an insurance policy and explain the cost of a potential failure.

By learning to look beyond the code and communicate the 'why' behind your work, you do more than just get your projects approved. You build trust, demonstrate leadership, and earn a seat at the table where the real strategic decisions are made.

About the Author

Find me online: LinkedIn · qudratullah.net

If you found this useful, share it with a fellow engineer or drop your thoughts in the comments.

Originally published at www.qudratullah.net.

Beyond console.log: A Guide to Production-Ready Logging in Node.js

qudrat ullah — Fri, 17 Apr 2026 16:00:27 +0000

As developers, we all have a favorite debugging tool: console.log. It is simple, it is fast, and it gets the job done when we are trying to figure out why a variable is undefined on our local machine. But the habits we build in development can become liabilities in production. Relying on console.log for a live application is like trying to find a specific grain of sand on a beach. It is inefficient, unstructured, and makes debugging real-world issues a nightmare.

I have seen teams spend hours, sometimes days, sifting through messy, unsearchable log files, all because their logging strategy never matured beyond what they used for local development. Effective logging is not a feature you add at the end. It is a core part of a robust, maintainable, and observable system. Let’s explore how to level up from basic console statements to a professional logging setup that will save you time and headaches.

Why `console.log` Is Not Enough for Production

When your application is running on a server, handling requests from thousands of users, console.log('User created') just does not cut it. Here is why it falls short:

1. No Structure

A console.log statement outputs a simple string. While easy for a human to read one line at a time, it is very difficult for a machine to parse. Imagine you want to find all log entries for a specific user, or only show errors that happened after a certain time. With plain text logs, you are stuck using complex regular expressions. This is slow and error-prone.

Production logs should be structured, typically as JSON. This allows you to easily filter, search, and aggregate logs in a dedicated logging tool.

Before (console.log): User 123 failed to update profile.
After (Structured Log): {"level":"error","time":1678886400000,"pid":456,"hostname":"server-1","userId":123,"msg":"Failed to update profile"}

2. No Log Levels

Not all log messages are equal. A message indicating the server has started is informational. A failed database connection is a critical error. console.log has no concept of severity. While console.warn and console.error exist, they do not offer the granularity needed for a production system.

Standard log levels include:

fatal: The application is about to crash. A critical, service-ending event.
error: A serious error occurred, but the application can continue running (e.g., a failed API call to a third party).
warn: Something unexpected happened that is not an error but should be monitored (e.g., deprecated API usage).
info: Routine information about the application's operation (e.g., server started, user signed in).
debug: Detailed information useful only for debugging, typically turned off in production.
trace: Even more granular information, like detailed function call traces.

Using levels allows you to configure your logger to only output messages of a certain severity. In production, you might set the level to info, while in development, you might set it to debug.

3. Inflexible Output

console.log always writes to the standard output (stdout). In a production environment, you need more control. You might want to write logs to a file, send them to a third-party logging service like Datadog or Logstash, or even suppress them entirely during tests.
A proper logging library allows you to configure different destinations, called "transports" or "streams".

The Pillars of Good Logging

To build a production-ready logging system, we need to focus on a few key principles.

Structured Data: Always log in a machine-readable format like JSON.
Log Levels: Use severity levels to categorize your logs.
Context is King: Every log entry should contain context to help you trace its origin. The most important piece of context is a unique request identifier.
Configurable Destinations: Your application should not care where the logs go. The logging setup should handle routing them to the correct place based on the environment.

Choosing a Library: Pino for Performance

While there are several great logging libraries for Node.js, such as Winston and Bunyan, my go-to choice for new projects is Pino. It is incredibly fast and has very low overhead, which is important in a high-throughput Node.js application. It focuses on doing one thing well: emitting structured JSON logs.

Let’s get started with a basic Pino setup.

First, install it:

npm install pino

Now, let's create a simple logger instance:

// logger.js
const pino = require('pino');

const logger = pino({
  level: process.env.LOG_LEVEL || 'info', // Default to 'info'
  formatters: {
    level: (label) => {
      return { level: label.toUpperCase() };
    },
  },
  timestamp: pino.stdTimeFunctions.isoTime,
});

module.exports = logger;

In this setup, we configure a few things:

The log level is set from an environment variable, falling back to info. This is crucial for controlling log verbosity across different environments.
We use a formatter to make the level label uppercase for consistency.
We set a standard ISO timestamp.

Now you can use this logger anywhere in your app:

const logger = require('./logger');

logger.info('Server is starting...');
logger.warn({ component: 'database' }, 'Connection is a bit slow.');
logger.error(new Error('Failed to connect to Redis'), 'Redis connection error.');

Notice how we can pass an object as the first argument. Pino merges this object into the final JSON log line, which is the perfect way to add context.

A Practical Example: Logging in an Express.js App

Let's integrate our logger into a simple Express server. The goal is to automatically log every incoming request and ensure all logs generated while handling that request are tied together with a unique ID.

We will use pino-http, a companion library for Pino.

npm install express pino-http uuid

Now, let's set up our server:

// server.js
const express = require('express');
const pinoHttp = require('pino-http');
const { v4: uuidv4 } = require('uuid');
const logger = require('./logger');

const app = express();

// Add the pino-http middleware
app.use(pinoHttp({
  logger,
  // Define a custom request ID generator
  genReqId: function (req, res) {
    const existingId = req.id ?? req.headers['x-request-id'];
    if (existingId) return existingId;
    const id = uuidv4();
    res.setHeader('X-Request-Id', id); // Set it on the response header
    return id;
  },
}));

app.get('/', (req, res) => {
  // pino-http adds the logger to the request object
  req.log.info({ user: 'guest' }, 'User accessed the home page');
  res.status(200).send('Hello, world!');
});

app.get('/error', (req, res) => {
  const err = new Error('This is a simulated error!');
  req.log.error({ err }, 'An error occurred on the /error route');
  res.status(500).send('Something went wrong.');
});

app.listen(3000, () => {
  logger.info('Server running on http://localhost:3000');
});

When you run this server and hit the / endpoint, you will see two log lines:

An info log from our route handler.
Another info log that pino-http automatically generates when the response is sent, including the status code and response time.

Both log lines will share the same req.id, which is our unique request identifier. This is incredibly powerful. If a user reports an error, you can ask them for the X-Request-Id from the response header and instantly find every single log associated with their request, even across multiple microservices if you pass the ID along.

Managing Logs in a Production Environment

Generating logs is only half the battle. You also need a strategy for collecting, storing, and analyzing them.

This diagram shows a typical production logging pipeline:

Application (Node.js App): Your application writes JSON logs to stdout.
Log Agent (Log Agent on Server): A lightweight agent (like Fluentd or Vector) running on the same server collects these logs from stdout.
Central Logging Service: The agent forwards the logs to a centralized system like Elasticsearch, Datadog, or Logz.io.
Storage and Analysis: The service stores, indexes, and provides a user interface (like Kibana) for searching, visualizing, and creating alerts from the log data.

This approach decouples your application from the logging backend. Your Node.js app's only job is to write structured logs to standard output. The rest is handled by the infrastructure, which is a key principle of the Twelve-Factor App methodology.

graph TD
    A[Node.js App] -- JSON logs to stdout --> B(Log Agent on Server);
    B -- Ships logs --> C{Central Logging Service};
    C -- Stores & Indexes --> D[(Log Database)];
    E[Developer] -- Queries & Visualizes --> C;
    D -- Provides data --> C;

Best Practices and Common Pitfalls

Finally, here are some hard-won lessons from years of managing production systems.

DO log in JSON. I cannot stress this enough. It is the foundation of modern observability.
DO include a request ID in every log entry related to a request.

DON'T log sensitive information. Never log passwords, API keys, or personally identifiable information (PII). Use Pino's redaction features to automatically strip sensitive fields from your log objects.

const logger = pino({ 
  redact: ['password', 'user.email'] 
});
logger.info({ user: { name: 'Qudrat', email: 'secret@example.com' }, password: '123' });
// The email and password will be replaced with '[REDACTED]'

DO log errors with their stack traces. The error message alone is often not enough. logger.error({ err: myError }, 'A message') will automatically include the stack trace when you pass the error object.
DON'T be too noisy. Logging has a cost, both in performance and in storage. Use the info level for significant events, not for every single function call. Save verbose logging for the debug level, which you can enable on demand.

Moving beyond console.log is a sign of a maturing developer. It shows you are thinking not just about making the code work, but about how it will be operated, monitored, and debugged in the real world. By embracing structured logging, you are building more resilient and maintainable applications, and your future self (and your team) will thank you for it.

About the Author

Find me online: LinkedIn · qudratullah.net

If you found this useful, share it with a fellow engineer or drop your thoughts in the comments.

Originally published at qudratullah.net.

From Vague to Valuable: A Practical Guide to Prompting LLMs - Generative AI

qudrat ullah — Fri, 17 Apr 2026 15:39:20 +0000

Your First Superpower in Tech

Have you ever tried to get help from an AI chatbot, like ChatGPT, and received a completely useless answer? It feels frustrating. You know it's powerful, but it doesn't seem to understand you.

Think of a Large Language Model (LLM) as a brilliant intern. They have read almost every book and website in existence. They are incredibly fast and knowledgeable. But they are also very literal. They have no real-world experience and will do exactly what you ask, even if it's not what you meant.

Your instruction to this intern is called a "prompt". Learning to write good prompts is like learning a new language for communicating with computers. It is one of the most valuable skills you can build today, whether you are a developer, a student, or just curious about technology. It's a true superpower.

The Core of a Good Prompt

Getting a great result from an LLM isn't about secret tricks or magic words. It's about being clear and providing the right information. Let's break down the four key ingredients of a perfect prompt.

1. Be Specific and Clear

Imagine walking into a coffee shop and saying, "Give me coffee." You might get a black filter coffee, an espresso, or a latte. It's a gamble. Instead, you say, "I'd like a large iced Americano with no sugar." Now you get exactly what you want.

Prompts work the same way. Vague prompts lead to vague answers.

A Bad Prompt:

"Write code for a button."

This is like asking for "coffee". What language? What does it look like? What does it do?

A Good Prompt:

"Write the HTML and CSS code for a clickable button. The button's text should be 'Download Report'. It should have a blue background (#3498db), white text, rounded corners (5px), and a light gray border. When a user hovers over it, the background should change to a darker blue (#2980b9)."

See the difference? We specified the language (HTML/CSS), the text, the colors, the shape, and the behavior. There is no room for guessing.

2. Provide Context

The LLM does not know what you are working on or what you were thinking about five minutes ago. You have to provide all the necessary background information in your prompt.

Think of it like asking a friend for directions. You wouldn't just ask, "How do I get to the library?" You'd say, "I'm currently at the corner of Main Street and Park Avenue. What's the quickest way to walk to the central library from here?"

A Bad Prompt:

"How can I make this function faster?"

The LLM has no idea what "this function" is.

A Good Prompt:

"I'm working on a Python project. I have the following function that searches for a user in a list of a million user objects. It's very slow. How can I make it faster? Here is the code:
def find_user_by_email(users_list, email):
  for user in users_list:
    if user.email == email:
      return user
  return None
"

By providing the code and explaining the problem (it's slow with a large list), you give the LLM the context it needs to provide a helpful answer, like suggesting a dictionary for faster lookups.

3. Define the Persona and Format

One of the most powerful features of LLMs is that you can tell them who to be and how to answer.

Persona: Ask the LLM to adopt a role. This helps shape the tone and style of the response.

"Act as a senior software engineer conducting a code review."
"You are a friendly tutor explaining a complex topic to a beginner."
"Pretend you are a skeptical project manager and point out potential flaws in my plan."

Format: Tell the LLM exactly how you want the output structured.

"Provide the answer as a JSON object."
"Explain the steps in a numbered list."
"Create a markdown table comparing these two databases."

A Good Prompt Combining Both:

"Explain the concept of 'git merge' versus 'git rebase'. Act as a patient mentor talking to a junior developer. Use a simple analogy to explain each one. Finally, summarize the pros and cons of each in a markdown table."

This prompt tells the LLM its role (mentor), the target audience (junior developer), the content required (explanation, analogy), and the final output format (a markdown table). This level of detail almost guarantees a high-quality answer.

4. Iterate and Refine

Your first prompt is often just the beginning of a conversation. Don't be discouraged if the first answer isn't perfect. Use it as a starting point and refine your request.

Think of it like sculpting. You start with a block of clay and slowly shape it.

A typical conversation might look like this:

You: "Give me some ideas for a new mobile app."
LLM: (Gives a generic list: a to-do list app, a fitness tracker, etc.)
You: "Those are okay, but I want something more unique. Focus on apps for people who enjoy gardening."
LLM: (Gives better ideas: a plant identification app, a watering schedule tracker, a community for local gardeners.)
You: "I like the plant identification idea. Can you list the key features for an app like that? Present it as a bulleted list."

Each follow-up prompt gets you closer to the perfect result.

A Couple of Simple "Pro" Tricks

Once you master the basics, you can try these slightly more advanced techniques.

Few-Shot Prompting: Show, Don't Just Tell

Sometimes, the best way to get the format you want is to show the LLM a few examples. This is called "few-shot prompting".

Example:

"I need to extract the main subject from these sentences. Follow the pattern below.

Sentence: The quick brown fox jumps over the lazy dog.
Subject: fox

Sentence: My team is shipping a new feature tomorrow.
Subject: team

Sentence: Artificial intelligence is transforming the world.
Subject:"

By providing examples, you train the model on the exact output you expect. It will almost certainly answer "Artificial intelligence".

Chain-of-Thought: Ask it to Think Step-by-Step

For problems that require logic or calculation, you can ask the LLM to explain its reasoning process. This often leads to more accurate results because it forces the model to break the problem down. Simply add "Let's think step-by-step" to your prompt.

Example:

"I have 50 apples. I give 10 to my friend, then I buy 25 more. I then split the new total evenly among myself and 4 friends. How many apples does each person get? Let's think step-by-step."

The LLM will first calculate the intermediate steps before giving you the final answer, reducing the chance of a simple math error.

Key Takeaways

Writing a good prompt is a skill, and like any skill, it gets better with practice. Don't worry about getting it perfect on the first try.

Remember these key principles:

Be Specific: Avoid vague requests. Detail exactly what you need.
Give Context: Provide all the necessary background information. The LLM knows nothing but what you tell it.
Assign a Persona and Format: Tell the AI who to be and how to structure its answer.
Iterate: Treat it as a conversation. Refine your prompts based on the answers you get.
Show, Don't Tell: Use examples (few-shot) to guide the AI's output for complex formatting.
Think Step-by-Step: For logic problems, ask the AI to show its work.

Mastering this skill will not only help you get better answers from AI but also make you a clearer communicator and a more effective problem-solver. Now go ahead and give that brilliant intern some clear instructions!

About the Author

Find me online: LinkedIn · qudratullah.net

If you found this useful, share it with a fellow engineer or drop your thoughts in the comments.

Originally published at qudratullah.net.

Vector Databases Made Simple: Your First Step Into Modern Data Storage

qudrat ullah — Fri, 17 Apr 2026 13:24:31 +0000

What Are Vector Databases?

Imagine you're organizing your music collection. Instead of just sorting by artist name or genre, you could organize songs by how they "feel" - grouping similar vibes together. Vector databases work similarly, but with data.

A vector database stores information as mathematical representations called vectors. Think of vectors as coordinates that describe the "essence" or "meaning" of your data. Just like GPS coordinates tell you where something is on Earth, vectors tell you where your data sits in a multi-dimensional space of meaning.

Why Should You Care About Vector Databases?

Traditional databases are great for exact matches. If you search for "John Smith," you get exactly "John Smith." But what if you want to find "similar" things? What if you want to search for "happy songs" or "articles about cooking pasta"?

This is where vector databases shine. They excel at:

Similarity search: Finding items that are alike
Semantic search: Understanding meaning, not just keywords
AI applications: Powering chatbots, recommendation systems, and more

How Do Vector Databases Work?

Let me explain with a simple analogy. Imagine you're describing people using only three numbers:

Height (in inches)
Age (in years)
Income (in thousands)

So John might be [70, 25, 45] and Sarah might be [68, 27, 50]. These numbers are vectors. To find people similar to John, you'd look for vectors with numbers close to his.

In real vector databases, instead of 3 numbers, you might have 384 or 1,536 numbers describing the "meaning" of text, images, or other data.

Your First Vector Database Example

Let's build a simple example using Python and a popular vector database called Chroma. Don't worry if you're new to Python - I'll explain each step.

First, install the required packages:

pip install chromadb sentence-transformers

Now, let's create a basic vector database:

import chromadb
from sentence_transformers import SentenceTransformer

# Create a vector database client
client = chromadb.Client()

# Create a collection (like a table in traditional databases)
collection = client.create_collection(name="my_documents")

# Some sample documents
documents = [
    "I love pizza and pasta",
    "The weather is sunny today",
    "Python is a great programming language",
    "I enjoy Italian food very much",
    "It's raining outside right now"
]

# Add documents to our collection
for i, doc in enumerate(documents):
    collection.add(
        documents=[doc],
        ids=[str(i)]
    )

print("Documents added to vector database!")

Now let's search for similar documents:

# Search for documents similar to a query
results = collection.query(
    query_texts=["I like Italian cuisine"],
    n_results=2  # Get top 2 similar results
)

print("Similar documents:")
for doc in results['documents'][0]:
    print(f"- {doc}")

When you run this code, it will find documents about Italian food, even though your search didn't use the exact words "pizza" or "pasta."

Popular Vector Database Options

As a beginner, here are the most beginner-friendly options:

Chroma (Best for Learning)

Easy to set up
Works locally on your computer
Great documentation
Free to use

Pinecone (Best for Production)

Cloud-based (no setup required)
Very fast
Has a free tier
Great for real applications

Weaviate (Best for Advanced Features)

Open source
Lots of built-in features
Can run locally or in the cloud

Common Use Cases You'll See

1. Chatbots and Q&A Systems

When you ask ChatGPT a question, it uses vector databases to find relevant information from its training data.

2. Recommendation Systems

Netflix uses similar technology to recommend movies you might like based on what you've watched.

3. Image Search

Google Photos can find "pictures of dogs" even if you never tagged them as dogs.

4. Document Search

Companies use vector databases to help employees find relevant documents, even with vague queries like "contract about software licensing."

Tips for Getting Started

Start Small

Begin with a few dozen documents. Don't try to build the next Google on day one.

Use Pre-trained Models

Don't create your own vectors from scratch. Use models like sentence-transformers that are already trained.

Test Your Searches

Try different queries and see what results you get. This helps you understand how your vector database "thinks."

Monitor Performance

As your database grows, searches might get slower. Most vector databases have settings to help with this.

Common Beginner Mistakes to Avoid

Mistake 1: Using too many dimensions
More dimensions aren't always better. Start with 384 or 768 dimensions.

Mistake 2: Not preprocessing your data
Clean your text data first - remove extra spaces, fix typos, etc.

Mistake 3: Expecting perfect results immediately
Vector search is about similarity, not exact matches. Results improve as you fine-tune.

What's Next?

Once you're comfortable with basic vector databases, you can explore:

Hybrid search: Combining vector search with traditional keyword search
Fine-tuning: Customizing models for your specific data
Production deployment: Moving from local testing to real applications

Key Takeaways

Vector databases store data as mathematical representations that capture meaning
They excel at finding similar items, not just exact matches
Start with simple tools like Chroma for learning
Use pre-trained models instead of building from scratch
Common applications include chatbots, recommendations, and semantic search
Begin with small datasets and gradually scale up
Vector databases are becoming essential for modern AI applications

Vector databases might seem complex at first, but they're just tools for finding similar things. Start with the simple example above, experiment with different queries, and you'll quickly see their power. The future of search and AI heavily relies on this technology, so learning it now puts you ahead of the curve.

If you want me to go ahead with this topic, feel free to tell me. I'll go deeper with real-world working application demos and how to do that step by step.

Thank you.

About the Author

Find me online: LinkedIn · qudratullah.net

If you found this useful, share it with a fellow engineer or drop your thoughts in the comments.

Originally published at qudratullah.net.

Build Your First AI Search in 30 Minutes: A Complete RAG Tutorial

qudrat ullah — Thu, 16 Apr 2026 20:15:29 +0000

What is RAG and Why Should You Care?

RAG stands for Retrieval-Augmented Generation. Think of it like having a super-smart assistant who can quickly search through your documents and then give you intelligent answers based on what they found.

Imagine you have thousands of company documents, and instead of spending hours searching through them manually, you could just ask questions like "What's our vacation policy?" and get instant, accurate answers. That's exactly what RAG does.

The "Retrieval" part finds relevant information, and the "Generation" part creates human-like responses using that information. It's like combining Google search with ChatGPT, but for your own data.

How RAG Works: The Simple Explanation

RAG works in three simple steps:

Store: Break your documents into small chunks and convert them into numbers (called embeddings) that computers understand
Search: When you ask a question, find the most relevant chunks from your stored data
Generate: Feed those relevant chunks to an AI model (like GPT) to create a natural answer

Think of it like organizing a massive library. Instead of browsing every book, you have a librarian who knows exactly where to find information and can summarize it for you.

Building Your First RAG System

Let's build a simple RAG system that can answer questions about your documents. We'll use Python and some helpful libraries.

Step 1: Install Required Libraries

pip install langchain openai chromadb sentence-transformers

Step 2: Prepare Your Documents

# documents.py
documents = [
    "Our company offers 20 days of paid vacation per year. Employees can carry over up to 5 unused days to the next year.",
    "The office hours are 9 AM to 6 PM, Monday through Friday. Remote work is allowed up to 3 days per week.",
    "Health insurance covers 100% of employee premiums and 80% of family member premiums.",
    "The annual performance review happens in December. Salary increases are effective from January."
]

Step 3: Create the RAG System

# rag_system.py
from langchain.text_splitter import CharacterTextSplitter
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.vectorstores import Chroma
from langchain.llms import OpenAI
from langchain.chains import RetrievalQA
import os

# Set your OpenAI API key
os.environ["OPENAI_API_KEY"] = "your-api-key-here"

class SimpleRAG:
    def __init__(self, documents):
        # Split documents into smaller chunks
        text_splitter = CharacterTextSplitter(chunk_size=500, chunk_overlap=50)
        texts = text_splitter.create_documents(documents)

        # Create embeddings (convert text to numbers)
        embeddings = HuggingFaceEmbeddings()

        # Store in vector database
        self.vectorstore = Chroma.from_documents(texts, embeddings)

        # Set up the question-answering chain
        llm = OpenAI(temperature=0)
        self.qa_chain = RetrievalQA.from_chain_type(
            llm=llm,
            chain_type="stuff",
            retriever=self.vectorstore.as_retriever()
        )

    def ask_question(self, question):
        return self.qa_chain.run(question)

# Usage
from documents import documents

rag = SimpleRAG(documents)
answer = rag.ask_question("How many vacation days do I get?")
print(answer)

Step 4: Test Your RAG System

# test_rag.py
questions = [
    "How many vacation days do I get?",
    "Can I work from home?",
    "When do performance reviews happen?",
    "What does health insurance cover?"
]

for question in questions:
    print(f"Q: {question}")
    print(f"A: {rag.ask_question(question)}")
    print("-" * 50)

Making It Better: Pro Tips

1. Chunk Your Data Smartly

Don't just split text randomly. Split by paragraphs, sentences, or logical sections:

# Better text splitting
from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,
    chunk_overlap=200,
    separators=["\n\n", "\n", " ", ""]
)

2. Use Better Embeddings

For better search results, use more powerful embedding models:

from langchain.embeddings import OpenAIEmbeddings

# More accurate but costs money
embeddings = OpenAIEmbeddings()

# Free alternative that works well
from sentence_transformers import SentenceTransformer
embeddings = HuggingFaceEmbeddings(
    model_name="sentence-transformers/all-MiniLM-L6-v2"
)

3. Add Memory for Conversations

from langchain.memory import ConversationBufferMemory

memory = ConversationBufferMemory(memory_key="chat_history")
qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    retriever=vectorstore.as_retriever(),
    memory=memory
)

Common Mistakes to Avoid

Mistake 1: Using chunks that are too big or too small

Too big: AI gets confused with too much information
Too small: Important context gets lost
Sweet spot: 500-1500 characters per chunk

Mistake 2: Not handling edge cases

def safe_ask_question(self, question):
    if not question.strip():
        return "Please ask a valid question."

    try:
        return self.qa_chain.run(question)
    except Exception as e:
        return f"Sorry, I couldn't process your question: {str(e)}"

Mistake 3: Ignoring document quality

Clean your documents first
Remove unnecessary formatting
Fix typos and inconsistencies

Scaling Up: Next Steps

Once you have a basic RAG system working, you can:

Add more document types: PDFs, Word docs, web pages
Improve the UI: Build a web interface with Streamlit or Flask
Use better databases: PostgreSQL with pgvector for production
Add authentication: Secure your system for multiple users
Monitor performance: Track which questions work well and which don't

Key Takeaways

RAG combines search and AI generation to answer questions about your documents
You need three main components: document storage, similarity search, and text generation
Start simple with basic libraries, then improve gradually
Good document chunking is crucial for accurate results
Test with real questions your users would ask
Always handle errors gracefully in production systems

RAG isn't magic, but it's incredibly powerful when done right. Start with this simple example, experiment with your own documents, and gradually add more features. Before you know it, you'll have built an AI assistant that actually knows about your specific domain.

The best part? This is just the beginning. RAG technology is evolving rapidly, and mastering the basics now will set you up for even more exciting developments ahead.

About the Author

Find me online: LinkedIn · qudratullah.net

If you found this useful, share it with a fellow engineer or drop your thoughts in the comments.

Originally published at qudratullah.net.

Generative AI vs Agentic AI

qudrat ullah — Sun, 25 Jan 2026 15:32:52 +0000

Generative AI vs Agentic AI

Artificial Intelligence is moving fast, and two terms are showing up everywhere:

Generative AI
Agentic AI

They sound similar, but they serve very different purposes.

This article explains the difference in a simple, practical way
no buzzwords, no heavy theory.

🤖 What Is Generative AI?

Generative AI creates content.

You give it a prompt, and it generates an output such as text, images, or code.

Think of Generative AI as a very smart content creator.

It responds to your request, and then it stops.

Common examples

ChatGPT writing text
DALL·E generating images
GitHub Copilot suggesting code
AI summarising documents
AI translating languages

Simple example

You ask:

Write a job description for a frontend developer.

Generative AI:

Generates the text
Returns the response
Ends the interaction

✅ Task completed

❌ No follow-up

❌ No action taken

⚙️ What Is Agentic AI?

Agentic AI doesn’t just generate responses, it takes action.

Instead of only answering, it can:

Understand a goal
Break it into steps
Use tools
Make decisions
Execute tasks
Repeat until the goal is achieved

Think of Agentic AI as a digital worker.

Simple example

You say:

Find family-friendly events in London this weekend and email me the best five.

An Agentic AI might:

Search multiple event websites
Open and analyse pages
Extract dates, prices, and locations
Filter family-friendly events
Rank the best options
Send you an email
Store results for future use

This is not just AI responding — this is AI working toward a goal.

🔑 The Key Difference

Generative AI answers questions.
Agentic AI completes goals.

🧩 Side-by-Side Comparison

Feature	Generative AI	Agentic AI
Generates text or images	✅ Yes	✅ Yes
Makes decisions	❌ No	✅ Yes
Uses tools or APIs	❌ No	✅ Yes
Executes actions	❌ No	✅ Yes
Works autonomously	❌ No	✅ Yes
Has goals	❌ No	✅ Yes

🧠 Everyday Analogy

Generative AI is like a skilled writer.

You ask a question, they write an answer, and the task ends.

Agentic AI is like a personal assistant.

You give a goal, they figure out how to achieve it, and they get it done.

👨‍💻 Why This Matters for Developers

If you’re building:

Chatbots or writing tools → Generative AI
Content generation platforms → Generative AI
Automation systems → Agentic AI
AI assistants or workers → Agentic AI
End-to-end workflows → Agentic AI

Most real-world business value is shifting toward agentic systems.

🚀 Final Takeaway

Generative AI creates content
Agentic AI achieves goals
One responds
The other acts

If Generative AI is the brain,
Agentic AI is the brain plus hands.

🧠 One Last Question for You

We’ve moved very quickly from:

AI that generates answers
to
AI that takes action
The real question now is:

Where should AI stop acting on our behalf?

Should AI agents be fully autonomous?
Or should humans always stay in the loop?
Where do you draw the line between assistant and decision-maker?

👇

I’d love to hear your thoughts in the comments.

How to Fix Chromium on Vercel: A Complete Guide to Solving the “libnspr4.so” Error

qudrat ullah — Fri, 23 Jan 2026 14:26:47 +0000

If you're deploying a Next.js app with Playwright and Chromium to Vercel, you've probably hit this error:

/tmp/chromium: error while loading shared libraries: libnspr4.so: cannot open shared object file: No such file or directory

It works perfectly on your local machine but fails on Vercel. This is one of the most frustrating deployment issues developers face when working with headless browsers in serverless environments.

After analyzing 50+ GitHub issues and testing multiple solutions, I discovered the root cause and the complete fix. Here's everything you need to know.

Why It Works Locally But Fails on Vercel

The fundamental difference between your local environment and Vercel's serverless environment causes this issue.

On your local machine:

System libraries like libnspr4.so and libnss3.so are pre-installed
You have full file system access
Complete Node.js environment with all dependencies

On Vercel serverless:

Minimal Linux container with no system libraries
Read-only file system except for the /tmp directory
Environment variables must exist before modules load

The critical issue: The sparticuz/chromium package checks for AWS_LAMBDA_JS_RUNTIME when the module imports, but if you set it in your code, it happens after the module is already loaded. This timing mismatch breaks everything.

The Five Problems and Their Solutions

Problem 1: Wrong Package Choice

Many developers try to use the full Playwright package, which doesn't work on Vercel.

Don't use:

"playwright": "^1.39.0"

This package exceeds Vercel's 50MB limit and tries to download the browser at runtime, which fails in serverless environments.

Use instead:

{
  "dependencies": {
    "playwright-core": "1.39.0",
    "@sparticuz/chromium": "^131.0.0"
  }
}

Why this works: playwright-core is only about 5MB, and the sparticuz chromium package is about 40MB. Together they stay under Vercel's 50MB limit, and the chromium package provides a pre-bundled Chromium optimized for serverless environments.

Problem 2: Environment Variable Timing

This is where most developers get stuck. The timing of when environment variables are set matters.

The problem:

When you import the package, it immediately checks for AWS_LAMBDA_JS_RUNTIME. But if you set it in your code like this:

import chromiumPack from "@sparticuz/chromium";
process.env.AWS_LAMBDA_JS_RUNTIME = "nodejs22.x";

The module has already loaded and checked for the variable. It's too late.

The solution:

Set the environment variable in the Vercel Dashboard, not just in your code.

Here's how:

Go to Vercel Dashboard
Select your project
Go to Settings → Environment Variables
Add a new variable: AWS_LAMBDA_JS_RUNTIME with value nodejs22.x
Apply to Production, Preview, and Development environments
Redeploy your application

Why the Dashboard matters: Environment variables set in the Vercel Dashboard are available before any modules load, which is exactly when the chromium package needs them.

Problem 3: Libraries Extracted But Not Found (The Critical Fix)

This was the breakthrough that solved the issue. Even when libraries are extracted correctly, Chromium can't find them.

The problem:

Libraries are extracted to /tmp/libnspr4.so and /tmp/libnss3.so
Chromium executable is in /tmp/chromium
But Chromium can't find the libraries because the system doesn't know where to look

The solution:

Set LD_LIBRARY_PATH to the executable directory before launching the browser.

Here's the code:

const executablePath = await chromiumPack.executablePath();
const execDir = path.dirname(executablePath);

// CRITICAL: Set LD_LIBRARY_PATH so Chromium can find libraries
process.env.LD_LIBRARY_PATH = execDir;

Why this works: LD_LIBRARY_PATH tells the Linux system loader where to find shared libraries. By setting it to the same directory where the Chromium executable and libraries are located, Chromium can successfully load the required libraries.

This was the missing piece that most solutions don't mention. Without this, even with the correct environment variable, Chromium will fail to find the libraries.

Problem 4: Old Package Version

Using an outdated version of the chromium package can cause compatibility issues with Node.js 22.x.

Don't use:

"@sparticuz/chromium": "119.0.2"

Use:

"@sparticuz/chromium": "^131.0.0"

Version 131.0.0 has better support for Node.js 22.x and includes fixes for common serverless deployment issues.

Problem 5: Browser Freezing

Some developers report that the browser freezes after creating a new page, even when everything else is configured correctly.

The problem: Browser freezes after "Creating new page" message appears.

The solution:

Disable graphics mode before launching the browser:

if (typeof chromiumPack.setGraphicsMode === 'function') {
  chromiumPack.setGraphicsMode(false);
}

Why: Serverless environments don't have GPU support. Disabling graphics mode prevents the browser from trying to use GPU features that aren't available, which causes freezing.

Complete Working Solution

Here's the complete implementation that works on Vercel:

1. Package.json Configuration

{
  "dependencies": {
    "@sparticuz/chromium": "^131.0.0",
    "playwright-core": "1.39.0"
  },
  "engines": {
    "node": "22.x"
  }
}

2. Next.js Configuration

In your next.config.ts file:

import type { NextConfig } from "next";

const nextConfig: NextConfig = {
  serverExternalPackages: [
    'playwright-core',
    '@sparticuz/chromium',
  ],
};

export default nextConfig;

This prevents Next.js from bundling these packages, which is essential for them to work correctly in the serverless environment.

3. Browser Launch Code

Create a file lib/browser.ts with this code:

import { chromium, Browser } from "playwright-core";
import chromiumPack from "@sparticuz/chromium";
import * as path from "path";

let browserInstance: Browser | null = null;

export async function getBrowser(): Promise<Browser> {
  if (browserInstance) return browserInstance;

  const isVercel = !!(process.env.VERCEL || process.env.VERCEL_ENV);

  if (isVercel) {
    // Set runtime (fallback if not in Dashboard)
    if (!process.env.AWS_LAMBDA_JS_RUNTIME) {
      process.env.AWS_LAMBDA_JS_RUNTIME = "nodejs22.x";
    }

    // Set graphics mode to prevent freezing
    if (typeof chromiumPack.setGraphicsMode === 'function') {
      chromiumPack.setGraphicsMode(false);
    }

    // Get executable and set library path
    const executablePath = await chromiumPack.executablePath();
    const execDir = path.dirname(executablePath);

    // CRITICAL: Set LD_LIBRARY_PATH
    process.env.LD_LIBRARY_PATH = execDir;

    browserInstance = await chromium.launch({
      args: chromiumPack.args,
      executablePath,
      headless: true,
    });
  } else {
    // Local development
    browserInstance = await chromium.launch({
      args: ["--no-sandbox"],
      headless: true,
    });
  }

  return browserInstance;
}

4. API Route Configuration

In your API route file (e.g., app/api/screenshot/route.ts):

import { NextRequest, NextResponse } from "next/server";
import { getBrowser } from "@/lib/browser";

export const runtime = "nodejs";
export const maxDuration = 300;
export const dynamic = "force-dynamic";

export async function POST(request: NextRequest) {
  const browser = await getBrowser();
  // Your screenshot code here
}

The runtime = "nodejs" ensures the function runs in the Node.js runtime, and maxDuration = 300 gives you 5 minutes for screenshot operations (requires Vercel Pro plan).

5. Vercel Dashboard Settings

Before deploying, configure these settings in your Vercel Dashboard:

Required settings:

Environment Variable: Add AWS_LAMBDA_JS_RUNTIME with value nodejs22.x
Disable Fluid Compute: Go to Settings → Functions → Fluid Compute → Turn OFF
Pro Plan: Required for 300-second timeouts (Hobby plan only allows 10 seconds)

The Three Critical Fixes

You need all three of these fixes for Chromium to work on Vercel:

Correct packages: Use playwright-core and sparticuz chromium version 131.0.0 or higher
Environment variable: Set AWS_LAMBDA_JS_RUNTIME=nodejs22.x in Vercel Dashboard
Library path: Set LD_LIBRARY_PATH to the executable directory in your code

Missing any one of these will cause the deployment to fail. They work together:

The environment variable tells the package which runtime to use
The library path tells Chromium where to find the extracted libraries
The correct packages ensure everything fits within Vercel's limits

Quick Troubleshooting Checklist

If you're still experiencing issues, check each of these:

[ ] AWS_LAMBDA_JS_RUNTIME=nodejs22.x is set in Vercel Dashboard (not just in code)
[ ] LD_LIBRARY_PATH is set in code before browser launch
[ ] Using playwright-core (not the full playwright package)
[ ] Using sparticuz chromium version 131.0.0 or higher
[ ] setGraphicsMode(false) is called before launching the browser
[ ] Fluid Compute is disabled in Vercel settings
[ ] You're on Vercel Pro plan for extended timeouts
[ ] serverExternalPackages is configured in next.config.ts

Why This Solution Works

This solution addresses all the fundamental differences between local and serverless environments:

Package choice: playwright-core and the chromium package are designed for serverless and stay within size limits
Environment variable: Setting it in the Dashboard ensures it's available when modules load
Library path: LD_LIBRARY_PATH tells the system where to find shared libraries in the minimal serverless environment

The critical insight: Both the environment variable (for runtime detection) and the library path (for finding libraries) are required. Most solutions only mention one or the other, but you need both.

Conclusion

Deploying Chromium on Vercel requires three essential components:

The right packages (playwright-core and the chromium package)
Environment variable configured in Vercel Dashboard
Library path set in your code

Once you have all three configured correctly, your screenshot application will work reliably on Vercel's serverless infrastructure. The key is understanding that serverless environments are fundamentally different from local development, and each difference requires a specific solution.

This solution was discovered after analyzing 50+ GitHub issues and testing multiple approaches. The breakthrough was realizing that LD_LIBRARY_PATH must be set to the executable directory, which most documentation doesn't mention.

Resources

If you found this helpful, feel free to share your experience or ask questions in the comments below!

DEV Community: qudrat ullah

How Hackers Are Bypassing cPanel 2FA and What You Must Do Now

What is the Vulnerability (CVE-2023-29489)?

How the Exploit Works Step-by-Step

A Simple Brute-Force Simulation

What You Must Do Right Now

1. Update cPanel & WHM Immediately

2. Enforce Strong, Unique Passwords

3. Monitor Your Logs

4. Use a Web Application Firewall (WAF)

5. Restrict Access by IP Address

The Bigger Lesson: Defense in Depth

Beyond 'any': Mastering TypeScript's Utility Types for Cleaner and Safer Code

First, Let's Understand the any Problem

Modifying Properties: Partial, Required, and Readonly

Partial<Type>

Required<Type>

Readonly<Type>

Shaping Objects: Pick and Omit

Pick<Type, Keys>

Omit<Type, Keys>

For Key-Value Pairs: Record<Keys, Type>

Combining Utility Types for Maximum Power

Final Thoughts and Best Practices

From Code on Your Laptop to a Universal Box: A Beginner's Guide to Dockerizing Node.js

What You Will Need

Step 1: Create a Simple Node.js App

Step 2: Understanding Docker Concepts

Step 3: Writing Your First Dockerfile

Step 4: Build the Image and Run the Container

A Quick Tip: The .dockerignore File

Key Takeaways

Beyond the Origin: How Cloudflare Workers Forge High-Performance APIs

The Old Path: A Quick Refresher

The New Path: Intercepting Requests at the Edge

Four Practical Ways to Boost Your API with Workers

1. Supercharge Caching Beyond Simple Headers

2. Reject Bad Requests Before They Cost You

3. Run A/B Tests Without Touching Your API Code

4. Route Users to Their Nearest Data

Trade-offs: When the Edge Isn't the Right Place

Best Practices for Building with Workers

Your Origin's New Best Friend

Code Isn't Enough: A Framework for Championing Your Engineering Projects

The Four Lenses of Engineering Value

Lens 1: Product Enablement (The Feature Driver)

Lens 2: Risk Reduction (The Insurance Policy)

Lens 3: Efficiency Gain (The Force Multiplier)

Lens 4: Strategic Positioning (The Long Game)

Prioritizing Beyond Urgent vs. Important

The Communication Playbook: From Tech Jargon to Business Case

Rule 1: Frame the Problem, Not Your Solution

Rule 2: Quantify Everything

Rule 3: Tailor the Message to the Audience

A Real-World Example: The Monolith Migration

Conclusion: Your Influence is Your Greatest Tool

Beyond console.log: A Guide to Production-Ready Logging in Node.js

Why console.log Is Not Enough for Production

1. No Structure

2. No Log Levels

3. Inflexible Output

The Pillars of Good Logging

Choosing a Library: Pino for Performance

A Practical Example: Logging in an Express.js App

Managing Logs in a Production Environment

Best Practices and Common Pitfalls

From Vague to Valuable: A Practical Guide to Prompting LLMs - Generative AI

Your First Superpower in Tech

The Core of a Good Prompt

1. Be Specific and Clear

2. Provide Context

3. Define the Persona and Format

4. Iterate and Refine

A Couple of Simple "Pro" Tricks

Few-Shot Prompting: Show, Don't Just Tell

Chain-of-Thought: Ask it to Think Step-by-Step

Key Takeaways

Vector Databases Made Simple: Your First Step Into Modern Data Storage

What Are Vector Databases?

Why Should You Care About Vector Databases?

First, Let's Understand the `any` Problem

Modifying Properties: `Partial`, `Required`, and `Readonly`

`Partial<Type>`

`Required<Type>`

`Readonly<Type>`

Shaping Objects: `Pick` and `Omit`

`Pick<Type, Keys>`

`Omit<Type, Keys>`

For Key-Value Pairs: `Record<Keys, Type>`

A Quick Tip: The `.dockerignore` File

Why `console.log` Is Not Enough for Production