DEV Community

Cover image for I built a Universal Data Generator to kill the 1,000 row limit. Here is the stack.
Lotfi Jebali
Lotfi Jebali

Posted on

I built a Universal Data Generator to kill the 1,000 row limit. Here is the stack.

I built a Universal Data Generator to kill the 1,000 row limit. Here is the stack.

I’ve been a developer for years, and I have a recurring nightmare: Seeding the Database.

It doesn't matter if I'm using PostgreSQL, MongoDB, or just need a CSV for a client. The options are always the same:

  1. Write manual scripts (boring, slow).
  2. Use existing tools (great, but they cap you at 1,000 rows unless you pay a monthly subscription).

So, I spent the last few weekends building MockBlast.

It is a Universal Mock Data Generator designed to handle massive datasets without timeouts.

Here is how I engineered it to stream 1M+ rows using Next.js and Nginx.

The Problem: Syntax & Scale

Most generators are dumb.

  • SQL: They don't escape single quotes (O'Reilly breaks your insert).
  • MongoDB: They give you strings instead of ObjectId("...") or ISODate("...").
  • Scale: Browser-based generation crashes RAM after ~50k rows.

I wanted a single tool where I could paste a schema and stream gigabytes of data in any format.

The Tech Stack (VPS > Serverless)

I started with Serverless, but I hit a wall immediately: Timeouts.
Generating 100,000 rows takes more than 10 seconds. Vercel/AWS Lambda kill the process.

I migrated to a Self-Hosted VPS stack:

  • App: Next.js (App Router).
  • Database: PostgreSQL (Dockerized).
  • Proxy: Nginx (Reverse Proxy & SSL).

This allows me to keep connections open for minutes and Stream data directly to the user.

The Secret Sauce: Nginx & Web Streams

To generate 1 Million rows without crashing the server RAM, I don't build the file in memory. I pipe it row-by-row.

1. The Next.js Route (Streaming)

I use the Web Streams API. I also use setTimeout(0) periodically to prevent the Node.js event loop from blocking during heavy CPU tasks.

// app/api/generate/route.ts
export async function POST(req: Request) {
  const { fields, rowCount, format } = await req.json();

  const stream = new ReadableStream({
    async start(controller) {
      const encoder = new TextEncoder();

      // 1. Push Header 
      if (format === 'json') controller.enqueue(encoder.encode('[\n'));

      // 2. Loop and Flush
      for (let i = 0; i < rowCount; i++) {
        const row = generateRow(fields); // Heavy logic

        let chunk = "";
        if (format === 'sql') chunk = `('${row.name}', '${row.email}'),\n`;
        if (format === 'mongodb') chunk = `{ name: "${row.name}", _id: ObjectId("...") },\n`;

        controller.enqueue(encoder.encode(chunk));

        // CRITICAL: Let the event loop breathe every 500 rows
        if (i % 500 === 0) await new Promise(r => setTimeout(r, 0));
      }

      // 3. Close
      if (format === 'json') controller.enqueue(encoder.encode(']'));
      controller.close();
    }
  });

  return new NextResponse(stream, {
    headers: { "Content-Disposition": `attachment; filename="data.${format}"` }
  });
}
Enter fullscreen mode Exit fullscreen mode

2. The Nginx Config (The Real Trick)

By default, Nginx buffers responses. It waits for the whole API call to finish before sending data to the client. This kills the streaming effect and causes timeouts.

I had to explicitly disable buffering in my nginx.conf:

location / {
    proxy_pass http://localhost:3000;

    # CRITICAL for Streaming:
    proxy_buffering off;

    # Increase timeouts for large files (1M+ rows)
    proxy_read_timeout 300s;
    proxy_connect_timeout 300s;
}
Enter fullscreen mode Exit fullscreen mode

Launch Day Stats 🚀

I launched yesterday with a "Users who signup before January 31st get a LifeTime pro licence for free" offer.

The results after 24 hours:

  • Data generations: 63
  • Users: 106

MockBlast's first day stats

What it does

MockBlast is live and Free for up to 10,000 rows (10x the industry standard).

It supports:

  1. PostgreSQL/MySQL: With Foreign Key integrity and JSONB support.
  2. MongoDB: Native BSON types (ObjectId, ISODate).
  3. CSV/JSON: For general use.

If you hate writing seed data, give it a try.

👉 Try MockBlast (Free)

Top comments (0)