Pease Ernest

Posted on Apr 24

Built an Amazon Scraper API in 50 Lines of TypeScript (And You Can Too)

#webscraping #api #bunjs #elysia

I Built an Amazon Scraper API in 50 Lines of TypeScript

No stealth plugins. No fingerprint hacks. No 47-line setup. Just a browser that actually works.

I've been scraping Amazon for years. Every time I thought I had it figured out, something broke:

Puppeteer? navigator.webdriver = true → blocked.
Playwright? Same problem, different package.
curl_cffi? Great TLS fingerprint, but no browser JavaScript.

So I built something different.

Enter Nothing Browser Piggy

Piggy is a headless browser library that doesn't leak automation signals. Real BoringSSL TLS. No CDP overhead. Built-in fingerprint spoofing that runs at DocumentCreation — before any page JavaScript can detect it.

Here's an Amazon scraper API in 50 lines:

import piggy, { usePiggy } from 'nothing-browser';
import { writeFileSync } from "fs";

await piggy.launch({ mode: "tab", binary: "headful" });
await piggy.register("amazon", "https://www.amazon.com");

piggy.actHuman(true);

const { amazon } = usePiggy<"amazon">();

await amazon.api("/search", async (_params, query) => {
  const term = query.q ?? "mattress";
  const pages = parseInt(query.pages) ?? 5;
  const results = [];

  for (let page = 1; page <= pages; page++) {
    await amazon.navigate(`https://www.amazon.com/s?k=${encodeURIComponent(term)}&page=${page}`);
    await amazon.wait(3000);

    const pageResults = await amazon.evaluate(() =>
      Array.from(document.querySelectorAll("[data-asin]")).map(el => ({
        asin: el.getAttribute("data-asin"),
        title: el.querySelector("h2 span")?.textContent?.trim() ?? "",
        price: el.querySelector(".a-price .a-offscreen")?.textContent?.trim() || "",
        rating: el.querySelector(".a-icon-alt")?.textContent?.trim() ?? "",
        image: el.querySelector("img.s-image")?.getAttribute("src") ?? "",
      }))
    );

    results.push(...pageResults);
  }

  writeFileSync(`./amazon-${term}.json`, JSON.stringify(results, null, 2));
  return { total: results.length, pages, term, products: results };
}, {
  detail: {
    summary: "Search Amazon products",
    parameters: [
      { name: "q", in: "query", schema: { type: "string", default: "mattress" } },
      { name: "pages", in: "query", schema: { type: "integer", default: 5 } }
    ]
  }
});

await piggy.serve(3000);

That's it. One command starts an API server with OpenAPI docs.

What Makes This Different

1. No Stealth Plugin Nightmares

Puppeteer needs puppeteer-extra-plugin-stealth. Playwright needs... something. Piggy has anti-detection built in.

// This just works. No extra imports.
await piggy.launch();
piggy.actHuman(true);

2. Real Browser, Real TLS

Piggy uses Qt6 WebEngine with real BoringSSL — same as Chrome. The TLS fingerprint is identical. No patched OpenSSL.

3. Built-in API Server

One line: await piggy.serve(3000)

Auto-generated OpenAPI docs at /openapi
Caching with ttl
Middleware support

4. TypeScript First

Full type safety. And usePiggy<T>() gives you autocomplete for registered sites.

The Results

Running this scraper on a mattress search (5 pages):

Metric	Result
Products scraped	~200 per search
Time per page	~3 seconds
Blocked by Amazon	0% (tested 50+ runs)
Lines of code	50

How to Try It

# Install
bun add nothing-browser

# Download the binary from GitHub Releases
# Place nothing-browser-headless in your project root

# Run
bun run amazon-scraper.ts

Then visit http://localhost:3000/openapi — interactive API docs.

Who This Is For

API reverse engineers — capture every request/responses
Price monitoring — check competitor prices
SEO researchers — analyze product rankings
Data nerds — build datasets without getting blocked

The Honest Limitations

Windows support is coming (native build in progress)
Google/Facebook may still block you (they block everything non-Chrome)
Amazon CAN ban your IP if you go too fast (add delays, use proxies)

The Bottom Line

I built Piggy because I was tired of the same 47-line setup file copied between projects. It's open source (MIT), it's fst, and it passes Cloudflarer where other tols fail.

Try it. Break it. Send a PR.

DEV Community