DEV Community

Custodia-Admin
Custodia-Admin

Posted on • Originally published at pagebolt.dev

Browser automation in Node.js: record narrated videos without managing a browser

Browser Automation in Node.js: Record Narrated Videos Without Managing a Browser

Most browser automation tutorials start with npm install puppeteer and end with you debugging why your CI container ran out of memory at 2am.

There's a different model. Send the steps you want the browser to execute — navigate, click, fill, scroll — to an API. Get the result back: a video, a screenshot, a PDF. No browser process to manage. No Chromium binary. No memory leaks.

Here's what that looks like in Node.js.

Record a narrated browser automation video

The highest-value thing you can do with browser automation isn't screenshots — it's video. A narrated recording that shows your app working, in a real browser, with a voice explaining each step.

import fs from 'fs';

const response = await fetch('https://api.pagebolt.dev/v1/video', {
  method: 'POST',
  headers: {
    'x-api-key': process.env.PAGEBOLT_API_KEY,
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    steps: [
      { action: 'navigate', url: 'https://yourapp.com', note: 'Opening the app' },
      { action: 'click', selector: '#sign-in', note: 'Signing in' },
      { action: 'fill', selector: '#email', value: 'demo@example.com' },
      { action: 'fill', selector: '#password', value: 'password123' },
      { action: 'click', selector: '#submit', note: 'Submitting' },
      { action: 'wait', ms: 1500 },
      { action: 'screenshot', name: 'dashboard' }
    ],
    audioGuide: {
      enabled: true,
      voice: 'nova',
      provider: 'openai',
      script: 'Welcome to the app. {{1}} Click Sign In to open the form. {{3}} Enter your credentials and submit. {{5}} The dashboard loads in under two seconds.'
    },
    pace: 'slow',
    frame: { enabled: true, style: 'macos' }
  })
});

fs.writeFileSync('demo.mp4', Buffer.from(await response.arrayBuffer()));
Enter fullscreen mode Exit fullscreen mode

The {{N}} markers in the script sync the narration to specific steps. Step 3 is the email fill — the voice says that line as the browser types. The result is an MP4 with a macOS browser chrome frame, AI narration, and natural pacing.

Voices: nova, alloy, echo, fable, onyx, shimmer (OpenAI) or emma, ava, brian, aria (Azure).

Multi-step automation sequences

For flows that span multiple pages — onboarding, checkout, form submission — use the sequence endpoint. It captures screenshots at each step and returns them as a JSON response:

const response = await fetch('https://api.pagebolt.dev/v1/sequence', {
  method: 'POST',
  headers: {
    'x-api-key': process.env.PAGEBOLT_API_KEY,
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    steps: [
      { action: 'navigate', url: 'https://yourapp.com/checkout' },
      { action: 'fill', selector: '#card-number', value: '4242 4242 4242 4242' },
      { action: 'fill', selector: '#expiry', value: '12/27' },
      { action: 'click', selector: '#pay-btn' },
      { action: 'wait', ms: 2000 },
      { action: 'screenshot', name: 'confirmation' }
    ]
  })
});

const { screenshots } = await response.json();
// screenshots: [{ name: 'confirmation', data: 'base64...' }]
Enter fullscreen mode Exit fullscreen mode

Use this for end-to-end smoke tests, automated QA screenshots, or generating before/after comparisons.

Take a screenshot

The simplest case — a single page capture:

const response = await fetch('https://api.pagebolt.dev/v1/screenshot', {
  method: 'POST',
  headers: {
    'x-api-key': process.env.PAGEBOLT_API_KEY,
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    url: 'https://yourapp.com',
    fullPage: true,
    fullPageScroll: true,  // forces lazy-loaded images to render
    blockBanners: true     // removes cookie consent popups
  })
});

fs.writeFileSync('page.png', Buffer.from(await response.arrayBuffer()));
Enter fullscreen mode Exit fullscreen mode

fullPageScroll is the one that catches people — without it, any content below the viewport that lazy-loads will appear blank.

Run through bot detection

Some pages return blank pages or redirect to a CAPTCHA for headless browsers. Pass stealth: true to mask browser fingerprints:

body: JSON.stringify({
  url: 'https://protected-site.com',
  stealth: true,
  fullPage: true
})
Enter fullscreen mode Exit fullscreen mode

Works across screenshots, sequences, and video. If your target site uses IP-based blocking, add a proxy parameter with your own proxy URL.

What you don't need

No npm install puppeteer. No npm install playwright. No:

  • Chromium binary downloaded on install
  • PUPPETEER_SKIP_CHROMIUM_DOWNLOAD environment variable hacks
  • --no-sandbox flags for Docker
  • Browser pool management
  • Memory leak monitoring
  • pm2 restart loops

The browser runs on PageBolt's infrastructure. You send steps, you get output.


Free tier includes 100 requests/month — enough to build and test a full automation workflow before you pay anything. → pagebolt.dev

Top comments (0)