Every browser automation project starts the same way. You open a browser, navigate to a URL, and immediately realize you need retry logic, stealth mode, and session persistence before writing any actual automation.
Here are the TypeScript patterns I reach for every time.
Core Browser Factory
export async function withPage<T>(
config: BrowserConfig,
fn: (page: Page) => Promise<T>
): Promise<T> {
const browser = await createBrowser(config);
const context = await createContext(browser, config);
const page = await context.newPage();
try {
return await fn(page);
} finally {
await browser.close();
}
}
Stealth patch that most tutorials skip:
await context.addInitScript(() => {
Object.defineProperty(navigator, 'webdriver', { get: () => undefined });
Object.defineProperty(navigator, 'plugins', { get: () => [1, 2, 3, 4, 5] });
});
Without this, headless Chrome is trivially detectable.
Screenshot with Element Selection
// Full page
const buffer = await captureScreenshot({ url: 'https://example.com', fullPage: true });
// Specific element
const nav = await captureScreenshot({ url: 'https://example.com', selector: 'nav' });
Supports PNG, JPEG, WebP. Custom viewports. Single function for all cases.
Structured Data Extraction
Define a schema, get typed data back:
const jobs = await extractStructured({
url: 'https://jobboard.example.com',
schema: {
title: { selector: 'h2.job-title', required: true },
company: { selector: '.company-name' },
tags: { selector: '.skill-badge', multiple: true },
link: { selector: 'a', attribute: 'href' },
},
listSelector: '.job-card',
});
// [{ title: 'Senior Engineer', company: 'Acme', tags: ['Python', 'AWS'], link: '...' }, ...]
No more writing querySelectorAll loops by hand.
Session Persistence
The biggest time-saver for authenticated scraping:
// First run: log in and save
await loginAndSaveSession({
loginUrl: 'https://app.example.com/login',
usernameSelector: '#email',
passwordSelector: '#password',
submitSelector: 'button[type="submit"]',
username: 'you@example.com',
password: process.env.PASS,
sessionPath: './session.json',
});
// Every subsequent run: no login needed
await withSavedSession('./session.json', async (context) => {
const page = await context.newPage();
await page.goto('https://app.example.com/dashboard');
// Already authenticated
});
Block Resources for 3-5x Speed
await context.route('**/*', async (route) => {
if (['image', 'media', 'font'].includes(route.request().resourceType()))
return route.abort();
return route.continue();
});
On content-heavy sites this cuts load time from 4s to under 1s.
Page Change Monitoring
Hash-based diffing with persistence:
await startMonitor({
url: 'https://example.com/product/123',
selector: '.price, .availability',
checkIntervalMs: 300_000,
onChange: async (diff) => {
console.log(`Changed at ${diff.detectedAt}`);
console.log(`Was: ${diff.previousContent}`);
console.log(`Now: ${diff.newContent}`);
// Send Slack/email alert
},
});
Good for: price trackers, job monitors, stock alerts, government updates.
PDF Generation
// URL to PDF
const pdf = await generatePdf({ source: 'https://example.com' });
// HTML template to PDF (great for invoices)
const invoicePdf = await generatePdf({
source: INVOICE_HTML(invoiceData),
sourceType: 'html',
displayHeaderFooter: true,
});
The kit includes a complete invoice HTML template you can drop your data into.
These are packaged as a starter kit with 20+ runnable TypeScript scripts, MIT license.
Playwright Browser Automation Starter Kit - $19 one-time, instant download. Includes: screenshot capture, PDF generation, data extraction, form automation, login flows, page monitoring, anti-detection, and a full scraper template with pagination and retry.
Top comments (0)