DEV Community

Алексей Спинов
Алексей Спинов

Posted on

Cheerio.js Cheat Sheet: Extract Data from Any HTML Page in 10 Lines

Cheerio is the fastest way to parse HTML in Node.js. jQuery syntax, zero browser overhead.

Install

npm install cheerio
Enter fullscreen mode Exit fullscreen mode

Basic Usage

const cheerio = require('cheerio');

const html = await fetch('https://example.com').then(r => r.text());
const $ = cheerio.load(html);

// Extract data
const title = $('h1').text();
const links = $('a').map((i, el) => $(el).attr('href')).get();
const prices = $('.price').map((i, el) => $(el).text().trim()).get();
Enter fullscreen mode Exit fullscreen mode

Common Patterns

Extract Table Data

const rows = $('table tr').map((i, row) => {
  const cells = $(row).find('td');
  return {
    name: $(cells[0]).text().trim(),
    value: $(cells[1]).text().trim()
  };
}).get();
Enter fullscreen mode Exit fullscreen mode

Extract All Meta Tags

const meta = {};
$('meta').each((i, el) => {
  const name = $(el).attr('name') || $(el).attr('property');
  if (name) meta[name] = $(el).attr('content');
});
Enter fullscreen mode Exit fullscreen mode

Extract Structured Data

const jsonLd = $('script[type="application/ld+json"]')
  .map((i, el) => JSON.parse($(el).html()))
  .get();
Enter fullscreen mode Exit fullscreen mode

When NOT to Use Cheerio

  • Site requires JavaScript rendering → use Playwright
  • Site has a JSON API → use fetch directly (faster, more stable)
  • Site blocks scrapers → need proxy rotation

API-First Rule

Before using Cheerio, always check if the site has a JSON API. 7 popular sites return JSON directly.

More Resources


Need HTML parsed or data extracted? $20 flat rate. Any website, any format. Email: Spinov001@gmail.com | Hire me

Top comments (0)