DEV Community

dss99911
dss99911

Posted on • Originally published at dss99911.github.io

Node.js 헤드리스 브라우저 - Puppeteer와 Chrome Launcher

Node.js에서 헤드리스 브라우저를 사용한 웹 크롤링/스크래핑 방법을 알아봅니다.

헤드리스 브라우저 종류

  • Puppeteer
  • PhantomJS
  • 기타 라이브러리들

참고 링크:

Chrome Launcher

헤드리스 Chrome을 실행하는 라이브러리입니다. 컨트롤은 DevTools Protocol로 합니다.

Chrome 실행

const chromeLauncher = require('chrome-launcher');

chromeLauncher.launch({
    startingUrl: 'https://google.com'
}).then(chrome => {
    console.log(`Chrome debugging port running on ${chrome.port}`);
});
Enter fullscreen mode Exit fullscreen mode

헤드리스 Chrome 실행

const chromeLauncher = require('chrome-launcher');

chromeLauncher.launch({
    startingUrl: 'https://google.com',
    chromeFlags: ['--headless', '--disable-gpu']
}).then(chrome => {
    console.log(`Chrome debugging port running on ${chrome.port}`);
});
Enter fullscreen mode Exit fullscreen mode

Puppeteer

Puppeteer는 헤드리스 Chrome/Chromium을 DevTools Protocol을 통해 제어합니다.

설치

npm i --save puppeteer
Enter fullscreen mode Exit fullscreen mode

기본 사용법

const puppeteer = require('puppeteer');

(async () => {
    const browser = await puppeteer.launch();
    const page = await browser.newPage();
    await page.goto('https://example.com');
    await page.screenshot({path: 'example.png'});
    await browser.close();
})();
Enter fullscreen mode Exit fullscreen mode

브라우저 창과 함께 실행

const browser = await puppeteer.launch({headless: false});
Enter fullscreen mode Exit fullscreen mode

다른 버전의 Chrome 사용

const browser = await puppeteer.launch({executablePath: '/path/to/Chrome'});
Enter fullscreen mode Exit fullscreen mode

모든 리소스 로드 대기

await page.goto(url, {"waitUntil": "networkidle0"});
Enter fullscreen mode Exit fullscreen mode

DOM 조작

Document 핸들

const aHandle = await page.evaluateHandle('document');
Enter fullscreen mode Exit fullscreen mode

querySelector

page.$(selector)
Enter fullscreen mode Exit fullscreen mode

querySelectorAll

page.$$(selector, e => {})
Enter fullscreen mode Exit fullscreen mode

querySelectorAll forEach

const divsCounts = await page.$$eval('div', divs => divs.length);
Enter fullscreen mode Exit fullscreen mode

요소 값 추출

const searchValue = await page.$eval('#search', el => el.value);
const preloadHref = await page.$eval('link[rel=preload]', el => el.href);
const html = await page.$eval('.main-container', e => e.outerHTML);
Enter fullscreen mode Exit fullscreen mode

참고: 단순한 값이 아니면 await를 해도 promise가 리턴됩니다.

스크린샷

const puppeteer = require('puppeteer');

(async () => {
    const browser = await puppeteer.launch();
    const page = await browser.newPage();
    await page.goto('https://example.com');
    await page.screenshot({path: 'example.png'});
    await browser.close();
})();
Enter fullscreen mode Exit fullscreen mode

PDF 생성

const puppeteer = require('puppeteer');

(async () => {
    const browser = await puppeteer.launch();
    const page = await browser.newPage();
    await page.goto('https://news.ycombinator.com', {waitUntil: 'networkidle2'});
    await page.pdf({path: 'hn.pdf', format: 'A4'});
    await browser.close();
})();
Enter fullscreen mode Exit fullscreen mode

Originally published at https://dss99911.github.io

Top comments (0)