loading...
Cover image for Quick list / feed scraping in the dev console

Quick list / feed scraping in the dev console

swkidd profile image swkidd ・1 min read

Scraping list data in the dev console

// function to create array of elements from an xpath
let xeval = (s, each = e => e) => { 
  const iter = document.evaluate(s, document, null, XPathResult.ANY_TYPE, null );
  const elems = []
  while(elem = iter.iterateNext()) {
    elems.push(each(elem))
  }
  return elems
}

I was recently helping someone scrape app reviews and came across this tip. This will work to scrape any list / feed of elements.

The Problem

A lot of sites generate class names on page load, so you can't always use class names / id's to pull page data. The solution is to use XPATH. I was surprised by how easy this is!

  • Scroll down the page a bit and choose an element you want to scrape

  • Inspect the element and copy it's XPath

right click element in dev window -> copy -> XPath / full XPath

  • it will look like:
    '.../div/div[10]/div/div[2]/div[1]/div[1]/span'

  • by deleting '[10]' all like elements will be selected

  • use the XPath to select the page elements

And that's it!

Here's the code

Posted on by:

Discussion

markdown guide