DEV Community

Bartłomiej Stefański
Bartłomiej Stefański

Posted on • Originally published at bstefanski.com on

🤖 Quickly scrape tweets without API or headless browser

Writing a scraping tool is a boring process, you have to use headless browser or an API (but that wouldn't be called scraping, would it?).

<!-- -->It takes a lot of time to develop and run such a tool. Whenever possible, it's best to avoid writing a standalone application for that.

My goal was to gather links to some tweets that were listed under Twitter's search page. I went to the search page, put this little snippet that I wrote in the DevTools and started scrolling until I was satisfied with the results.

<!-- -->It will probably stop working in the near future, since it is fully based on text content of some DOM nodes, but you can of course take a look at Twitter's DOM and modify it to your needs.


const links = new Set();

window.addEventListener('scroll', () =>

[...document.querySelector('[aria-label="Timeline: Search timeline"').children[0].children].forEach((el) => {

const singleLink = el.querySelectorAll('a')[3];

if (singleLink) {

 links.add(singleLink.getAttribute('href'));

}

}),

);

console.log(links);

Enter fullscreen mode Exit fullscreen mode

AWS Security LIVE!

Join us for AWS Security LIVE!

Discover the future of cloud security. Tune in live for trends, tips, and solutions from AWS and AWS Partners.

Learn More

Top comments (0)

AWS Security LIVE!

Tune in for AWS Security LIVE!

Join AWS Security LIVE! for expert insights and actionable tips to protect your organization and keep security teams prepared.

Learn More

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay