DEV Community

Scraping

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
I spent 3 days scraping a site until I tried LLMs for data extraction

I spent 3 days scraping a site until I tried LLMs for data extraction

Comments
6 min read
My web scraping nightmare ended when I let an LLM read the HTML

My web scraping nightmare ended when I let an LLM read the HTML

Comments
5 min read
I Thought I Knew Web Scraping — Until I Hit JavaScript

I Thought I Knew Web Scraping — Until I Hit JavaScript

Comments
4 min read
Why I Gave Up on Regex and Started Using AI for Web Scraping

Why I Gave Up on Regex and Started Using AI for Web Scraping

Comments
5 min read
I Spent a Weekend Fighting Flaky Scrapers — Here’s What Finally Worked

I Spent a Weekend Fighting Flaky Scrapers — Here’s What Finally Worked

Comments
5 min read
Advanced Headless Browser Anti-Bot Techniques: TLS & Canvas

Advanced Headless Browser Anti-Bot Techniques: TLS & Canvas

Comments
6 min read
Optimizing Chunking and Data Extraction for Zero-Hallucination RAG

Optimizing Chunking and Data Extraction for Zero-Hallucination RAG

Comments
4 min read
Track YC Demo Day Companies in Real Time (with code)

Track YC Demo Day Companies in Real Time (with code)

Comments
5 min read
Architecture of a Rental Aggregator: Scraping and Normalizing 90+ Sources

Architecture of a Rental Aggregator: Scraping and Normalizing 90+ Sources

Comments
4 min read
Web Scraping is a Contract

Web Scraping is a Contract

4
Comments
8 min read
How I scraped 50k YouTube subtitles in 2 weeks for $7 (and the legal gray zones)

How I scraped 50k YouTube subtitles in 2 weeks for $7 (and the legal gray zones)

Comments
4 min read
API or browser agent? We picked yes.

API or browser agent? We picked yes.

Comments
7 min read
When web scraping breaks: using AI to extract messy data

When web scraping breaks: using AI to extract messy data

Comments
5 min read
ISP proxies, AI crawlers, and the slow death of datacenter IPs: 2026 in numbers

ISP proxies, AI crawlers, and the slow death of datacenter IPs: 2026 in numbers

Comments
8 min read
I Tested 15 LLMs for Web Scraping and Built Heuristics Instead

I Tested 15 LLMs for Web Scraping and Built Heuristics Instead

Comments
3 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.