DEV Community

# webscraping

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
How to test whether your web extraction API is lying to your agent

How to test whether your web extraction API is lying to your agent

Comments
3 min read
How we built a hiring-intent lead finder using Google as the backend (no login, no ban risk)

How we built a hiring-intent lead finder using Google as the backend (no login, no ban risk)

Comments
4 min read
When an Actor Platform Is Too Much for an LLM Scraping Task

When an Actor Platform Is Too Much for an LLM Scraping Task

Comments
4 min read
How to Know If You Actually Need Mobile Proxies (Without Buying Any)

How to Know If You Actually Need Mobile Proxies (Without Buying Any)

1
Comments
8 min read
I Built a $29/Month API That Turns Any Website Into Structured JSON (No AI Black Box)

I Built a $29/Month API That Turns Any Website Into Structured JSON (No AI Black Box)

Comments
2 min read
Parsing robots.txt for 10 AI Crawlers: Wildcards, Partial Blocks, Line Numbers

Parsing robots.txt for 10 AI Crawlers: Wildcards, Partial Blocks, Line Numbers

Comments
5 min read
How we built a Reddit comment-tree scraper that returns upvote scores — through a residential proxy

How we built a Reddit comment-tree scraper that returns upvote scores — through a residential proxy

Comments
6 min read
How to read a Polymarket Up/Down outcome (and the Price To Beat) before the oracle settles

How to read a Polymarket Up/Down outcome (and the Price To Beat) before the oracle settles

Comments
4 min read
YouTube Transcript Scraper: bulk-download captions for RAG, AI, and show notes

YouTube Transcript Scraper: bulk-download captions for RAG, AI, and show notes

Comments
7 min read
Your Scraper Returns 200 OK and Lies. Here's How to Catch It.

Your Scraper Returns 200 OK and Lies. Here's How to Catch It.

Comments
5 min read
Detecting Q&A Patterns and Heading Trees in Raw HTML

Detecting Q&A Patterns and Heading Trees in Raw HTML

1
Comments 1
6 min read
When scraping orchestration is the wrong abstraction for LLM workflows

When scraping orchestration is the wrong abstraction for LLM workflows

Comments
4 min read
Generating scraper logic at runtime instead of writing it per site

Generating scraper logic at runtime instead of writing it per site

Comments
3 min read
Twitch Chat Scraper: export any VOD's full chat replay for $1.05/1K

Twitch Chat Scraper: export any VOD's full chat replay for $1.05/1K

Comments
7 min read
Threads Reply Scraper: export the full conversation tree of any public post

Threads Reply Scraper: export the full conversation tree of any public post

Comments
8 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.