DEV Community

Cover image for I Built a Free API That Scrapes Any Website Using Plain English — No CSS Selectors
Paras Tejpal
Paras Tejpal

Posted on

I Built a Free API That Scrapes Any Website Using Plain English — No CSS Selectors

I've wasted days of my life maintaining CSS selectors.

You know the drill — you write the perfect scraper, it works great for a week, then the site does a frontend redesign, your selectors break, and you spend another afternoon hunting through the DOM again.

So I built Opticparse — a completely different approach.

How It Works

Instead of selectors, Opticparse:

  1. Opens a real Chromium browser (via Playwright)
  2. Navigates to your URL and waits for JavaScript to load
  3. Screenshots the page
  4. Sends the image + your plain English query to a Vision-Language model (rotating through Groq, Gemini, and GitHub Models to prevent rate limits)
  5. Extracts and returns clean, structured JSON matching your target schema

Because it uses AI vision to look at the page exactly like a human does, it never breaks when the HTML structure changes.


🛠️ The Tech Stack

I built this using:

  • FastAPI (Python): High-performance backend routing
  • Playwright: Handles headless rendering, waits for dynamic content, and takes screenshots
  • OpenAI / Gemini SDKs: Communicates with the vision models
  • Free Key Rotation: Cascading fallback rotation across 6 providers so it runs completely for free

💻 Code Example: Scrape E-Commerce Prices

Here is how simple it is to query. You just pass the URL, a query prompt, and a JSON schema you want it to output:


bash
curl -X POST https://opticparse-python-sg.onrender.com/api/vision-scrape \
  -H "Content-Type: application/json" \
  -H "X-API-Key: YOUR_KEY" \
  -d '{
    "target_url": "https://example.com/store",
    "extraction_query": "Extract the product name, price, and currency",
    "response_schema": {
      "type": "object",
      "properties": {
        "product_name": {"type": "string"},
        "price": {"type": "number"},
        "currency": {"type": "string"}
      }
    }
  }'
## 🚀 Get Started

I've open-sourced the code and deployed the API to Render. 

1. **GitHub Repository**: If you want to run it locally or host it yourself: [parastejpal987-cmyk/opticparse](https://github.com/parastejpal987-cmyk/opticparse)
2. **RapidAPI Hub**: Access the free API tiers immediately: [Opticparse on RapidAPI](https://rapidapi.com/parastejpal987cmyk/api/opticparse-ai-vision-web-scraper)

Let me know what you think in the comments! Happy scraping.
Enter fullscreen mode Exit fullscreen mode

Top comments (0)