I had a freelance gig. Collect water management system data for a bunch of towns — infrastructure details, treatment plants, distribution networks, whatever was publicly available. Put it all together in one structured document.
Sounds simple. It's not.
If you've ever tried to manually collect government data from the web, you know the pain. Every town has a different website. Some have PDFs buried three links deep. Some have scanned documents from 2014 that no one has updated since. Some don't even have a proper site — just a PDF uploaded to a random state portal.
So the workflow was basically: Google the town, find the relevant pages, open 15 tabs, copy-paste data into a doc, find a PDF, download it, read through it, extract the relevant numbers, repeat for the next town. For hours.
I wasn't going to do that manually. So I built an automation pipeline using OpenClaw — an open-source autonomous AI agent tool — and got the whole thing working in under an hour.
The Stack
Three pieces:
- Brave Search API — for programmatic web search. I needed to find town-specific water management data without manually googling each one. Brave's API let me run structured queries and get back URLs with relevant results.
- OCR (via an OpenClaw skill) — a lot of the useful data was locked inside scanned PDFs. Government reports, water quality documents, infrastructure plans — stuff that's technically "public" but practically unreadable by machines. The OCR skill extracted text from these documents so the pipeline could actually use the content.
- Claude API — the brain of the operation. Once I had raw search results and extracted PDF text, Claude processed everything, pulled out the relevant water management data, and structured it into a clean, consistent format across all the towns. How OpenClaw Tied It All Together The thing about OpenClaw is it's not just running one AP__I call. It's an autonomous agent — meaning I could describe the task, give it the tools (Brave Search, OCR, Claude), and let it figure out the sequencing. For each town, the agent would:
Search for water management data using Brave
- Identify which results were useful (web pages vs PDFs vs junk)
- For PDFs — download and run OCR to extract text
- For web pages — pull the relevant content
- Send everything to Claude to extract and structure the data
- Compile the output into a consistent format
I didn't have to write the logic for "if it's a PDF, do this, if it's a webpage, do that." The agent handled those decisions. I just described what I needed and gave it the tools.
The Part That Blew My Mind
Here's what I didn't expect.
After the pipeline was working, OpenClaw didn't just run the task — it created a skill out of the entire workflow. A reusable, self-contained skill that combined Brave Search, OCR, and Claude into one thing.
So after the initial setup, my workflow became:
Enter town name → get structured water management data.
That's it. One input. The skill handled everything — searching, finding PDFs, scanning them, extracting the data, structuring the output. I didn't have to think about the pipeline anymore. It was just a tool now.
Setup took maybe 40 minutes — configuring the individual pieces, testing on one town, making sure the output format was right. But once the agent built the skill, every town after that was basically instant.
And the output was more consistent than what I would have produced manually. Because Claude was structuring every town's data the same way — same fields, same format. When I do it manually, I start strong and by town 15 I'm cutting corners. We all do.
Why I'm Writing This
I see a lot of posts about AI agents being used for coding — Cursor, Claude Code, Copilot. And that's great. But I think the more underrated use case is stuff like this. Data collection. Research. Document processing. The boring, repetitive work that doesn't require you to be a senior engineer — just someone who knows how to connect a few APIs and let an agent do the manual labor.
If you're a backend developer or honestly anyone who deals with data, look into autonomous agent tools. Not for the hype. For the hours you'll get back.
OpenClaw is open-source. The Brave Search API has a free tier. Claude API costs are minimal for this kind of workload. The barrier to entry is lower than you think.
Tech used:
- OpenClaw (autonomous AI agent)
- Brave Search API
- OCR skill (document scanning)
- Claude API (data extraction and structuring) Time saved: ~6-8 hours of manual work → under 1 hour of setup + automated execution
If you've used AI agents for non-coding automation, I'd love to hear what you built. Drop it in the comments.
Top comments (0)