DEV Community: Senpeng

Scaling Our Daily Twitter Scraping Workflow with actionbook's Cloud Browser Providers

Senpeng — Sat, 18 Apr 2026 18:11:28 +0000

What We Were Trying to Do Every Morning

Every morning, our inbox looks like a battlefield. Around a thousand emails pour in overnight, each one an IFTTT-generated notification carrying a single Twitter link. Keyword subscriptions, competitor alerts, KOL tracking, industry signals, all jammed together in an unreadable pile.

Out of those thousand, roughly fifty are actually worth reading. The other 950 are noise.

But here is the catch. You cannot tell which is which by looking at the email subject. The IFTTT notification gives you a link and not much else. To filter properly, you have to open each tweet and pull three pieces of information: the actual content, the view count, and the like count. Only then can you decide whether a post deserves a closer look or a follow-up.

So we did what every engineering team does when faced with a thousand repetitive clicks. We automated it.

And for a long time, our automation was fine. Not great. Just fine.

Where the First Version Broke

Our first crawler was a straightforward Actionbook script running in local mode. Open a batch of tabs, visit each IFTTT-provided link, extract the content and metrics, move on. On paper, we could push it to 30 concurrent tabs on a single machine. In practice, it hit Twitter's rate limit almost immediately.

Anything above a handful of concurrent requests started coming back with empty pages, challenge screens, or outright blocks. Dropping concurrency to keep the crawler stable meant a full run took around thirty minutes. By the time the data landed, standup was halfway over.

We tried the usual bag of tricks. Jittered delays. Rotating user agents. Residential proxies. Each one added maintenance cost and the payoff kept shrinking.

Eventually we stopped and asked the obvious question. What is actually limiting us here?

It was not CPU. It was not bandwidth. It was not even the script.

It was the single exit IP.

Every request we made went out through the same door. Twitter was not rate-limiting our machine. It was rate-limiting anyone knocking from that address. No amount of local optimization could fix a problem that existed at the network edge.

The only real way forward was to move to cloud browsers, so that every request could go out from a different IP.

Enter `--provider`

Around this time, Actionbook shipped exactly what we needed. A --provider flag that lets you delegate browser sessions to different cloud browser services. Today it supports three backends: Driver, HyperBrowser, and BrowserUse.

What matters is not which three providers. What matters is that you can switch between them by changing a single flag, without touching the script. That means we can run the same crawler across multiple providers in parallel, and each provider brings its own pool of IPs.

The New SOP: Spreading the Load Across Three Providers

Here is where the economics get interesting. Each of these cloud browser providers offers a free tier. Individually, none of them is generous enough to handle our full daily volume. Together, running in parallel, they comfortably are.

So we designed the pipeline around that observation.

At 7 AM, a cron job ingests every IFTTT email from the overnight inbox and extracts the Twitter link embedded in each one. Today that yields around a thousand URLs. The list gets split into three roughly equal slices, one for each provider.

For each slice, we open a single cloud browser session on the corresponding provider. Inside that session, we drive 10 tabs concurrently, each one visiting its assigned tweet URL and pulling the content, view count, and like count.

Three providers running in parallel. One session each. Ten tabs per session. Thirty real requests in flight at any moment, but split across three completely independent IP pools. If Twitter throttles one provider's egress, the other two keep going without even noticing.

From the crawler's point of view, none of this complexity exists. Actionbook abstracts away the differences between Driver, HyperBrowser, and BrowserUse. We wrote one scraper. The orchestrator decides which provider gets which slice, spins up the sessions, and collects the results.

Once everything lands, the pipeline runs a summarization pass over the collected content and applies our relevance filters. The thousand raw URLs collapse into about fifty tweets that genuinely deserve attention, and those show up in the morning briefing channel before anyone walks into standup. The runtime dropped from around thirty minutes to about five, and rate limit errors effectively went to zero.

Closing Thought

This Twitter pipeline is just one example. Once you have Actionbook's --provider flag combined with isolated sessions across different cloud browsers, a lot of workflows that used to feel impractical suddenly become straightforward.

Anything that needs high-volume access to the same domain, anything that needs a clean session per task, anything that needs to sidestep the limits of a single local machine, all of it fits naturally into this pattern.

Hermes Searched 4 Platforms at Once. I Just Told It What I Wanted.

Senpeng — Thu, 16 Apr 2026 07:08:07 +0000

Hermes Agent is one of the hottest open-source agents right now. Runs local, reasons well, handles multi-step tasks without hand-holding. But its built-in browser tools have a bottleneck: one tab at a time. Want to search multiple platforms? Your agent queues them up and waits through each one sequentially.

I integrated Actionbook with Hermes and asked it to fetch today's trending topics from Twitter, Reddit, Google, and Hacker News. It spun up four browser tabs in parallel and returned the aggregated results in under a minute.

One instruction. Four platforms. All at once.

Integrate Actionbook with Hermes

One line to get Actionbook running in Hermes:

npx skills add actionbook/actionbook && hermes skills install skills-sh/actionbook/actionbook/actionbook -y

Done.

What I told Hermes

Use Actionbook to operate the browser with chrome remote debugging, launching parallel tasks across the following platforms simultaneously to research [your topic], then synthesize a consolidated trend report.
## Tab Management Rule (Apply to ALL tasks)

> After extracting the required information from any detail page (post, article, thread, comment page), immediately close that tab before opening the next one. Never accumulate more than 1–2 open tabs per platform task at any time.

## Execution

Run all 4 platforms in parallel via Actionbook. Each browser task is independent.

HackerNews
1. Open HackerNews and search for [your topic], filter by this week sorted by popularity.
2. For each high-scored post: open the detail/comments page → extract supporting and opposing viewpoints → close the tab.

Reddit
1. Open Reddit and search for [your topic], filter by Top this week.
2. For each relevant post: open the post → switch comment sorting to Controversial → extract pain points, product comparisons, and real user experiences → close the tab.

Twitter/X
1. Open Twitter/X and search for [your topic], filter for high-engagement original posts (no retweets).
2. For each post: open the reply thread → capture sentiment and key opinions from notable voices → close the tab.

Google
1. Open Google and search for [your topic].
2. For each result (news, blog, announcement): open the page → collect key information → close the tab.
3. Cover 1~2 results.

Hermes reads the instruction, builds the URL list, opens all 4 tabs through actionbook, then snapshots and extracts text from each one in parallel. No waiting for one to finish before starting the next.

You just watch four tabs light up.

Why trending data breaks with one tab at a time

Trending content shifts by the minute. If your agent checks Twitter first, then Reddit, then Google, then HN, by the time it finishes the fourth search, the first result is already stale. You end up with a patchwork of different moments instead of one coherent snapshot.

Parallel tabs fix this. All four searches happen at the same instant. The data you get back is a synchronized time slice across every platform.

How actionbook handles parallel tabs

Actionbook's daemon manages multiple CDP connections concurrently. Each tab is an independent target. No "active tab" concept, no switching overhead.

Behind the scenes, here's what Hermes runs:

# Open 4 tabs at once
actionbook browser new-tab "https://twitter.com/search?q=AI+agents" --session s1
actionbook browser new-tab "https://www.reddit.com/search/?q=AI+agents" --session s1
actionbook browser new-tab "https://www.google.com/search?q=AI+agents+trending+today" --session s1
actionbook browser new-tab "https://news.ycombinator.com" --session s1

# Wait for all 4 to load, then extract
actionbook browser wait-idle --session s1 --tab t1
actionbook browser text --session s1 --tab t1
actionbook browser wait-idle --session s1 --tab t2
actionbook browser text --session s1 --tab t2
# ... same for t3, t4

Four pages loaded, four pages extracted. Hermes reads all the results and writes a single summary.

What else you can do with parallel tabs

Morning briefing for newsletter writers. A dev I know runs this every morning before coffee. Hermes pulls trending from multiple sources, cross-references the overlapping topics, and outputs a structured briefing. What used to be 30 minutes of tab-switching is now a single instruction the night before.

Competitive monitoring. An indie hacker pointed multiple tabs at competitor blogs and changelogs simultaneously. Every Monday, Hermes opens them all, extracts what shipped that week, and drops a comparison into a markdown file. No RSS, no manual checking.

Multiple tabs, one instruction. This is how agents should be searching the web.

I Asked Claude Code to Scrape First Round. It Opened 30 Tabs at Once.

Senpeng — Fri, 10 Apr 2026 11:53:19 +0000

I wanted to study how successful startups on First Round Capital craft their taglines. So I gave Claude Code one instruction: use actionbook go to ervery website and extract tagline.

Then Claude Code instead of clicking through each page one by one like a traditional browser agent would, it opened 30 tabs simultaneously and returned everything in under a minute.

This is how agents should be using browsers all along.

Setup your agent in one sentence

Tell Claude Code one sentence:

Find actionbook on GitHub and install it.

That's the entire setup.

After setup, give it one instruction:

Collect all companies URL from firstround.com/companies.
Open 30 tabs at once. Take snapshot of each page and extract the company name and tagline and save as CSV (company name, URL, tagline).

Claude Code builds the URL list from the index page, batches them into groups of 30, and manages tab rotation on its own. You just watch 30 tabs open, scrape, close, and repeat.

Why agents need Actionbook for parallel tabs

Browser agents today work one tab at a time. That's fine for 5 pages but not for triaging hundreds of emails, parsing a thousand tweets, or scanning Reddit threads. One tab at a time, your agent spends most of its time waiting instead of working.

Actionbook gives agents parallel browser access. 30 tabs open, 30 tabs working, all at once.

How Actionbook isolates each tab

Every Actionbook browser command carries an explicit address: --session and --tab.

actionbook browser text --session s1 --tab t1
actionbook browser text --session s1 --tab t17

These two execute at the same time. There's no "active tab" concept. Each tab is an independent target. The daemon behind the CLI manages 30 CDP connections in parallel, one per tab, with its own page state and lifecycle.

That's what makes the batch cycle possible:

# Open 30 tabs
actionbook browser new-tab "https://firstround.com/review/post-1" --session s1
actionbook browser new-tab "https://firstround.com/review/post-2" --session s1
# ... 28 more

# Wait + scrape all 30 in parallel
actionbook browser wait-idle --session s1 --tab t1
actionbook browser text --session s1 --tab t1
# ... same for t2 through t30

# Close the batch, open next 30
actionbook browser close-tab --session s1 --tab t1
actionbook browser new-tab "https://firstround.com/review/post-31" --session s1

End result: 192 websites visited, each one snapshotted and parsed. Company name, URL, and tagline extracted into a single CSV. 7 batches, 1 minute.

Beyond scraping: real workflows across tabs

Extracting text is the simplest case. The interesting part is what happens when the agent clicks, fills forms, and navigates across 30 tabs at the same time.

Product growth workflows. One team uses Actionbook to open dozens of Gmail threads, Twitter mentions, and Reddit posts in parallel. Their agent reads everything at once, cross-references the feedback, and updates their growth funnel doc. What used to be a morning of manual triage now takes minutes.

Flight deal hunting. Another user built a flight booking agent that opens 5 airline sites simultaneously, searches the same route on all of them, compares prices, and returns the cheapest option. The agent fills in departure, destination, and dates on each site at the same time.

Actionbook lets your agent control the browser in parallel. This is what agents should have been doing from the start.

Let OpenClaw Use Your ChatGPT GPT-5.4 Pro Model

Senpeng — Fri, 03 Apr 2026 06:47:11 +0000

I run OpenClaw on my Mac Mini as a personal assistant for daily automation. Recently I wanted to give it one more ability: talk to ChatGPT through my actual browser, not the API.

Here's what it looks like. I send a message in Telegram, OpenClaw opens ChatGPT in Chrome, sends the prompt, reads the response, and brings it back:

Why Operate ChatGPT This Way

Some ChatGPT models like GPT-5.4 Pro are only available through the web interface with a Plus or Pro subscription. The API has its own model list and own pricing. Going through the browser means OpenClaw gets access to every model I can use, including the ones the API doesn't offer.

When I chat with OpenClaw through Telegram and need research on something, I just ask. OpenClaw opens ChatGPT in the browser, sends my question, reads the response, and replies back to me in Telegram. I don't switch apps or copy-paste anything, and the conversation stays in my ChatGPT history so I can pick it up later.

Because it uses my real browser session, OpenClaw can select any model I have access to. If OpenAI drops a new model next week, OpenClaw just picks it from the dropdown the same way I would.

How I Set It Up: actionbook CLI + Chrome Extension

actionbook is a Browser Action Engine I've been building. It gives AI agents pre-computed "action manuals" for websites, semantic descriptions of what's interactive, so agents don't need to parse raw HTML or guess at selectors.

The key for OpenClaw: actionbook CLI connects to your existing Chrome through a Chrome extension. Not a new browser instance. Your running browser with active session.

There's only one manual step: install the actionbook Chrome extension from the Web Store. Everything else, I just told OpenClaw to do it:

Install actionbook CLI from https://github.com/actionbook/actionbook
run `actionbook setup`, and pick extension mode for browser connection.

During setup, the wizard detects your environment: OS, shell, installed browsers. The important step is Browser Mode, where it picks between:

isolated: Launch a dedicated browser (clean environment, no setup needed)
extension: Control your existing Chrome

OpenClaw picked extension as instructed. That's the whole point. I want to control the Chrome that's already logged into ChatGPT, not spin up a fresh one.

How OpenClaw Operates ChatGPT

Once actionbook is set up, I just tell OpenClaw what I need:

Use actionbook to open ChatGPT and ask: 
what are the latest trends in AI agents for 2026? Bring me the answer.

I don't tell OpenClaw which button to click or where the input box is. OpenClaw reads the action manual for ChatGPT, takes a snapshot of the page to understand the current layout, and figures out how to operate it on its own. It fills in the prompt, clicks send, waits for the response to finish streaming, and reads the result back to me.

Here's what it runs behind the scenes:

actionbook search "chatgpt"
actionbook get chatgpt.com:/:default
actionbook browser snapshot
actionbook browser fill "your question here" --ref-id e3
actionbook browser click --ref-id e4
actionbook browser wait-idle
actionbook browser text

What Else I Use It For

I also use it for parallel GEO testing. When I need to compare how ChatGPT responds to the same prompt under different contexts, I tell OpenClaw to open multiple tabs and send them all at once.

And this isn't limited to ChatGPT. OpenClaw can operate any page I'm already logged into. That's what makes it a real agent.