foxck016077

Posted on May 16 • Edited on May 25

An Apify Actor for Gmail inbox analytics â refresh-token-only OAuth, async router, per-feature quota

#analytics #opensource #serverless #showdev

I just open-sourced an Apify Actor for Gmail inbox workflow analytics: apify-gmail-inbox-intel. It is not a scraper, not a bulk sender â it is an inbox analytics tool on gmail.readonly scope. This post is a design tour, not a tutorial.

If you have ever asked "which client thread did I forget to reply to?" or "what is my average reply turnaround?", this is the kind of workflow it covers.

Why an Apify Actor

I needed three things at once: serverless runtime, pay-per-result billing, and a real input schema. Apify gives me all of them without writing a backend. I get a hosted endpoint, dataset storage, a key-value store for state, and a developer audience that is already paying for actors.

The actor exposes four features through a single entrypoint:

thread_search â query Gmail threads by q, paginate, return metadata + message counts
reply_metrics â for each thread, compute reply-from-me, reply-from-others, last-reply age, SLA breach flag
summarizer â optional OpenAI LLM thread summary (BYO API key)
unread_digest â list unread threads in the last N hours, grouped by label

Design decision 1: refresh-token-only OAuth

The hardest call early on was OAuth. Two paths:

3-legged OAuth on the Actor side â Actor hosts callback URL, exchanges code, stores tokens.
Refresh-token-only â user does the OAuth dance once on their own, hands me {refresh_token, client_id, client_secret} as Actor input.

I picked option 2. Reasons:

Apify Actors do not have a stable HTTPS callback URL per user. Each run is a job, not a server.
"We never store your Gmail tokens" is a far easier privacy story to defend.
I do not want to be the holder-of-secrets for someone else's mailbox.

In the Actor, the flow is:

# src/gmail_client.py â sketch
async def get_access_token(oauth_token: dict) -> str:
    resp = await httpx_client.post(
        "https://oauth2.googleapis.com/token",
        data={
            "grant_type": "refresh_token",
            "refresh_token": oauth_token["refresh_token"],
            "client_id": oauth_token["client_id"],
            "client_secret": oauth_token["client_secret"],
        },
    )
    return resp.json()["access_token"]

Access token lives in memory only. Job end â process tears down â token gone. Best effort, but at least nothing persists in Apify storage with my code path.

Design decision 2: one async router, not four actors

Tempting to split into four actors. I did not, for two reasons:

Marketing surface area. One actor with four feature enum values gets one Store page, one rating, one review pile. Four actors split everything four ways.
Shared OAuth + shared quota. The token exchange, error handling, mask helpers, KVS quota â all reusable.

src/main.py is just a router:

FEATURES = {
    "thread_search": thread_search.run,
    "reply_metrics": reply_metrics.run,
    "summarizer": summarizer.run,
    "unread_digest": digest.run,
}

async def main():
    actor_input = await Actor.get_input() or {}
    feature = actor_input.get("feature")
    if feature not in FEATURES:
        raise ValueError(f"Unknown feature: {feature}")
    await FEATURES[feature](actor_input)

Each feature module owns its own INPUT_SCHEMA.json semantics through the same shared file â the feature enum drives validation downstream in each handler.

Design decision 3: quota lives in Apify KVS

Free tier is 100 threads / month. That counter has to survive across runs. Apify KeyValueStore is the obvious home â no extra DB, persistent, scoped to the Actor.

# src/quota.py â sketch
async def check_and_increment(user_id: str, feature: str, n: int):
    kvs = await Actor.open_key_value_store()
    key = f"quota/{user_id}/{month_key()}/{feature}"
    used = (await kvs.get_value(key)) or 0
    if used + n > FREE_LIMIT:
        raise QuotaExceeded(feature, used, FREE_LIMIT)
    await kvs.set_value(key, used + n)

Month roll-over is a string key by year-month â no cron, no migration, no drift. Pro tier flips a flag and skips the check entirely.

Tests

Six pytest tests, asyncio_mode = auto in pytest.ini. Coverage:

Router rejects unknown feature
Each of 4 features short-circuits cleanly in dry_run=True
Quota raises after limit, allows under

[pytest]
asyncio_mode = auto

That tiny config line is the difference between "6 tests pass" and "6 tests error: missing event loop". Learned it the hard way.

Pricing model

Free: 100 threads / month
Pro: $19 / month (5000 threads metadata + 100 LLM summaries)
Pay-per-result add-on: $0.50 / 1,000 thread metadata, $0.005 / summary

Apify handles billing. I handle code.

What I would do differently

Webhook trigger â right now unread_digest runs on demand. A scheduled trigger + Slack/Discord delivery is the obvious next product.
Label-level rules â reply_metrics is global. A per-label SLA matrix would be more useful for sales teams.
Multi-account fan-out â one run, multiple OAuth tokens, one combined dataset.

Code

Repo: https://github.com/foxck016077/apify-gmail-inbox-intel
License: MIT
Actor manifest: .actor/actor.json + INPUT_SCHEMA.json if you want to fork

Happy to take feedback on the OAuth-only design â was there a reason to go full 3-legged that I am missing?

📚 Part of the Apify Gmail Inbox Actor — design notes series.

Source: foxck016077/apify-gmail-inbox-intel — MIT, end-to-end Gmail inbox analytics actor design: refresh-token-only OAuth, async routing, and per-feature limits.

Cold-start update: 4 days after launching this actor on Apify Store + GitHub MIT, I posted actual funnel numbers (1 star, 1 reaction, 0 sales) plus 5 corrections I would make if I were past-me at hour 0. Useful if you are shipping a similar shape.

Discussion question: If you were extending this actor, which module would you optimize first: OAuth onboarding, async throughput, or quota policy ergonomics?

📚 In the broader ecosystem: awesome-apify-actors — a curated list of 68 production Apify Actors I put together. Sorted by lifetime run count, categorized by use case (Web scraping, Search, Maps, social media platforms, Lead gen, etc.). This Actor sits under Email & Productivity. CC0, MIT actors welcomed.

Update (May 19): Day 6 of this experiment. I changed the freelancer companion pack from a $9 hard wall to pay-what-you-want from $1, opened a 30% affiliate program, and shipped a 26-page bundle PDF (this 9-article series, compiled). Day-6 writeup with the math behind the price drop.

If you read the series from day 1, the bundle PDF is now included at every price point including $1. Affiliate signup at foxck.gumroad.com/affiliates.

The Actor is at apify.com/foxck/gmail-inbox-intel — free MIT, paste 3 OAuth fields (gmail.readonly), get stalled threads ranked by SLA breach. Source code: github.com/foxck016077/apify-gmail-inbox-intel.

Day 7 update (later May 19): I shipped a product pivot — the Gumroad listing above is now a Self-Host Bundle for engineers (full Actor source + docker-compose.yml + 5-min OAuth setup), PWYW from $5 suggested $19. The original PDF still ships inside as a bonus. Same URL.

Day 7 write-up with the funnel audit that triggered the pivot: funnel audit found 7 of 9 articles had no buy link, then I pivoted the product.

Found this useful? My deep-dive on reverse-engineering Claude Code: Claude Code Mastery — The Reverse-Engineering Guide.

Sample report preview: Friday Triage gist — anonymized 10-thread example of the $99 Done-For-You triage output. Grounded in r/sales 1tdngew (49 comments on re-engaging cold prospects) and r/smallbusiness 1td0827 (60-comment thread, top reply at 61 score: "holding 50 open loops in your head").

More from the shop:

Claude Code Mastery: The Reverse-Engineering Guide — $49, 19 pages, every env var / hook event / settings key extracted from the v2.1.90 binary
5 n8n Workflows that Save 10+ Hours/Week — $29, the bundle
AI Lead Auto-Responder — $39, Gmail → instant AI-classified reply

Read the latest checkpoint: Day 16 — +51 reader spike in 85 min, 0 sales

Day 18 — pbot v1 dev preview shipped

After 18 days of this ZERO-TEN cold start: $9 PDF killed at Day 17, pivoted to pbot — a one-click personal knowledge bot you install on your own machine. Talk to it from LINE / Telegram / Zalo on your phone.

v1 dev preview is real: 93 MB macOS .dmg packaged, 15k-chunk SQLite FTS5 queries in 0-3 ms, Anthropic real calls with source citations, daemon auto-start on boot. Day 18 deep dive: the 7-line bigram fix for Chinese search.

Join the pbot waitlist ($29 · first-100 get -30% → $20) →