⚠️ This article contains affiliate advertising (promotions). A portion of revenue generated through linked sites is paid to the author, but this does not affect the purchase price for readers in any way.
Hey — I'm a working engineer running a side hustle in tech writing and e-commerce. Here's the bottom line upfront: by the time you finish this article, you'll have a Python script that extracts "side-hustle promo noise" from your X (Twitter) timeline and auto-adds it to a mute list, plus a Claude Haiku classifier that labels each tweet as signal or noise for roughly ¥0.02 per tweet — copy-paste ready, just swap in your API keys. My own information-gathering time dropped from 90 minutes to 12 minutes a day (7-day average; details below).
Why Manual Muting Breaks Down on X: The 30-Item Wall
Muting on X via the GUI is one entry at a time. In my case, roughly 70% of the 480 accounts I follow are genuinely useful — but 30% are promotional, making them "almost good" accounts. Muting at the account level kills the useful tweets too. So I turned to keyword muting, which becomes unmanageable past 30–40 keywords manually. Add "free," "limited time," "LINE sign-up," and "#RT please" to the list and you start catching legitimate tech tweets as collateral damage.
One failure story: early on I added "side hustle" as a mute keyword, missed an entire high-quality thread squarely in my interest zone, missed the viral wave, and conservatively lost about ¥3,000 in affiliate opportunity. Word filters don't have the precision. That's the starting point for this article.
Extracting "Promo Templates" Mechanically with Tweepy and Filter Rules
First, using the X API v2 (read access is available even on the free tier) and Tweepy, I pull tweets equivalent to my home timeline and numerically score structural features common in promotional content. The trick is to score on three axes — emoji density, URL count, and call-to-action verbs — rather than keyword matching.
import re
import tweepy
client = tweepy.Client(bearer_token="YOUR_BEARER_TOKEN")
NUDGE_VERBS = ["登録", "フォロー", "プレゼント", "配布", "限定", "無料", "リプ", "DM"]
EMOJI = re.compile(r"[\U0001F300-\U0001FAFF\u2600-\u27BF]")
def spam_score(text: str) -> float:
emoji_n = len(EMOJI.findall(text))
url_n = len(re.findall(r"https?://", text))
nudge_n = sum(text.count(v) for v in NUDGE_VERBS)
length = max(len(text), 1)
# Weighted sum of emoji density, URLs, and nudge verbs (normalized to ~0–1)
score = (emoji_n / length) * 3 + url_n * 0.25 + nudge_n * 0.2
return round(min(score, 1.0), 3)
# Fetch home timeline (evaluate recent posts)
me = client.get_me().data.id
tl = client.get_home_timeline(max_results=50, tweet_fields=["text", "author_id"])
for tw in tl.data or []:
s = spam_score(tw.text)
flag = "NOISE" if s >= 0.35 else "keep"
print(f"[{flag}] score={s} :: {tw.text[:40]}")
On my own data (1,200 recent tweets), a threshold of 0.35 yields 87% capture rate for promo templates with 9% false positives. That 9% false-positive rate stings. Tech influencers use a lot of emojis, so filtering on that alone kills good tweets. That's why the second stage shows Claude the meaning.
Why I Chose Claude Haiku Over GPT: ~¥200 per 10,000 Tweets
For pure classification, GPT-4o-mini would work — but I went with Claude Haiku 4.5 for cost and Japanese nuance. Each tweet runs roughly ¥0.02 in combined input/output tokens; processing 10,000 tweets a month comes to about ¥200. If you only pass the "borderline" tweets that survive the first-stage score filter to the LLM, measured API spend drops to the ¥50-per-month range.
import os
from anthropic import Anthropic
client = Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
PROMPT = """You are a classifier helping a side-hustle account operator gather information.
Classify the following tweet as either SIGNAL or NOISE, with a reason in 15 characters or fewer.
- SIGNAL: Technical insight, real revenue data, primary source, concrete how-to
- NOISE: Hollow promotion, info-product funnel, motivational fluff, freebie bait
Output JSON only: {"label": "...", "reason": "..."}
Tweet: ===\n{tweet}\n==="""
def classify(tweet: str) -> dict:
msg = client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=80,
messages=[{"role": "user", "content": PROMPT.format(tweet=tweet)}],
)
import json
return json.loads(msg.content[0].text)
samples = [
"【完全無料】今だけ副業マニュアル配布!リプで受け取り🔥フォロー必須",
"X API v2のレート制限、無料枠だと15分で1回。filtered streamは有料化済みでした",
]
for t in samples:
print(classify(t), "::", t[:30])
Running this, the first sample returns {"label": "NOISE", "reason": "freebie bait"} and the second returns {"label": "SIGNAL", "reason": "primary API limit info"}. The semantic stage catches clever info-product tweets that the spam score alone missed. In my testing, the two-stage setup brought false positives from 9% down to 1.8%.
Running It Every Morning at 7 AM with GitHub Actions and Accumulating Results in a CSV
This is the core of a low-maintenance workflow. Running it locally every day doesn't stick, so I added schedule: cron('0 22 * * *') (7 AM JST) to .github/workflows/mute.yml and append NOISE-flagged author_id values to mute_candidates.csv. Once a week I skim the CSV and actually mute only accounts that have been flagged NOISE three or more times. This lets me make the call on "useful but spammy" accounts based on data rather than gut feeling.
Here are my measured logs from week two (from my own time-tracking app):
- Before: ~90 min/day browsing the timeline (including ~40 min scrolling past promo tweets)
- After (7-day average): 12 min/day. Average of 63 NOISE tweets auto-filtered daily
- Downstream effect: Used the reclaimed time to write tech articles; saw my first affiliate conversion on day 4 (details below)
One more failure story: I mixed up UTC and JST for the cron schedule and spent the first three days running at 4 AM local time — while I was asleep, everything hit the API rate limit and failed silently. With the free X API tier, even reads are tightly capped, so trimming max_results and running once a day as a batch job is the realistic approach.
Reinvesting the Freed 78 Minutes: Where the Side Hustle Time Goes
To be honest, cutting information-gathering time doesn't generate a single yen on its own. The value lies in redirecting those 78 minutes toward output. I used them to bump my tech article cadence to three posts a week and cleaned up the CTAs at the end of each article — and affiliate conversions started moving in the first month.
What also quietly compounds: not letting the side-hustle income sit idle. I apply the same "automate it" mindset I used for muting — side-hustle earnings flow automatically into a brokerage automatic investment account. Zero fees, ¥100 minimum, so even a few thousand yen a month gets onto the compounding track. People who automate their information intake tend to benefit most from automating where their money goes too. If you don't have an account yet, opening a free account at SBI Securities or Rakuten Securities (no annual fee, no maintenance fee) lets you consolidate your side-hustle deposit account and auto-investment in one place.
My step-by-step guide for account setup and auto-investment configuration (A8 tracking link): https://example-a8.net/affiliate-securities ※ Replace with your own verified tracking URL in production.
The Fastest Path to Running This in 30 Minutes (3 Steps)
-
pip install tweepy anthropic→ set your X Developer Bearer Token andANTHROPIC_API_KEYas environment variables - Use
spam_scoreas a coarse pre-filter; pass only tweets that score below 0.35 but still look suspicious toclassify— this is what keeps API spend in the ¥50/month range - GitHub Actions
cronruns on UTC, so for 7 AM JST use0 22 * * *. For the first week, manually eyeball the CSV to verify your false-positive rate
What determines side-hustle revenue isn't silencing noise — it's what you produce with the time you clear. For me that was tech articles and automatic investing. Start by running the two code blocks above and counting how many NOISE tweets are in your own timeline. If this was useful, let me know with a like — next up I'll write the prompt-engineering edition for improving classification accuracy without fine-tuning.
🛠 Related Links (Author's Work)
For those who want to put Claude / GitHub Actions development automation like this article to work in their own environment immediately:
- AI development automation kit & prompt collection (copy-paste-ready configs and CLAUDE.md examples) → https://itsuya.gumroad.com/l/agentrules260619
- Free tool suite for instantly resolving dev errors — DevToolBox → https://1280itsuya.github.io/devtools/
※ Links to the author's own products and sites (includes promotions).
If you found this useful: I packaged 50 copy-paste AI debugging prompts + drop-in Claude Code config templates (CLAUDE.md, settings.json, MCP) into a small kit.
Launch deal: code START50 = 50% off → 50 AI Debugging Prompts + Claude Code Config Pack (about $6, 50% off applied)
New: my 10-chapter ebook Practical Claude Code — automation & unattended operation (about $9, 50% off applied)
Top comments (0)