DEV Community

Septim Labs
Septim Labs

Posted on

I read the X (Twitter) algorithm source for 4 days and built a Claude Code sub-agent that scores drafts before posting

Twitter's recommendation algorithm has been open-source since March 2023. I spent four days reading it. Here's what those four days taught me — and the Claude Code sub-agent I built so you don't have to.

The repos are twitter/the-algorithm (the Scala graph + ranking stack) and twitter/the-algorithm-ml (the Python ML training and scoring code). Together they expose the weights, the signal definitions, and the candidate-source pipelines. They do not expose the live production graph, the user-side embeddings, or the real-time engagement velocity. That asymmetry — what's published vs. what's runtime — is the entire point of this article.

The five signals that actually move distribution

The heavy ranker is the model that scores candidate tweets after retrieval. Its action-weight map is the single most cited table in the repo, defined in the-algorithm-ml/projects/home/recap/README.md and surfaced through the scoring code at the-algorithm-ml/projects/home/recap/model/model_and_loss.py.

Signal Weight What it counts
reply_engaged_by_author +75 The reader replied AND the author replied to that reply
reply +13.5 The reader replied to the tweet
good_profile_click +12 Reader clicked the author's profile and engaged on it
good_click +11 Reader clicked an in-tweet link and dwelled
good_click_v2 +10 Same as above with stricter dwell threshold
retweet +1 Reader retweeted
fav (like) +0.5 Reader liked
video_playback50 +0.005 Reader watched 50 percent of an attached video
negative_feedback_v2 -74 Mute, block, "not interested", report

Source: the-algorithm-ml/projects/home/recap/README.md (the weights table is the canonical reference; the loss function consumes them via the WeightedBCELoss defined in model_and_loss.py).

If you've never opened the file, that's the entire pre-tweet distribution game on one page. Likes are worth one percent of a reply that the author engages with. A single mute roughly cancels 150 likes.

Two things that jumped out

The +75 weight is asymmetric. reply_engaged_by_author only fires when the author replies to a replier. A reply alone is +13.5. The author's reply on top of the reader's reply triples the score and then some. Translation: replies aren't rewarded — conversations the author participates in are. The fastest legal lever on the algorithm is replying to your own replies within the first hour.

Negative feedback at -74 nearly mirrors the top positive. The system is designed to be punished by mute and block almost as hard as it's rewarded by author-engaged replies. The implication: writing for the median follower of your follower is dangerous. One mute from someone on the edge of your follower-of-follower graph can erase the entire upside from 150 likes. The honest read of the weights is that being aggressively un-controversial is rewarded more than people assume.

What the source can — and can't — tell you

I want to be explicit about this because the "I read the algorithm" genre is full of overclaiming.

A pre-post scorer that reads only the draft text and media CAN evaluate:

  • Reply pull — does the post end with a question, a wrong-on-purpose claim, an unfinished thought, or a stance worth pushing back on
  • Conversation depth — is the topic one where the author can plausibly reply to 5-15 replies (the lever for reply_engaged_by_author)
  • Negative-feedback risk — does the draft contain mute-triggers (engagement-bait phrasing, lecture tone, identity attacks, low-effort dunks)
  • Media + dwell — does the post include an image, native video, or chart (video_playback50 is microscopic per unit but compounds with dwell on good_click_v2)
  • Link tax — out-links suppress reach in the candidate-retrieval stage; the draft either eats that cost or it doesn't

A pre-post scorer CANNOT evaluate:

  • SimClusters — reader-side topic embeddings (the-algorithm/src/scala/com/twitter/simclusters_v2). These are computed from the reader's behavior, not the post.
  • Real-Graph weights — author-side, per-follower edge weights (the-algorithm/src/scala/com/twitter/graph/batch/job/tweepcred). Fixed at post time and unknowable from the text.
  • TweepCred — account-quality score. Also fixed at post time.
  • In-network vs. out-of-network reach — depends on retrieval, not ranking.
  • Real-time engagement velocity — the first 30 minutes shape the rest of the distribution and only exist after publish.

The honesty is the moat. Any scorer that claims to measure SimClusters from a draft is lying. Any scorer that claims to predict reach is lying. What a scorer can do is grade the draft against the five published levers a writer actually controls. That's it. That's the product.

The Claude Code sub-agent (Pulse)

I built Pulse as a single markdown file at ~/.claude/agents/pulse.md. About 6,000 characters. It's a Claude Code sub-agent: one file, no install, no API key, no SaaS.

Inside Claude Code:

/agents pulse score this draft: "the +75 weight in twitter's heavy ranker only activates when the author replies to a replier. likes are worth 0.5. one mute is -74. you are not writing for likes."
Enter fullscreen mode Exit fullscreen mode

Returns a 0-100 score, a per-axis breakdown, a ship/rewrite decision, and 3-5 concrete rewrites, each cited to the axis that the rewrite lifts.

The sub-agent format is documented at docs.claude.com — Claude Code reads files in ~/.claude/agents/ and exposes them via the /agents command. No background process. No telemetry. The rubric and prompt are in the file; you can read them.

The rubric

Five axes. The weights were chosen to mirror the relative impact of the published signals, not to be precise about them.

Axis Weight Maps to
Reply pull 35% reply (+13.5) and the trigger for reply_engaged_by_author (+75)
Conversation depth 25% The author-side lever on reply_engaged_by_author
Negative-feedback risk 17% negative_feedback_v2 (-74)
Media + dwell 15% video_playback50, good_click, good_click_v2
Link tax 8% Candidate-retrieval penalty on out-links (suppression behavior documented in the-algorithm/src/scala/com/twitter/timelineranker)

Each axis maps to a file path in the public repos. The agent cites them in its output, so a reviewer can verify any individual claim against the source.

Calibration

I ran Pulse against five posts I'd previously shipped with known performance — three that hit, two that flopped. Four out of four verifiable predictions matched: the three winners scored 71+ and the two flops scored under 50. The fifth post was a wash — borderline score, borderline performance — so I'm not counting it as a hit.

No false positives (no flop that scored high). No false negatives (no winner that scored low). Five posts is not a calibration set; it's a sanity check. The honest claim is that Pulse's failure modes haven't surfaced yet on a sample that small.

The honest caveat

The Twitter algorithm repos were last meaningfully committed in early 2024. X's production ranker has almost certainly drifted since then — new signals, retuned weights, retired branches. Pulse scores against the published weights, not the live ranker. That's a directional tool, not a promissory one.

The argument for using it anyway: the five levers the public source exposes — reply pull, conversation depth, negative-feedback risk, media, link tax — are structural to how recommender systems work. The exact weights drift; the topology of what's worth measuring doesn't. A 2026 production ranker that no longer weights reply_engaged_by_author at exactly +75 is almost certainly still rewarding author-engaged replies more than likes. Pulse grades the structure, not the magic number.

If that's not enough certainty for you, don't buy it. The source is free. Read it yourself.

Where to get it

The studio shipped Pulse as a single Claude Code sub-agent file. Founding-week pricing is $49 through Monday 2026-05-26, then $79. The team license is $999 for unlimited seats.

Page: septimlabs.com/pulse

Open-source sample of the agent format and a redacted slice of the rubric: github.com/septimlabs-code/septim-agents-pack-sample

For context on how the studio builds sub-agents in general, the catalog is at septimlabs.com and the broader sub-agents pack is at septimlabs.com/agents-pack.

Cold close

Read the source. The +75 isn't a secret, just unread.

If you write on X and you've never opened the-algorithm-ml/projects/home/recap/README.md, you are guessing at a game where the rulebook is public. Spend four days. Or spend $49 and skip to the part where a sub-agent grades your drafts against what the rulebook actually says.

Either path is fine. Both end at the same place: stop writing for likes.

Top comments (1)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.