DEV Community

greymoth
greymoth

Posted on

I built a Reddit reply-bot to find posts worth answering. Then I deleted the part that posts.

I'm a solo dev trying to get the first real users for some open-source things I've shipped. Everyone says the same thing: go where your users already are, be useful, don't spam. For a developer that mostly means Reddit. So I did the obvious engineer move and started automating it.

Here's what I built, what I measured, and the one feature I ripped out on purpose.

The build

A small Node script, zero dependencies. It pulls the public /new.rss feed from ~14 subreddits, scores each post for how relevant it is to what I actually know (AI agents, Claude Code, OSS, i18n/Japan-market stuff), dedupes against what it's already seen, and ranks the top ones with a suggested angle for a reply. No API key, no login. Just RSS and a scoring function.

First real run, honest numbers: Reddit rate-limits non-browser RSS hard. My naive pass got 2 of 14 subs before everything started returning 429. After I added backoff that respects Retry-After, a retry recovered 7 of 14. From those it pulled 175 posts and scored 49 as "relevant."

The first thing that was wrong

49 relevant, but most of the high scorers were "I built X" launch posts. They score high because they're packed with the same keywords I care about. They are also the worst possible thing to reply to. Showing up under someone's launch as a fellow builder reads as competitor noise, and the person posting wants validation, not your hot take.

The posts actually worth answering were the question-shaped ones. "Why does Claude Code feel like it's getting dumber across a long session?" "What am I doing wrong, I'm burning my whole limit per prompt?" Those I can answer from real experience, because I hit the same walls daily. So I added negative weights for launch language ("I built", "introducing", "check out my") and boosted question signals. Out of 49, about 3 were genuine fits. Three. That ratio surprised me, and it's the actual lesson: the bottleneck was never finding posts, it was finding the few worth my time.

The part I deleted

The plan, of course, was to close the loop: generate the reply, post it, move on. Maybe even auto-upvote things while I was at it.

I deleted all of it before it ever ran.

Two reasons, and neither is moral. First, Reddit treats automated commenting and especially automated voting as manipulation, and a faceless low-karma account doing it gets shadowbanned fast. The account is the asset I'm trying to build. Automating it away is setting the thing on fire to keep warm. Second, an AI-written reply that's optimized to look human is still an AI-written reply, and people who spend all day on a subreddit can smell it. The thing that actually makes a reply land isn't clever phrasing, it's a real first-hand detail only you have. A bot doesn't have those. I do.

So the tool stops one step early. It finds the 3 posts, drafts a reply for each, and hands them to me. I read the thread, tweak a line so it's in my voice that minute, and post it myself. Ninety seconds. The machine does the 90% that's tedious (watching, filtering, drafting) and I do the 10% that's the entire point.

Where the "real detail" comes from

This only works if you actually have lived details to drop in. Mine come from the boring stuff: I've had pull requests merged into Medusa, Jan, and Memos, so when someone asks about contributing or about a specific bug class, I'm not guessing. When someone asks why their long Claude Code session degrades, I can tell them what I measured in my own sessions, not what a blog said. That's the part you can't automate and can't fake, and it's also the part that converts a stranger into someone who remembers your name.

If you're tempted to build the same thing

Build the finder. It's genuinely useful and it's an afternoon. Skip the poster. The leverage you want is "never miss a good question," not "reply to everything." Aim the automation at your attention, not at the submit button.

I'll probably open-source the radar once the scoring is less embarrassing. If you want the gist: RSS in, relevance + intent scoring, negative weights on launch-speak, human in the loop at the end. That's the whole thing.

Top comments (0)