Imran Siddique

Posted on Feb 5

I Accidentally Built a Spam Bot: The Engineering Lessons from -16 Karma

#moltbook #agentos #agentmesh #agentkernel

We optimized for output. We should have optimized for feedback.

We spend a lot of time discussing AI Safety in terms of "Will it launch nukes?" or "Will it leak keys?"

But there is a much more immediate danger that we ignore: Will your agent just be really annoying?

I recently deployed an autonomous agent to engage with the community on Moltbook (a platform for AI agents). The goal was "Community Engagement." The result was a reputation score of -16 and a near-ban.

I didn't build a community manager. I built a spam bot.

Here is the engineering post-mortem of what went wrong, and the Reputation Circuit Breaker we built to fix it.

The "Infinite Loop" Trap

Our initial architecture was flawed because it was Open Loop.

# The "Spam" Architecture
while True:
    topic = generic_market_research()
    post = llm.generate_thought_piece(topic)
    platform.publish(post)
    sleep(4 * 3600) # Post every 4 hours

From a code perspective, this works. It runs without errors.
From a system perspective, it is catastrophic.

The agent was firing messages into the void without reading the ack (acknowledgment) signals from the community. It posted "Provocative Hot Takes" while the community was silently downvoting it.

The Lesson: An agent without a feedback loop isn't "autonomous." It's just a while loop with a budget.

The Fix: Implementing "Social Backpressure"

We had to treat "Karma" and "Upvotes" not as vanity metrics, but as System Telemetry.

We rewrote the agent's control plane to implement Social Backpressure. Just like a queue rejects new jobs when the database is overloaded, our agent now rejects its own ideas when its reputation drops.

1. The State Machine

We moved from a simple script to a 3-state machine:

High Trust (Poster Mode): Can create new threads. (Requires Karma > 5)
Low Trust (Comment Mode): Can only reply to others. (Karma > 0)
Probation (Lurker Mode): Read-only. (Karma < 0)

2. The Circuit Breaker Code

We implemented a check that runs before any generation step.

// The Reputation Guardrail
async function checkSocialHealth(agentId) {
    const metrics = await telemetry.getAgentMetrics(agentId);

    // Hard Stop: If the community hates us, stop talking.
    if (metrics.karma < 0 && metrics.recentTrend === 'down') {
        console.warn("⚠️ CIRCUIT BREAKER: Negative sentiment detected.");
        return { allowed: false, mode: 'LURKER', reason: 'reputation_repair' };
    }

    // Soft Stop: If engagement is low, stop starting new threads.
    if (metrics.avgCommentsPerPost < 1.0) {
        return { allowed: true, mode: 'COMMENTER_ONLY', reason: 'low_engagement' };
    }

    return { allowed: true, mode: 'POSTER' };
}

This ensures that we earn the right to speak. If the agent creates value (upvotes), it gets more bandwidth. If it creates noise (downvotes), it gets throttled.

The "Recovery Mode"

You can destroy reputation in milliseconds, but rebuilding it takes cycles.

When our agent hit -16, we couldn't just "fix the prompt." We had to change the behavior. We hardcoded a Recovery Strategy into the system prompt:

CURRENT MODE: RECOVERY

Constraint: You are NOT allowed to mention "AgentMesh" or sell anything.

Goal: Restore Karma to > 0.

Strategy: Find questions you can answer. Be helpful. Be brief. Ask questions.

It took 4 days of "comment-only" mode to climb back from -16 to -10.

Conclusion: Trust is an API

We are building the Internet of Agents. If we don't build manners into our protocols, the internet will become unusable.

We need to stop optimizing for "Tokens Generated" and start optimizing for "Signal-to-Noise Ratio."

If your agent doesn't have a module that listens for "Shut up," you haven't finished building it.

Related:

DEV Community