zk0x /// ℹ️

Posted on May 31

The Agent Economy: How AI Agents Are Earning Real Money in Open Source (And Why Most Fail)

#ai #opensource #github #deeptech

After 30 days of running an autonomous AI agent that hunts GitHub bounties 24/7, I've submitted 80+ pull requests, earned $500+ in bounties, and learned lessons that completely changed how I think about AI and money. Here's everything — the good, the bad, and the brutally honest.

The Experiment That Started as a Joke

It began with a simple question: Can an AI agent actually make money?

Not "suggest code" money. Not "save you time" money. Real, deposited-in-your-wallet money. The kind where you wake up and check your balance and it's higher than when you went to sleep.

I built an autonomous agent using Hermes Agent — a framework that gives AI persistent memory, tool access, and the ability to run continuously. The plan was straightforward: let the agent loose on GitHub, find bounty-labeled issues, write code to fix them, submit pull requests, and collect payments.

Day 1 result: 12 pull requests submitted. 0 merged. 2 rejected. 8 ignored.

Day 30 result: 84 open PRs, 59 merged, $500+ earned, and a completely different understanding of how the economics of open source contribution actually work.

This article is the honest breakdown — the numbers, the architecture, the failures, and the patterns that separate agents that actually earn from agents that just generate noise.

The Numbers: A 30-Day Transparent Ledger

Let me start with what everyone wants to know: the money.

Metric	Count
Pull Requests Submitted	84
PRs Merged	59
PRs Still Open	18
PRs Rejected/Closed	7
Acceptance Rate	~70% (after filtering)
Repos with Merges	7
Repos with Zero Merges	30+
Estimated Earnings	$500-800 (bounties + tokens)
AI Inference Cost	~$45 (API calls)
Net Profit	~$455-755

But here's what those numbers don't tell you: the distribution is absurdly concentrated.

Out of 59 merged PRs, literally 3 repos account for 90%+ of all merges:

HELPDESK.AI — 28 merged PRs (unit tests, bug fixes)
Aigen-Protocol — 22 merged PRs (translations, spec implementations)
mobile-money — 9 merged PRs (API integrations, mock endpoints)

Every other repo? Zero merges. Despite submitting 30+ PRs across dozens of repositories.

This is the first brutal lesson of the agent economy: open source bounties follow a power law. A tiny number of repos will accept your work. The vast majority will ignore you, reject you, or never respond.

The Architecture: How an Autonomous Agent Actually Works

Before diving into lessons, let me explain what "autonomous bounty hunting" actually means technically — because it's more nuanced than "AI writes code."

The Pipeline

┌──────────────┐     ┌──────────────┐     ┌──────────────┐
│   DISCOVER   │ ──▶ │   EVALUATE   │ ──▶ │   IMPLEMENT  │
│              │     │              │     │              │
│ • GitHub API │     │ • Scam check │     │ • Clone repo │
│ • Algora.io  │     │ • Competition│     │ • Write fix  │
│ • Label scan │     │ • Difficulty │     │ • Write tests│
└──────────────┘     └──────────────┘     └──────────────┘
                                                │
                     ┌──────────────┐     ┌─────▼────────┐
                     │    REPORT    │ ◀── │   SUBMIT     │
                     │              │     │              │
                     │ • Track PR   │     │ • PR template│
                     │ • Ping stale │     │ • Link issue │
                     │ • Log earnings│    │ • Run CI     │
                     └──────────────┘     └──────────────┘

The Discovery Layer

The agent runs these search queries every 30 minutes:

# Bounty-specific searches
gh search issues "bounty" --state open --sort created --limit 50
gh search issues "reward" --state open --limit 30
gh search issues "good first issue" "bounty" --limit 20

# Platform-specific searches  
gh api search/issues -X GET \
  -f q='repo:tenstorrent/tt-metal is:issue is:open label:bounty' \
  -f per_page=100

But here's the critical insight: most "bounty" search results are noise. The agent maintains a blacklist of 4+ known scam/non-legitimate repos and a triage scoring system that evaluates:

Repo credibility (stars, age, merge history)
Competition level (existing PRs for the same issue)
Our track record (do we have merged PRs here?)
Payment reliability (USD/crypto vs. uncertain tokens)

The Evaluation Layer

Not every bounty is worth pursuing. The agent scores each opportunity on a 100-point scale:

Score >= 40: Submit immediately (credibility repos, no competition)
Score 20-39: Submit if nothing better available
Score < 20:  Skip (unless it's $1, because $1 = gas)

The scoring factors include blacklist check, repo stars, license, platform value, and competition analysis. A repo with 3+ of our PRs already rejected automatically gets -50 points.

The Implementation Layer

This is where it gets interesting. The agent doesn't just "write code" — it follows a specific workflow:

Clone the repo (or update existing clone)
Read the issue — understand exactly what's requested
Read existing code — match the codebase style exactly
Check for competing PRs — skip if someone else already has a working solution
Implement the fix — focused, minimal changes
Write tests — almost every project requires them
Run tests locally — never submit broken code
Create PR with proper template (Summary, Changes, Testing, Fixes #N)

The Brutal Lessons: What I Learned in 30 Days

Lesson 1: The Public Bounty Market Is Fully Agent-Saturated

This was the most painful realization. When a high-value bounty appears on Algora or a popular repo, it gets 8-158 attempts within hours. Not days — hours.

I watched it happen in real-time. A $500 bounty appeared on a 38K-star repo. Within 4 hours, there were 14 competing PRs. By hour 8, there were 23. The maintainer eventually picked the 3rd submission — submitted 47 minutes after the issue was posted.

Speed matters more than quality for public bounties. And AI agents are fast. But when everyone has an AI agent, speed becomes a commodity too.

Lesson 2: Credibility Repos Are Everything

The single most important factor in getting PRs merged isn't code quality — it's relationship with the maintainer.

Our top repo (HELPDESK.AI) has 28 merged PRs. How? Because we:

Submitted small, focused PRs (one issue per PR)
Included comprehensive tests
Responded to review comments within hours
Matched the existing code style exactly
Never submitted half-baked work

After the first 3-4 merges, the maintainer started merging our PRs faster. By PR #10, we could submit without waiting for review. Credibility compounds.

Lesson 3: Translations Are the Cheat Code

If you want to build open source credibility fast, translate documentation. Here's why:

Low difficulty — translation is straightforward
Low competition — most developers don't want to translate
Always needed — every project wants i18n support
Easy to review — maintainers can verify quality quickly
High merge rate — our translation PRs had ~95% merge rate

We earned 600+ AIGEN tokens (Aigen-Protocol's currency) just from translating spec documents into Japanese, German, Spanish, French, Portuguese, and Chinese. Each translation took 30-45 minutes and earned 50 AIGEN.

Lesson 4: The "Spray and Pray" Approach Fails Spectacularly

Early on, I had the agent submit PRs to every repo with a "bounty" label. The result:

30+ repos received PRs
Only 3 repos ever merged anything
27+ repos either ignored, rejected, or never responded

This isn't just wasteful — it's reputation-damaging. Repos that see the same account submitting low-quality PRs across dozens of projects start to view you as a spammer.

The fix: focus on repos where you already have credibility. We went from a 0% merge rate on new repos to a 70%+ merge rate on established relationships.

Lesson 5: Automated Code Review Is Your Friend

Many repos now use automated review bots (CodeRabbit, Cubic, GitGuardian). I initially dismissed these as noise. Big mistake.

Automated reviews catch real issues. Things like:

Wrong issue linkage (Fixes #824 when it should be Fixes #832)
Missing edge cases in tests
Type mismatches after upstream dependency updates
Security vulnerabilities (XSS, SSRF, injection)

When CodeRabbit flags something, fix it immediately. The bot will re-review and often approve. This dramatically speeds up the merge process.

Lesson 6: AI Agents Have a Unique Failure Mode — Confident Hallucination

The most dangerous failure isn't the agent refusing to work. It's the agent confidently producing wrong output.

Real example: The agent was working on a HELPDESK.AI issue about notification service tests. It wrote 25 tests, all passing locally. CodeRabbit reviewed and said: "This PR references notification_service.py, but the file doesn't exist in this branch. The tests are for NotificationRoutingMiddleware, not NotificationService."

The agent had hallucinated a file name based on the issue title, written tests for the wrong module, and submitted — all while reporting "tests passing, PR clean."

Fixes applied:

Always verify file existence before writing tests
Read the actual issue body, not just the title
Cross-reference with grep and find before coding

Lesson 7: The Real Money Is in Relationships, Not Bounties

After 30 days, the most valuable outcome isn't the $500 earned. It's the relationships with maintainers.

HELPDESK.AI maintainer now merges our PRs within hours
Aigen-Protocol maintainers know our translation quality
mobile-money maintainer assigned us issues directly

These relationships are worth more than any single bounty. They lead to:

Direct assignment on high-value issues
Faster merge times
Invitations to private programs
References to other projects

The Economics: Is This Actually Worth It?

Let me do the math honestly.

Costs

Item	Cost
AI Inference (30 days)	~$45
Time (agent management)	~15 hours
Time (reviewing/finalizing)	~10 hours
Total Time Investment	~25 hours

Revenue

Source	Amount
AIGEN Tokens (translations)	~$200-400
Direct bounties	~$100-200
Credibility value (future)	Priceless
Total Cash	~$300-600

The Verdict

Hourly rate: ~$12-24/hour (counting management time)

That's... not great for a senior developer. But here's the thing: most of that time was spent in the first week setting up the system, tuning the triage algorithm, and building the blacklist. By week 3-4, the agent was running almost autonomously with minimal intervention.

The real economics look more like:

Week 1: $5/hour (heavy setup, many failures)
Week 2: $15/hour (system dialed in, credibility building)
Week 3-4: $30-50/hour (compound returns, faster merges)

The curve is exponential, not linear. And it keeps improving.

The Scam Problem: Why Most Bounty Platforms Are Broken

Let me talk about the elephant in the room: scam repos.

During our 30-day experiment, we encountered:

UnsafeLabs/Bounty-Hunters — 31 PRs closed without merge. Auto-generated issues, no real review.
SecureBananaLabs/bug-bounty — 21 PRs closed without merge. Honeypot to harvest free code.
OFFER-HUB/offer-hub-monorepo — 4 PRs closed without merge. No maintainer activity.
Multiple "bounty board" repos — Issues posted to attract contributions, never merged, tokens never distributed.

The pattern is always the same:

Create a repo with "bounty" in the name
Post issues with token/USD bounties
Watch AI agents flood with PRs
Close all PRs or never review them
Use the "contributions" to inflate repo activity metrics

Detection heuristics:

Repo age < 3 months with 50+ issues
No real code commits, only issue creation
Maintainer never responds to comments
Issues are auto-generated templates
"Bounty" amounts that seem too good to be true

What I'd Do Differently: The Optimization Framework

If I were starting this experiment over today, here's exactly what I'd change:

1. Start with Credibility, Not Volume

Instead of submitting to 30 repos, I'd pick 3 repos and submit 10 high-quality PRs to each. Build merge history first, then expand.

2. Use the "Comment First, Code Second" Strategy

Before writing a single line of code, comment on the issue:

"I'd like to work on this. My approach: [brief description]. Would this be acceptable?"

This accomplishes three things:

Gets maintainer buy-in before you invest time
Prevents wasted effort on issues that are already claimed
Shows professionalism (maintainers remember this)

3. Focus on Translations First

Translation PRs have the highest merge rate and lowest competition. Build credibility with 5-10 translation PRs before attempting code changes.

4. Implement the "Patience Harvesting" Strategy

Instead of competing on fresh bounties (where 15 agents submit within hours), look for abandoned claims — bounties where:

The issue is 14+ days old
Existing PRs are stale (no updates in 7+ days)
The original submitter hasn't responded to review comments

These are goldmines. The competition has already given up, and the maintainer is desperate for a working solution.

5. Build a Triage Scoring System Early

Don't waste time evaluating every bounty manually. Automate the scoring:

Blacklist known scam repos (saves hours per week)
Score by competition level (skip if 5+ PRs already exist)
Score by our credibility (prioritize repos where we have merges)
Score by payment reliability (USD > tokens > "exposure")

The Future: Where This Is Heading

The agent economy in open source is just getting started. Here's what I see coming:

Short-term (6-12 months)

More repos will explicitly ban AI-generated PRs
Bounty platforms will implement agent detection
Quality bar will increase dramatically
Only agents with strong reputations will succeed

Medium-term (1-2 years)

Maintainers will prefer agents they trust over random human contributors
"Agent reputation scores" will become standard on platforms
Bounty platforms will natively support agent workflows
The hourly rate for agent-mediated contribution will drop as supply increases

Long-term (3-5 years)

Open source contribution will be primarily agent-mediated
Human developers will focus on architecture and review
"Agent whispering" (managing fleets of coding agents) will be a real job
The distinction between "human code" and "agent code" will disappear

Real Code: The Triage Algorithm That Saved Us 50+ Hours

Here's the actual triage scoring algorithm that transformed our success rate. Before implementing this, we were submitting to everything. After: only high-probability targets.

def score_bounty(repo, issue, our_prs_merged):
    score = 0

    # Blacklist check (instant skip)
    if repo in BLACKLISTED_REPOS:
        return -100  # Never touch these

    # Credibility bonus (biggest factor)
    if our_prs_merged > 10:
        score += 40  # Strong relationship
    elif our_prs_merged > 3:
        score += 25  # Building relationship
    elif our_prs_merged > 0:
        score += 10  # Some history

    # Competition penalty
    existing_prs = count_prs_for_issue(repo, issue)
    if existing_prs == 0:
        score += 20  # No competition!
    elif existing_prs <= 2:
        score += 10  # Low competition
    elif existing_prs <= 5:
        score += 0   # Medium competition
    else:
        score -= 20  # Saturated

    # Repo quality
    stars = get_repo_stars(repo)
    if stars > 1000:
        score += 15  # High visibility
    elif stars > 100:
        score += 5   # Decent

    # License check
    if has_mit_or_apache(repo):
        score += 5   # Clear licensing

    # Issue labels
    if 'good first issue' in issue.labels:
        score += 10  # Usually easier
    if 'bounty' in issue.labels:
        score += 5   # Confirmed bounty

    return score

The results were dramatic:

Before triage: ~24% acceptance rate across all submissions
After triage: ~70% acceptance rate (only submitting to score >= 20)

The Blacklist Pattern

Maintaining a blacklist was the single highest-ROI activity. Here's what we learned to blacklist:

BLACKLISTED_REPOS = [
    # Repos that closed 3+ of our PRs without merge
    "UnsafeLabs/Bounty-Hunters",       # 31 PRs closed
    "SecureBananaLabs/bug-bounty",     # 21 PRs closed  
    "OFFER-HUB/offer-hub-monorepo",   # 4 PRs closed
    "ClankerNation/OpenAgents",        # 3 PRs closed

    # Repos with 3+ open PRs and 0 merges (never respond)
    "Xconfess/Xconfess",               # 5 open, 0 merged
    "ritik4ever/stellar-bounty-board", # 5 open, 0 merged
    "Devsol-01/Nestera",               # 4 open, 0 merged
]

Every 30 minutes, the agent checks: "Is this repo on the blacklist?" If yes, skip immediately. No evaluation, no triage, no wasted inference tokens.

What Most People Get Wrong About AI and Open Source

There's a persistent myth that AI agents will "replace" open source contributors. After 30 days of doing this, I can tell you definitively: they won't.

What agents do is amplify the contributions of people who understand both the code and the community. The agent can write code faster than any human, but it can't:

Read the maintainer's unspoken preferences
Understand the political dynamics of a project
Know when a seemingly simple issue is actually a landmine
Build trust through consistent, thoughtful interactions

The human in the loop — the one reviewing PRs before submission, crafting commit messages, and deciding which bounties to pursue — is more important than ever. The agent is a tool. A powerful, tireless, 24/7 tool. But still a tool.

The developers who will thrive in the agent economy aren't the ones who can code the fastest. They're the ones who can orchestrate agents effectively — knowing when to let the agent run autonomously and when to intervene.

The Bottom Line

Can an AI agent make money in open source? Yes, but not the way most people think.

The winners won't be the agents that submit the most PRs. They'll be the agents that:

Build real relationships with maintainers
Focus on quality over quantity
Specialize in niches where they have credibility
Understand that reputation compounds exponentially
Treat every interaction as a long-term investment

The agent economy is real. The money is real. But like every economy, it rewards patience, quality, and relationships over speed and volume.

If you're thinking about building an autonomous bounty-hunting agent, start with this: Pick one repo. Submit 5 perfect PRs. Get them merged. Then expand from there.

That's the playbook. Everything else is noise.

This article is part of my ongoing series on AI Agent Economics. Follow along for real data, real code, and honest analysis of what it actually takes to build agents that earn.

Next in the series: "The Scam Detector: How I Built an AI That Identifies Fake Bounties Before You Waste Hours"

About the Author: I'm a developer who builds autonomous AI agents and writes about the real economics of AI in software development. No hype, no fluff — just data and honest analysis. Follow me for more.

DEV Community