DEV Community

Tracepilot
Tracepilot

Posted on

Build a Bounty Verification Agent That Tests PRs & Validates Evidence

Build a Bounty Verification Agent That Tests PRs & Validates Evidence

What we're building: An automated agent that verifies submitted GitHub bounty PRs, runs tests, captures evidence, and produces a clear QA report — all in one traceable pipeline.

Prerequisites

Step 1: Set Up Your Project

mkdir bounty-verifier
cd bounty-verifier
npm init -y
npm install openai tracepilot-sdk node-fetch
Enter fullscreen mode Exit fullscreen mode

Create your environment file:

# .env
GITHUB_TOKEN=ghp_your_token_here
OPENAI_API_KEY=sk-your-key
TRACEPILOT_API_KEY=tp_live_your_key
Enter fullscreen mode Exit fullscreen mode

Step 2: Build the PR Verification Core

Create verifier.js — this is where the actual testing logic lives:

import fetch from 'node-fetch';

export async function verifyPR(prUrl) {
  // Parse owner/repo/PR number from URL
  const match = prUrl.match(/github\.com\/(.+)\/(.+)\/pull\/(\d+)/);
  if (!match) throw new Error('Invalid PR URL');

  const [, owner, repo, prNumber] = match;

  // Fetch PR details
  const prRes = await fetch(
    `https://api.github.com/repos/${owner}/${repo}/pulls/${prNumber}`,
    { headers: { Authorization: `token ${process.env.GITHUB_TOKEN}` } }
  );
  const pr = await prRes.json();

  // Check mergeable status
  if (pr.mergeable === false) {
    return {
      passed: false,
      evidence: { mergeConflict: true },
      summary: '❌ PR has merge conflicts'
    };
  }

  // Get check runs
  const checksRes = await fetch(pr.checks_url, {
    headers: { Authorization: `token ${process.env.GITHUB_TOKEN}` }
  });
  const checks = await checksRes.json();

  const allPassed = checks.check_runs.every(
    run => run.conclusion === 'success'
  );

  return {
    passed: allPassed,
    evidence: {
      prTitle: pr.title,
      mergeable: pr.mergeable,
      checksPassed: allPassed,
      totalChecks: checks.check_runs.length,
      failedChecks: checks.check_runs
        .filter(r => r.conclusion !== 'success')
        .map(r => ({ name: r.name, status: r.conclusion }))
    },
    summary: allPassed 
      ? `✅ ${checks.check_runs.length} checks passed`
      : `❌ Some checks failed`
  };
}
Enter fullscreen mode Exit fullscreen mode

Step 3: Add the AI Analysis Layer

Now let's make the agent smart enough to analyze PR content and bounty requirements:

import OpenAI from 'openai';

const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

export async function analyzePR(prData, bountyRequirements) {
  const response = await openai.chat.completions.create({
    model: 'gpt-4o-mini',
    messages: [
      {
        role: 'system',
        content: 'You are a QA bounty verifier. Analyze PRs against bounty requirements. Respond in JSON with: { passed: boolean, reasoning: string, evidence: object }'
      },
      {
        role: 'user',
        content: `Bounty requirements: ${bountyRequirements}\n\nPR data: ${JSON.stringify(prData, null, 2)}`
      }
    ],
    response_format: { type: 'json_object' }
  });

  return JSON.parse(response.choices[0].message.content);
}
Enter fullscreen mode Exit fullscreen mode

Step 4: Wire Everything Together with TracePilot

This is where the magic happens. Every step gets traced, so you can debug failures instantly:

import { TracePilot } from 'tracepilot-sdk';
import { verifyPR } from './verifier.js';
import { analyzePR } from './analyzer.js';

const tp = new TracePilot(process.env.TRACEPILOT_API_KEY);

async function verifyBountyPR(prUrl, bountyRequirements) {
  await tp.startTrace('bounty-verifier');

  // Step 1: Fetch and verify PR
  const { result: prResult, spanId: prSpan } = await tp.wrapOpenAI(
    () => verifyPR(prUrl),
    [{ role: 'user', content: `Verify PR: ${prUrl}` }],
    null,
    1
  );

  if (!prResult.passed) {
    return prResult; // Early exit with evidence
  }

  // Step 2: AI analysis of PR against requirements
  const { result: analysis, spanId: analysisSpan } = await tp.wrapOpenAI(
    () => analyzePR(prResult.evidence, bountyRequirements),
    [{ role: 'user', content: 'Analyze PR against bounty requirements' }],
    prSpan,  // Link to parent span
    2
  );

  // Step 3: Generate final report
  const { result: report } = await tp.wrapToolCall(
    'generate-report',
    () => generateReport(prResult, analysis),
    analysisSpan,
    3,
    false  // Not destructive — just generating text
  );

  return report;
}

function generateReport(prResult, analysis) {
  return {
    timestamp: new Date().toISOString(),
    prUrl,
    prTitle: prResult.evidence.prTitle,
    technicalChecks: prResult.passed ? '✅ All passing' : '❌ Some failed',
    aiAnalysis: analysis.passed ? '✅ Meets requirements' : '❌ Needs revision',
    evidence: {
      githubChecks: prResult.evidence,
      aiReasoning: analysis.reasoning
    },
    verdict: (prResult.passed && analysis.passed) 
      ? '✅ APPROVED — Ready for bounty payout'
      : '❌ REJECTED — See details above',
    bountyAmount: '300 MRG'
  };
}
Enter fullscreen mode Exit fullscreen mode

Step 5: Run It

// index.js
import 'dotenv/config';
import { verifyBountyPR } from './verifier-agent.js';

const prUrl = 'https://github.com/mergeos-bounties/mergeos/pull/64';
const requirements = `
- PR must pass all CI checks
- Code must be well-documented
- Must include test coverage
- No merge conflicts
`;

verifyBountyPR(prUrl, requirements)
  .then(report => console.log(JSON.stringify(report, null, 2)))
  .catch(console.error);
Enter fullscreen mode Exit fullscreen mode
node index.js
Enter fullscreen mode Exit fullscreen mode

Adding Observability

You already have TracePilot wired in. Here's what happens:

  1. Every step is traced — PR fetch, AI analysis, report generation
  2. Token costs tracked — See exactly how much each AI call cost
  3. Time-travel debugging — If the AI analysis fails, open the dashboard, fork the trace at step 2, edit the prompt, and replay

One line change to add more visibility:

// Before — standard logging
console.log('PR verified:', prResult);

// After — trace it
const { spanId } = await tp.wrapToolCall('log-verification', 
  () => console.log('PR verified:', prResult),
  parentSpanId,
  4
);
Enter fullscreen mode Exit fullscreen mode

Open tracepilotai.com/dashboard — every verification run appears as a structured trace. Click into any span to see exact inputs, outputs, and timing.

Next Steps

You got this. Here's what to do next:

  1. Add more checks — Clone the repo, run tests locally, verify diff size
  2. Add bounty evidence capture — Screenshot passing checks, save test output as artifacts
  3. Build a web dashboard

Debugging AI agents shouldn't feel like reading The Matrix.
Join other engineers who are building reliable autonomous workflows in our community: TracePilot Discord

Top comments (0)