Bringing AI Agents to CI/CD: Using ToolHive and Buildkite to Bring Intelligence to Vulnerability Scanning

#ai #mcp #security #ci

Continuous Integration and Continuous Deployment (CI/CD) pipelines have traditionally relied on deterministic scripts and predefined workflows, offering predictable results. What if your CI/CD pipeline could think, analyze, and make intelligent decisions? What if it could adapt to complex scenarios, understand context, and provide insights beyond simple pass/fail results?

This is where agentic workflows come in. With the new ToolHive Buildkite Plugin, you can now seamlessly integrate AI agents into your CI/CD pipelines using the Model Context Protocol (MCP).

Use Case: Agentic Vulnerability Scanning

Traditional CI/CD treats all issues equally. A medium-severity CVE gets the same treatment whether it's in a critical path or an unused dependency. Agents change this by understanding:

Context: Where and how a vulnerability can be exploited in your specific architecture.
Impact: The actual risk to your application, not just a generic score.
Remediation: Specific steps that work for your codebase, not generic advice.

Instead of a binary pass/fail, you get nuanced analysis with actionable recommendations. The agent doesn't just tell you there's a problem – it explains why it matters and how to fix it.

In the context of CI/CD, this means your pipeline can become more than just a series of tests – it becomes an intelligent system that understands your codebase, security posture, and deployment requirements. The technology stack for this approach: MCP, ToolHive, and Buildkite.

ToolHive: The MCP Engine

ToolHive is your starting point for running MCP in production. It handles:

Server Lifecycle: Starting, stopping, and managing MCP server instances.
Transport Methods: Supporting multiple communication protocols (stdio, SSE, streamable-http).
Security: Managing secrets, permissions, and isolation.
Discovery: Providing a registry of available MCP servers.

Buildkite: The CI/CD Platform

Buildkite provides a flexible, scalable CI/CD platform that's perfect for running agentic workflows because of its:

Plugin Architecture: Extensible system for adding functionality.
Container Support: Native Docker/Podman integration.
Parallel Execution: Ability to run multiple agents simultaneously.
Artifact Management: Built-in support for storing and sharing results.

How the ToolHive Buildkite Plugin Works

To bring this use case — agentic vulnerability scanning — to life, we created the ToolHive Buildkite Plugin. It bridges these technologies and enables you to spawn MCP servers directly in your CI/CD pipeline. Here's how it works:

Plugin Configuration

In your Buildkite pipeline, you simply add the plugin to any step:

steps:
  - label: "🔍 Security Analysis"
    command: "run-security-scan"
    plugins:
      - StacklokLabs/toolhive#v0.0.2:
          server: "osv"  # OSV vulnerability database server
          transport: "sse"
          proxy-port: 8080

Automatic MCP Server Provisioning

When the pipeline runs, the plugin:

Downloads ToolHive if not already available.
Spawns the MCP server in a containerized environment.
Configures networking to make the server accessible to your agent.
Manages lifecycle ensuring proper cleanup after execution.

Agent Connection

Your AI agent can then connect to the MCP server and use its tools. Here's a snippet from a simple Python agent using the PydanticAI framework:

from pydantic_ai import Agent
from pydantic_ai.mcp import MCPServerSSE

# Connect to the MCP server spawned by the plugin
osv_server = MCPServerSSE("http://localhost:8080/sse")

# Create an agent with access to OSV tools
agent = Agent(
    model=model,
    mcp_servers=[osv_server],
    system_prompt="You are a security analyst..."
)

# The agent can now use OSV tools to analyze vulnerabilities
result = await agent.run("Analyze security vulnerabilities...")

Real-World Example: Intelligent Vulnerability Scanning

Let's look at a concrete example from our demo repository that showcases agentic vulnerability scanning, comparing a traditional approach to an agentic approach

Traditional CI/CD Security Scanning:

# Run a vulnerability scanner
osv-scanner --json > results.json

# Check if any high severity vulnerabilities exist
if grep -q '"severity": "HIGH"' results.json; then
  echo "High severity vulnerabilities found!"
  exit 1
fi

This approach is limited in several ways. There’s a binary pass/fail decision, no context or explanation, no intelligent categorization and no actionable recommendations. In short, it’s functional, but not very useful.

Agentic Security Scanning:

steps:
  - label: "🔍 Intelligent Vulnerability Analysis"
    command: |
      uv run buildkite-demo-agent \
        --packages-file examples/packages.json \
        --fail-on-vulnerabilities \
        --severity-threshold high
    plugins:
      - StacklokLabs/toolhive#v0.0.2:
          server: "osv"
          transport: "sse"

This agent provides a lot more value. It will deliver intelligent analysis by using Claude Code to understand vulnerability context; it will classify the severity of CVEs based on actual impact and not just CVSS scores; and it will offer detailed explanations and specific remediation steps.

The Agent's Intelligence in Action

Here's what happens when the agent analyzes a vulnerability:

Query OSV Database: The agent uses MCP tools to query vulnerability data.
Contextual Analysis: Claude analyzes the vulnerability considering:
- The specific package and version
- The type of vulnerability (RCE, DoS, data exposure)
- The ecosystem and common usage patterns
Intelligent Categorization: Instead of relying solely on CVSS scores, the agent considers:
- Exploitability in your environment
- Actual impact on your application
- Availability of patches or workarounds
Structured Output: Returns actionable information:

Conclusion

There’s a lot of energy around MCP, and we need to channel that into putting the protocol to work in production environments. Agentic vulnerability scanning is a compelling use case and it hints at the potential of agentic workflows in CI/CD pipelines. If you want to give this a try, we created the ToolHive Buildkite Plugin to be simple and secure. Of course, we also encourage you to check out ToolHive for other MCP use cases; you can explore our GitHub repo or reach out directly via Discord.