Cristian Angulo

Posted on Jan 14 • Edited on Jan 16

Turning block/goose into an AI SRE Agent

#devops #sre #ai #observability

In this post, I’ll explain how I configured Goose to behave like a real SRE: querying AWS CloudWatch, reasoning about incidents, and operating inside a fully reproducible Nix environment.

What is block/goose?

block/goose is an autonomous agent framework designed to execute workflows using LLMs plus external tools. Think of it as a CLI-native AI operator that can:

Call tools (APIs, CLIs, MCP servers)
Maintain session context
Execute multi-step reasoning loops

What makes Goose interesting for SRE is that it does not try to replace tools — it orchestrates them.

Why an AI SRE Agent?

A real SRE spends most of their time doing:

Investigations across logs, metrics, and alarms
Repeating the same diagnostic steps
Cross-referencing infra state with recent changes
Reducing operational toil

This repo turns Goose into an agent that can:

Query CloudWatch logs, metrics, and alarms
Reason about AWS infrastructure health
Follow structured incident workflows
Act as a first-response investigator

This is not ChatOps fluff — it’s an operational assistant.

Architecture Overview

The setup has three pillars:

Goose as the reasoning engine
MCP servers as tool providers
Nix Flakes as the environment orchestrator

Goose (LLM Agent)
   ├── CloudWatch MCP Server
   ├── AWS CLI
   ├── GitHub MCP (optional)
   └── Local tooling (jq, httpie, docker, task)

Everything runs inside a reproducible dev shell.

The Reproducible Environment (Nix)

Just like all my other projects, this one is driven by a flake.nix.

Why? Because SRE tooling is fragile:

AWS CLI versions
Node vs Python tooling
MCP servers via uvx or npx

Nix eliminates all of that drift.

Once inside the shell, you already have:

AWS credentials
Goose CLI
CloudWatch tooling
JSON processing tools

No README ritual. No “install this first”.

Configuring Goose as an SRE Agent

The heart of the system is the Goose configuration.

Provider & Model

NIXPKGS_ALLOW_UNFREE=1
GOOGLE_GEMINI_MODEL_NAME=gemini-2.5-flash
GITHUB_PERSONAL_ACCESS_TOKEN=
SENTRY_ACCESS_TOKEN=

Fast, cheap, and good at tool orchestration — perfect for SRE tasks.

CloudWatch as a First-Class Tool

This is where Goose becomes useful.

CloudWatch MCP Extension

GOOSE_PROVIDER: google
GOOSE_MODEL: gemini-2.5-flash
extensions:
  cloudwatch:
    enabled: true
    type: stdio
    name: cloudwatch
    description: AWS CloudWatch Observability (metrics/logs/alarms) via awslabs.cloudwatch-mcp-server
    cmd: uvx
    args:
    - awslabs.cloudwatch-mcp-server@latest
    envs:
      region: us-east-1
      profile: default
      FASTMCP_LOG_LEVEL: ERROR
    env_keys: []
    timeout: 30000
    bundled: null
    available_tools: []
  github:
    enabled: true
    type: stdio
    name: github
    description: GitHub MCP Server
    cmd: npx
    args:
      - -y
      - "@modelcontextprotocol/server-github"
    timeout: 30000
  sentry:
      enabled: true
      type: stdio
      name: sentry
      description: Sentry MCP Server
      cmd: npx
      args:
        - -y
        - mcp-remote@latest
        - https://mcp.sentry.dev/mcp
      timeout: 30000

This gives the agent the ability to:

Fetch metrics
Query logs
Inspect alarms
Reason over time-series data

This is the same data an SRE would manually inspect — just faster.

How Goose Behaves Like an SRE

Instead of prompting Goose with generic questions, I treat it like a teammate.

Examples:

“Investigate elevated 5xx errors in the last 30 minutes.”
“Check if latency correlates with a deploy.”
“Summarize CloudWatch alarms for this service.”

Goose:

Chooses the right tool
Queries CloudWatch
Interprets the output
Produces a human-readable diagnosis

No dashboards. No clicking. Just answers.

Why This Works

1. Tools, Not Plugins

Goose doesn’t hallucinate metrics — it queries real AWS APIs.

2. Reproducibility

Anyone can clone the repo and get the same SRE agent.

3. Composability

You can add:

GitHub MCP (for correlating PRs)
PagerDuty
Terraform state inspection
Custom runbooks

The agent evolves with your infrastructure.

Entering the Environment

nix --extra-experimental-features 'nix-command flakes' develop --impure

Why --impure?

AWS credentials
Docker socket
Local MCP binaries

This is intentional and controlled.

What This Is Not

❌ A replacement for on-call engineers
❌ A magic “fix production” bot
❌ Another ChatOps toy

This is:

A first-response investigator
A context-gathering machine
A force multiplier for SREs

Conclusion

SRE work is pattern-based, repetitive, and highly procedural.

That makes it perfect for AI agents — as long as they:

Use real tools
Run in real environments
Respect operational boundaries

With block/goose, MCP servers, and Nix, you get exactly that.

If you’re curious about AI agents that actually do ops, this is a solid place to start.

DEV Community