For a long time, my “performance engineering workflow” as a Tech Lead looked like this:
- Log into New Relic
- Run a handful of NRQL queries
- Inspect slow transactions and error traces
- Map issues back to Drupal code
- Estimate effort and impact
- Create JIRA tickets with enough context
- Post updates in Teams
It’s valuable work, but it is also repetitive, mechanical, and very interrupt‑friendly. It was quietly costing me 2-3 hours every week.
So I automated it.
This post walks through the workflow I built using Claude Code, the New Relic MCP, Jira, and Microsoft Teams to:
- Continuously analyze performance and error data
- Generate structured root cause analysis with code references
- Create and prioritize Jira tickets
- Notify the team with severity‑specific alerts
- Optionally draft pull requests for straightforward fixes
Why This Was Worth Automating
As a Tech Lead on a large Drupal platform, my time is best spent on:
- Architecture and design decisions
- Reviewing high‑impact changes
- Mentoring and unblocking engineers
- Shaping priorities with product and leadership
But performance issues don’t care about calendars.
Every incident or regression forced me back into the same manual loop: query New Relic, decipher traces, reverse‑engineer root causes, and turn them into actionable tickets. It was important, but it wasn’t leverage.
The workflow in this post exists to do one thing: remove the mechanical part of performance engineering while keeping the judgment and risk decisions in human hands.
High‑Level Architecture
At a high level, the system looks like this:
- New Relic collects APM metrics, traces, and error data from our Drupal application.
- Claude Code (via the New Relic MCP) pulls that data, analyzes it, and decides what’s worth acting on.
- Jira receives structured issues with metrics, root causes, effort estimates, and links back to New Relic.
- Microsoft Teams gets severity‑color‑coded notifications so the right people see the right issues at the right time.
- GitHub (optionally) receives draft pull requests for straightforward fixes the AI can safely propose.
See full resolution image here. This architecture keeps responsibilities clear:
- New Relic is the source of truth.
- AI is used for interpretation and orchestration.
- Jira and Teams are where work and communication actually happen.
- Humans stay firmly in the decision loop.
The Workflow End‑to‑End
Phase 1: Data Collection from New Relic
On demand (or via a scheduled script), the workflow starts with a single instruction, e.g.:
“Execute New Relic performance analysis for production last 1 hour”
Behind the scenes, Claude Code uses the New Relic MCP to run a focused set of NRQL queries:
- Slow transactions: endpoints with response time above a threshold
- Error rates: exception types, messages, and affected routes
- Database performance: slow queries and N+1‑style patterns
- Open incidents and application health: alerts, Apdex, throughput
The goal here is not to recreate the entire dashboard - it’s to pull just enough data for a useful decision.
Phase 2: AI‑Powered Root Cause Analysis
Claude Code then takes that raw telemetry and turns it into something my team can act on:
- Groups slow transactions into meaningful units (e.g.,
/reports/latest, specific admin pages) - Connects issues to Drupal modules, controllers, or custom code paths
- Distinguishes between:
- one‑off spikes vs. consistent degradation,
- user‑facing vs. admin‑only issues,
- backend jobs vs. interactive requests
- Hypothesizes root causes:
- N+1 queries
- missing cache tags/contexts
- heavy external API calls
- misconfigured database access patterns
The important part: this analysis is always explainable. If the suggestion is wrong, it is wrong in a way that is obvious when you read the ticket.
Phase 3: Priority, Impact, and Story Points
Performance work always competes with feature work, so the system needs to express impact in a language the team understands.
The workflow classifies each issue as Critical, High, or Medium based on thresholds like:
- Response time ranges
- Error rate percentages
- Whether the issue is user‑visible
- Whether functionality is partially or fully degraded
From there, it estimates story points (1/2/3/5/8) using a simple heuristic:
- Complexity (single query tweak vs. cross‑module change)
- Scope (one endpoint vs. a subsystem)
- Risk (low‑risk cache change vs. behavior‑changing refactor)
- Effort (hours vs days)
These are not perfect, but they are consistent – which is often more useful than “perfect but ad hoc”.
Phase 4: Jira Ticket Generation
For every actionable issue, the workflow calls the Jira API to create a ticket in the current active sprint.
Each ticket includes:
- A descriptive title (e.g.
[Performance] /reports/latest endpoint – 650% response time increase) - A summary of the issue and affected environment
- Metrics:
- average and p95 response time
- error rate
- timeframe and estimated impact
- Root cause analysis in plain language
- Affected code (modules, file paths, and line numbers where possible)
- Suggested fix (cache changes, query optimizations, config sync, etc.)
- Deep links back to the relevant New Relic views
This is the difference between “we should look into that spike” and “here is an actionable story ready for a sprint board”.
Phase 5: Teams Notifications and Optional PRs
Once tickets are created, the workflow posts an Adaptive Card to the right Teams channel:
Every card includes:
- Short description of the issue
- Key metrics (response time, error rate, environment)
- Link to the Jira ticket (and through that, to New Relic)
- Story points and priority
For certain well‑scoped cases (like adding cache metadata or adjusting a specific query), there is also an optional step:
“Yes, attempt to fix this issue”
When I explicitly opt in, Claude Code will:
- Read the relevant files
- Propose a code change
- Run local commands/tests where available
- Open a draft PR linked to the Jira ticket
Nothing merges automatically. Manual review and CI are still required.
Guardrails and Risk Management
The part that made this usable in a real production environment was not “more automation”; it was more guardrails.
Some of the key ones:
-
No confidential data in prompts or artifacts
- No Jira IDs in public logs or documentation
- No customer identifiers or business metrics
- No secrets, URLs, or internal hostnames
-
New Relic as source of truth
- The AI analyzes existing metrics; it does not invent data
-
Human‑in‑the‑loop by design
- Every ticket is reviewed before the team sees it
- PRs are suggestions, not actions
-
Opinionated thresholds
- Hard lines for Critical / High / Medium to avoid alert fatigue
- The system prefers fewer, higher‑quality tickets to a noisy firehose
These are boring details, but they are also the difference between “cool demo” and “thing we actually trust”.
ROI: What This Changed in Practice
In terms of time:
- Manual performance checks and ticket creation went from ~2–3 hours/week down to about 15 minutes of review.
- Incident response is faster because there is less friction between “we saw something weird in New Relic” and “there is a ticket with a clear owner and plan”.
In terms of quality:
- We miss fewer issues, especially slow degradations.
- Tickets come with better context and suggested fixes.
- Standups focus more on trade‑offs (“Do we take this now or next sprint?”) and less on “What exactly is going on?”.
And in terms of team dynamics:
- Developers can pick up performance work without having to live in New Relic.
- The platform feels more “observed” without feeling more “surveilled”.
Implementation Notes
At a high level, this stack uses:
- Application: Drupal on Acquia
- Monitoring: New Relic APM, NRQL queries, incidents
- AI: Claude Code with the New Relic MCP
- Ticketing: Jira Cloud (REST API v3)
- Communication: Microsoft Teams (incoming webhooks)
- Version Control: GitHub (for optional PR automation)
From a configuration perspective, the main work is:
- Wiring the New Relic account and NRQL queries into the MCP
- Wiring Jira, Teams, and GitHub credentials through environment variables
- Defining thresholds and ticket templates that reflect your actual workflow
The exact code in my setup is specific to our environment and not open‑sourced (yet), but the pattern is portable to any stack where you have:
- Structured telemetry
- A programmable AI agent
- A ticketing system
- A chat/notification channel
Credits
This workflow only exists because of the people around me.
- Shannon Lal, our CTO, pushed us to adopt Claude Code and gave me the runway to experiment with this in a real system.
- Maria Parra Pino and Ruslana Zagrai, two of the developers on the team, were the ones stress‑testing the idea, calling out edge cases, and helping refine it into something that is actually usable day‑to‑day.
And of course, thanks to New Relic for building a platform and MCP integration that made it possible to treat APM data as something to automate against, not just stare at.
When This Pattern Makes Sense
In my experience, this kind of automation works well when:
- The workflow is repeatable and well‑understood.
- The inputs are observable and reliable (like APM telemetry).
- The outputs can be expressed as structured work (tickets, PRs, notifications).
- You are willing to keep humans in the loop.
If you’re running New Relic and Jira already, the leap from “manual checks” to “AI‑assisted performance engineering” is more about design and guardrails than about exotic technology.
If you end up building a variant of this, I’d genuinely love to hear what worked, what broke, and what you did differently.


Top comments (0)