DEV Community

Cover image for How AI Is Changing SRE Workflows (Without Replacing SREs)
Samson Tanimawo
Samson Tanimawo

Posted on

How AI Is Changing SRE Workflows (Without Replacing SREs)

I get asked this question a lot: 'Is AI going to replace SREs?' Short answer: no. Long answer: AI is changing what SREs spend their time on, and the SREs who adapt will have a huge edge.

What AI is actually good at in SRE workflows

1. First-pass triage. AI can look at 50 alerts and tell you the 5 most likely to be related to an ongoing incident. Beats manual correlation every time.

2. Log summarization. 100,000 log lines into a 20-line summary highlighting anomalies. The summary isn't always right, but it's a starting point.

3. Runbook generation. Given an alert type and historical incidents, AI can draft a runbook. You edit it; you don't write from scratch.

4. Post-mortem first drafts. Pull from chat logs, ticket history, monitoring data. Generate a structured timeline. Human polishes it.

5. Routine query generation. 'Show me the error rate for service X grouped by endpoint for the last 24 hours.' AI writes the query, you run it.

What AI is bad at

1. Judgment calls under pressure. When multiple things could be wrong, a good SRE uses instincts built from years of experience. AI guesses.

2. Novel failures. AI is pattern-matching on history. A truly new failure mode looks like noise to it.

3. Organizational politics. 'Who do I wake up at 3 AM' is not a technical question. AI doesn't help here.

4. Accountability. When something breaks, someone needs to own the decisions that got made. AI can't own anything.

The new SRE workflow

AI does the first 30%. Human does the crucial middle 40% (judgment, decision-making, stakeholder communication). AI does the last 30% (writing up, following up, documenting).

This lets SREs handle 2-3x more incidents with the same quality. Not by working harder — by delegating the mechanical parts.

What to learn

If you're an SRE and you're not using AI tools in your workflow, you're leaving 30% of your productivity on the table. Not because AI is magic, but because the boring parts of your job don't need a human.

Start small: feed an alert into ChatGPT or Claude and ask for possible causes. See what you get. Then add it to your actual on-call tooling.


Written by Dr. Samson Tanimawo
BSc · MSc · MBA · PhD
Founder & CEO, Nova AI Ops. https://novaaiops.com

Top comments (0)