Pranay Ravi

Posted on May 21

Automated 25 Minutes of My Morning With a Prompt (Not a Script)

#devops #monitoring #ai #productivity

Every serious engineering org I've worked in has the same split personality.

One side: A modern observability stack. AppDynamics, Datadog, whatever the current favorite is. Real-time metrics, distributed traces, beautiful dashboards, alert routing. Years of investment. It works.

The other side: A long tail of legacy monitoring tools that predate the current stack by a decade. Tools that don't speak OpenTelemetry. Tools that can't push to a webhook. Tools whose output format is, and will remain, a 73-row HTML table emailed at 11:15 UTC every morning.

These two sides don't talk to each other. And in most orgs, someone fills that gap manually for 25–30 minutes every morning.

That someone was me.

The Gap Nobody Draws on the Architecture Diagram

The box that should exist on every enterprise architecture diagram but never does:

"Dave reads his email for 25 minutes"

This is what I call an observability seam — the boundary between your modern monitoring stack and legacy outputs it can't ingest. These seams exist because:

Legacy tools were built before modern observability conventions
Integration work is permanently lower priority than feature work
The cost is diffuse (distributed across many humans' mornings) rather than concentrated, so it never hits "fix it" priority

The insight I kept coming back to: this isn't a technology problem. It's an architectural surface area nobody drew. Once you see it that way, the solution becomes tractable.

What the Legacy Tail Actually Looked Like

My specific environment had four data sources, none of which fed the observability stack:

Source	Format	Frequency
Replication latency alerts	Email (body text)	Multiple times daily
DB backup status report	Email (73-row HTML table)	Daily at 11:15 UTC
Infrastructure/AWS notifications	Email (subject-line pattern)	20–50 per day
Deployment Review	Confluence page backed by Jira macros	Checked manually

None of them have APIs. None of them feed my SIEM or APM. The only integration surface available was: it arrives as email, or it lives at a known URL.

That constraint is also the opportunity.

The Architecture: Agent as Integration Layer

The core decision was to treat an LLM agent not as a productivity tool, but as a seam-closing integration layer — something that sits between heterogeneous, unstructured legacy outputs and a human who needs a consistent, actionable daily summary.

The agent does four things each morning:

1. Replication Latency — Signal extraction from noise

Find the latest alert, read the body, and report whether the body is actually populated.

This sounds trivial. It isn't. Replication alerts sometimes fire with empty bodies — the email arrived, but the content didn't. Manually, this is easy to miss because you see the email and assume it's fine. The agent makes "body is empty" an explicit flagged state. That's catching a monitoring failure, not just monitoring an alert.

2. Backup Report — Structured extraction from semi-structured HTML

Parse the 73-row HTML table, tally the BACKUP_STATUS column, surface only the rows that didn't complete.

Output: "69 completed, 0 failed, 3 with no backup — db01, db02, db05."

Reading time: three seconds. A human eyeballing 73 rows is slower and occasionally misses things.

3. Infrastructure Notifications — Classification at scale

Twenty to fifty emails daily, each following the pattern STATUS - Alert (Server Name : X and DB Name : Y).

The agent classifies every subject line, counts by status, and surfaces only the non-SUCCESS entries with a direct link to each email.

Output: "21 SUCCESS, 1 WARNING on SERVER04U / DB09."

That one line is the entire actionable output of fifty emails.

4. Deployment Review — Live data from a known URL

Read the Confluence page, identify the Jira macro structure, query the underlying filters directly, and report live issue counts. Bridges the gap between "someone updated a Confluence page" and "here's the actual current state of production deployments."

Why an LLM Instead of a Python Script?

A reasonable engineer will ask: couldn't this be IMAP + regex + cron?

Yes — and in many cases that's the right answer. Large enterprises already run Airflow, Power Automate, Splunk SOAR, Logic Apps. This isn't automation entering a vacuum.

The honest comparison:

Approach	Strength	Breaks when…
Python + regex + cron	Deterministic, auditable, fast	Email format changes, new alert pattern appears
ETL / SOAR pipeline	Scalable, governed, integrated	Requires schema agreement upstream
LLM agent	Tolerates format variance, low setup cost	Output is nondeterministic, harder to audit

What the LLM reduces is integration friction — not the need for integration.

The backup report column order can shift. The AWS notification subject-line pattern can vary. A new data source can be added in a sentence of English rather than a week of schema work.

That flexibility has a cost: you trade determinism for adaptability. For a morning triage digest where a missed edge case surfaces in the next run, that tradeoff is acceptable. For a system that triggers automated remediation, it isn't.

Right mental model: LLM agents lower the cost of closing observability seams. They don't replace deterministic pipelines where correctness guarantees matter.

Design Principles Worth Keeping

Independent section isolation

Each source is handled independently. A connector outage in one section doesn't abort the others — the agent renders an "unavailable" note and continues. A partial report is vastly more valuable than a silent failure.

Absence as a signal

Each section includes a sanity check that fires if no emails are found for that source.

"No replication alerts received today" is bolded as a warning, not silently skipped.

This paid off in month two, when the backup report email silently stopped arriving after a mail routing change. The sanity check flagged it immediately. Manual triage would have assumed a clean run.

Delta orientation

The report answers "what changed or failed since yesterday", not "what is the current state of everything."

Confirming sameness has no information value but costs the same attention as confirming change. A delta-oriented report routes attention only where it's needed.

Runtime date resolution

The prompt never hardcodes a date. "Today" is resolved against the local clock at execution time. Small thing. Prevents a class of subtle bugs where a stale prompt runs with yesterday's scope.

The Prompt (Sanitized)

The prompt itself is the deliverable worth preserving:

You are producing the daily 9 AM operations report. Run silently — this prompt is self-contained.

REQUIRED CONNECTORS
- Microsoft 365 / Outlook (email search + read_resource on mail:// URIs)
- Atlassian (getConfluencePage; optionally searchJiraIssuesUsingJql)

If any connector is missing or errors out, render that section with a clear
"⚠ connector unavailable" note and continue. Never abort the whole report.

SCOPE OF "TODAY"
"Today" = the calendar day the task fires, in local time.
Always pass afterDateTime: "today" to outlook_email_search.

====
SECTION 1 — Replication Latency Monitoring (latest alert today)
====
1. Search for latest alert today.
2. read_resource on its URI for the full body.
3. Report: count of alerts today, receivedDateTime, sender, and either the
   body text or "Body is empty" if it's only whitespace.

====
SECTION 2 — DB Backup Report (failure check)
====
1. Find today's report email.
2. Parse rows; tally BACKUP_STATUS column.
3. State explicitly: "X completed, Y failed, Z with no backup."
   **Bold** the failure count if > 0.

====
SECTION 3 — Infra notifications today
====
1. Fetch alerts from sender, limit 50.
2. Parse subject pattern: "STATUS - Alert (Server : X and DB : Y)"
3. Summarize counts by status. **Bold** anything not SUCCESS.
4. List non-SUCCESS items with a webLink to each.

====
SECTION 4 — Deployment Review (Confluence)
====
1. getConfluencePage, contentFormat="html".
2. If JIRA tool available, run each filter and include live issue counts.

====
OUTPUT FORMAT
====
- Plain markdown. One H1 with today's date. Four H2 sections in order.
- **Bold** anything requiring attention.
- End with "Sources:" — no duplicate links.
- Tone: factual. No filler. No apology lines.

====
SANITY CHECKS
====
- Section 1 zero results: **"No replication alerts today — monitoring may be down."**
- Section 2 zero results: **"Backup report email not received yet."**
- Section 3 zero results: **"No infra notifications today — verify alerting pipeline."**
- Section 4 failure: include the direct Confluence URL for manual access.

Three things worth noting:

Section independence is explicit. Each section is instructed to fail gracefully and continue. Fault isolation encoded directly in the prompt.

Sanity checks are the highest-value lines. Flagging "no email received today" catches the failure mode where the upstream monitoring system has broken silently — the exact failure mode that checklist-style human review tends to miss.

The prompt is a versioned artifact. As the environment changes — new email formats, new Confluence pages, connector upgrades — the prompt evolves. Treat it as a living document.

Honest Tradeoffs

This wouldn't be a useful post without naming the downsides.

LLM output is nondeterministic. Identical inputs can produce slightly different summaries across runs. I spot-check the raw email count against the agent's tally once a week.

Token truncation is real. A 73-row HTML table pushed through a context window can get silently trimmed. Defensive prompting mitigates but doesn't eliminate this.

The desktop-up dependency. In my setup, scheduled tasks fire when the desktop app is open. A laptop asleep at 9 AM means the report runs when you open it. Acceptable for a daily digest; not for time-critical monitoring.

Trust takes time to calibrate. The first week I ran this, I verified everything manually anyway. By week four, I'd stopped double-checking, and I had enough incident data to know the report was reliable. That calibration period is part of the deployment.

The Pattern Generalizes

Once you see the shape — agent reads heterogeneous sources at a known cadence, filters to deltas, surfaces only what requires human attention — it appears everywhere:

On-call handoff summaries (PagerDuty + Slack + Jira + APM, joined and summarized)
Sprint health reports (blocked tickets, aging tickets, no-owner tickets — weekly, automated)
Security alert triage (SOC mailers often emit 200 events to find 3 that matter)
Compliance evidence collection (SOC 2 renewals are largely a gathering problem, not an analysis problem)

The domain changes. The architecture doesn't.

The Mental Model Shift

The efficiency gain is real — roughly 25 minutes a day × 250 working days ≈ 100 hours annually. But that's not the most interesting outcome.

From push to pull. Most monitoring tools send you everything and you decide what matters. The cost of sending is zero for the system and enormous for the human. An agent flips that: I'll ask each morning for the delta.

From dashboards to digests. A dashboard is only valuable if someone looks at it. A daily digest that synthesizes five sources and filters to a single page is more operationally useful than five perfect dashboards, because it solves the actual constraint — attention — rather than the assumed one — visualization.

From checklists to deltas. The morning health check used to be: open these five places, confirm each is green. That's cognitive load spent confirming sameness. The right version is only tell me what changed.

What This Is, and Isn't

This is not an argument for replacing your observability infrastructure with LLM prompts. Your APM is doing things this agent will never do: real-time anomaly detection, distributed trace correlation, infrastructure topology mapping. The agent doesn't compete with that.

What the agent does is address the seam — the gap between the modern observability stack and the legacy outputs it can't ingest.

Making the seam visible is the first step. Closing it is the second. The agent is the bridge.

If your morning starts with "let me just check a few things in my email" — you have an observability seam. The tooling to close it is available now, and the implementation cost is a well-written prompt.

What observability seams do you have in your environment? Curious what data sources people are working around. Drop a comment.

DEV Community