Discord as My Operator Pager: One Timeline for a Whole Studio

#architecture #automation #devops #honeydrunk

A deploy of mine broke, and I found out about it the way you never want to: days later, by stumbling onto it myself. The failure had been recorded. An email went out. A status flipped red in a dashboard somewhere. Every system did its job. I just never saw any of it, because the signal was scattered across so many places that none of them was the place I actually looked.

That's the failure mode of running a lot of automation alone. Under my repos there's a constant hum of machinery: deploys firing, CI pipelines passing and failing, scheduled jobs waking up to do work, secrets quietly aging toward expiry. When you're one person, the hard question stops being "can I build it" and becomes "do I know what it's doing right now." For a long time the honest answer was no.

The signal was everywhere and nowhere. GitHub notifications were supposed to be my feed, except they were buried under a huge issue backlog, so the inbox that should have told me a deploy broke was the same inbox screaming about forty open issues. Actions failures arrived as email, which I learned to skim past because most of them were noise. Some things only showed up if I went and looked: opening a dashboard, checking a deploy status, remembering to verify a job actually ran. There was no single place I could glance at and trust to tell me the truth. So I missed things. The information existed the whole time. It just had no home.

This post is about giving it one. I made Discord the place where everything my automation does reports in, and I gave that reporting a structure so the important events stay glanceable.

When the Signal Has No Home

The pattern was worse than any one missed alert.

A deploy would go out and I'd find out it worked by using the thing, not by being told. A CI run would fail on a repo I hadn't touched in a week, and the email telling me sat unread between two GitHub digest mails. A scheduled job would skip a night and I'd discover the gap days later. None of these are catastrophic on their own. The problem is the pattern: every kind of operational event lived in a different place, in a different format, with a different chance of me ever seeing it.

On a team, someone is usually watching the boards while someone else builds. Solo, the watching is also my job, layered on top of the building, and human attention does not scale across a dozen notification surfaces. I needed to collapse all of those surfaces into one.

The requirement was simple to state. I wanted a single timeline where, if something happened that I'd care about, it showed up, and where the things I'd care about most were easy to pick out from the things that are just routine chatter.

What a Webhook Is, and Why Discord

The mechanism underneath all of this is a webhook, and it's worth a plain definition because the whole design rests on it.

A Discord webhook is a URL you POST a message to. You send an HTTP request to that URL with some content, and the message appears in a Discord channel. That's the entire contract. There's no bot to host, no gateway connection to keep alive, no library to babysit. Any piece of automation that can make an HTTP request can speak into a channel, which means every part of my studio (a GitHub Action in the cloud, a script on a machine at home) can report in with the same trivial primitive.

Discord earns the role for a boring reason: I'm already in it, on my phone and my desktop, all day. The pager has to be somewhere I already look. Anywhere else just becomes one more dashboard I forget to open. Channels give me natural separation. Mentions give me a way to make the loud things loud. It was already running, so I made it load-bearing.

I'll borrow a word for the messages it carries: operator alerts. An operator alert is a message meant for me, the person running the studio, telling me something about the state of my own systems. A deploy finished. A pipeline failed. A secret is about to expire. The audience is exactly one person, and that fact ends up mattering a lot, which I'll get to.

The Taxonomy: Every Event Has a Known Home

A single channel with everything dumped into it would just be the email inbox again, reinvented. The structure is the point.

So I split operator alerts into channels by signal type. Each kind of event has a known home, and I know what each home means without reading closely. The split looks roughly like this (these are examples of the shape, not a literal channel list):

Deploys. Something went out, where it went, and whether it landed.
CI failures. A pipeline broke, on which repo, with a link to the run.
Security and credential events. A secret or credential did something I need to know about.
Rotation and escalation. A scheduled or sensitive job needs my attention now.

The reason this works is that the channel itself carries meaning before I read a word. A new message in the deploys channel is routine and I can let it scroll by. A new message in the security channel makes me stop. By routing each event type to its own home, the important signals stop competing with the routine ones for the same glance. Glanceability is a property of the layout, not of any single message.

A webhook emits a structured message into the right channel. The emitter knows what kind of event it's reporting, so it knows which channel URL to POST to. That routing decision is the taxonomy, encoded.

Two Stores, Because Alerts Come From Two Places

There's a credential wrinkle here that shaped the implementation, and it's a good illustration of how a small system grows real edges.

Those webhook URLs are secrets. Anyone holding one can post into your channel, so they have to be stored and handed to the emitters carefully. The catch is that my alerts originate from two very different places, and those two places have different ways of holding a secret.

Some alerts come from GitHub Actions, running in the cloud. Those read their webhook URLs from organization secrets, which is GitHub's built-in store for values that workflows are allowed to use. Other alerts come from a local automation worker running on my own hardware. That worker reads its webhook URLs from Key Vault, which is Azure's cloud secret store. Two execution contexts, two secret stores, the same destination channels on the other end.

I could have forced everything through one store, but the contexts genuinely live in different trust domains, and each already had a native, well-guarded place to keep a secret. So I let each emitter use the store that fits where it runs. The taxonomy is shared. The credential plumbing is local to the emitter.

Redaction Is Not Optional

There's one rule in this system I treat as absolute: secrets get redacted before anything is posted.

The reason is uncomfortable once you say it out loud. The whole point of a security and credential channel is to tell me when something is wrong with a secret. If the message announcing that a token is expiring includes the token, then my alerting surface just leaked a credential into a chat application. An alert that leaks the thing it's alerting about is its own incident, and a worse one than the event that triggered it.

So redaction is a property of the emitter, enforced before the POST, every time. The emitter's job is to describe an event in enough detail that I can act, and never enough detail to compromise anything. "A secret in this vault is expiring in three days" is the alert. The secret's value never goes near the channel. I'd rather an alert be slightly too vague than carry a payload I'd regret.

The Boundary: Operator Alerts and Customer Notifications Are Different Concerns

This is the part I hold most firmly, so I'll state it plainly as two separate things.

Operator alerts are signal for me, the person running the studio. The audience is one. The job is to tell me what my automation is doing so I can intervene when I need to. Discord is that pager, and this entire post is about it.

Customer-facing notifications are a product capability, the features that email or notify the people who use what I build. The audience is users. The job is reliable, durable delivery to people who are not me. That's a separate system entirely, with its own delivery guarantees, its own retries, its own failure handling, and its own place in the architecture.

These two have different audiences, different reliability needs, and different failure modes, so they live in different systems. If my operator pager misses a message, I notice the next time I glance and I go look. If a customer notification system misses a message, that's a product defect with someone else's expectations attached. Mixing them would drag the casual reliability of a personal pager into a place that demands real guarantees, or drag heavyweight delivery machinery into something that just needs to ping my phone. Keeping them apart lets each one be exactly as serious as it should be.

Discord is the operator pager. The customer notification product is its own thing, and it's a different post.

What Shipped, and What I Learned

The taxonomy is in place, both webhook stores are wired, redaction is enforced at the emitter, and the most recent piece to land was a credential pager: alerts specifically for credential and secret events. A secret nearing expiry, a rotation that needs attention, these now route to their own channel instead of dissolving into the general stream. The events that are easiest to forget and most painful to miss got the most deliberate home.

The thing I learned building this is that scattered signal gets solved by choosing one surface and committing to it, not by polishing the surfaces you already ignore. The technology here is almost embarrassingly simple. A webhook is a URL you POST to. All the real work lived in the taxonomy, the discipline of redaction, and the boundary that keeps my pager out of my product.

A solo operator can't watch a dozen places. So I stopped trying, and built the one place worth watching.