Error Handling Patterns for Email Agents

#email #ai #architecture #api

A REST API fails synchronously — you get a 4xx and handle it on the spot. Email fails on a delay: the send call returns success, and the actual failure shows up minutes later as a bounce. An autonomous agent that only handles the first kind of error is half-built, and the half it's missing is the half that gets its sending paused. Here's the full failure surface for an email agent, and the retry design that survives it.

The context: agents built on Agent Accounts — hosted mailboxes the agent owns, currently in beta — where the agent itself decides when to send. That autonomy is exactly why error handling can't be an afterthought: a human notices when replies stop landing; a loop doesn't.

Channel one: errors at send time

Some failures do come back synchronously on the send request, and each maps to a different response from your code:

Status	Body	What your agent should do
`429`	`"rate limit exceeded"`	Back off and retry; raise quotas via policy if it recurs
`400`	`"domain is not verified"`	Stop — finish domain verification, retrying is pointless
`400`	Text indicating sending is paused for the account	Stop all sends; a reputation pause is in effect
`403`	`send blocked by abuse restriction`	Stop and contact support — there's no quota to wait out

The pattern worth internalizing: only the 429 is retryable. The two 400s and the 403 are states, not transient failures, and a naive exponential-backoff-everything loop will hammer an endpoint that can't succeed. The 403 deserves special respect — abuse restrictions can be scoped to a sender, a domain, an organization, an application, or a single grant, and Nylas applies the most specific match. An application-level block affects every account under that application, not just the one you sent from. Recovery is a support conversation, not a timer: include the application ID, the grant ID, and one example error response so the abuse team can locate the restriction. The good news is that once it's cleared, sends succeed on the next attempt — there's no propagation delay to wait through.

Channel two: bounces, minutes later

For mail sent through connected mailboxes (Google, Microsoft, iCloud, Yahoo — the 4 providers that generate Non-Delivery Reports), the message.bounce_detected webhook delivers the failure your send call never saw:

{
  "type": "message.bounce_detected",
  "data": {
    "grant_id": "<NYLAS_GRANT_ID>",
    "object": {
      "bounced_addresses": "no-such-user@example.com",
      "bounce_reason": "The email account that you tried to reach does not exist.",
      "type": "mailbox_unavailable",
      "code": "550",
      "bounce_date": "Mon, 08 Jun 2026 14:21:00 +0000"
    }
  }
}

The code field is the branch point — and note it's a string, so compare "550", not 550. Codes in the 500 range are hard bounces: the address is gone, and the only correct move is a suppression list. Codes in the 400 range are soft bounces — full mailbox, throttled server — safe to retry with a cap:

def handle_bounce(obj: dict, suppression: set[str]):
    address = obj["bounced_addresses"]
    if obj["code"].startswith("5"):
        suppression.add(address)        # permanent: never send again
    else:
        schedule_retry(address, max_attempts=3)  # temporary: capped retry

Two blind spots to plan around. Bounce detection works by finding the NDR in the sender's mailbox, so standard IMAP and Exchange (EWS) accounts — which don't reliably generate NDRs — produce no message.bounce_detected events at all. And detection is asynchronous by nature: the NDR can arrive minutes after the original send, so your handler can't assume any ordering relative to the send call that caused it.

For Agent Account sends specifically, the deliverability signal comes through 4 transactional triggers instead: message.transactional.delivered, .bounced, .complaint, and .rejected. Subscribe to all of them — they're your only real-time window into the rates described next.

The numbers that decide whether you keep sending

This is where error handling stops being per-message and becomes per-account. Nylas tracks each Agent Account's rolling bounce and complaint rates, with explicit thresholds documented in the usage limits guide:

Bounce rate: under 2% is healthy; at 5% the account goes under review; at 10% sending is paused.
Complaint rate: under 0.1% is healthy; at 0.1% review; at 0.5% paused.

The measurement details change how you design around these. The bounce rate counts only hard bounces to addresses that don't exist — full mailboxes, greylisting, and other transient rejections don't touch it — so a suppression list directly protects the metric. The denominator is a recent representative send volume rather than a fixed time window, which keeps the rate meaningful whether the account sends a hundred messages a day or a million. Complaints are counted only against recipient domains that send complaint feedback to senders, meaning your measured 0.1% likely understates real spam-folder activity.

One more asymmetry matters operationally: "under review" is completely silent to your application. Sending continues, no error changes shape, and the only place the trend is visible is your own webhook-derived telemetry. By the time the API starts returning the pause response, the silent phase is already over. And a pause doesn't clear itself on a timer — it requires contacting support with the cause and the fix. The cheap-looking shortcut of "just retry everything and let the bounces sort themselves out" converts directly into a multi-day outage for your agent.

The 0.1% complaint threshold deserves a moment of arithmetic: on a low-volume account, a handful of recipients clicking "mark as spam" is enough to land you under review. For an agent doing outreach, honoring unsubscribes immediately isn't politeness — it's uptime.

Putting it together: the agent's own circuit breaker

The resilient design mirrors what the platform does, one layer earlier:

Validate before sending. Skip any address that's ever hard-bounced.
Treat send errors as states. Retry 429 with backoff; halt on pause, verification, and abuse responses.
Consume the deliverability webhooks. Track your own bounce and complaint counts per account.
Trip your own breaker first. If your measured rates climb toward the review thresholds, pause your outbound queue before the platform pauses it for you — you'll see the trend in your own telemetry first.
Authenticate the domain. Missing DKIM, SPF, or DMARC shows up as a higher bounce rate, because recipient servers refuse the mail outright.

The bounce handling recipe covers the webhook wiring end to end.

Start with the suppression list — it's an afternoon of work and it protects both failure channels at once. Then ask the harder question: if your agent's bounce rate doubled overnight, would anything in your system notice before the 10% line did?