Protecting Sender Reputation When Agents Send Email

#email #security #saas #api

How many angry recipients does it take to stop an AI agent from sending email? Fewer than you'd guess: on Agent Accounts, a complaint rate of 0.5% pauses sending outright, and 0.1% — one spam report per thousand messages — is enough to put the account under review. An autonomous sender can cross either line in an afternoon, so reputation deserves a place in your architecture, not just your deliverability checklist.

The thresholds that pause you

Agent Accounts (currently in beta) are Nylas-hosted mailboxes, and the platform tracks rolling bounce and complaint rates per account. The send limits page publishes the exact lines:

Bounce rate	State
Under 2%	Healthy
5% or above	Under review
10% or above	Sending paused

Complaint rate	State
Under 0.1%	Healthy
0.1% or above	Under review
0.5% or above	Sending paused

Some detail keeps these fair. Only hard bounces — sends to addresses that don't exist — count toward the bounce rate; soft bounces from full mailboxes or greylisting don't. Complaints are counted when a recipient clicks Mark this email as spam or drags your mail to junk, and only recipient domains that send complaint feedback to senders contribute. And the denominator is your recent representative send volume rather than a fixed time window, so the math stays meaningful at a hundred messages a day or a million.

Do the arithmetic for a small sender and the margin gets uncomfortable. An agent that sends 2,000 messages across a window needs just two spam reports to hit the 0.1% review line, and ten to hit the 0.5% pause. Low volume doesn't mean low risk — it means each unhappy recipient carries more weight.

The part that surprises teams: pauses don't clear on a timer. Once sending is paused, resuming requires contacting support with the bounce or complaint source identified and the fix applied. There's no quiet period you can wait out, which means prevention is the entire game.

What enforcement looks like in code

"Under review" is silent to your application — sends keep succeeding while the clock runs. Enforcement and its neighbors surface as error responses on the send call, and they're worth distinguishing because the right reaction differs for each:

Status	Body	What it means	What to do
`400`	Text mentioning an account-level suspension or paused sending	Reputation enforcement paused the account	Stop sending, find the bounce/complaint source, contact support
`400`	`"domain is not verified"`	The `from` domain hasn't finished DNS verification	Fix the MX/TXT records — no amount of retrying helps
`429`	`"rate limit exceeded"`	A per-account or per-domain rate limit was hit	Back off and retry, or raise quotas through a policy

Your agent's send wrapper should treat these as three different branches: a 429 is retryable with backoff, the verification 400 is a configuration bug, and the pause 400 should trip a hard stop on the whole outbound loop.

Watch the rates before the platform does

The only real-time window into the underlying rates is the transactional deliverability webhook family: message.transactional.delivered, message.transactional.bounced, message.transactional.complaint, and message.transactional.rejected — the same events the rate calculation is built from.

Wire those four into your own telemetry and add a circuit breaker: if bounces or complaints start climbing, pause your agent's outbound loop yourself. You'll see the problem before the platform acts on it, and a self-imposed pause needs no support ticket to lift. Subscription works like any other webhook:

curl --request POST \
  --url "https://api.us.nylas.com/v3/webhooks" \
  --header "Authorization: Bearer $NYLAS_API_KEY" \
  --header "Content-Type: application/json" \
  --data '{
    "trigger_types": [
      "message.transactional.bounced",
      "message.transactional.complaint",
      "message.transactional.rejected"
    ],
    "callback_url": "https://yourapp.example.com/webhooks/deliverability"
  }'

Isolate domains so one tenant can't sink the rest

Reputation accrues to domains, which makes domain layout a blast-radius decision. The provisioning guide recommends a dedicated subdomain for production — agents.yourcompany.com — so agent traffic can't drag down your primary marketing domain. From there, two patterns scale the idea:

Reputation sharding. Split high-volume outbound across sales-a.yourcompany.com, sales-b.yourcompany.com, and so on, so trouble on one domain doesn't contaminate the others.
Per-customer domains. In multi-tenant apps, provision each customer's agents on that customer's own verified domain. One tenant's sloppy list hygiene then burns only their reputation — a single application can manage accounts across any number of registered domains, so this costs you nothing architecturally.
Environment separation. Keep agents.staging.yourcompany.com and agents.yourcompany.com in the same application, so load tests and integration suites never spend the production domain's reputation.

Domain authentication is part of the same story: missing or misconfigured SPF, DKIM, or DMARC shows up as a higher bounce rate from recipient servers that refuse the mail outright. The TXT records you publish during domain registration carry that configuration, so verify before you scale up volume.

The other block: abuse restrictions

Separate from reputation math, an abuse restriction can be applied when the operations team identifies out-of-policy use. Matching sends fail immediately with 403 Forbidden and the body send blocked by abuse restriction. There's no threshold to manage here — and the scoping deserves attention: a restriction can target a sender address, a domain including its subdomains, a grant, an application, or an entire organization. At the application level it stops every Agent Account under that app, not just the misbehaving one. Recovery is through support with your application ID, grant ID, and an example error; once cleared, sends succeed on the next attempt.

The hygiene that keeps you healthy

The boring practices work: validate recipient addresses before sending and skip anything that has hard-bounced before; honor unsubscribes immediately; use double opt-in for lists you care about. For an agent, encode these as code-level rules around the send call — an LLM won't independently decide that a recipient bounced last Tuesday.

Two questions that come up often:

Do these thresholds apply to connected Gmail or Microsoft accounts? No — they're specific to Agent Account grants (provider: "nylas"), where Nylas owns the SMTP path. Deliverability for connected grants is governed by the upstream provider's own rules.

How does this interact with the daily send quota? They're independent ceilings. The free plan caps volume at 200 messages per account per day (paid plans have no daily cap by default, and a policy can set a stricter quota), while bounce and complaint rates judge quality at any volume. You can be paused well under quota, and you can hit quota with a spotless reputation.

A worthwhile exercise before your agent's volume grows: sketch what happens to your product if its sending paused today, and how long until you'd notice. If the answers are "everything stops" and "when a customer complains," start with the webhook subscription above — it's one API call and it turns both answers around.