RFC-WF-0020
Rate Limits, Abuse Controls & Operational Safety (RAOS)
Status: Draft Standard
Version: 1.0.0
Date: 20 Nov 2025
Category: Standards Track
Author: FullAgenticStack Initiative
Dependencies: RFC-WF-0003 (CCP), RFC-WF-0004 (ACSM), RFC-WF-0007 (OoC), RFC-WF-0008 (RCP), RFC-WF-0009 (TMSI), RFC-WF-0010 (IDS), RFC-WF-0015 (PPGP)
License: Open Specification (Public, Royalty-Free)
Abstract
This document specifies Rate Limits, Abuse Controls & Operational Safety (RAOS) for WhatsApp-first systems. RAOS defines normative requirements for throttling, abuse detection, safe degradation, and “lockdown” operational modes across command execution, observability queries, recovery actions, and token/confirmation flows. RAOS reduces risk from spam, brute force, denial-of-service, and operational runaway (e.g., repeated retries or agent loops) while preserving WhatsApp-first operability.
Index Terms— rate limiting, abuse prevention, denial-of-service, throttling, safe degradation, operational safety, conversational systems.
I. Introduction
WhatsApp-first systems expose powerful operations through a conversational interface. Without explicit abuse controls, the same convenience becomes an attack vector: flooding commands, brute-forcing confirmation tokens, scraping observability, or repeatedly triggering recovery actions. RAOS standardizes a minimum set of controls to keep systems stable and auditable under abuse, while maintaining a usable conversational experience.
RAOS also covers operational safety for automation and agents: preventing runaway loops, uncontrolled retries, and mass effects.
II. Scope
RAOS specifies:
- Rate limiting requirements by actor/tenant/command class
- Abuse detection signals and escalation actions
- Safe degradation strategies for OoC and RCP
- Token brute-force protections
- Retry/reprocess safety limits and loop prevention
- Lockdown / maintenance modes controllable via WhatsApp (admin)
- Evidence/telemetry requirements for abuse events
RAOS does not prescribe a specific algorithm (token bucket, leaky bucket); it defines behavioral constraints.
III. Normative Language
MUST, MUST NOT, SHOULD, SHOULD NOT, MAY are normative.
IV. Definitions
Rate Limit: A rule that caps request frequency within a time window.
Abuse Event: A detected suspicious pattern (spam, brute force, scraping, loops).
Safe Degradation: Reducing system capabilities/verbosity without breaking core operability.
Lockdown Mode: A governed operational mode restricting high-risk actions temporarily.
V. Design Goals
RAOS MUST ensure:
- G1. Stability: prevent resource exhaustion via conversation ingress.
- G2. Safety: prevent brute force and runaway automation.
- G3. Usability: degrade gracefully; don’t hard-fail normal users unnecessarily.
- G4. Auditability: record abuse signals and actions as evidence/telemetry.
- G5. Governance: allow privileged operators to control modes via WhatsApp.
VI. Rate Limiting Model (Normative Minimum)
A. Required Dimensions
Rate limits MUST be enforceable across at least:
- actor (user/admin/agent id)
- tenant (tenant_id)
- command class (read-only vs mutation vs destructive vs admin)
- endpoint type (ingress, OoC queries, RCP actions)
B. Minimum Buckets
Implementations MUST define separate limits for:
- Read-only commands (OoC/status queries)
- State-mutating commands
- Destructive commands (S2)
- Admin high-impact commands (S3)
- Recovery actions (RCP)
- Confirmation attempts / token submissions
C. Default Principle (Normative)
Destructive/admin/recovery flows MUST have stricter limits than read-only flows.
VII. Abuse Detection Signals (Minimum Set)
Systems MUST detect at least:
- A1 Spam ingress: high message rate from an actor
- A2 Repeated denied authz: repeated scope denials
- A3 Confirmation brute force: repeated token/phrase failures
- A4 OoC scraping: high-frequency observability queries
- A5 Recovery loops: repeated retries/reprocess without convergence
- A6 Agent runaway: repeated agent proposals/actions beyond thresholds
Detection MAY be threshold-based or risk-scored, but MUST be explainable and auditable.
VIII. Mitigation Actions (Normative)
Upon abuse detection, implementations MUST support mitigations:
A. Throttle
Reduce allowed rate for the offending actor/tenant.
B. Challenge / Step-up Escalation
For sensitive actions, require step-up earlier or more frequently.
C. Degradation
For OoC, reduce detail level (summary-only) under stress or suspicious activity.
D. Temporary Suspension (Scoped)
Temporarily suspend only the specific actor or capability class (not whole tenant) when possible.
E. Lockdown Mode (Admin-Controlled)
Enable a mode where:
- destructive/admin/recovery actions are restricted to a minimal operator set
- confirmation tokens become mandatory for broader classes
- OoC output detail is reduced
Lockdown MUST be enabled/disabled via WhatsApp admin commands (subject to ACSM step-up).
IX. Confirmation Token Hardening
A. Attempt Limits
Systems MUST enforce attempt limits for confirmation tokens per:
- actor
- command_id
- time window
Exceeding limits MUST trigger a mitigation (throttle or temporary suspension).
B. Token Properties
Tokens MUST be:
- short-lived
- single-use
- envelope-bound (command_id + idempotency_key)
C. Side-Channel Safety
Systems SHOULD avoid giving attackers a high-resolution oracle (“token wrong by 1 char”). Responses SHOULD be generic, while evidence logs remain precise.
X. Recovery Safety and Loop Prevention
A. Retry Caps
RCP retry MUST be capped per command (attempt counter), and MUST require escalating verification or operator intervention beyond a threshold.
B. Backoff Requirements
Retry/reprocess flows SHOULD implement backoff.
C. Non-Converging Workflows
If a workflow fails repeatedly, the system MUST:
- stop automatic retries
- mark status as
blockedor equivalent - present next operator actions via OoC (evidence-backed)
D. Compensation Safety
Compensation (R3) MUST be rate-limited strictly and MUST require step-up.
XI. Agent Safety Controls (AIP Alignment)
If agents exist:
- agents MUST have proposal rate limits
- agent-driven recovery recommendations MUST be capped
- “agent loop detection” MUST exist (A6)
Agents MUST NOT continuously propose the same failed plan without human intervention; repeated proposals MUST degrade to “suggestion-only” mode.
XII. Evidence and Telemetry Requirements
Abuse and mitigation events MUST be recorded as:
- telemetry signals (metrics/logs/traces), and
- evidence artifacts or evidence-linked observations where relevant
At minimum, systems MUST record:
- actor/tenant
- trigger signal (A1–A6)
- mitigation action taken
- time window
- affected command(s) if applicable
OoC SHOULD allow privileged users to query recent abuse actions in a redacted summary form.
XIII. Policy Binding (PPGP)
RAOS limits and thresholds SHOULD be configurable via policy packs (PPGP), including per-tenant overrides.
If a policy pack is missing for high-impact actions, the system MUST fail closed for those actions (consistent with governance fail-closed principles).
XIV. Relationship to Other RFCs
- CCP (0003): RAOS constrains confirmation and command ingress.
- ACSM (0004): mitigation may require step-up and scope enforcement.
- OoC (0007): RAOS defines OoC scraping controls and safe degradation.
- RCP (0008): RAOS limits recovery loops and compensation abuse.
- TMSI (0009): RAOS mitigates DoS/abuse threats and token brute force.
- IDS (0010): replay safety complements dedupe but does not replace RAOS.
- PPGP (0015): policy binding for thresholds and modes.
XV. Security Considerations
Overly strict limits can create self-inflicted outages; overly lax limits enable abuse. Implementations MUST balance stability and availability. Lockdown mode must be protected by strong admin controls to prevent attackers from “locking” the system maliciously.
XVI. Conclusion
RAOS standardizes the operational safety layer for WhatsApp-first systems: rate limits, abuse detection, mitigation strategies, and governed lockdown controls. These mechanisms protect both the system and its users from spam, brute force, scraping, and runaway automation—while preserving conversational operability and evidence-backed accountability.
References
[1] RFC-WF-0003, Conversational Command Protocol (CCP).
[2] RFC-WF-0004, Administrative Command Security Model (ACSM).
[3] RFC-WF-0007, Observability over Conversation (OoC).
[4] RFC-WF-0008, Recovery & Compensation Protocol (RCP).
[5] RFC-WF-0009, Threat Model & Security Invariants (TMSI).
[6] RFC-WF-0010, Idempotency & Delivery Semantics (IDS).
[7] RFC-WF-0015, Policy Packs & Governance Profiles (PPGP).
Concepts and Technologies
Rate limiting buckets, abuse signal detection, token brute-force protection, safe degradation, lockdown modes, retry caps and backoff, recovery loop prevention, agent runaway detection, evidence-backed abuse reporting.
Top comments (0)