RFC-WF-0014
Channel Adapter & Message Normalization (CAMN)
Status: Draft Standard
Version: 1.0.0
Date: 20 Nov 2025
Category: Standards Track
Author: FullAgenticStack Initiative
Dependencies: RFC-WF-0001 (WFCS), RFC-WF-0002 (WWCS), RFC-WF-0003 (CCP), RFC-WF-0005 (CRCD), RFC-WF-0006 (EAS), RFC-WF-0010 (IDS), RFC-WF-0013 (WKD)
License: Open Specification (Public, Royalty-Free)
Abstract
This document specifies Channel Adapter & Message Normalization (CAMN) for WhatsApp-first systems. CAMN defines normative requirements for converting raw WhatsApp inbound/outbound traffic (text, audio, image, document, interactive replies) into a canonical, audit-friendly Normalized Message Event model. CAMN standardizes message identity, deduplication hints, media handling boundaries, STT attachment semantics, and trace bindings so that higher layers (CCP/IDS/EAS/OoC/RCP) can operate consistently across implementations.
Index Terms— channel adapter, message normalization, multimodal input, STT, media ingestion, deduplication, trace binding, WhatsApp-first.
I. Introduction
WhatsApp-first compliance requires multimodal parity (WFCS) and safe command execution (CCP/IDS). However, implementations often break at the very first step: inconsistent mapping of WhatsApp message payloads into internal events, missing stable identifiers, lack of media provenance, or non-deterministic STT attachment. CAMN standardizes this “edge contract” to reduce ambiguity and to make dedupe, evidence, and observability reliable.
CAMN does not attempt to replace vendor payloads. It defines the canonical internal event representation after ingestion.
II. Scope
CAMN specifies:
- Canonical normalized message event schema (minimum fields)
- Identity and trace binding rules for inbound/outbound messages
- Deduplication and replay handling at the adapter boundary (IDS alignment)
- Multimodal handling requirements (text/audio/image/document)
- STT attachment requirements for audio
- Media download/storage boundaries and security constraints
- Mapping requirements for interactive replies (menus/buttons) to CCP initiation
CAMN does not define which WhatsApp provider you use (Cloud API, BSP, etc.), only what you MUST emit internally.
III. Normative Language
MUST, MUST NOT, SHOULD, SHOULD NOT, MAY are normative.
IV. Definitions
Channel Adapter: Component that receives/sends WhatsApp messages and emits normalized events.
Normalized Message Event (NME): Canonical internal representation of a WhatsApp message.
Media Artifact: Stored representation (or reference) to downloaded media (audio/image/document).
STT Attachment: The transcription object produced from an audio message.
V. Design Goals
CAMN MUST ensure:
- G1. Deterministic Ingestion: same message → same normalized event identity
- G2. Traceability: every command can link back to conversation/message IDs
- G3. Multimodal Consistency: audio/media are first-class inputs, not “special cases”
- G4. Dedupe-Friendly: adapter emits enough identity to support IDS exactly-once effects
- G5. Audit-Ready: normalized events are suitable for evidence binding (EAS)
VI. Normalized Message Event (NME) Model
A. Required Fields (Inbound)
The adapter MUST emit an NME with at least:
-
event_id(unique, stable) -
direction=inbound -
channel=whatsapp -
conversation_id(stable conversation/thread id) -
message_id(provider message id or stable derived id) -
sender(actor identifier; phone or internal actor id) -
received_at(timestamp) -
message_type(text|audio|image|document|interactive|location|contact|unknown) -
content(type-specific payload; may be empty for media-only) -
media[](optional array for attachments) -
interactive(optional structure for replies/selections) -
integrity(hash/fingerprint fields; optional but recommended) -
trace(correlation_id/causation_id if available; else generated)
B. Required Fields (Outbound)
Outbound messages MUST be emitted as NMEs with:
-
direction=outbound recipientsent_at-
outbound_message_id(provider id once available, or pending reference) - linkage to the inbound trigger (
in_reply_to.message_idwhen applicable)
C. Canonical Example (JSON)
```json id="z3g5fl"
{
"event_id": "uuid",
"direction": "inbound",
"channel": "whatsapp",
"conversation_id": "conv_123",
"message_id": "wamid.xxx",
"sender": { "actor_id": "user_456", "display": "+55..." },
"received_at": "2026-02-22T00:00:00Z",
"message_type": "audio",
"content": {},
"media": [
{
"media_id": "provider_media_id",
"kind": "audio",
"mime": "audio/ogg",
"size_bytes": 182332,
"sha256": "optional",
"storage_ref": "s3://bucket/key-or-local-ref"
}
],
"stt": {
"status": "completed",
"transcript": "cancelar pedido 204",
"confidence": 0.86,
"engine": "implementation-defined",
"completed_at": "2026-02-22T00:00:02Z"
},
"trace": {
"correlation_id": "corr_789",
"causation_id": "msg_wamid.xxx"
}
}
---
## VII. Identity, Dedupe, and Replay Handling
### A. Stable Identity
`message_id` MUST be stable for a given inbound message delivery. If the provider supplies an ID, it MUST be used. If not, the adapter MUST derive a stable ID from canonicalized content + sender + timestamp bucket (implementation-defined) and MUST document that derivation.
### B. Adapter-Level Dedupe (Recommended)
The adapter SHOULD perform lightweight dedupe of identical inbound deliveries (same provider message id) to reduce downstream load.
### C. IDS Alignment
Even if adapter dedupe exists, downstream command execution MUST still implement IDS exactly-once effects (adapter dedupe is not sufficient).
---
## VIII. Multimodal Requirements
### A. Audio
For `message_type=audio`:
* The system MUST accept audio as a first-class input (WFCS).
* The adapter MUST emit an NME immediately and MAY later emit an update event when STT completes, or embed STT if synchronous.
* The system MUST bind the final STT transcript to the original message via `message_id`.
### B. Images/Documents
For `image` and `document`:
* The adapter MUST capture enough metadata to fetch/store media (subject to security).
* The adapter SHOULD compute `sha256` after download for tamper-evidence and dedupe.
* The system MUST NOT require a web upload as a substitute (WFCS/WFCS parity).
### C. Interactive Replies
For interactive selections (menus/buttons):
* The adapter MUST normalize user choice into an `interactive.selection` payload containing:
* selected option id/value
* display label
* original menu context reference (if available)
* The command layer MUST be able to map this to CCP initiation deterministically (WWCS).
---
## IX. Media Handling Boundaries and Security
### A. Download Policy
Media download MAY be deferred, but the NME MUST still contain a resolvable `media_id` reference.
### B. Storage References
If media is stored, the NME MUST include `storage_ref` and SHOULD include a content hash.
### C. Redaction
NME and media metadata MUST respect privacy and redaction policies. OoC outputs MUST avoid leaking raw media content unless privileged and explicitly requested.
---
## X. Binding to CCP, EAS, OoC
### A. CCP Binding
Every CCP Command Envelope trace MUST be able to reference:
* `conversation_id`
* one or more `message_id`s that originated the command
### B. EAS Binding
Evidence artifacts MUST reference the same `conversation_id` and `message_id`s (directly or via trace ids), ensuring audit continuity from raw WhatsApp input to executed effect.
### C. OoC Binding
OoC queries SHOULD allow lookup by message id (or conversation id + index) as a convenience path to find the associated command_id and evidence chain.
---
## XI. Conformance Requirements
An implementation is CAMN-compliant if it:
1. Emits NMEs for all inbound message types used by the system (text/audio/image/document at minimum)
2. Preserves stable `conversation_id` and `message_id` bindings
3. Supports STT attachment binding for audio messages
4. Provides metadata sufficient for dedupe and evidence traceability
5. Supports deterministic mapping of interactive replies to CCP initiation
---
## XII. Relationship to Other RFCs
* **WFCS (0001):** requires multimodal and full WhatsApp operability.
* **WWCS (0002):** interactive/menu mapping expectations.
* **CCP (0003):** consumes normalized events to create envelopes.
* **CRCD (0005):** maps interactive selections to command declarations.
* **EAS (0006):** binds evidence to conversation/message trace.
* **IDS (0010):** relies on stable message identity and replay assumptions.
* **WKD (0013):** may advertise adapter/event schemas as part of discovery.
---
## XIII. Security Considerations
* Media can contain sensitive data; enforce strict storage/access policies.
* Do not treat STT transcripts as authoritative without confirmation for mutations.
* Normalize carefully to avoid injection through “interactive payload” fields.
* Rate-limit inbound floods at adapter boundary.
---
## XIV. Conclusion
CAMN standardizes the WhatsApp edge contract: how raw messages become canonical normalized events with stable identity, multimodal attachments, and trace bindings. This allows CCP/IDS/EAS/OoC/RCP to remain deterministic and auditable across implementations, reducing fragility at the most failure-prone boundary.
---
## References
[1] RFC-WF-0001, *WhatsApp-First Compliance Core (WFCS).*
[2] RFC-WF-0002, *Web-to-WhatsApp Conversion Standard (WWCS).*
[3] RFC-WF-0003, *Conversational Command Protocol (CCP).*
[4] RFC-WF-0005, *Command Registry & Capability Declaration (CRCD).*
[5] RFC-WF-0006, *Evidence Artifact Schema (EAS).*
[6] RFC-WF-0010, *Idempotency & Delivery Semantics (IDS).*
[7] RFC-WF-0013, *Well-Known Discovery & Interop Endpoints (WKD).*
---
## Concepts and Technologies
Channel adapters, normalized message events, multimodal ingestion, STT binding, media artifact hashing, interactive reply normalization, trace/correlation IDs, adapter dedupe, evidence traceability.
Top comments (0)