DEV Community

Cover image for Resurface Claude Code Usage Across Your Team with CloudWatch OTEL (No Lambda)

Resurface Claude Code Usage Across Your Team with CloudWatch OTEL (No Lambda)

I've been building AI tooling infrastructure to empower a team of 50+ software engineers to do vibe coding safely. We went from 3 engineers using AI full-time to 50+ in 6 months — including non-engineers. (I co-presented on this journey at AgentCon Hong Kong 2026.)

One thing we learned: you can't improve what you can't measure. Once you give a team AI coding tools, you want visibility into how they're being used — not to evaluate individual engineers, but to understand adoption patterns. Which tools are people reaching for? How large are the prompts? What tool calls are being made? Making these metrics visible to everyone helps the team learn from each other and helps champions pull others forward.

This post is about the plumbing: how to get that telemetry data from coding agents into CloudWatch with minimal infrastructure.

"But we already have an LLM gateway." If your team routes AI traffic through a gateway like LiteLLM or AWS Bedrock, you already have token-level usage data. But if your engineers are on coding plans — Claude Team/Max, OpenCode Go, GitHub Copilot seats, ChatGPT Codex — the LLM calls bypass your gateway entirely. You lose visibility into the interesting stuff: how many tool calls per session, prompt sizes, which tools are being invoked, who's active at what times. That's where OTEL telemetry fills the gap.

AI coding tools are shipping with built-in OpenTelemetry support. Claude Code, Claude CoWork, GitHub Copilot, Gemini CLI, and Cursor (via hooks) all export metrics, traces, and log events over OTLP/HTTP — token counts, tool durations, model latency, the works. Kiro has an open feature request for native OTEL support too.

There's one catch: CloudWatch's OTLP endpoints require SigV4 signing. These tools' OTEL SDKs can't do that. Neither can most OTEL SDKs without an AWS-specific exporter or a collector sidecar.

The usual fix is a Lambda function that receives OTLP, signs it, and forwards it. That means cold starts, packaging, and another thing to maintain.

Here's a simpler way: API Gateway REST API with AWS Service Integration. APIGW signs the request with SigV4 using an execution role. No Lambda. No collector. No code.

Expanding Brain Meme

Timeline: CloudWatch has supported OTLP ingestion for traces and logs for some time (availability varies by region — check the OTLP endpoints doc). Native OTLP metrics support launched April 2, 2026 in public preview, completing all three pillars of observability via OTLP.

Architecture

Architecture

AI Coding Tool (OTEL SDK)
  ↓ OTLP/HTTP + x-api-key
API Gateway REST API
  ├→ POST /v1/metrics  → AWS Integration → monitoring (SigV4) → CloudWatch Metrics
  ├→ POST /v1/traces   → AWS Integration → xray (SigV4)      → X-Ray / CloudWatch Logs
  └→ POST /v1/logs     → AWS Integration → logs (SigV4)       → CloudWatch Logs
Enter fullscreen mode Exit fullscreen mode

The client sends standard OTLP/HTTP requests with an API key. APIGW validates the key, assumes an IAM role, signs the request with SigV4, and forwards it to the CloudWatch OTLP endpoint. That's it.

Why This Works

API Gateway REST API has an integration type called AWS Service Integration. It can call any AWS service API and sign the request with SigV4 using an execution role. The CloudWatch OTLP endpoints are standard AWS service endpoints:

Signal Endpoint Service
Metrics monitoring.{region}.amazonaws.com/v1/metrics monitoring
Traces xray.{region}.amazonaws.com/v1/traces xray
Logs logs.{region}.amazonaws.com/v1/logs logs

APIGW's integration URI format maps directly:

arn:aws:apigateway:{region}:monitoring:path/v1/metrics
arn:aws:apigateway:{region}:xray:path/v1/traces
arn:aws:apigateway:{region}:logs:path/v1/logs
Enter fullscreen mode Exit fullscreen mode

Setup

The full infrastructure is defined in a CloudFormation template (link at the bottom). Here's what it creates:

IAM Execution Role

APIGW needs an IAM role to sign requests to CloudWatch. The policy is scoped to only the actions and resources needed for OTLP ingestion:

Parameters:
  OtlpLogGroupName:
    Type: String
    Default: "otlp-logs"
    Description: CloudWatch Logs log group for OTLP log ingestion

Resources:
  OtlpExecutionRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Statement:
          - Effect: Allow
            Principal:
              Service: apigateway.amazonaws.com
            Action: sts:AssumeRole
      Policies:
        - PolicyName: otlp-metrics
          PolicyDocument:
            Statement:
              - Effect: Allow
                Action:
                  - cloudwatch:PutMetricData
                Resource: "*"
        - PolicyName: otlp-traces
          PolicyDocument:
            Statement:
              - Effect: Allow
                Action:
                  - xray:PutTraceSegments
                  - xray:PutTelemetryRecords
                Resource: "*"
        - PolicyName: otlp-logs
          PolicyDocument:
            Statement:
              - Effect: Allow
                Action:
                  - logs:PutLogEvents
                  - logs:CreateLogStream
                  - logs:DescribeLogStreams
                Resource:
                  - !Sub "arn:aws:logs:${AWS::Region}:${AWS::AccountId}:log-group:${OtlpLogGroupName}:*"
Enter fullscreen mode Exit fullscreen mode

Note: cloudwatch:PutMetricData doesn't support resource-level ARNs. The cloudwatch:namespace condition key exists but does not apply to the OTLP ingestion path — metrics are accepted regardless of namespace. X-Ray PutTraceSegments also doesn't support resource-level restrictions. Logs permissions are scoped to a specific log group via the OtlpLogGroupName parameter.

API Gateway with AWS Service Integration

Each OTLP signal gets its own resource with an AWS integration:

MetricsMethod:
  Type: AWS::ApiGateway::Method
  Properties:
    HttpMethod: POST
    AuthorizationType: NONE
    ApiKeyRequired: true
    Integration:
      Type: AWS
      IntegrationHttpMethod: POST
      Uri: !Sub "arn:aws:apigateway:${AWS::Region}:monitoring:path/v1/metrics"
      Credentials: !GetAtt OtlpExecutionRole.Arn
      PassthroughBehavior: WHEN_NO_MATCH
      ContentHandling: CONVERT_TO_TEXT
Enter fullscreen mode Exit fullscreen mode

Same pattern for /v1/traces (service: xray) and /v1/logs (service: logs).

API Key Authentication

Protect the endpoint with an API key so only your tools can send telemetry:

ApiKey:
  Type: AWS::ApiGateway::ApiKey
  Properties:
    Enabled: true

UsagePlan:
  Type: AWS::ApiGateway::UsagePlan
  Properties:
    ApiStages:
      - ApiId: !Ref Api
        Stage: !Ref Stage
Enter fullscreen mode Exit fullscreen mode

Configure Your Tools

The proxy works with any tool that supports standard OTEL environment variables. Here's how to configure each:

Disclaimer: I personally use Claude Code routed through a custom LLM gateway (not a coding plan), since some coding plans aren't available in the region I live in. The configurations below are based on each tool's official documentation — your mileage may vary.

Claude Code

Official monitoring docs

export CLAUDE_CODE_ENABLE_TELEMETRY=1
export OTEL_METRICS_EXPORTER=otlp
export OTEL_LOGS_EXPORTER=otlp
export OTEL_EXPORTER_OTLP_PROTOCOL=http/json
export OTEL_EXPORTER_OTLP_ENDPOINT=https://xxx.execute-api.us-west-2.amazonaws.com/prod
export OTEL_EXPORTER_OTLP_HEADERS=x-api-key=your-api-key
export OTEL_SERVICE_NAME=claude-code
Enter fullscreen mode Exit fullscreen mode

For short-lived tasks, lower the export interval so data flushes before the process exits:

export OTEL_METRIC_EXPORT_INTERVAL=1000
export OTEL_LOGS_EXPORT_INTERVAL=1000
Enter fullscreen mode Exit fullscreen mode

For traces (beta):

export CLAUDE_CODE_ENHANCED_TELEMETRY_BETA=1
export OTEL_TRACES_EXPORTER=otlp
export OTEL_TRACES_EXPORT_INTERVAL=1000
Enter fullscreen mode Exit fullscreen mode

Enforcing OTEL across your team: Claude Code supports managed settings via managed-settings.json, deployable through MDM (Jamf, Intune, etc.). This lets you enforce OTEL configuration org-wide — engineers don't need to set environment variables manually, and they can't opt out.

Claude CoWork (Team & Enterprise)

CoWork monitoring docs — configure via Admin Settings → Cowork → Monitoring:

  • OTLP endpoint: your APIGW URL
  • OTLP protocol: http/json
  • OTLP headers: x-api-key=your-api-key

CoWork streams user prompts, tool/MCP invocations, file access, human approval decisions, and API request details. It shares the same OTel event schema as Claude Code via the Claude Agent SDK — you can distinguish them by terminal.type (cowork vs cli).

GitHub Copilot CLI

Copilot CLI OTel reference — available since Copilot CLI 1.0.4:

export COPILOT_OTEL_ENDPOINT=https://xxx.execute-api.us-west-2.amazonaws.com/prod
export COPILOT_OTEL_HEADERS=x-api-key=your-api-key
Enter fullscreen mode Exit fullscreen mode

Gemini CLI

Gemini CLI telemetry docs

export GEMINI_CLI_OTEL_EXPORT_ENDPOINT=https://xxx.execute-api.us-west-2.amazonaws.com/prod
Enter fullscreen mode Exit fullscreen mode

Cursor (via Hooks)

Cursor doesn't have native OTEL export yet, but the community cursor-otel-hook project captures agent activity via Cursor's hook system and exports traces to any OTLP endpoint. Configure via otel_config.json:

{
  "OTEL_EXPORTER_OTLP_ENDPOINT": "https://xxx.execute-api.us-west-2.amazonaws.com/prod/v1/traces",
  "OTEL_EXPORTER_OTLP_PROTOCOL": "http/json",
  "OTEL_EXPORTER_OTLP_HEADERS": { "x-api-key": "your-api-key" }
}
Enter fullscreen mode Exit fullscreen mode

What You Get

CloudWatch receives standard OTLP data. For Claude Code specifically:

  • Metrics: claude_code.token.usage (by token.type: input/output/cache_read/cache_creation), claude_code.cost.usage (USD), claude_code.session.count, claude_code.lines_of_code.count
  • Traces (beta): Spans linking each user prompt → API requests → tool executions
  • Log events: claude_code.user_prompt, claude_code.tool_decision, claude_code.tool_result, claude_code.api_request — all tagged with session.id and service.name=claude-code

Here's what real Claude Code log events look like after flowing through the proxy into CloudWatch Logs. This is actual data from an E2E test — a single prompt that triggered a Bash tool call:

claude_code.user_prompt — emitted when the user sends a prompt:

{
  "resource": {
    "attributes": {
      "host.arch": "arm64",
      "os.type": "linux",
      "service.name": "claude-code",
      "service.version": "2.1.114",
      "os.version": "6.17.0-1010-aws"
    }
  },
  "scope": {
    "name": "com.anthropic.claude_code.events",
    "version": "2.1.114"
  },
  "body": "claude_code.user_prompt",
  "attributes": {
    "event.sequence": 0,
    "user.id": "1c257d04...",
    "prompt_length": "40",
    "terminal.type": "non-interactive",
    "event.name": "user_prompt",
    "event.timestamp": "2026-04-18T11:23:13.187Z",
    "prompt": "<REDACTED>",
    "session.id": "846ab649-8bba-471e-8ec5-8756116d0840",
    "prompt.id": "88475ee2-59c2-4137-9201-5540c6a6cad1"
  }
}
Enter fullscreen mode Exit fullscreen mode

claude_code.tool_result — emitted after each tool execution:

{
  "body": "claude_code.tool_result",
  "attributes": {
    "tool_name": "Bash",
    "tool_result_size_bytes": "899",
    "tool_input": "{\"command\":\"ls\",\"description\":\"List files in current directory\"}",
    "duration_ms": "95",
    "success": "true",
    "session.id": "846ab649-8bba-471e-8ec5-8756116d0840",
    "prompt.id": "88475ee2-59c2-4137-9201-5540c6a6cad1"
  }
}
Enter fullscreen mode Exit fullscreen mode

claude_code.api_request — emitted after each API call with token counts and cost:

{
  "body": "claude_code.api_request",
  "attributes": {
    "model": "claude-sonnet-4-5-20250929",
    "input_tokens": "142",
    "output_tokens": "61",
    "cache_read_tokens": "0",
    "cache_creation_tokens": "25848",
    "cost_usd": "0.098271",
    "duration_ms": "4950",
    "speed": "normal",
    "session.id": "846ab649-8bba-471e-8ec5-8756116d0840",
    "prompt.id": "88475ee2-59c2-4137-9201-5540c6a6cad1"
  }
}
Enter fullscreen mode Exit fullscreen mode

All events share the same prompt.id, linking them into a single interaction. The event.sequence field orders events within a prompt. Every record carries service.name=claude-code in resource attributes, so isolating Claude Code telemetry in a mixed pipeline is trivial — just filter on that in CloudWatch Logs Insights:

fields @timestamp, body, attributes.model, attributes.cost_usd, attributes.duration_ms
| filter resource.attributes.`service.name` = 'claude-code'
| filter body = 'claude_code.api_request'
| sort @timestamp desc
Enter fullscreen mode Exit fullscreen mode

Region Availability

CloudWatch OTLP endpoints are available in most regions but not all. The OTLP metrics preview launched in 5 regions:

Signal Regions Docs
Metrics (preview) us-east-1, us-west-2, ap-southeast-1, ap-southeast-2, eu-west-1 Announcement
Traces Most commercial regions OTLP Endpoints
Logs Most commercial regions OTLP Endpoints

Tested and confirmed:

Region Metrics Traces Logs
us-east-1
us-west-2
ap-southeast-1
ap-east-1 (Hong Kong)

If your primary region doesn't support it, deploy the proxy in a supported region. The APIGW endpoint is accessible from anywhere.

For the full list of CloudWatch service endpoints by region, see the AWS General Reference.

Gotchas

XRay traces require manual setup. The CloudFormation template creates the proxy endpoints, but X-Ray traces need two additional steps that aren't in the template:

  1. Set CloudWatch Logs as the trace segment destination:
aws xray update-trace-segment-destination --destination CloudWatchLogs
Enter fullscreen mode Exit fullscreen mode
  1. Create a CloudWatch Logs resource policy allowing X-Ray to write to the aws/spans log group:
aws logs put-resource-policy \
  --policy-name XRayAccessPolicy \
  --policy-document '{"Version":"2012-10-17","Statement":[{"Effect":"Allow","Principal":{"Service":"xray.amazonaws.com"},"Action":["logs:PutLogEvents","logs:CreateLogGroup","logs:CreateLogStream"],"Resource":"*"}]}'
Enter fullscreen mode Exit fullscreen mode

Without these, traces will return AccessDeniedException.

CloudWatch Logs supports bearer token auth. The /v1/logs endpoint supports bearer token authentication without SigV4 — but only for logs. Metrics and traces still require SigV4, which is why the APIGW proxy is needed for a unified endpoint.

Use http/json, not http/protobuf. CloudWatch accepts both formats, but API Gateway's CONVERT_TO_TEXT content handling can corrupt binary protobuf payloads in transit. Set OTEL_EXPORTER_OTLP_PROTOCOL=http/json to avoid this. JSON is also easier to debug in APIGW execution logs. Most coding tools default to protobuf, so you'll need to override this explicitly.

API Gateway payload limit. REST API has a 10MB payload limit. OTLP batches from coding tools are well under this, but keep it in mind if you're aggregating from multiple sources. CloudWatch's own limits are 1MB for metrics and logs, 5MB for traces (full limits).

REST API, not HTTP API. Only REST API supports the AWS integration type needed for SigV4 service proxying. HTTP API does not.

Cost

This is about as cheap as it gets for a telemetry pipeline:

Component Cost
API Gateway ~$3.50 / million requests
CloudWatch Metrics Standard CW pricing (free during OTel metrics preview)
CloudWatch Logs Standard CW pricing
Lambda $0 (there is none)

No idle cost. No provisioned capacity. Pure pay-per-request.

For comparison: a Lambda-based OTLP forwarder would add ~$0.20/million invocations plus compute time, but gives you retry logic and transformation capabilities. At typical coding agent volumes (a few hundred requests/day per developer), the cost difference is negligible — the real win is operational simplicity.

When NOT to Use This

This proxy is optimized for simplicity. It's the right choice for low-to-moderate telemetry volumes from coding agents and developer tools. But it has tradeoffs:

Approach Complexity Cost Retries Multi-destination Transformation
This proxy (APIGW) Minimal ~$3.50/M req
OTel Collector Medium Compute cost
Lambda forwarder Medium ~$0.20/M + compute
ADOT SDK (in-app) Low Free (SigV4 native)
SaaS (Datadog, etc.) Low $$$ N/A

Consider an OTel Collector or Lambda forwarder instead if you need:

  • High throughput — thousands of requests/second from many sources
  • Retry and buffering — this proxy is fire-and-forget; if CloudWatch returns an error, the data is lost. OTEL SDKs have built-in retry, but only for transient failures
  • Multi-destination routing — fan out to CloudWatch + Datadog + S3 simultaneously
  • Payload transformation — filter, enrich, or redact telemetry before ingestion
  • Compliance requirements — audit trails, guaranteed delivery, or data residency controls

For most coding agent monitoring use cases (a team of 5-50 developers), this proxy handles the volume comfortably.

Security Considerations

The proxy uses API key authentication — simple but not the strongest option. Here's how to harden it:

Attach AWS WAF to the REST API. Add rate limiting, IP allowlisting, or geo-blocking to prevent abuse. A single WAF WebACL with a rate-based rule (e.g., 1000 req/5min per IP) costs ~$6/month and stops most abuse patterns.

Rotate API keys. APIGW supports multiple API keys per usage plan. Create a new key, distribute it, then disable the old one — zero downtime rotation.

Consider IAM auth for internal use. If your tools run inside AWS (EC2, ECS, Lambda), switch AuthorizationType from NONE to AWS_IAM and drop the API key entirely. The caller signs requests with SigV4 using their IAM role — no shared secrets. This doesn't work for external tools like Claude Code on developer laptops, but it's ideal for CI/CD pipelines or server-side agents.

Egress control. If you're running coding agents in a controlled environment, restrict outbound traffic to only your APIGW endpoint. This prevents telemetry from leaking to unauthorized collectors.

Beyond Coding Agents

This proxy works with any OTEL SDK that supports OTLP/HTTP. If your tool can set OTEL_EXPORTER_OTLP_ENDPOINT and OTEL_EXPORTER_OTLP_HEADERS, it can ship telemetry to CloudWatch through this proxy.

Potential use cases:

  • AI coding agents (Claude Code, CoWork, Copilot, Cursor, Gemini CLI) — track token usage, costs, and tool calls across your org
  • Internal tools — ship metrics without embedding AWS credentials in client apps
  • CI/CD pipelines — export build/test telemetry to CloudWatch
  • On-premises services — send OTLP from outside AWS without running ADOT Collector

For apps running inside AWS with IAM roles available, consider the ADOT SDK for collector-less telemetry with native SigV4 signing — no proxy needed.

Source Code & One-Click Deploy

The CloudFormation template and full documentation are on GitHub:

👉 gabrielkoo/otlp-cloudwatch-proxy

One-click deploy to supported regions:

Region Deploy
US East (N. Virginia) Launch Stack
US West (Oregon) Launch Stack
Asia Pacific (Singapore) Launch Stack
Asia Pacific (Sydney) Launch Stack
Europe (Ireland) Launch Stack

Built and validated on a Saturday morning with Claude Code + OpenClaw. Zero Lambda functions were harmed in the making of this article.

Further Reading

  • AWS Guidance for Claude Code with Amazon Bedrock — Monitoring — A comprehensive (and admittedly overkill) reference implementation using ECS Fargate + ALB + ADOT Collector + Lambda + DynamoDB + Kinesis + Athena. Great if you want to see the full spectrum of what can be measured: per-user token tracking, quota monitoring, cost dashboards, and an analytics data lake. If you need all of that, use it. If you just need telemetry flowing to CloudWatch, the one-template proxy in this post will do.
  • Claude Code Monitoring Docs — Official OTEL configuration reference, including all metrics, events, and traces.
  • Claude Code Managed Settings — How to deploy managed-settings.json via MDM for org-wide OTEL enforcement.
  • CloudWatch OTLP Endpoints — AWS docs on native OTLP ingestion for metrics, traces, and logs.

Top comments (0)