<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Clive Ying</title>
    <description>The latest articles on DEV Community by Clive Ying (@jiemying).</description>
    <link>https://dev.to/jiemying</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F4013204%2F8bc1f880-07b0-4af0-a74b-278d34fa7939.png</url>
      <title>DEV Community: Clive Ying</title>
      <link>https://dev.to/jiemying</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/jiemying"/>
    <language>en</language>
    <item>
      <title>Run Claude Code for Your Whole Team — Zero API Keys on Developer Laptops</title>
      <dc:creator>Clive Ying</dc:creator>
      <pubDate>Fri, 03 Jul 2026 08:03:11 +0000</pubDate>
      <link>https://dev.to/jiemying/run-claude-code-for-your-whole-team-zero-api-keys-on-developer-laptops-3847</link>
      <guid>https://dev.to/jiemying/run-claude-code-for-your-whole-team-zero-api-keys-on-developer-laptops-3847</guid>
      <description>&lt;p&gt;Here's a thing that happens at almost every company that starts using AI coding tools: one developer gets an Anthropic API key, tells two colleagues, and within a week half the team has their own keys stored in &lt;code&gt;.env&lt;/code&gt; files, shell profiles, and CI jobs. Someone leaves. Their key is still active somewhere. You have no idea what it's calling or how much it's spending.&lt;/p&gt;

&lt;p&gt;There's a better model. This post walks through an open-source AWS setup that lets your whole engineering team use Claude Code without a single API key or AWS credential ever touching a developer laptop — and how a few architectural tricks make it work without the usual enterprise friction.&lt;/p&gt;




&lt;h2&gt;
  
  
  What the gateway actually does
&lt;/h2&gt;

&lt;p&gt;The &lt;a href="https://code.claude.com/docs/en/claude-apps-gateway" rel="noopener noreferrer"&gt;Claude Apps Gateway&lt;/a&gt; is a self-hosted proxy you run in your own AWS account. The architecture is deliberately simple:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌ your AWS account (private VPC) ──────────────────────────────┐
│  internal IPv4 ALB → ECS Fargate (claude gateway)            │
│      ├─ OIDC → your IdP (Cognito / Okta / Entra / …)         │
│      ├─ RDS PostgreSQL (sign-in + rate-limit + spend state)   │
│      └─ Amazon Bedrock upstream (via ECS task role)           │
└──────────────────────────────────────────────────────────────┘
   developers reach the ALB over your private network
   (corporate VPN / Direct Connect / TGW — or a bundled reference Client VPN)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Developers run &lt;code&gt;claude /login&lt;/code&gt;, complete corporate SSO in a browser (MFA, conditional access — whatever your IdP enforces), and get a short-lived JWT. That's their credential. The gateway holds the Bedrock IAM role; it never leaves AWS. Offboarding is removing someone from your IdP.&lt;/p&gt;

&lt;p&gt;The repo that sets this all up is at &lt;a href="https://github.com/jiem-ying/claude-apps-gateway-aws" rel="noopener noreferrer"&gt;github.com&lt;/a&gt; — a self-contained CloudFormation stack you can copy into your own repo and deploy as-is.&lt;/p&gt;




&lt;h2&gt;
  
  
  Up and running in one command
&lt;/h2&gt;

&lt;p&gt;The repo ships five &lt;code&gt;.env&lt;/code&gt; profiles covering the common starting points:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;You have…&lt;/th&gt;
&lt;th&gt;Profile&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Nothing yet (greenfield)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;managed-newcognito-collector-vpn.env&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Existing Cognito pool&lt;/td&gt;
&lt;td&gt;&lt;code&gt;managed-existingcognito-byotelemetry.env&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Okta / Entra / any OIDC IdP&lt;/td&gt;
&lt;td&gt;&lt;code&gt;byo-oidc-notelemetry.env&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;No public domain (just testing)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;selfsigned-fallback.env&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Copy one, fill in ~5 values (your domain, Route53 zone ID, and IdP details), and run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cp &lt;/span&gt;config/managed-newcognito-collector-vpn.env config/my.env
&lt;span class="nv"&gt;$EDITOR&lt;/span&gt; config/my.env      &lt;span class="c"&gt;# domain, zone, region — that's mostly it&lt;/span&gt;

&lt;span class="nb"&gt;source &lt;/span&gt;config/my.env
./deploy-all.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The orchestrator chains everything: IdP stack → gateway stack → optional ADOT collector → optional Client VPN. It prints a summary at the end with the gateway URL, CloudWatch dashboard link, and &lt;code&gt;.ovpn&lt;/code&gt; path if VPN was bundled.&lt;/p&gt;

&lt;p&gt;If you already have your own IdP, OTLP collector, and private network path, skip &lt;code&gt;deploy-all.sh&lt;/code&gt; entirely and call &lt;code&gt;deploy.sh&lt;/code&gt; directly with your existing endpoints — nothing bundled deploys.&lt;/p&gt;




&lt;h2&gt;
  
  
  Three architectural tricks worth borrowing
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Public certificate + private address
&lt;/h3&gt;

&lt;p&gt;The ALB is &lt;strong&gt;internal and IPv4-only&lt;/strong&gt; — that's not optional. Claude Code's &lt;code&gt;/login&lt;/code&gt; flow explicitly rejects any gateway that resolves to a public IP address, so the isolation is enforced by the client, not by policy.&lt;/p&gt;

&lt;p&gt;The certificate, however, needs to be browser-trusted, and getting there without distributing a self-signed CA to every laptop is painful. The solution is: issue a &lt;strong&gt;DNS-validated public ACM certificate&lt;/strong&gt; against your domain, then point the public A-record at the &lt;strong&gt;internal ALB&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The domain resolves to private IPs (&lt;code&gt;10.20.x.x&lt;/code&gt;), reachable only over your private network. But the cert was validated against a public DNS name, so browsers trust it. Zero &lt;code&gt;NODE_EXTRA_CA_CERTS&lt;/code&gt;, zero keychain imports, zero fingerprint prompts.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Config lives in the task definition, not SSM
&lt;/h3&gt;

&lt;p&gt;The rendered &lt;code&gt;gateway.yaml&lt;/code&gt; config is passed as an ECS task-definition environment variable. This means any config change — model allowlist, telemetry endpoint, RBAC policies — forces a new task-def revision, and ECS automatically cycles the running tasks to pick it up.&lt;/p&gt;

&lt;p&gt;We tried an earlier version that fetched config from SSM at runtime. It broke in a subtle way: if you updated the telemetry endpoint in SSM, the running tasks kept using the old one until manually recycled. Config-in-taskdef makes the deploy process the source of truth.&lt;/p&gt;

&lt;p&gt;The constraint is a 4096-byte ECS env var limit. The current render is ~3000 bytes. &lt;code&gt;deploy.sh&lt;/code&gt; fails fast with a byte count if the rendered config exceeds the limit.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. GPG-verified binary download in Docker
&lt;/h3&gt;

&lt;p&gt;The Dockerfile uses a two-stage build. Stage 1 imports Anthropic's release signing key (fingerprint hardcoded), downloads a signed manifest, verifies the detached signature, downloads the &lt;code&gt;claude&lt;/code&gt; Linux binary, and SHA256-checks it against the manifest. Only the verified binary is copied into the runtime stage.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="c"&gt;# stage 1: verify&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;gpg &lt;span class="nt"&gt;--import&lt;/span&gt; anthropic-release-key.asc &lt;span class="se"&gt;\
&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; gpg &lt;span class="nt"&gt;--verify&lt;/span&gt; manifest.json.sig manifest.json &lt;span class="se"&gt;\
&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;sha256sum&lt;/span&gt; &lt;span class="nt"&gt;--check&lt;/span&gt; &amp;lt;&lt;span class="o"&gt;(&lt;/span&gt;jq &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="s1"&gt;'...'&lt;/span&gt; manifest.json&lt;span class="o"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;# stage 2: minimal runtime&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="s"&gt; debian:stable-slim&lt;/span&gt;
&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; --from=verifier /usr/local/bin/claude /usr/local/bin/claude&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The runtime image is &lt;code&gt;debian:stable-slim&lt;/code&gt; with only &lt;code&gt;ca-certificates&lt;/code&gt; added. No verification tooling ships to production.&lt;/p&gt;




&lt;h2&gt;
  
  
  RBAC without a new system
&lt;/h2&gt;

&lt;p&gt;Group-based access control is two env vars away:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;DENY_TOOL_GROUP&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;contractors
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;DENY_TOOLS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"mcp__bash,mcp__computer"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;deploy.sh&lt;/code&gt; renders these into a &lt;code&gt;managed.policies&lt;/code&gt; block in the gateway config — a first-match policy list that denies those tools to the specified IdP group, with a catch-all that leaves everyone else unrestricted:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;managed&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;policies&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;contractors-deny-shell&lt;/span&gt;
      &lt;span class="na"&gt;match&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;groups&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;contractors"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
      &lt;span class="na"&gt;deny&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;tools&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mcp__bash"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mcp__computer"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;everyone-else&lt;/span&gt;
      &lt;span class="na"&gt;match&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Model allowlists work the same way — engineering gets Opus, contractors get Haiku. Policy changes take effect on redeploy; a user's new group membership takes effect on their next &lt;code&gt;claude /login&lt;/code&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Observability: metrics vs. Logs Insights
&lt;/h2&gt;

&lt;p&gt;The bundled ADOT collector exports OTLP events to CloudWatch. Only &lt;code&gt;user.email&lt;/code&gt; and &lt;code&gt;user.groups&lt;/code&gt; are promoted to CloudWatch EMF metric &lt;strong&gt;dimensions&lt;/strong&gt; on &lt;code&gt;token.usage&lt;/code&gt; and &lt;code&gt;cost.usage&lt;/code&gt; events. Every distinct dimension value creates a new custom metric, and CloudWatch charges per metric. So: keep dimensions low-cardinality.&lt;/p&gt;

&lt;p&gt;High-cardinality slicing — per-user spend breakdowns, per-role analysis — is done in &lt;strong&gt;CloudWatch Logs Insights&lt;/strong&gt; over &lt;code&gt;/aws/claude-gateway/events&lt;/code&gt;. Logs Insights is billed per GB scanned, which is orders of magnitude cheaper for ad-hoc queries.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;CloudWatch Metrics  → "what is team A spending per day?"  (dashboard, alarms)
Logs Insights       → "which user spiked spend on Tuesday?"  (ad-hoc)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Enable log forwarding with &lt;code&gt;FORWARD_LOGS=true&lt;/code&gt;. It's off by default; metrics-only is the default.&lt;/p&gt;




&lt;h2&gt;
  
  
  Gotchas we hit
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;VPN tunnel MTU.&lt;/strong&gt; If &lt;code&gt;/login&lt;/code&gt; hangs silently, suspect the VPN MTU. At 1500, TLS handshake packets get fragmented and dropped. Fix: &lt;code&gt;sudo ifconfig utunN mtu 1300&lt;/code&gt;. It resets on every VPN reconnect — put it in a connect script.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Stale client state on hostname change.&lt;/strong&gt; When switching from a self-signed cert to the managed ACM cert (hostname change), both &lt;code&gt;~/.claude/remote-settings.json&lt;/code&gt; and the macOS keychain entry &lt;code&gt;Claude Code-credentials&lt;/code&gt; pin the old host. Delete both or &lt;code&gt;claude /logout&lt;/code&gt; before reconnecting.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;ACM validation hangs for ~90 minutes&lt;/strong&gt; if &lt;code&gt;PUBLIC_HOSTED_ZONE_ID&lt;/code&gt; is a private zone, or if &lt;code&gt;DOMAIN_NAME&lt;/code&gt; isn't under the zone you specified. &lt;code&gt;deploy.sh&lt;/code&gt; now preflights both and fails immediately — but if you hit this on an older version, delete the stuck ACM cert and redeploy.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;RDS-generated passwords in Postgres DSNs.&lt;/strong&gt; RDS can generate passwords containing &lt;code&gt;#&lt;/code&gt;, &lt;code&gt;%&lt;/code&gt;, &lt;code&gt;&amp;amp;&lt;/code&gt;, and other URL-structural characters. The &lt;code&gt;entrypoint.sh&lt;/code&gt; assembles the &lt;code&gt;postgres://&lt;/code&gt; DSN at runtime and percent-encodes the password in pure bash — no Python, no Perl in the image.&lt;/p&gt;




&lt;h2&gt;
  
  
  What it costs
&lt;/h2&gt;

&lt;p&gt;No gateway license. You pay for the AWS infrastructure plus normal Bedrock per-token pricing — the same as if you were calling Bedrock directly.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Resource&lt;/th&gt;
&lt;th&gt;Rough cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;2× ECS Fargate tasks (HA across AZs)&lt;/td&gt;
&lt;td&gt;~$9/mo&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RDS PostgreSQL &lt;code&gt;db.t4g.micro&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;~$12/mo&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Internal ALB&lt;/td&gt;
&lt;td&gt;~$16/mo&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Regional NAT gateway&lt;/td&gt;
&lt;td&gt;data-dependent&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Bedrock inference&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;per-token, same as always&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Roughly &lt;strong&gt;$40/month&lt;/strong&gt; of fixed infrastructure for a team of any size. Scale ECS to 0 or tear down the stack when you don't need it.&lt;/p&gt;




&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;

&lt;p&gt;The stack is MIT-licensed, self-contained, and deploys into your own account — no SaaS dependency, no license server, no phoning home.&lt;/p&gt;

&lt;p&gt;Source is at the link above. If you try it and hit something that isn't in the gotchas list, open an issue — the troubleshooting guide in &lt;code&gt;docs/GUIDE.md&lt;/code&gt; grows with every team that runs it.&lt;/p&gt;




</description>
      <category>ai</category>
      <category>aws</category>
      <category>claude</category>
    </item>
  </channel>
</rss>
