<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: littlemex</title>
    <description>The latest articles on DEV Community by littlemex (@littlemex63454).</description>
    <link>https://dev.to/littlemex63454</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3965705%2F60bbaa61-173e-4518-a64b-08aa5661d878.jpg</url>
      <title>DEV Community: littlemex</title>
      <link>https://dev.to/littlemex63454</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/littlemex63454"/>
    <language>en</language>
    <item>
      <title>Stratoclave: a tenant-aware credit gateway for Amazon Bedrock — now with OpenAI codex support</title>
      <dc:creator>littlemex</dc:creator>
      <pubDate>Wed, 03 Jun 2026 11:21:13 +0000</pubDate>
      <link>https://dev.to/littlemex63454/stratoclave-a-tenant-aware-credit-gateway-for-amazon-bedrock-now-with-openai-codex-support-266</link>
      <guid>https://dev.to/littlemex63454/stratoclave-a-tenant-aware-credit-gateway-for-amazon-bedrock-now-with-openai-codex-support-266</guid>
      <description>&lt;p&gt;If you let a team share a single AWS account for Amazon Bedrock, you quickly run into questions Bedrock alone does not answer: who called which model, under whose budget, through which identity. &lt;strong&gt;Stratoclave&lt;/strong&gt; is a small OSS gateway that puts those answers in front of Bedrock without dragging in Postgres, Redis, or a SaaS control plane.&lt;/p&gt;

&lt;p&gt;It was originally written for myself — I just wanted per-user credits in front of Bedrock for personal use of Claude Code. It grew into something that now also covers OpenAI codex / GPT-5.x via Bedrock's &lt;code&gt;bedrock-mantle&lt;/code&gt; endpoint.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Repo:&lt;/strong&gt; &lt;a href="https://github.com/littlemex/stratoclave" rel="noopener noreferrer"&gt;&lt;code&gt;littlemex/stratoclave&lt;/code&gt;&lt;/a&gt; (Apache 2.0, alpha)&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  What it actually does
&lt;/h2&gt;

&lt;p&gt;Stratoclave is a single FastAPI service on ECS Fargate that exposes two inference routes:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Route&lt;/th&gt;
&lt;th&gt;Wire format&lt;/th&gt;
&lt;th&gt;Backend&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;POST /v1/messages&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Anthropic Messages API&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;bedrock:Converse&lt;/code&gt; in us-east-1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;POST /openai/v1/responses&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;OpenAI Responses API&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;bedrock-mantle&lt;/code&gt; in us-east-2 / us-west-2&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Both routes share the same DynamoDB-backed credit reservation, the same &lt;code&gt;messages:send&lt;/code&gt; / &lt;code&gt;responses:send&lt;/code&gt; RBAC scopes, the same audit log, and the same three identity paths (Cognito password, AWS SSO via Vouch-by-STS, long-lived &lt;code&gt;sk-stratoclave-*&lt;/code&gt; keys).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9k6ple6vs3yikuciu54t.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9k6ple6vs3yikuciu54t.png" alt="Stratoclave architecture: clients to CloudFront to ALB to ECS Fargate, with DynamoDB, Cognito, Bedrock, bedrock-mantle (us-west-2 / us-east-2 cross-region), STS, and CloudWatch Logs." width="800" height="429"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The control plane is one AWS region (us-east-1) and one Fargate task. Bedrock for OpenAI is cross-region, but no second control-plane region is deployed.&lt;/p&gt;

&lt;p&gt;The web console login screen redirects to the Cognito Hosted UI for password / SSO sign-in; CLI users instead run &lt;code&gt;stratoclave auth login&lt;/code&gt; and then bring this tab into focus with &lt;code&gt;stratoclave ui open&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwc8e7eqehupzj5g59li8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwc8e7eqehupzj5g59li8.png" alt="Stratoclave web console login screen with the language switcher, a " width="800" height="500"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The thing I actually wanted: per-tenant, per-user credits
&lt;/h2&gt;

&lt;p&gt;The reason this exists. Every inference call atomically reserves &lt;code&gt;max_tokens + input_estimate&lt;/code&gt; from the caller's budget with a conditional &lt;code&gt;UpdateItem&lt;/code&gt;, invokes the upstream, then refunds the diff from the real token counts on return. &lt;code&gt;UsageLogs&lt;/code&gt; always records the actual spend, not the reservation. Concurrent requests cannot race past the quota — the conditional write either commits or fails.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7ygz9dhy3joe5w0aev70.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7ygz9dhy3joe5w0aev70.png" alt="Credit reservation flow: 4 lanes (Client / Backend / DynamoDB / Bedrock or bedrock-mantle), 8 steps top-to-bottom from POST to refund + UsageLogs row." width="800" height="713"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The pipeline lives in one file (&lt;code&gt;backend/mvp/_pipeline.py&lt;/code&gt;) and is shared between both routes — the OpenAI Responses route applies an extra reasoning-effort multiplier (1× / 2× / 4× / 8× for &lt;code&gt;low&lt;/code&gt; / &lt;code&gt;medium&lt;/code&gt; / &lt;code&gt;high&lt;/code&gt; / &lt;code&gt;xhigh&lt;/code&gt;) on the upfront reservation because reasoning traces can blow output by an order of magnitude. The minimum reservation is 8 192 tokens regardless of multiplier.&lt;/p&gt;

&lt;p&gt;Personal usage history shows per-call token counts, model names, and credit spend drawn from the same &lt;code&gt;UsageLogs&lt;/code&gt; table.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F20polzqhgomp4vdjm1s7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F20polzqhgomp4vdjm1s7.png" alt="Stratoclave web console: personal usage statistics page listing recent inference calls with model, token counts, and credit amounts consumed." width="800" height="520"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Vouch by STS: passwordless login without holding an IdP secret
&lt;/h2&gt;

&lt;p&gt;The single behaviour I am proudest of. The CLI signs an &lt;code&gt;sts:GetCallerIdentity&lt;/code&gt; request locally with SigV4, the backend forwards the signed payload to STS verbatim, and the backend trusts only the &lt;code&gt;Arn&lt;/code&gt; / &lt;code&gt;UserId&lt;/code&gt; / &lt;code&gt;Account&lt;/code&gt; STS returns. No IdP refresh token ever touches the backend.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe2idyughmazfngvd9qfm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe2idyughmazfngvd9qfm.png" alt="Vouch by STS: 4 lanes (CLI / Backend / STS / DynamoDB+Cognito). CLI signs locally, backend replays to STS, STS returns canonical Arn/UserId/Account, backend resolves identity-type gates and mints an access_token." width="800" height="542"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The pattern is the same one &lt;a href="https://developer.hashicorp.com/vault/docs/auth/aws" rel="noopener noreferrer"&gt;HashiCorp Vault has used for a decade&lt;/a&gt; in its AWS &lt;code&gt;iam&lt;/code&gt; auth method. Anything that populates &lt;code&gt;~/.aws/credentials&lt;/code&gt; works the same way: &lt;code&gt;aws sso login&lt;/code&gt;, &lt;code&gt;saml2aws&lt;/code&gt;, Entra ID / Okta / ADFS SAML federation, even a regular IAM user with long-lived keys (default DENY unless explicitly allowed per trusted account). EC2 instance profiles are rejected by default because they cannot be attributed to a single human.&lt;/p&gt;

&lt;p&gt;A full backend compromise cannot pivot into the customer's IAM Identity Center or SAML IdP. The worst-case blast radius is bounded to Stratoclave's own resources — Bedrock overspend, DynamoDB tampering, impersonation within this deployment.&lt;/p&gt;

&lt;p&gt;The trusted-accounts admin page is where AWS account IDs and &lt;code&gt;allowed_role_patterns&lt;/code&gt; (fnmatch) are managed — this is the allowlist that gates SSO logins from outside accounts.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fekmsdcn43r9q3umyu1wv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fekmsdcn43r9q3umyu1wv.png" alt="Stratoclave web console: admin trusted-accounts list showing AWS account IDs registered for SSO with allowed role patterns." width="800" height="500"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Three little things that turned out to matter more than expected
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. &lt;code&gt;stratoclave codex -- "..."&lt;/code&gt; (and &lt;code&gt;stratoclave claude -- "..."&lt;/code&gt;)
&lt;/h3&gt;

&lt;p&gt;A wrapper subcommand that mints a 30-minute ephemeral &lt;code&gt;responses:send&lt;/code&gt; (or &lt;code&gt;messages:send&lt;/code&gt;) key, hands it to the child process via env, and revokes the key on exit:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;stratoclave codex &lt;span class="nt"&gt;--&lt;/span&gt; &lt;span class="s2"&gt;"Write a hello-world Python function"&lt;/span&gt;
&lt;span class="o"&gt;[&lt;/span&gt;INFO] Launching codex via Stratoclave proxy &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;openai.gpt-5.4, &lt;span class="nv"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;sk-stratoclave-...&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="o"&gt;[&lt;/span&gt;INFO] Child process uses an ephemeral responses-only API key&lt;span class="p"&gt;;&lt;/span&gt;
       the Cognito bearer is not exported and the user&lt;span class="s1"&gt;'s
       ~/.codex/config.toml is untouched.
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The child gets a key scoped to exactly one route; the user's Cognito bearer never leaves the parent process. MCP servers and tool subprocesses started by codex cannot pivot back into the user's stratoclave admin endpoints because the env they inherit doesn't carry the right credentials.&lt;/p&gt;

&lt;p&gt;The same wrapper exists for Claude Code (&lt;code&gt;stratoclave claude&lt;/code&gt;). They share the env-scrub list and the revoke-on-exit lifecycle through one Rust struct (&lt;code&gt;ChildLauncher&lt;/code&gt;) so a fix to one applies to both.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. &lt;code&gt;/.well-known/stratoclave-config&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;One unauthenticated discovery endpoint that drives the entire CLI bootstrap:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;stratoclave setup https://&amp;lt;your&amp;gt;.cloudfront.net
&lt;span class="nv"&gt;$ &lt;/span&gt;stratoclave auth sso &lt;span class="nt"&gt;--profile&lt;/span&gt; your-aws-sso-profile     &lt;span class="c"&gt;# or `auth login --email`&lt;/span&gt;
&lt;span class="nv"&gt;$ &lt;/span&gt;stratoclave codex &lt;span class="nt"&gt;--&lt;/span&gt; &lt;span class="s2"&gt;"..."&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The endpoint returns Cognito IDs, default model names, and OpenAI base path / supported regions when &lt;code&gt;CODEX_ENABLED=true&lt;/code&gt;. Old CLI binaries hitting a new backend deserialize cleanly because every new field is &lt;code&gt;Optional&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. The CLI is the source of truth for "what works through the proxy"
&lt;/h3&gt;

&lt;p&gt;Anything that speaks Anthropic Messages or OpenAI Responses with a custom &lt;code&gt;base_url&lt;/code&gt; works. The CLI is just a quality-of-life wrapper. If you prefer using the OpenAI SDK directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://&amp;lt;your&amp;gt;.cloudfront.net/openai/v1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sk-stratoclave-xxxxxxxx...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;   &lt;span class="c1"&gt;# mint via web console or CLI
&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;responses&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;openai.gpt-5.4&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nb"&gt;input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Hello&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;output_text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same for Anthropic:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;anthropic&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Anthropic&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Anthropic&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://&amp;lt;your&amp;gt;.cloudfront.net&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sk-stratoclave-xxxxxxxx...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-opus-4-7&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Hello&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}],&lt;/span&gt;
&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Claude Desktop's Cowork (Gateway mode) and Cline / Continue / Aider with &lt;code&gt;ANTHROPIC_BASE_URL&lt;/code&gt; work the same way.&lt;/p&gt;

&lt;p&gt;The personal API-keys page is where you manage long-lived &lt;code&gt;sk-stratoclave-*&lt;/code&gt; keys yourself — the same key-shape the wrapper subcommands mint ephemerally and revoke on exit.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwnr41aka1sz0j1hqihlr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwnr41aka1sz0j1hqihlr.png" alt="Stratoclave web console: personal API key management page listing active sk-stratoclave keys with their scope, creation date, and individual revoke buttons." width="800" height="681"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Screenshots: admin walkthrough
&lt;/h2&gt;

&lt;p&gt;The admin flow from dashboard to a provisioned tenant member, top to bottom.&lt;/p&gt;

&lt;p&gt;The dashboard summarises tenants, users, recent activity, and credit consumption.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyrnvzqbdiu8luhvmsyjx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyrnvzqbdiu8luhvmsyjx.png" alt="Stratoclave web console main dashboard showing summary tiles for active users, total tokens consumed, credit budgets, and recent activity." width="800" height="500"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The new-user form collects email, role (&lt;code&gt;admin&lt;/code&gt; / &lt;code&gt;team_lead&lt;/code&gt; / &lt;code&gt;user&lt;/code&gt;), tenant assignment, and an initial credit budget.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxtqklu7ynjlh7bu3d4ni.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxtqklu7ynjlh7bu3d4ni.png" alt="Stratoclave web console admin new-user creation form with fields for email address, role selection, tenant, and initial credit limit." width="800" height="756"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The user detail view shows assigned role, tenant, remaining vs total credit, and the user's own API keys.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9pk218ssuzxdjjs3ozgy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9pk218ssuzxdjjs3ozgy.png" alt="Stratoclave web console admin user detail page showing role / auth, tenant, credit balance, and the three admin actions: reassign tenant, override credit, delete user." width="800" height="500"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The tenant detail view lists members, their credit balances, and the tenant-wide monthly cap.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fds87afqxxw593quvafb1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fds87afqxxw593quvafb1.png" alt="Stratoclave web console admin tenant detail page listing the tenant's members with their roles, credit usage, and the tenant-wide monthly budget ceiling." width="800" height="500"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The admin usage-logs page is the audit trail. Filter by &lt;code&gt;tenant_id&lt;/code&gt;, &lt;code&gt;user_id&lt;/code&gt;, and ISO-8601 &lt;code&gt;since&lt;/code&gt; / &lt;code&gt;until&lt;/code&gt; — backed by a PK Query when &lt;code&gt;tenant_id&lt;/code&gt; is set, a GSI Query when &lt;code&gt;user_id&lt;/code&gt; is set, and a Scan otherwise (truncated at 100 rows).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0ynbu874teub67y0p1ac.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0ynbu874teub67y0p1ac.png" alt="Stratoclave web console admin usage-logs page with filter controls for tenant_id, user_id, since, until, and an empty results table for the current filter window." width="800" height="500"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What it deliberately does not do
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;No multi-provider fan-out.&lt;/strong&gt; It is Bedrock-shaped. If you need OpenAI direct, Vertex, Gemini, Ollama, and so on in one proxy, &lt;a href="https://github.com/BerriAI/litellm" rel="noopener noreferrer"&gt;LiteLLM&lt;/a&gt; is the right tool — it speaks 100+ providers and has a much richer commercial budgeting tier.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No Postgres, no Redis.&lt;/strong&gt; All state is in DynamoDB. That keeps the deployment small and the failure modes few; it also caps the budgeting feature set at what fits a key-value store.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No us-east-1 escape for the control plane.&lt;/strong&gt; All Stratoclave infrastructure runs in us-east-1; only the &lt;code&gt;bedrock-mantle&lt;/code&gt; calls for OpenAI are cross-region (us-east-2 / us-west-2).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No SaaS dependency.&lt;/strong&gt; No external control plane, no telemetry, no license server. The deployment is yours.&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;Stratoclave&lt;/th&gt;
&lt;th&gt;LiteLLM Proxy&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Providers&lt;/td&gt;
&lt;td&gt;Amazon Bedrock (Claude family + OpenAI GPT-5.x)&lt;/td&gt;
&lt;td&gt;100+ (OpenAI, Anthropic, Bedrock, Vertex, Azure, Gemini, Ollama, …)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;State&lt;/td&gt;
&lt;td&gt;DynamoDB only (serverless)&lt;/td&gt;
&lt;td&gt;Postgres required, Redis recommended&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RBAC&lt;/td&gt;
&lt;td&gt;admin / team_lead / user, tenant-scoped&lt;/td&gt;
&lt;td&gt;Proxy / Internal User / Team, global / team / user / key / model budgets&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;API keys&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;sk-stratoclave-*&lt;/code&gt;, scope narrowing, cap of 5 active, immediate revoke&lt;/td&gt;
&lt;td&gt;Virtual keys with &lt;code&gt;expires / max_budget / rpm_limit / tpm_limit / models&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SSO / STS&lt;/td&gt;
&lt;td&gt;Built-in (Vouch by STS, covers &lt;code&gt;aws sso&lt;/code&gt;, &lt;code&gt;saml2aws&lt;/code&gt;, IAM users)&lt;/td&gt;
&lt;td&gt;Enterprise tier (Okta / Entra ID / OIDC / SAML)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Deploy&lt;/td&gt;
&lt;td&gt;AWS CDK v2, Fargate from 256 CPU / 512 MiB&lt;/td&gt;
&lt;td&gt;Docker / Helm / ECS / EKS / Cloud Run&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;License&lt;/td&gt;
&lt;td&gt;Apache 2.0 (everything OSS)&lt;/td&gt;
&lt;td&gt;Dual license (MIT + Commercial); SSO / audit are commercial&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CLI integration&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;stratoclave claude --&lt;/code&gt; / &lt;code&gt;stratoclave codex --&lt;/code&gt; ephemeral wrappers&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;ANTHROPIC_BASE_URL&lt;/code&gt; / &lt;code&gt;OPENAI_BASE_URL&lt;/code&gt; env override&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  When this is the right tool
&lt;/h2&gt;

&lt;p&gt;You're an AWS-native team that already has IAM Identity Center / &lt;code&gt;saml2aws&lt;/code&gt;, you only call Bedrock, and you do not want to run an RDBMS for a proxy. You want per-tenant credit, per-user override, an audit trail, and the option to mint short-lived keys for CI without touching Postgres.&lt;/p&gt;

&lt;p&gt;You're not? Pick LiteLLM. Stratoclave is opinionated and small on purpose.&lt;/p&gt;

&lt;h2&gt;
  
  
  Quick start
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Deploy to your AWS account&lt;/span&gt;
git clone https://github.com/littlemex/stratoclave.git
&lt;span class="nb"&gt;cd &lt;/span&gt;stratoclave
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;AWS_PROFILE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your-admin-profile
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;AWS_REGION&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;us-east-1 &lt;span class="nv"&gt;CDK_DEFAULT_REGION&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;us-east-1
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;CDK_DEFAULT_ACCOUNT&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;aws sts get-caller-identity &lt;span class="nt"&gt;--query&lt;/span&gt; Account &lt;span class="nt"&gt;--output&lt;/span&gt; text&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;cd &lt;/span&gt;iac &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; ./scripts/deploy-all.sh

&lt;span class="c"&gt;# Build the CLI (pre-built binaries TBD)&lt;/span&gt;
&lt;span class="nb"&gt;cd&lt;/span&gt; ../cli &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; cargo build &lt;span class="nt"&gt;--release&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;PATH&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$PWD&lt;/span&gt;&lt;span class="s2"&gt;/target/release:&lt;/span&gt;&lt;span class="nv"&gt;$PATH&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

&lt;span class="c"&gt;# Bootstrap and use&lt;/span&gt;
stratoclave setup https://&amp;lt;your&amp;gt;.cloudfront.net
stratoclave auth sso &lt;span class="nt"&gt;--profile&lt;/span&gt; &amp;lt;your-aws-sso-profile&amp;gt;
stratoclave codex &lt;span class="nt"&gt;--&lt;/span&gt; &lt;span class="s2"&gt;"Hello, who are you?"&lt;/span&gt;
stratoclave claude &lt;span class="nt"&gt;--&lt;/span&gt; &lt;span class="s2"&gt;"Summarise this repository"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Status
&lt;/h2&gt;

&lt;p&gt;Alpha. Public HTTP surfaces, DynamoDB schemas, and CDK construct props may change without notice until &lt;code&gt;v0.1.0&lt;/code&gt; is cut. Issues and pull requests welcome.&lt;/p&gt;

&lt;p&gt;If you read this far and the tradeoffs match your situation, I would be very glad to hear how the deploy goes.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>opensource</category>
      <category>security</category>
      <category>ai</category>
    </item>
  </channel>
</rss>
