Gabriel Koo for AWS Community Builders

Posted on Mar 22

Bedrock for AI Coding Tools: Mantle vs Gateway vs LiteLLM — A Decision Guide for AWS Credit Burners

#aws #bedrock #ai #openai

You have AWS credits. You want to use them on AI coding tools — OpenCode, Codex CLI, Claude Code, whatever. Amazon Bedrock has the models. But how do you actually connect them?

There are three approaches, and picking the wrong one wastes time. Here's the decision guide I wish I had.

All data in this post is as of March 2026. Model counts and API support may change — check amazonbedrockmodels.github.io for the latest.

TL;DR

Just want it to work? Mantle + OpenCode. Five minutes, zero infra.
Need Claude models via OpenAI API? bedrock-access-gateway on Lambda.
Need Claude Code specifically? LiteLLM. It's the only path.
Codex CLI? Broken with all three. Wait for LiteLLM to fix a tool translation bug.

The three paths

One thing all three have in common: your API keys and code context stay within your AWS account or your own infrastructure. Third-party AI gateways exist (Bifrost, Portkey, etc.), but they require routing your Bedrock API keys and code context through someone else's servers. Self-hosted or AWS-native — that's the baseline.

1. Bedrock Mantle — no self-hosted infra required

Mantle is AWS's native OpenAI-compatible endpoint. No Lambda, no container, no proxy — just set your base URL and API key:

export OPENAI_BASE_URL="https://bedrock-mantle.us-east-1.api.aws/v1"
export OPENAI_API_KEY="your-bedrock-api-key"

What's on Mantle: 38 open-weight models — DeepSeek, Mistral, Qwen, GLM, NVIDIA Nemotron, MiniMax, Moonshot Kimi, Google Gemma, OpenAI gpt-oss, and Writer Palmyra.

What's NOT on Mantle: Anthropic Claude, Amazon Nova, Meta Llama, AI21, Cohere — the proprietary/first-party models are absent.

API coverage: Mantle exposes Chat Completions (/v1/chat/completions) and Responses API (/v1/responses). No Anthropic Messages API (/v1/messages).

The Responses API is limited — only 4 models support it: openai.gpt-oss-120b-1:0, openai.gpt-oss-20b-1:0, openai.gpt-oss-120b, and openai.gpt-oss-20b. Every other model is Chat Completions only. I verified this by scraping all 102 model card pages in the AWS documentation.

Cost: Standard Bedrock on-demand pricing. No gateway markup, no infra costs.

Best for: OpenCode or any tool that speaks OpenAI Chat Completions.

2. bedrock-access-gateway — self-hosted, all models

bedrock-access-gateway (or my fork, bedrock-access-gateway-function-url) gives you an OpenAI-compatible proxy backed by all Bedrock models — including Claude, Nova, and Llama.

Deploy it as a Lambda Function URL or on ECS, and you get:

export OPENAI_BASE_URL="https://your-lambda-url.lambda-url.us-west-2.on.aws/api/v1"
export OPENAI_API_KEY="your-gateway-api-key"

The tradeoff: you maintain infrastructure. But you get access to every Bedrock model through a single OpenAI-compatible endpoint.

Cost: Bedrock on-demand pricing + Lambda/ECS compute costs (minimal for Lambda Function URLs — you pay per invocation).

Best for: When you need Claude or Nova through OpenAI-compatible tools, or want full control over routing, caching, and logging.

3. LiteLLM — the universal translator

LiteLLM is the Swiss Army knife. It translates between API schemas — OpenAI, Anthropic, Bedrock native, and more. It's the only option that gives you Anthropic Messages API (/v1/messages) compatibility with Bedrock models.

This matters because Claude Code uses the Anthropic API schema, not OpenAI's. If you want to run Claude Code against Bedrock, LiteLLM is your best (and arguably only) option. I tested this end-to-end: Claude Code CLI → LiteLLM → Bedrock Converse API — it works, including streaming responses.

Is it perfect? No. Setup is more complex (Python process or Docker container, optional PostgreSQL for analytics), and you're adding another layer of abstraction. But it's the most flexible gateway available.

Cost: Bedrock on-demand pricing + your compute costs for hosting LiteLLM. No per-call markup from LiteLLM itself (open source).

Best for: Claude Code, or when you need both OpenAI and Anthropic API compatibility from a single proxy.

Tool compatibility matrix

Tool	API Schema	Mantle	bedrock-access-gateway	LiteLLM
OpenCode	OpenAI Chat	✅	✅	✅
Codex CLI	OpenAI Responses	❌ Auth issues	❌ No Responses API	⚠️ Tool bug
Claude Code	Anthropic Messages	❌ No support	❌ Wrong schema	✅

Note: Anthropic-native tools like Kiro CLI also work through LiteLLM's Anthropic Messages API translation.

A note on Codex CLI

Codex CLI requires the Responses API (/v1/responses), which limits your options:

Mantle: Only the 4 OpenAI gpt-oss models support Responses API. Even with those, I hit 401 auth errors (Bearer token not passed correctly through Codex's HTTPS transport) and tool type rejections (web_search type not supported — only function and mcp).
bedrock-access-gateway: No Responses API at all — /v1/responses returns 404. The gateway only implements Chat Completions.
LiteLLM: Supports Responses API (v1.66.3+) and has an official Codex CLI tutorial. However, as of v1.82.5, there's a tool translation bug: Codex CLI sends built-in tool types that LiteLLM converts to Bedrock Converse format with empty toolSpec.name fields, causing Bedrock validation errors. The Responses API itself works fine when tested with standard function tools. This should be fixable on the LiteLLM side.

Quick setup: OpenCode + Mantle

If you just want to burn AWS credits on a coding CLI today, here's the fastest path. OpenCode (v1.2.27+) works with Mantle out of the box:

{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "bedrock-mantle": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "Bedrock Mantle",
      "options": {
        "baseURL": "https://bedrock-mantle.us-east-1.api.aws/v1",
        "apiKey": "{env:BEDROCK_API_KEY}"
      },
      "models": {
        "openai.gpt-oss-120b": { "name": "GPT OSS 120B" },
        "zai.glm-5": { "name": "GLM 5 (744B/40B MoE)" },
        "qwen.qwen3-coder-480b-a35b-instruct": { "name": "Qwen3 Coder 480B" },
        "deepseek.v3.2": { "name": "DeepSeek V3.2" },
        "mistral.mistral-large-3-675b-instruct": { "name": "Mistral Large 3" }
      }
    }
  },
  "model": "bedrock-mantle/openai.gpt-oss-120b"
}

Save to ~/.config/opencode/opencode.json, set BEDROCK_API_KEY, and you're coding.

The bottom line

	Mantle	bedrock-access-gateway	LiteLLM
Infra to maintain	None	Lambda/ECS	Container/process
Models available	38 (open-weight)	All Bedrock	All Bedrock
OpenAI Chat API	✅	✅	✅
OpenAI Responses API	⚠️ gpt-oss only	❌	✅
Anthropic Messages API	❌	❌	✅
OpenCode	✅	✅	✅
Codex CLI	❌	❌	⚠️ Tool bug
Claude Code	❌	❌	✅
Extra cost	None	~$0 (Lambda)	Your compute
Setup time	5 min	30 min	1 hr

Track available Mantle models

I maintain amazonbedrockmodels.github.io — a catalog of every Bedrock model with API support badges and endpoint support (Mantle vs Runtime), scraped from the AWS documentation.

Burning AWS credits on something interesting? I'd love to hear what tools and models you're using — drop a comment.
comment.*

Top comments (1)

J.Goutin • Apr 4

Hi! Thanks for the great breakdown for AWS credit burners.

I'm the author of another tool in this space called stdapi.ai which provides OpenAI and Anthropic endpoints for Bedrock. It is intended to be highly optimized for AWS and feature-complete with Bedrock.

I use it daily with Claude Code, though I haven't fully validated it with Codex yet.

If you ever update your benchmark, I'd be thrilled if you gave it a spin alongside Mantle, Gateway, and LiteLLM. I'm actively looking for feedback from other builders, so any thoughts would be hugely appreciated!