OpenAI on Amazon Bedrock: I simulated the migration from my current stack and the numbers don't add up like the announcement promises
The "correct solution" for reducing OpenAI costs is to stop calling OpenAI directly. I know that sounds weird. Let me explain why Bedrock might end up costing you more — slower, with more friction — than staying exactly where you are.
When the announcement dropped — OpenAI and AWS CEOs sharing a stage, 274 points on HN, everyone losing their minds — my first instinct was the same one I had when I migrated from Vercel to Railway: I'm going to test this myself before I say anything. That Vercel migration ate a full weekend and taught me more about real infrastructure than months of reading docs ever did. With Bedrock it took less time to reach a conclusion, but it was just as educational.
Spoiler: I didn't migrate. And it wasn't out of laziness.
OpenAI Amazon Bedrock migration costs: what the announcement promises vs. what I actually found
The pitch is clean: you access OpenAI models (GPT-4o, o1, o3-mini) from your existing AWS infrastructure, using the same IAM you already have, without managing third-party API keys, with consolidated billing and Bedrock's availability guarantees. For a company with a security and compliance team, that's genuinely valuable.
For me — running a stack on Railway, Next.js, and PostgreSQL with direct OpenAI API calls — that's worth... let me calculate it.
My current stack has these measurable characteristics:
- ~4,200 calls/month to GPT-4o with 2k–8k token contexts
- Measured average latency: 380ms to first token (p50), 720ms at p95
- Real cost over the last 30 days: $18.40 USD across input and output tokens
- Zero auth overhead: API key in an environment variable, one line of config
Before simulating the migration, I documented that baseline cold. I didn't want to fool myself later by comparing apples to oranges.
The simulation: moving my real calls to Bedrock
To simulate the migration I used an AWS account I already had active (leftover from when I was doing infra work back in 2022) and enabled the GPT-4o model in Bedrock from the console. The enablement process itself already has friction: you have to accept model-specific terms, wait for per-model approval, and configure the right IAM permissions. That took me 40 minutes the first time.
The SDK client changes:
// Current stack: direct OpenAI call
// Simple, predictable, no surprises
import OpenAI from 'openai';
const client = new OpenAI({
apiKey: process.env.OPENAI_API_KEY,
});
const response = await client.chat.completions.create({
model: 'gpt-4o',
messages: [{ role: 'user', content: prompt }],
max_tokens: 500,
});
// Same call via Bedrock
// Notice the signature change and the AWS credentials overhead
import { BedrockRuntimeClient, InvokeModelCommand } from '@aws-sdk/client-bedrock-runtime';
// AWS credentials are resolved at runtime from the environment
// IAM role, env vars, or ~/.aws/credentials — each with its own latency
const bedrockClient = new BedrockRuntimeClient({
region: 'us-east-1', // GPT-4o on Bedrock only available in us-east-1 at time of testing
});
// The body has to be serialized — no syntactic sugar
const command = new InvokeModelCommand({
modelId: 'openai.gpt-4o', // different format from direct calls
contentType: 'application/json',
accept: 'application/json',
body: JSON.stringify({
messages: [{ role: 'user', content: prompt }],
max_tokens: 500,
}),
});
const rawResponse = await bedrockClient.send(command);
// You need to deserialize manually — another step that fails silently if you forget
const response = JSON.parse(new TextDecoder().decode(rawResponse.body));
That signature change isn't just cosmetic. It's a breaking point for any generic wrapper you've built on top of the official OpenAI SDK.
The real numbers from the simulation
I ran the same set of 50 prompts against both endpoints — same texts, same model, same max_tokens — and measured:
| Metric | OpenAI direct | OpenAI via Bedrock | Delta |
|---|---|---|---|
| p50 latency (ms) | 382 | 534 | +40% |
| p95 latency (ms) | 718 | 1,240 | +72% |
| Cost per 1M input tokens | $2.50 | $3.00* | +20% |
| IAM cold start (first req) | 0ms | 340ms | — |
| Initial setup | ~2 min | ~40 min | — |
*Estimated pricing with Bedrock's markup at time of testing. Bedrock applies a surcharge on top of OpenAI's base price; it's not a pure pass-through.
The IAM cold start surprised me the most. The first call of each session carries a credential resolution overhead that simply doesn't exist with direct OpenAI. In a serverless context — which is where Bedrock theoretically makes the most sense — that 340ms stacks on top of your function's cold start. If you've read my post on how the Microsoft-OpenAI deal affects real API costs, this is the same pattern: corporate deals generate layers, and every layer has latency.
The gotchas that don't appear in the announcement
1. The lock-in flips but doesn't disappear
Bedrock's sales pitch is escaping OpenAI lock-in. My take: what you're actually doing is trading model lock-in for platform lock-in. Now you depend on AWS to enable the models you need, at whatever price AWS negotiates, with whatever regional availability AWS decides on.
When OpenAI launched o3-mini, I had it in my stack in 20 minutes: I changed one line of config. On Bedrock, new OpenAI models have to go through AWS's enablement process, which historically takes days or weeks. For a project where I'm iterating on models constantly, that's real friction.
I already dug into the lock-in problem in infra when I simulated a domain hijacking attack on GoDaddy — depending on a third party for something critical always has a price that doesn't show up on the pricing page.
2. IAM is a vector of complexity, not just security
I spent 25 minutes debugging an AccessDeniedException that turned out to be an incomplete IAM policy. The error message doesn't tell you which permission is missing; it just tells you something failed. I had to go to CloudTrail, filter by the exact timestamp, and reconstruct the permission chain from there.
With direct OpenAI, if the API key is wrong, the error is clear, immediate, and self-explanatory. The simplicity of debugging is not a minor detail when you're working alone at 11pm.
3. "Consolidated" pricing has a minimum floor
For small projects — under $50/month in LLM spend — the operational overhead of maintaining an active AWS account, properly configured IAM policies, Bedrock monitoring in CloudWatch, and per-service billing tracking consumes dev time that's worth more than the 20% markup you'd save... except you're not saving anything because Bedrock is actually more expensive than going direct.
This connects to something I understood during my Railway migration: "enterprise" infrastructure has an operational cost that doesn't scale down. Bedrock is enterprise infrastructure. For a team of 10+ people with compliance requirements and centralized billing, it makes sense. For me today, it doesn't.
4. Streaming behaves differently
I tested calls with streaming enabled — which I use for the UX experience in my text generation features — and the chunk behavior in Bedrock is not identical to the official SDK. Chunks arrive in different sizes, which broke my markdown parsing logic on the client. It's not a bug, it's an implementation difference that no announcement mentions.
I found something similar when I analyzed the Mercor voice data situation: the implementation details that don't appear in the announcement are exactly the ones that end up biting you.
FAQ: OpenAI on Amazon Bedrock for independent devs
Are OpenAI prices on Bedrock the same as on the direct API?
No. Bedrock applies a markup on top of OpenAI's base price. At the time of my simulation, GPT-4o on Bedrock cost ~$3.00 per million input tokens versus $2.50 on the direct API. The exact markup can vary and AWS doesn't document it prominently; you have to do the comparison manually from the pricing calculator.
Do I need an AWS account to access OpenAI via Bedrock?
Yes, mandatory. There's no access to Bedrock without an AWS account, configured IAM, and individually enabled models. If you already have infrastructure on AWS, that cost is already paid. If you don't, it's a new cost.
Is OpenAI latency on Bedrock comparable to the direct API?
In my tests, no. p50 was 40% higher and p95 was 72% higher. The IAM overhead and Bedrock's additional proxy layer add latency that simply doesn't exist in a direct call. For use cases where latency matters — real-time chat, response streaming — that difference is perceptible to the user.
Does Bedrock support all OpenAI models?
As of publishing this, no. GPT-4o and some models from the o1/o3 family are available, but not the full OpenAI catalog. New models have to go through AWS's enablement process before being available on Bedrock, creating a lag relative to direct availability.
Does it make sense to migrate if I'm already using other Bedrock models (Claude, Llama)?
Yes, this is the case where the value proposition holds up the most. If you already have active Bedrock infrastructure, configured IAM, and consolidated billing, adding GPT-4o to the same stack has a low marginal cost. The problem is for someone starting from scratch just for OpenAI.
Does streaming work the same on Bedrock as in the OpenAI SDK?
Not exactly. Streaming chunks on Bedrock have different size behavior than the official SDK. If you have UI logic that depends on chunk size or timing — progressive markdown parsing, typing indicators — you'll need to adjust that logic. It's not a blocker, but it's work that the migration documentation doesn't mention.
My take: the deal is real, the value proposition for independent devs isn't
My thesis, straight up: the OpenAI-AWS deal is genuinely interesting for companies with infra teams, active compliance requirements, and centralized billing in AWS. For an independent dev or small team already calling the OpenAI API directly, Bedrock adds friction, adds cost, and adds latency without giving back anything that matters in that context.
What the announcement sells is operational simplicity for people who already have operational complexity installed. If the problem Bedrock solves is "managing multiple API keys from multiple vendors," that problem exists when you have multiple vendors and a security team auditing every credential. If you have one API key in an environment variable and Railway manages it for you, that problem doesn't exist and Bedrock solves nothing.
There's something else that makes me uncomfortable: whenever two giants announce an integration from the same stage, the numbers that show up in the deck are the ones that look good for both of them. The numbers I found — 40% more latency, 20% more cost, 40 minutes of setup versus 2 — don't appear in any press release.
This connects to the analysis of pgbackrest and maintenance changes: infrastructure decisions that look neutral rarely are. Someone always wins more.
My decision today? I'm staying on the direct API. If in six months the Bedrock markup drops, the IAM cold start disappears, and the model catalog reaches parity, I'll reevaluate. But I don't migrate for an announcement; I migrate for the numbers. And today's numbers say no.
If you made it this far and you're evaluating the same thing, do what I did: measure first. Pull your real calls from the last month, see what it costs you and how long it takes, and then open the Bedrock console. The announcement can wait; production infrastructure can't.
This article was originally published on juanchi.dev
Top comments (0)