DEV Community

Cover image for Using OpenRouter's OpenAI-Compatible Models (Grok 4.1 Fast) with Strands Agents
yoko / Naoki Yokomachi
yoko / Naoki Yokomachi

Posted on

Using OpenRouter's OpenAI-Compatible Models (Grok 4.1 Fast) with Strands Agents

This article is an AI-assisted translation of a Japanese technical article.

Introduction

I'm building a personal AI agent called TONaRi ("tonari" means "next to" in Japanese — named with the idea of an AI that stands next to you and supports your daily life). It's built with Strands Agents + Amazon Bedrock AgentCore, with a VRM-powered 3D avatar frontend using AITuberKit.

In a previous article, I wrote about cost reduction through sub-agent splitting.
https://dev.to/yokomachi/28-tool-definitions-cutting-ai-agent-costs-with-sub-agent-splitting-4dbp

This time, I took cost reduction a step further by making it possible to switch the LLM itself to Grok 4.1 Fast via OpenRouter.

Cost Comparison

Let's compare the costs between Claude Haiku 4.5 (Amazon Bedrock), which I had been using as the main model, and Grok 4.1 Fast (OpenRouter), the new alternative.

Claude Haiku 4.5 (Bedrock) Grok 4.1 Fast (OpenRouter)
Input $1.10 / 1M tokens $0.20 / 1M tokens
Output $5.50 / 1M tokens $0.50 / 1M tokens

That's a significant difference. As I mentioned in the previous article, LLM per-token pricing is by far the biggest cost driver, so reducing the unit price — while maintaining an acceptable quality balance — has the greatest impact.

Switching Models in Strands Agents

Strands Agents is an open-source agent SDK provided by AWS, and it supports models beyond Bedrock. Using the OpenAIModel class, you can directly use models from any service that provides an OpenAI-compatible API, such as OpenRouter. If you need broader provider support, LiteLLMModel is also an option. Since Grok 4.1 Fast is OpenAI-compatible, we use the OpenAIModel class directly.

Creating an OpenAIModel

First, add the openai dependency.

dependencies = [
    "strands-agents>=1.23.0",
    "openai>=1.0.0",
    # ...
]
Enter fullscreen mode Exit fullscreen mode

Then create the model instance via OpenRouter.

from strands.models.openai import OpenAIModel

model = OpenAIModel(
    client_args={
        "api_key": "your-openrouter-api-key",
        "base_url": "https://openrouter.ai/api/v1",
    },
    model_id="x-ai/grok-4.1-fast",
)
Enter fullscreen mode Exit fullscreen mode

The created model can be passed to an Agent with the exact same interface as a Bedrock model.

from strands import Agent

agent = Agent(
    model=model,  # Works the same whether BedrockModel or OpenAIModel
    system_prompt="You are a personal AI assistant.",
    tools=my_tools,
)
Enter fullscreen mode Exit fullscreen mode

Wrap Up

So I switched the model used for everyday conversations to Grok 4.1 Fast, and my impression is that quality isn't a major issue for casual conversation. However, application-specific conversation tags (this AI agent uses tags like [happy] or [bow] to trigger facial expressions and motions) sometimes get ignored or misinterpreted by the model, so that still needs tuning.

I also had concerns about tool calling via AgentCore Gateway, but it's been working surprisingly well without any major adjustments.

I'll continue monitoring and consider trying other models or implementing model-specific routing if needed.

Top comments (0)