<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: yoko / Naoki Yokomachi</title>
    <description>The latest articles on DEV Community by yoko / Naoki Yokomachi (@yokomachi).</description>
    <link>https://dev.to/yokomachi</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F637404%2F00fb320f-dc2f-4e97-9bd1-8b2b578b2209.jpg</url>
      <title>DEV Community: yoko / Naoki Yokomachi</title>
      <link>https://dev.to/yokomachi</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/yokomachi"/>
    <language>en</language>
    <item>
      <title>Building an AWS Cost Visualization Workflow with Strands Agents Skills and AgentCore Code Interpreter</title>
      <dc:creator>yoko / Naoki Yokomachi</dc:creator>
      <pubDate>Fri, 03 Apr 2026 01:26:25 +0000</pubDate>
      <link>https://dev.to/aws-builders/building-an-aws-cost-visualization-workflow-with-strands-agents-skills-and-agentcore-code-2d2</link>
      <guid>https://dev.to/aws-builders/building-an-aws-cost-visualization-workflow-with-strands-agents-skills-and-agentcore-code-2d2</guid>
      <description>&lt;h1&gt;
  
  
  Introduction
&lt;/h1&gt;

&lt;p&gt;I'm currently developing a personal AI agent called TONaRi. It also has an X (Twitter) account where it posts tech news and more.&lt;br&gt;
&lt;a href="https://x.com/tonari_with" rel="noopener noreferrer"&gt;https://x.com/tonari_with&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The agent's core architecture is built on &lt;a href="https://strandsagents.com/" rel="noopener noreferrer"&gt;Strands Agents&lt;/a&gt; + &lt;a href="https://aws.amazon.com/bedrock/agentcore/" rel="noopener noreferrer"&gt;Amazon Bedrock AgentCore&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffyz2pxpy9x1s8ia4nuwt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffyz2pxpy9x1s8ia4nuwt.png" alt="Architecture overview" width="800" height="520"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In this article, I combined AgentCore Code Interpreter with Strands Agents' Agent Skills to implement a workflow that retrieves AWS cost data and generates chart images using code. Check out the video demo below:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://x.com/_cityside/status/2035339843014987845" rel="noopener noreferrer"&gt;https://x.com/_cityside/status/2035339843014987845&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Although this was an addition to an existing web application codebase, I hope it also serves as a useful reference for building something similar from scratch.&lt;/p&gt;

&lt;p&gt;Here are the main technologies used:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;AgentCore Code Interpreter&lt;/strong&gt;: One of Amazon Bedrock AgentCore's building blocks that executes code in a sandboxed environment&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Agent Skills (SKILL.md)&lt;/strong&gt;: Externalized prompts that are loaded on demand&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost Explorer API&lt;/strong&gt;: An API for retrieving AWS cost data, called from an agent tool&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;S3&lt;/strong&gt;: Stores chart images generated by Code Interpreter, served to the frontend via Presigned URLs&lt;/li&gt;
&lt;/ul&gt;
&lt;h1&gt;
  
  
  Amazon Bedrock AgentCore Code Interpreter
&lt;/h1&gt;

&lt;p&gt;Amazon Bedrock AgentCore Code Interpreter (hereafter "Code Interpreter") is one of the building blocks that allows agents hosted on AgentCore Runtime to safely execute code in a sandboxed environment.&lt;br&gt;
&lt;a href="https://aws.amazon.com/blogs/machine-learning/introducing-the-amazon-bedrock-agentcore-code-interpreter/" rel="noopener noreferrer"&gt;https://aws.amazon.com/blogs/machine-learning/introducing-the-amazon-bedrock-agentcore-code-interpreter/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Key features include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Code execution in a sandboxed environment&lt;/li&gt;
&lt;li&gt;Pre-installed libraries such as pandas, numpy, and matplotlib&lt;/li&gt;
&lt;li&gt;In addition to the default access-restricted environment, you can create user-defined environments with public internet access or VPC connectivity&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In this project, I use Code Interpreter to have the agent dynamically generate chart images from data using matplotlib.&lt;/p&gt;
&lt;h1&gt;
  
  
  Strands Agents Skills
&lt;/h1&gt;

&lt;p&gt;Agent Skills is a mechanism originally proposed by Anthropic. In a nutshell, it works like this: you define procedures you want the agent to execute in Markdown files (similar to system prompts), then inject only the metadata into the system prompt. The agent dynamically loads the Skill files based on the metadata and executes the procedures. This approach helps reduce token consumption and prevents context pollution.&lt;/p&gt;

&lt;p&gt;As of March 2026, Agent Skills are now available in Strands Agents as well:&lt;br&gt;
&lt;a href="https://strandsagents.com/docs/user-guide/concepts/plugins/skills/" rel="noopener noreferrer"&gt;https://strandsagents.com/docs/user-guide/concepts/plugins/skills/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For this project, I defined the following workflow as a Skill:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Call the Cost Explorer API tool to retrieve cost data for the user-specified period&lt;/li&gt;
&lt;li&gt;Call the cost visualization tool

&lt;ul&gt;
&lt;li&gt;2-1. Convert cost data into a chart image using Code Interpreter&lt;/li&gt;
&lt;li&gt;2-2. Upload the image to S3&lt;/li&gt;
&lt;li&gt;2-3. Return the S3 presigned URL&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h1&gt;
  
  
  Processing Flow
&lt;/h1&gt;

&lt;p&gt;Here's a simplified overview of the processing flow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User: "Show me this month's AWS costs"
  ↓
Main Agent
  ├─ ① skills tool: Load skill
  ├─ ② get_aws_cost tool: Call Cost Explorer API
  └─ ③ execute_python tool
     └─ ③-1 Generate matplotlib chart via Code Interpreter
        ③-2 Upload to S3
        ③-3 Return presigned URL
  ↓
Frontend: Detect S3 image URL in text → Display inline in chat
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h1&gt;
  
  
  Implementation
&lt;/h1&gt;

&lt;h2&gt;
  
  
  get_aws_cost: Cost Data Retrieval Tool
&lt;/h2&gt;

&lt;p&gt;The AWS cost retrieval tool is defined as an agent tool using the &lt;code&gt;@tool&lt;/code&gt; decorator. The logic is separated from the Code Interpreter chart image generation.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;strands&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;tool&lt;/span&gt;

&lt;span class="n"&gt;_ce_client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ce&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;region_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ap-northeast-1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nd"&gt;@tool&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_aws_cost&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;period&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;monthly&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;months&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;group_by_service&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Retrieve AWS cost data from Cost Explorer.

    Use this tool to fetch cost data. Then pass the result to execute_python
    to create matplotlib charts for visualization.

    Args:
        period: Granularity - &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;monthly&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; or &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;daily&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.
        months: Number of months to look back (default: 1, max: 6).
        group_by_service: If True, break down costs by AWS service.

    Returns:
        JSON string with cost data.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;ce&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;_ce_client&lt;/span&gt;
    &lt;span class="c1"&gt;# ...
&lt;/span&gt;    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ce&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_cost_and_usage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;TimePeriod&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Start&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;End&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="n"&gt;Granularity&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;MONTHLY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;Metrics&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;UnblendedCost&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="n"&gt;GroupBy&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;DIMENSION&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SERVICE&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}],&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;data&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  execute_python: Code Execution Tool
&lt;/h2&gt;

&lt;p&gt;Similarly, Code Interpreter code execution is defined as an agent tool using the &lt;code&gt;@tool&lt;/code&gt; decorator. To reliably capture matplotlib figures, the tool automatically injects capture code before and after the agent-generated code.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;bedrock_agentcore.tools.code_interpreter_client&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;code_session&lt;/span&gt;

&lt;span class="n"&gt;CODE_INTERPRETER_REGION&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CODE_INTERPRETER_REGION&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ap-northeast-1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;OUTPUT_BUCKET&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CODE_INTERPRETER_OUTPUT_BUCKET&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ap-northeast-1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;_s3_client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;s3&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;region_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AWS_REGION&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ap-northeast-1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="nd"&gt;@tool&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;execute_python&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;code&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Execute Python code in a sandboxed environment. Use this to run data analysis,
    generate charts with matplotlib, or perform calculations.

    Available libraries: pandas, numpy, matplotlib, json, datetime.
    Use ONLY matplotlib for plotting (not seaborn).
    Use English for all chart labels and titles (Japanese fonts are not available).

    IMPORTANT for chart generation:
    - Do NOT call plt.savefig() — images are auto-captured from open figures.
    - Do NOT call plt.close() — closing figures prevents image capture.
    - Just create figures with plt.subplots() and leave them open.
    - Do NOT use boto3 — the sandbox has no AWS credentials.

    Args:
        code: Python code to execute.
        description: Optional description of what the code does.

    Returns:
        JSON string with execution results including stdout, stderr, and image URLs.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="c1"&gt;# Automatically inject matplotlib image capture code
&lt;/span&gt;    &lt;span class="n"&gt;img_code&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
import matplotlib
matplotlib.use(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Agg&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;)
&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;code&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;
import matplotlib.pyplot as plt, base64, io, json as _json
_imgs = []
for _i in plt.get_fignums():
    _b = io.BytesIO()
    plt.figure(_i).savefig(_b, format=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;png&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, bbox_inches=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;tight&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, dpi=100)
    _b.seek(0)
    _imgs.append({{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;i&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;: _i, &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;d&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;: base64.b64encode(_b.read()).decode()}})
if _imgs:
    print(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;_IMG_&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt; + _json.dumps(_imgs) + &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;_END_&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;)
plt.close(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;all&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;)
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;code_session&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;CODE_INTERPRETER_REGION&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;code_client&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;code_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;executeCode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;code&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;img_code&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;language&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;python&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;clearContext&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;})&lt;/span&gt;
        &lt;span class="c1"&gt;# Extract images from stdout using _IMG_..._END_ markers
&lt;/span&gt;        &lt;span class="c1"&gt;# Upload to S3 and return presigned URLs
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Creating the SKILL.md
&lt;/h2&gt;

&lt;p&gt;Now that the tools are defined, we create the Agent Skill that defines how to call them. The directory structure looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;agentcore/
├── skills/
│   └── aws-cost/
│       └── SKILL.md
├── app.py
└── ...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The SKILL.md file contains YAML frontmatter and a Markdown-formatted prompt:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;aws-cost&lt;/span&gt;
&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Analyze&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;and&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;visualize&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;AWS&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;cost&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;data&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;using&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;get_aws_cost"&lt;/span&gt;
  &lt;span class="s"&gt;for data retrieval and execute_python for matplotlib chart generation&lt;/span&gt;
&lt;span class="na"&gt;allowed-tools&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;get_aws_cost execute_python&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;

&lt;span class="gh"&gt;# AWS Cost Analysis Skill&lt;/span&gt;

Two-step process: fetch data with &lt;span class="sb"&gt;`get_aws_cost`&lt;/span&gt;,
then visualize with &lt;span class="sb"&gt;`execute_python`&lt;/span&gt;.

&lt;span class="gu"&gt;## Critical Rules&lt;/span&gt;
&lt;span class="p"&gt;
-&lt;/span&gt; &lt;span class="gs"&gt;**NEVER call plt.savefig()**&lt;/span&gt; — images are auto-captured from open figures.
&lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="gs"&gt;**NEVER call plt.close()**&lt;/span&gt; — closing figures prevents image capture.
&lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="gs"&gt;**Use English for ALL text**&lt;/span&gt; in charts — Japanese fonts are unavailable.

&lt;span class="gu"&gt;## Step 1: Fetch Data&lt;/span&gt;
(How to call get_aws_cost)

&lt;span class="gu"&gt;## Step 2: Visualize&lt;/span&gt;
(matplotlib code template)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Integrating with the Agent
&lt;/h2&gt;

&lt;p&gt;The tools are passed via the &lt;code&gt;tools&lt;/code&gt; parameter, and the Skill is initialized with the &lt;code&gt;AgentSkills&lt;/code&gt; plugin and passed to the agent.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;strands&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Agent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;AgentSkills&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;src.agent.code_interpreter&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;execute_python&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;src.agent.aws_cost&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;get_aws_cost&lt;/span&gt;

&lt;span class="c1"&gt;# Initialize the Skills plugin
&lt;/span&gt;&lt;span class="n"&gt;skills_plugin&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AgentSkills&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;skills&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;./skills/&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Create the agent
&lt;/span&gt;&lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;other_tools&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;execute_python&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;get_aws_cost&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;plugins&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;skills_plugin&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;system_prompt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;system_prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I'll skip the frontend implementation details, but essentially it detects image URLs in the agent's response and automatically fetches and displays them inline.&lt;/p&gt;

&lt;h1&gt;
  
  
  Demo
&lt;/h1&gt;

&lt;p&gt;Here's what it looks like when the skill is actually running. Since the chart-generating code is dynamically created by the agent, the output varies depending on how you phrase your instructions.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbez6nxpakl4bzypunzt2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbez6nxpakl4bzypunzt2.png" alt="Demo screenshot" width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here's the video demo again from the beginning of the article:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://x.com/_cityside/status/2035339843014987845" rel="noopener noreferrer"&gt;https://x.com/_cityside/status/2035339843014987845&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Wrapping Up
&lt;/h1&gt;

&lt;p&gt;That's how I implemented an AWS cost charting feature using Agent Skills + Code Interpreter. (Admittedly, you could just look at the Cost Explorer console for the same information, but this was more of a proof of concept...)&lt;/p&gt;

&lt;p&gt;In this implementation, I used the default Code Interpreter tool, which restricts public internet access. However, by using a user-defined Code Interpreter tool, you could enable more flexible code execution. I'd love to explore the possibilities further.&lt;/p&gt;

</description>
      <category>agents</category>
      <category>ai</category>
      <category>architecture</category>
      <category>aws</category>
    </item>
    <item>
      <title>Using OpenRouter's OpenAI-Compatible Models (Grok 4.1 Fast) with Strands Agents</title>
      <dc:creator>yoko / Naoki Yokomachi</dc:creator>
      <pubDate>Sun, 15 Mar 2026 02:05:39 +0000</pubDate>
      <link>https://dev.to/yokomachi/using-openrouters-openai-compatible-models-grok-41-fast-with-strands-agents-8l3</link>
      <guid>https://dev.to/yokomachi/using-openrouters-openai-compatible-models-grok-41-fast-with-strands-agents-8l3</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;This article is an AI-assisted translation of a Japanese technical article.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;I'm building a personal AI agent called TONaRi ("tonari" means "next to" in Japanese — named with the idea of an AI that stands next to you and supports your daily life). It's built with Strands Agents + Amazon Bedrock AgentCore, with a VRM-powered 3D avatar frontend using AITuberKit.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmvxtux6cbpf8b0oqys5g.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmvxtux6cbpf8b0oqys5g.png" alt=" " width="800" height="520"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In a previous article, I wrote about cost reduction through sub-agent splitting.&lt;br&gt;
&lt;a href="https://dev.to/yokomachi/28-tool-definitions-cutting-ai-agent-costs-with-sub-agent-splitting-4dbp"&gt;https://dev.to/yokomachi/28-tool-definitions-cutting-ai-agent-costs-with-sub-agent-splitting-4dbp&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This time, I took cost reduction a step further by making it possible to switch the LLM itself to Grok 4.1 Fast via OpenRouter.&lt;/p&gt;
&lt;h2&gt;
  
  
  Cost Comparison
&lt;/h2&gt;

&lt;p&gt;Let's compare the costs between Claude Haiku 4.5 (Amazon Bedrock), which I had been using as the main model, and Grok 4.1 Fast (OpenRouter), the new alternative.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;Claude Haiku 4.5 (Bedrock)&lt;/th&gt;
&lt;th&gt;Grok 4.1 Fast (OpenRouter)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Input&lt;/td&gt;
&lt;td&gt;$1.10 / 1M tokens&lt;/td&gt;
&lt;td&gt;$0.20 / 1M tokens&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Output&lt;/td&gt;
&lt;td&gt;$5.50 / 1M tokens&lt;/td&gt;
&lt;td&gt;$0.50 / 1M tokens&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;That's a significant difference. As I mentioned in the previous article, LLM per-token pricing is by far the biggest cost driver, so reducing the unit price — while maintaining an acceptable quality balance — has the greatest impact.&lt;/p&gt;
&lt;h2&gt;
  
  
  Switching Models in Strands Agents
&lt;/h2&gt;

&lt;p&gt;Strands Agents is an open-source agent SDK provided by AWS, and it supports models beyond Bedrock. Using the &lt;code&gt;OpenAIModel&lt;/code&gt; class, you can directly use models from any service that provides an OpenAI-compatible API, such as OpenRouter. If you need broader provider support, &lt;code&gt;LiteLLMModel&lt;/code&gt; is also an option. Since Grok 4.1 Fast is OpenAI-compatible, we use the &lt;code&gt;OpenAIModel&lt;/code&gt; class directly.&lt;/p&gt;
&lt;h3&gt;
  
  
  Creating an OpenAIModel
&lt;/h3&gt;

&lt;p&gt;First, add the &lt;code&gt;openai&lt;/code&gt; dependency.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;dependencies = [
    "strands-agents&amp;gt;=1.23.0",
    "openai&amp;gt;=1.0.0",
    # ...
]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then create the model instance via OpenRouter.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;strands.models.openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OpenAIModel&lt;/span&gt;

&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OpenAIModel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;client_args&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;api_key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;your-openrouter-api-key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;base_url&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://openrouter.ai/api/v1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;model_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;x-ai/grok-4.1-fast&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The created model can be passed to an Agent with the exact same interface as a Bedrock model.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;strands&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Agent&lt;/span&gt;

&lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# Works the same whether BedrockModel or OpenAIModel
&lt;/span&gt;    &lt;span class="n"&gt;system_prompt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;You are a personal AI assistant.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;my_tools&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Wrap Up
&lt;/h2&gt;

&lt;p&gt;So I switched the model used for everyday conversations to Grok 4.1 Fast, and my impression is that quality isn't a major issue for casual conversation. However, application-specific conversation tags (this AI agent uses tags like &lt;code&gt;[happy]&lt;/code&gt; or &lt;code&gt;[bow]&lt;/code&gt; to trigger facial expressions and motions) sometimes get ignored or misinterpreted by the model, so that still needs tuning.&lt;/p&gt;

&lt;p&gt;I also had concerns about tool calling via AgentCore Gateway, but it's been working surprisingly well without any major adjustments.&lt;/p&gt;

&lt;p&gt;I'll continue monitoring and consider trying other models or implementing model-specific routing if needed.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>ai</category>
      <category>strandsagents</category>
      <category>openrouter</category>
    </item>
    <item>
      <title>28 TOOL DEFINITIONS! — Cutting AI Agent Costs with Sub-Agent Splitting</title>
      <dc:creator>yoko / Naoki Yokomachi</dc:creator>
      <pubDate>Sat, 07 Mar 2026 12:53:02 +0000</pubDate>
      <link>https://dev.to/yokomachi/28-tool-definitions-cutting-ai-agent-costs-with-sub-agent-splitting-4dbp</link>
      <guid>https://dev.to/yokomachi/28-tool-definitions-cutting-ai-agent-costs-with-sub-agent-splitting-4dbp</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;This article is an AI-assisted translation of a Japanese technical article.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;I'm building a personal AI agent called TONaRi ("tonari" means "next to" in Japanese — named with the idea of an AI that stands next to you and supports your daily life). It's built with Strands Agents + Amazon Bedrock AgentCore, with a VRM-powered 3D avatar frontend using AITuberKit.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8jjnj0rszxp1bhp89f2t.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8jjnj0rszxp1bhp89f2t.png" alt=" " width="800" height="520"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As I kept adding tools to make my personal AI agent more useful for daily tasks, the input tokens per API call ballooned — and so did the cost.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbk9v2dt1e2hjyuxal3tu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbk9v2dt1e2hjyuxal3tu.png" alt=" " width="800" height="426"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;It's lower now, but the projection was heading toward $120/month&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;In this article, I'll walk through the input token bloat problem caused by too many tools and how I tackled it by splitting into sub-agents.&lt;/p&gt;

&lt;h2&gt;
  
  
  Architecture Overview
&lt;/h2&gt;

&lt;p&gt;Here's a high-level look at TONaRi's architecture:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Frontend (Next.js + VRM 3D Avatar)
  → Next.js API Route
    → AgentCore Runtime (Strands Agent)
      → AgentCore Gateway → Lambda functions (tools)
      → AgentCore Memory (STM/LTM)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The agent runs as a container deployed on Bedrock AgentCore Runtime. External tools are implemented as Lambda functions accessed through AgentCore Gateway. Adding a new tool is as simple as writing a Lambda function and registering it as a Gateway target.&lt;/p&gt;

&lt;h2&gt;
  
  
  All the Tools
&lt;/h2&gt;

&lt;p&gt;AgentCore Gateway lets you expose Lambda functions as agent tools.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;strands.tools.mcp&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;MCPClient&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;mcp_proxy_for_aws.client&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;aws_iam_streamablehttp_client&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create_mcp_client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gateway_url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;MCPClient&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create_transport&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;aws_iam_streamablehttp_client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;endpoint&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;gateway_url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;aws_region&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;aws_service&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bedrock-agentcore&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;MCPClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;create_transport&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here are all the tools I've connected:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Domain&lt;/th&gt;
&lt;th&gt;Tools&lt;/th&gt;
&lt;th&gt;Count&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Task Management&lt;/td&gt;
&lt;td&gt;List, Add, Complete, Update&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Calendar&lt;/td&gt;
&lt;td&gt;List events, Check availability, Create, Update, Delete, Suggest schedule&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gmail&lt;/td&gt;
&lt;td&gt;Search, Get, Create draft, Archive&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Notion&lt;/td&gt;
&lt;td&gt;Search pages, Get page, Create, Update, Query DB, Get DB&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Twitter&lt;/td&gt;
&lt;td&gt;Get today's tweets, Post&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Diary&lt;/td&gt;
&lt;td&gt;Save, Get&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Date Utils&lt;/td&gt;
&lt;td&gt;Get current datetime, Calculate date, List date range&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Web Search&lt;/td&gt;
&lt;td&gt;Web search&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;28&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Each tool can be called individually, but the real power is chaining. For example, saying "Search for a recipe, save the bookmark to Notion, create a shopping list, and add grocery shopping to my tasks" triggers:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Web search tool finds a recipe&lt;/li&gt;
&lt;li&gt;Saves the URL to a Notion bookmark page&lt;/li&gt;
&lt;li&gt;Creates a shopping list from the recipe and saves it to a Notion memo page&lt;/li&gt;
&lt;li&gt;Adds a grocery shopping task to TONaRi's task list&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The AI agent sits between tools and interprets vague user requests to orchestrate across them — this is the most useful aspect of using an AI agent day-to-day.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Input Token Explosion
&lt;/h2&gt;

&lt;p&gt;Behind the convenience, costs were quietly piling up. When calling the Bedrock API, input tokens consist of four main components:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;System prompt&lt;/strong&gt;: Agent character settings, behavior rules&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tool definitions&lt;/strong&gt;: Name, description, and JSON schema for every tool&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Long-term memory (LTM)&lt;/strong&gt;: Episodes and facts extracted from past conversations&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Conversation history (STM)&lt;/strong&gt;: Current session content&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The biggest culprit was tool definitions. I had Claude Code calculate it — the 28 tools directly connected to the agent consumed about 5,000 tokens.&lt;/p&gt;

&lt;h3&gt;
  
  
  Breaking Down the Numbers
&lt;/h3&gt;

&lt;p&gt;Here's a rough breakdown of input tokens per call for the monolithic agent:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Estimated Tokens&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;System prompt (character + all domain rules)&lt;/td&gt;
&lt;td&gt;~3,500&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tool definitions (28 tools × schema)&lt;/td&gt;
&lt;td&gt;~5,000&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;LTM search results&lt;/td&gt;
&lt;td&gt;~1,500&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Conversation history (10 turns)&lt;/td&gt;
&lt;td&gt;Variable (~5,000–30,000)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The system prompt, tools, and LTM are essentially fixed costs sent with every message — that's 10,000 tokens per call. With about 100 calls per day, the monthly fixed cost alone is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;10,000 tokens × 100 calls/day × 30 days = 30,000,000 tokens/month
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;At Claude Haiku 4.5's Bedrock input token rate ($1.10/1M tokens for Japan cross-region inference), that's $33/month in fixed costs alone. As a solo developer, having ~$33/month go toward tool definitions that might not even be used on a given call was painful.&lt;/p&gt;

&lt;h2&gt;
  
  
  Splitting into Sub-Agents
&lt;/h2&gt;

&lt;p&gt;To reduce the number of tool definitions the main agent loads, I created domain-specific sub-agents and had the main agent call them via the &lt;code&gt;@tool&lt;/code&gt; decorator.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[Before: Monolithic]
Main Agent
├── System prompt (all domain rules)
└── 28 tools ← sent every single call

[After: Sub-agent split]
Main Agent
├── System prompt (generic rules only)
├── DateTool (3 tools)      ← frequently used, kept in main
├── TavilySearch (1 tool)   ← same
├── task_agent      ← defined as @tool (4 tools)
├── calendar_agent  ← defined as @tool (6 tools)
├── gmail_agent     ← defined as @tool (4 tools)
├── notion_agent    ← defined as @tool (6 tools)
├── diary_agent     ← defined as @tool (2 tools)
├── briefing_agent  ← defined as @tool (multi-domain tools)
└── twitter_agent   ← defined as @tool (2 tools)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Sub-Agent Implementation
&lt;/h3&gt;

&lt;p&gt;With Strands Agents' &lt;code&gt;@tool&lt;/code&gt; decorator, you can define a sub-agent as a tool for the main agent:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@tool&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;calendar_agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Google Calendar sub-agent. Handles listing, availability checks, creating, updating, and deleting events.

    Args:
        request: A request related to the owner&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s calendar
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;BedrockModel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;model_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;jp.anthropic.claude-haiku-4-5-20251001-v1:0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;region_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ap-northeast-1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;streaming&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;system_prompt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;You are a Google Calendar specialist assistant...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;_calendar_tools&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# calendar tools only
&lt;/span&gt;            &lt;span class="n"&gt;callback_handler&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Calendar operation error: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  System Prompt Reduction
&lt;/h3&gt;

&lt;p&gt;By splitting sub-agents by domain, domain-specific rules moved from the main system prompt to each sub-agent's prompt.&lt;/p&gt;

&lt;p&gt;Before: Main prompt contained all domain rules&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;- Calendar rules (duplicate checks, deletion confirmation, etc.)
- Gmail rules (draft only, date search caveats, etc.)
- Notion rules (property formats, database mappings, etc.)
- Briefing procedure (5 detailed sections)
- Diary creation flow (interview → generate → save)
- ...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After: Main prompt only has sub-agent list and delegation rules&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;## Sub-agent Coordination
- task_agent: Task management (list, add, complete, update)
- calendar_agent: Google Calendar (get, create, update, delete events)
- gmail_agent: Gmail (search, get, create drafts)
- ...

### Delegation Rules
- Describe requests to sub-agents in detail
- Rephrase sub-agent results in your own words
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This reduced the system prompt from ~7,400 characters to ~3,800 characters.&lt;/p&gt;

&lt;h3&gt;
  
  
  Cost Reduction
&lt;/h3&gt;

&lt;p&gt;Comparing the main agent's fixed cost per call:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Before (Monolithic)&lt;/th&gt;
&lt;th&gt;After (Sub-agent split)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;System prompt&lt;/td&gt;
&lt;td&gt;~3,500 tokens&lt;/td&gt;
&lt;td&gt;~2,000 tokens&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tool definitions&lt;/td&gt;
&lt;td&gt;28 tools (~5,000 tokens)&lt;/td&gt;
&lt;td&gt;12 tools (~2,500 tokens)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;LTM search results&lt;/td&gt;
&lt;td&gt;~1,500 tokens&lt;/td&gt;
&lt;td&gt;~1,500 tokens&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Fixed cost total&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~10,000 tokens&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~6,000 tokens&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Those 4,000 tokens weren't deleted — they moved to the sub-agents. Here's the per-call input token cost for each sub-agent:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Sub-agent&lt;/th&gt;
&lt;th&gt;Prompt&lt;/th&gt;
&lt;th&gt;Tool Defs&lt;/th&gt;
&lt;th&gt;Request Message&lt;/th&gt;
&lt;th&gt;Total&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;task_agent&lt;/td&gt;
&lt;td&gt;~400&lt;/td&gt;
&lt;td&gt;~400&lt;/td&gt;
&lt;td&gt;~100&lt;/td&gt;
&lt;td&gt;~900&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;calendar_agent&lt;/td&gt;
&lt;td&gt;~400&lt;/td&gt;
&lt;td&gt;~850&lt;/td&gt;
&lt;td&gt;~100&lt;/td&gt;
&lt;td&gt;~1,350&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;gmail_agent&lt;/td&gt;
&lt;td&gt;~400&lt;/td&gt;
&lt;td&gt;~400&lt;/td&gt;
&lt;td&gt;~100&lt;/td&gt;
&lt;td&gt;~900&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;notion_agent&lt;/td&gt;
&lt;td&gt;~400&lt;/td&gt;
&lt;td&gt;~700&lt;/td&gt;
&lt;td&gt;~100&lt;/td&gt;
&lt;td&gt;~1,200&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;briefing_agent&lt;/td&gt;
&lt;td&gt;~500&lt;/td&gt;
&lt;td&gt;~2,500&lt;/td&gt;
&lt;td&gt;~100&lt;/td&gt;
&lt;td&gt;~3,100&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;diary_agent&lt;/td&gt;
&lt;td&gt;~400&lt;/td&gt;
&lt;td&gt;~200&lt;/td&gt;
&lt;td&gt;~100&lt;/td&gt;
&lt;td&gt;~700&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;twitter_agent&lt;/td&gt;
&lt;td&gt;~400&lt;/td&gt;
&lt;td&gt;~150&lt;/td&gt;
&lt;td&gt;~100&lt;/td&gt;
&lt;td&gt;~650&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;If you just add everything up, the "After" total is actually higher. But the key insight is reducing tokens sent on &lt;em&gt;every&lt;/em&gt; call. For example, the briefing_agent loads Gmail, Calendar, and task tools all at once and has complex rules — it's expensive, but it only runs once a day. Before, all those definitions were loaded on every single call. Now they only load when needed.&lt;/p&gt;

&lt;h3&gt;
  
  
  Monthly Cost Impact
&lt;/h3&gt;

&lt;p&gt;Estimating with ~100 calls per day:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[Main agent fixed cost reduction (every call)]
  4,000 tokens/call × 100 calls/day × 30 days = 12,000,000 tokens/month

[Sub-agent additional cost (only when invoked)]
  Assuming ~60% of calls (60/day) trigger one sub-agent
  Average 900 tokens/call × 60 calls/day × 30 days = 1,620,000 tokens/month
  *briefing_agent (~3,100 tokens) runs once/day, calculated separately
  briefing: 3,100 tokens × 30 days = 93,000 tokens/month

[Net savings]
  12,000,000 - 1,620,000 - 93,000 = 10,287,000 tokens/month
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;At Claude Haiku 4.5's Bedrock input token rate ($1.10/1M tokens, Japan cross-region inference), that's roughly &lt;strong&gt;$11/month in input token savings&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Other Optimizations
&lt;/h2&gt;

&lt;p&gt;I also made several complementary changes:&lt;/p&gt;

&lt;h3&gt;
  
  
  Conversation Window Reduction
&lt;/h3&gt;

&lt;p&gt;Changed &lt;code&gt;SlidingWindowConversationManager&lt;/code&gt;'s &lt;code&gt;window_size&lt;/code&gt; from 15 to 10.&lt;br&gt;
Savings: $3–5/month&lt;/p&gt;

&lt;h3&gt;
  
  
  LTM Search Result Reduction
&lt;/h3&gt;

&lt;p&gt;Reduced &lt;code&gt;top_k&lt;/code&gt; across LTM strategies (total 18 → 10 results).&lt;br&gt;
Savings: $2–3/month&lt;/p&gt;

&lt;h3&gt;
  
  
  Lightweight Pipeline Agents
&lt;/h3&gt;

&lt;p&gt;For automated tasks like scheduled tweets and news collection, I was using the full main agent. I replaced these with lightweight dedicated agents that share memory but carry only minimal tools.&lt;br&gt;
Savings: $2–3/month&lt;/p&gt;

&lt;h3&gt;
  
  
  Total Savings
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Optimization&lt;/th&gt;
&lt;th&gt;Est. Monthly Savings&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Sub-agent splitting&lt;/td&gt;
&lt;td&gt;$11&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Conversation window reduction&lt;/td&gt;
&lt;td&gt;$3–5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;LTM result reduction&lt;/td&gt;
&lt;td&gt;$2–3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Lightweight pipeline agents&lt;/td&gt;
&lt;td&gt;$2–3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$18–22&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Wrapping Up
&lt;/h2&gt;

&lt;p&gt;So I managed to cut costs to some degree, but it's still expensive...! &lt;br&gt;
If you have any clever cost reduction ideas, I'd love to hear them.&lt;/p&gt;

&lt;p&gt;(Fortunately I was recently selected as an AWS Community Builder, so I'm hoping for some AWS credits!)&lt;/p&gt;

</description>
      <category>aws</category>
      <category>ai</category>
      <category>python</category>
      <category>strandsagents</category>
    </item>
    <item>
      <title>Controlling VRM Character Motions for an AI Agent on the Web</title>
      <dc:creator>yoko / Naoki Yokomachi</dc:creator>
      <pubDate>Sat, 21 Feb 2026 13:00:22 +0000</pubDate>
      <link>https://dev.to/yokomachi/controlling-vrm-character-motions-for-an-ai-agent-on-the-web-3gga</link>
      <guid>https://dev.to/yokomachi/controlling-vrm-character-motions-for-an-ai-agent-on-the-web-3gga</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;This article is an AI-assisted translation of a Japanese technical article.&lt;br&gt;
&lt;a href="https://zenn.dev/yokomachi/articles/202602_vrm-motion-control-on-web" rel="noopener noreferrer"&gt;https://zenn.dev/yokomachi/articles/202602_vrm-motion-control-on-web&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;I'm currently working on a personal AI agent project and decided to use a 3D model as the user interface.&lt;br&gt;
Since I didn't have the knowledge to build everything from scratch, I leveraged &lt;a href="https://github.com/tegnike/aituber-kit" rel="noopener noreferrer"&gt;AITuberKit&lt;/a&gt;, an OSS project I'd been aware of for a while, to quickly set up the frontend.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fukjb50ylfi6h5k0t4c5r.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fukjb50ylfi6h5k0t4c5r.png" alt=" " width="800" height="378"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Tech Stack
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;VRM model creation: &lt;a href="https://vroid.com/studio" rel="noopener noreferrer"&gt;VRoid Studio&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Web frontend: Next.js, TypeScript&lt;/li&gt;
&lt;li&gt;VRM rendering &amp;amp; control: &lt;a href="https://github.com/pixiv/three-vrm" rel="noopener noreferrer"&gt;three-vrm&lt;/a&gt; (v3.0.0), Three.js&lt;/li&gt;
&lt;li&gt;Base kit: &lt;a href="https://github.com/tegnike/aituber-kit" rel="noopener noreferrer"&gt;AITuberKit&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Agent implementation: Strands Agents, Amazon Bedrock AgentCore &lt;em&gt;Not covered in detail in this article&lt;/em&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h1&gt;
  
  
  VRM and VRoid Studio
&lt;/h1&gt;

&lt;p&gt;VRM is a file format designed for 3D avatars.&lt;br&gt;
With &lt;a href="https://vroid.com/studio" rel="noopener noreferrer"&gt;VRoid Studio&lt;/a&gt;, you can create characters and export them in VRM format without any 3D modeling knowledge.&lt;br&gt;
In my case, my only prior experience was creating characters in video games, but I was able to create two models (male and female) in about an hour — that's how easy it is.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://x.com/_cityside/status/2019742015617994773" rel="noopener noreferrer"&gt;https://x.com/_cityside/status/2019742015617994773&lt;/a&gt;&lt;/p&gt;
&lt;h1&gt;
  
  
  What AITuberKit Can Do
&lt;/h1&gt;

&lt;p&gt;&lt;a href="https://github.com/tegnike/aituber-kit" rel="noopener noreferrer"&gt;AITuberKit&lt;/a&gt; is an OSS that displays VRM models in a web browser and bundles features like LLM-powered chat, facial expression control, and speech synthesis.&lt;/p&gt;

&lt;p&gt;Here are some of the key features AITuberKit provides:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;VRM model display, facial expression control, and lip-sync&lt;/li&gt;
&lt;li&gt;LLM-powered chatbot functionality&lt;/li&gt;
&lt;li&gt;Speech synthesis API integration&lt;/li&gt;
&lt;li&gt;YouTube streaming integration&lt;/li&gt;
&lt;li&gt;Multimodal input&lt;/li&gt;
&lt;li&gt;etc.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For my project, since I'm building it as a personal AI agent, I'm using AITuberKit's base features like VRM display control and chatbot functionality while adding heavy customizations on top.&lt;/p&gt;
&lt;h1&gt;
  
  
  Implementing Motion Control
&lt;/h1&gt;

&lt;p&gt;Here's where we get to the main topic.&lt;br&gt;
AITuberKit supports switching facial expressions (smile, angry face, etc.) out of the box, so I decided to implement additional body motions (bowing, extending a hand, etc.).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://x.com/_cityside/status/2016874430056845502" rel="noopener noreferrer"&gt;https://x.com/_cityside/status/2016874430056845502&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Architecture Overview
&lt;/h2&gt;

&lt;p&gt;Here's the overall picture of the motion control system:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;LLM Response
  ↓ Streaming parser
  ├─ [emotion] Emotion tag → ExpressionController → Facial expression control
  └─ [bow/present] Motion tag → GestureController → Bone control
                                         ↑
                                    EmoteController (conflict resolution)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;EmoteController&lt;/code&gt; sits between facial expressions and motions to handle conflicts between them.&lt;/p&gt;

&lt;h2&gt;
  
  
  Motion Definitions
&lt;/h2&gt;

&lt;p&gt;Motions are implemented by defining bone rotations as keyframes.&lt;/p&gt;

&lt;p&gt;Here's an example definition for a bow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// src/features/emoteController/gestureController.ts&lt;/span&gt;
&lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;BoneRotation&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;bone&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;VRMHumanBoneName&lt;/span&gt;
  &lt;span class="nx"&gt;rotation&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;THREE&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Quaternion&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;GestureKeyframe&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;duration&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;
  &lt;span class="nx"&gt;bones&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;BoneRotation&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;GestureDefinition&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;keyframes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;GestureKeyframe&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt;
  &lt;span class="nx"&gt;holdDuration&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;
  &lt;span class="nx"&gt;closeEyes&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="nx"&gt;boolean&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For the bow motion, three bones — spine, chest, and neck — are each rotated forward to create a more natural-looking bow rather than simply bending at the waist.&lt;br&gt;
The arm bones are also adjusted to achieve a natural posture.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// src/features/emoteController/gestureController.ts&lt;/span&gt;
&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;_gestures&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;bow&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;keyframes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;duration&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;bones&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="na"&gt;bone&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;spine&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="na"&gt;rotation&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;THREE&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Quaternion&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;setFromEuler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;THREE&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Euler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.25&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
          &lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="na"&gt;bone&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;chest&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="na"&gt;rotation&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;THREE&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Quaternion&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;setFromEuler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;THREE&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Euler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.15&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
          &lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="na"&gt;bone&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;neck&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="na"&gt;rotation&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;THREE&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Quaternion&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;setFromEuler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;THREE&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Euler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.12&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
          &lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="c1"&gt;// Arm bones are also adjusted (omitted)&lt;/span&gt;
      &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="na"&gt;holdDuration&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;closeEyes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// Close eyes during the bow&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h1&gt;
  
  
  Triggering Motions from LLM Responses
&lt;/h1&gt;

&lt;p&gt;The character's expressions are controlled by having the LLM output emotion and motion tags in its responses.&lt;/p&gt;

&lt;p&gt;Emotion tags are implemented by default in AITuberKit. The LLM response looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[happy]Thank you so much!
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Motion tags are a custom addition. They appear in the response just like emotion tags:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Welcome! [bow]What kind of fragrance are you looking for today?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When both emotion and motion tags appear simultaneously, both are triggered.&lt;br&gt;
For example, &lt;code&gt;[happy][bow]&lt;/code&gt; results in the character bowing with a smile.&lt;/p&gt;

&lt;p&gt;The system prompt includes the following instructions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="s2"&gt;`
## Emotional Expression
The format for conversation text is as follows. Choose the single most appropriate emotion for the entire response and prepend the emotion tag at the beginning.
[{neutral|happy|angry|sad|relaxed|surprised}]{conversation text}
`&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h1&gt;
  
  
  Handling Conflicts Between Expressions and Motions
&lt;/h1&gt;

&lt;p&gt;Simply applying both facial expressions and motions at the same time can cause unexpected behavior, so I've added the following controls.&lt;br&gt;
For example, having the eyes open during a bow looked unnatural, so I set &lt;code&gt;closeEyes: true&lt;/code&gt; to close the eyes on the motion control side.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;EmoteController&lt;/code&gt; manages this by passing flags between controllers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// src/features/emoteController/emoteController.ts&lt;/span&gt;
&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;updateExpression&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;delta&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;isEmotionActive&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;_expressionController&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;isEmotionActive&lt;/span&gt;
  &lt;span class="c1"&gt;// Skip auto-blink if the motion is closing eyes and expression is neutral&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;skipAutoBlink&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;_gestureController&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;isClosingEyes&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;isEmotionActive&lt;/span&gt;
  &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;_expressionController&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;delta&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;skipAutoBlink&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;updateGesture&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;delta&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;isEmotionActive&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;_expressionController&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;isEmotionActive&lt;/span&gt;
  &lt;span class="c1"&gt;// Skip motion eye-close if an emotion expression is active&lt;/span&gt;
  &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;_gestureController&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;delta&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;isEmotionActive&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The emotion expressions and the motion's eye-close feature are mutually exclusive.&lt;br&gt;
When the emotion is &lt;code&gt;neutral&lt;/code&gt;, the motion side closes the eyes. When an emotion is active, the motion's eye-close is disabled and control is handed to the expression side.&lt;/p&gt;

&lt;h1&gt;
  
  
  Wrapping Up
&lt;/h1&gt;

&lt;p&gt;Using a chat UI as the frontend for an AI agent is a very common approach, but even a simple model like this feels lively just by having it move around, which makes it really fun.&lt;br&gt;
That said, controlling motions can be quite tricky — figuring out which bones to rotate and by how much is surprisingly difficult.&lt;br&gt;
For more complex motions, you could look into purchasing motion packs, which might be a good option.&lt;/p&gt;

</description>
      <category>vrm</category>
      <category>threejs</category>
      <category>nextjs</category>
      <category>typescript</category>
    </item>
  </channel>
</rss>
