<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Milad Rezaeighale</title>
    <description>The latest articles on DEV Community by Milad Rezaeighale (@miladrezaei).</description>
    <link>https://dev.to/miladrezaei</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2220284%2Feda20ba1-dadc-4f2e-979c-957dc2818bb1.png</url>
      <title>DEV Community: Milad Rezaeighale</title>
      <link>https://dev.to/miladrezaei</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/miladrezaei"/>
    <language>en</language>
    <item>
      <title>MCPfying Tools Securely at Scale with Bedrock AgentCore Gateway</title>
      <dc:creator>Milad Rezaeighale</dc:creator>
      <pubDate>Tue, 03 Feb 2026 09:44:54 +0000</pubDate>
      <link>https://dev.to/aws-builders/mcpfying-tools-securely-at-scale-with-bedrock-agentcore-gateway-e3d</link>
      <guid>https://dev.to/aws-builders/mcpfying-tools-securely-at-scale-with-bedrock-agentcore-gateway-e3d</guid>
      <description>&lt;p&gt;As organizations move from single-agent experiments to production-grade agentic systems, the bottleneck is rarely the model. It’s the tool layer: how teams expose capabilities, how agents discover the right tools, how invocation is standardized across heterogeneous backends, and how governance is enforced consistently as usage scales.&lt;/p&gt;

&lt;p&gt;In this article, I describe an enterprise pattern for “MCP-fying” internal tools using &lt;a href="https://aws.amazon.com/bedrock/agentcore/" rel="noopener noreferrer"&gt;Amazon Bedrock AgentCore Gateway&lt;/a&gt;—treating it as a centralized MCP front door for tool discovery and invocation. The goal is not to wrap one function, but to establish a repeatable approach that reduces duplicated integrations, supports multi-team ownership, and creates a foundation for secure, scalable tool access across the organization.&lt;/p&gt;

&lt;h2&gt;
  
  
  AgentCore Gateway as an enterprise tool layer
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/bedrock/agentcore/" rel="noopener noreferrer"&gt;Amazon Bedrock AgentCore Gateway&lt;/a&gt; AgentCore Gateway is a managed &lt;strong&gt;tool front door&lt;/strong&gt; that exposes organizational capabilities as discoverable, invokable tools through an &lt;strong&gt;MCP-compatible&lt;/strong&gt; interface. Instead of every agent framework integrating separately with every backend service, you register backends behind the gateway as targets, define tool schemas (contracts) once, and let clients interact through one consistent surface.&lt;/p&gt;

&lt;p&gt;Clients typically use three MCP-style operations:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Tool discovery&lt;/strong&gt; (what tools exist).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tool search/filtering&lt;/strong&gt; (find the right tool at scale).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tool invocation&lt;/strong&gt; (run a tool with inputs).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj8mgr2wyp86dtplyjabd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj8mgr2wyp86dtplyjabd.png" alt="AgentCore Gateway as the MCP tool front door with identity, IAM execution, and observability" width="800" height="385"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Key capabilities of AgentCore Gateway
&lt;/h2&gt;

&lt;p&gt;AgentCore Gateway provides a set of capabilities designed to standardize and simplify tool integration across teams and agent frameworks:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Unified MCP endpoint&lt;/strong&gt; – A stable entry point that exposes tools through a consistent contract for discovery, search, and invocation.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Protocol translation &amp;amp; request routing&lt;/strong&gt; – Converts MCP tool calls into the appropriate backend action and routes requests to the correct target/tool implementation.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Composition (many tools, one front door)&lt;/strong&gt; – Aggregates tools from multiple backends so agents integrate once with the gateway instead of many services directly.&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Targets&lt;/strong&gt; for enterprise backends – Connect common enterprise surfaces as tool targets, such as:&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;AWS Lambda&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;OpenAPI-defined APIs&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Smithy models&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;In this article we focus on MCPfying tools with AWS Lambda.&lt;/em&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Managed operations&lt;/strong&gt; – Centralizes telemetry and operational visibility (for example via Amazon CloudWatch) for troubleshooting and governance.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scalable discovery&lt;/strong&gt; – Supports narrowing the toolset at runtime (semantic search / filtered discovery) to reduce tool overload and improve tool selection in large catalogs.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;OK, enough theory. Let’s build!&lt;/p&gt;

&lt;h2&gt;
  
  
  What we’ll build
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Configure identity (inbound) and Create the Gateway&lt;/li&gt;
&lt;li&gt;Add a Lambda target + IAM execution role (outbound)&lt;/li&gt;
&lt;li&gt;Connect a Strands agent to the Gateway (MCP client)&lt;/li&gt;
&lt;li&gt;Invoke tools through the agent&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The full notebook (end-to-end) is available in &lt;a href="https://github.com/miladrezaei-ai/amazon-bedrock-agentcore-samples/tree/main/AgentCore-gateway/mcpfying-lambda-into-mcp-tools" rel="noopener noreferrer"&gt;my repo&lt;/a&gt;:
&lt;/h2&gt;

&lt;h2&gt;
  
  
  1. Configure identity (inbound) and Create the Gateway
&lt;/h2&gt;

&lt;p&gt;For this implementation I use &lt;a href="https://github.com/aws/bedrock-agentcore-starter-toolkit" rel="noopener noreferrer"&gt;bedrock_agentcore_starter_toolkit&lt;/a&gt; which is AWS’s CLI-based starter toolkit for deploying Python agents to Amazon Bedrock AgentCore Runtime with “zero infrastructure” to manage—so you can go from local code to a running agent quickly.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import logging
from bedrock_agentcore_starter_toolkit.operations.gateway.client import GatewayClient

client = GatewayClient(region_name=os.environ["AWS_DEFAULT_REGION"])

cognito_authorizer = client.create_oauth_authorizer_with_cognito("agentcore-gateway-test")

# Create Gateway (MCP) and capture identifiers
gateway = client.create_mcp_gateway(authorizer_config=cognito_authorizer["authorizer_config"])
gateway_id = gateway["gatewayId"]
gateway_url = gateway["gatewayUrl"]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We need &lt;strong&gt;gateway_id&lt;/strong&gt; and &lt;strong&gt;gateway_url&lt;/strong&gt; later.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Add a Lambda target + IAM execution role (outbound)
&lt;/h2&gt;

&lt;p&gt;When you use &lt;strong&gt;bedrock_agentcore_starter_toolkit&lt;/strong&gt; with &lt;strong&gt;create_mcp_gateway_target&lt;/strong&gt;, the SDK automatically provisions an AWS Lambda function that includes two example tools: a helper-created AWS Lambda target exposing get_weather and get_time. You can see that lambda in your console.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;lambda_target_1 = client.create_mcp_gateway_target(
    gateway=gateway,
    target_type="lambda"  # helper creates/uses a default lambda + tool schema
)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In this tutorial, we’ll go a step further by creating a custom Lambda-backed target with a simple tool that returns a random number (Option B). In order to do that we need to :&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create an AWS Lambda function and copy its function ARN.&lt;/li&gt;
&lt;li&gt;Create a Gateway target for that Lambda using create_mcp_gateway_target.&lt;/li&gt;
&lt;li&gt;Define the tool schema (contract) so the Gateway knows the tool name, inputs, and what output to expect.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;explicit target payload (more realistic)&lt;/strong&gt;&lt;br&gt;
This is the part that matters most for enterprise usage: you’re defining the tool contract (schema) explicitly.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;lambda_target_configuration = {
    "lambdaArn": "arn:aws:lambda:REGION:ACCOUNT_ID:function:agentCoreGatewayCustomLambda",
    "toolSchema": {
        "inlinePayload": [
            {
                "name": "get_random_number",
                "description": "Return a random number",
                "inputSchema": {"type": "object", "properties": {}, "required": []},
                "outputSchema": {"type": "integer"},
            }
        ]
    },
}

lambda_target_2 = client.create_mcp_gateway_target(
    gateway=gateway,
    target_type="lambda",
    target_payload=lambda_target_configuration,
)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;3) Connect a Strands agent to the Gateway (MCP client)&lt;/strong&gt;&lt;br&gt;
Cool! Now let’s create an agent with &lt;a href="https://strandsagents.com/latest/" rel="noopener noreferrer"&gt;Strands&lt;/a&gt; and connect it to the AgentCore Gateway as an MCP client. Two important details from the working flow: refresh the access token before connecting (Cognito tokens expire), and keep the MCP client running while the agent is operating (start it once—don’t recreate it per call). &lt;br&gt;
After connecting, your first sanity check is to list the tools exposed by the Gateway—if ListTools doesn’t return what you expect, the issue is usually the Authorization header, the /mcp suffix, or the Gateway target/tool configuration, not the agent itself.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from strands import Agent
from strands.models import BedrockModel
from strands.tools.mcp.mcp_client import MCPClient
from mcp.client.streamable_http import streamablehttp_client

# refresh token
access_token = client.get_access_token_for_cognito(cognito_authorizer["client_info"])

# ensure /mcp suffix
mcp_url = gateway_url if gateway_url.endswith("/mcp") else f"{gateway_url}/mcp"

mcp_client = MCPClient(
    lambda: streamablehttp_client(
        url=mcp_url,
        headers={"Authorization": f"Bearer {access_token}"},
    )
)

mcp_client.start()
tools = mcp_client.list_tools_sync()

# Bedrock model for the agent
model = BedrockModel(model_id="eu.amazon.nova-pro-v1:0")  # choose your model
agent = Agent(model=model, tools=tools)

print("Loaded tools:", agent.tool_names)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You’ll see tools like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;\&amp;lt;TargetName\&amp;gt;___get_weather&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;\&amp;lt;TargetName\&amp;gt;___get_time&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;\&amp;lt;TargetName\&amp;gt;___get_random_number&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;4) Invoke tools through the agent&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;response = agent("Get the time for ECT")
print(response)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should be able to see this in the output:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt; The user has requested the time for ECT, which stands for Eastern Caribbean Time. I need to use the &lt;code&gt;TestGatewayTargetc7c8080f___get_time&lt;/code&gt; tool to get the current time for this timezone.  Tool #1: TestGatewayTargetc7c8080f___get_time&lt;br&gt;
The current time in Eastern Caribbean Time (ECT) is 2:30 PM.Response:  The current time in Eastern Caribbean Time (ECT) is 2:30 PM.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This validates the full chain:&lt;br&gt;
&lt;strong&gt;Agent **→ **MCP client&lt;/strong&gt; → &lt;strong&gt;AgentCore Gateway&lt;/strong&gt; → &lt;strong&gt;Lambda target&lt;/strong&gt; → &lt;strong&gt;tool response&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cleanup (important)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Here we delete the Gateway and its targets, and remove the Cognito user pool/domain created for the tutorial to avoid leaving unused resources behind (and any potential costs).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ognito_idp = boto3.client("cognito-idp")

# Stop the MCP client if it's running
try:
    if streamable_http_mcp_client.is_running():
        streamable_http_mcp_client.stop()
        print("✓ MCP client stopped")
except:
    pass

# Deletinbg User Pool
try:
    cognito_idp = boto3.client("cognito-idp")
    cognito_idp.delete_user_pool_domain(
        UserPoolId=COGNITO_USER_POOL_ID,
        Domain=DOMAIN_PREFIX 
    )
    print(f"✓ Deleted Cognito domain: {DOMAIN_PREFIX}")

    cognito_idp.delete_user_pool(UserPoolId=COGNITO_USER_POOL_ID)
    print(f"✅ Deleted Cognito user pool: {COGNITO_USER_POOL_ID}")

except Exception as e:
    msg = str(e).lower()
    if "notfound" in msg or "not found" in msg:
        print("ℹ️ Cognito user pool already deleted (nothing to clean up).")
    else:
        print(f"❌ Failed to delete Cognito user pool: {e}")
        raise

# Deleting the gateway and its targets
try:
    client.cleanup_gateway(gatewayID)
    print("✅ Cleanup complete! (gateway + targets deleted)")
except Exception as e:
    msg = str(e).lower()
    if "notfound" in msg or "not found" in msg:
        print("ℹ️ Gateway already deleted (nothing to clean up).")
    else:
        print(f"❌ Cleanup failed: {e}")
        raise
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Practical notes from this implementation&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Tool schema is the real product surface.&lt;/strong&gt; Treat it like an API contract (names, descriptions, input schema quality).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Target naming matters.&lt;/strong&gt; The final tool name includes the target name prefix; keep it stable and readable.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Keep the MCP client session alive&lt;/strong&gt; while the agent is running, otherwise tool calls will fail.&lt;/li&gt;
&lt;li&gt;You get enterprise-friendly operational behavior because the tool access surface is centralized (and can be governed consistently later).&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;This article shows how &lt;a href="https://aws.amazon.com/blogs/machine-learning/introducing-amazon-bedrock-agentcore-gateway-transforming-enterprise-ai-agent-tool-development/" rel="noopener noreferrer"&gt;Amazon Bedrock AgentCore Gateway&lt;/a&gt; acts as an enterprise “tool front door”: agents don’t integrate with every backend directly—instead they connect once over &lt;a href="https://modelcontextprotocol.io/docs/getting-started/intro" rel="noopener noreferrer"&gt;Model Context Protocol&lt;/a&gt; to discover, search, and invoke tools through a single stable endpoint.&lt;/p&gt;

&lt;p&gt;You then walk through a practical build: creating a gateway, wiring identity (inbound auth via &lt;a href="https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/identity-overview.html" rel="noopener noreferrer"&gt;AgentCore Identity&lt;/a&gt; and an IdP like &lt;a href="https://aws.amazon.com/pm/cognito/?trk=1cd4d802-f0cd-40ed-9f74-5a472b02fba5&amp;amp;sc_channel=ps&amp;amp;trk=1cd4d802-f0cd-40ed-9f74-5a472b02fba5&amp;amp;sc_channel=ps&amp;amp;ef_id=CjwKCAiA1obMBhAbEiwAsUBbIvdWYgrXJfz_z-hI5x_u_T9ymOnEgN9sWoxdRTCwPuaROK1gjSS95hoCJ8oQAvD_BwE:G:s&amp;amp;s_kwcid=AL!4422!3!651541907485!e!!g!!amazon%20cognito!19835790380!146491699385&amp;amp;gad_campaignid=19835790380&amp;amp;gbraid=0AAAAADjHtp_eufE4Rny58dH6QBSUac_iN&amp;amp;gclid=CjwKCAiA1obMBhAbEiwAsUBbIvdWYgrXJfz_z-hI5x_u_T9ymOnEgN9sWoxdRTCwPuaROK1gjSS95hoCJ8oQAvD_BwE" rel="noopener noreferrer"&gt;Amazon Cognito&lt;/a&gt;), connecting an AWS Lambda target with an IAM execution role (outbound auth), defining tool schemas (contracts), and validating everything end-to-end by connecting a &lt;a href="https://strandsagents.com/latest/" rel="noopener noreferrer"&gt;Strands Agents&lt;/a&gt; MCP client to list tools and run invocations.&lt;/p&gt;

&lt;h2&gt;
  
  
  Resources
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/miladrezaei-ai/amazon-bedrock-agentcore-samples/tree/935d75fcd57b173f942ae6c1b6a677560ff8279c" rel="noopener noreferrer"&gt;My GitHub (full notebook + code)&lt;br&gt;
&lt;/a&gt;&lt;/p&gt;

</description>
      <category>agents</category>
      <category>aws</category>
      <category>mcp</category>
      <category>agentocre</category>
    </item>
    <item>
      <title>Amazon Bedrock AgentCore Setup Confusion: Which IAM Role Do I Need?</title>
      <dc:creator>Milad Rezaeighale</dc:creator>
      <pubDate>Wed, 07 Jan 2026 13:14:32 +0000</pubDate>
      <link>https://dev.to/aws-builders/amazon-bedrock-agentcore-setup-confusion-which-iam-role-do-i-need-1pk1</link>
      <guid>https://dev.to/aws-builders/amazon-bedrock-agentcore-setup-confusion-which-iam-role-do-i-need-1pk1</guid>
      <description>&lt;p&gt;If you’re trying to deploy an agent into Amazon Bedrock AgentCore Runtime and you see a CLI flag like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;agentcore configure --entrypoint my_agent.py -er &amp;lt;YOUR_IAM_ROLE_ARN&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;…it’s easy to get stuck.&lt;/p&gt;

&lt;p&gt;Because &lt;em&gt;&lt;/em&gt; is not your IAM user, and it’s not your SSO role. It’s a separate Execution Role that AgentCore Runtime assumes to run your agent.&lt;/p&gt;

&lt;p&gt;Even after publishing my earlier article on &lt;a href="https://dev.to/aws-builders/from-demos-to-business-value-taking-agents-to-production-with-amazon-bedrock-agentcore-2pdj"&gt;building an agent with AgentCore&lt;/a&gt;, I noticed there’s still a common point of confusion for many people. So I decided to write this article and explain what role you need to create!&lt;/p&gt;

&lt;p&gt;Once you create that role correctly, deployment becomes straightforward.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;This guide is based on the official AWS documentation for AgentCore Runtime permissions: &lt;a href="https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/runtime-permissions.html" rel="noopener noreferrer"&gt;IAM Permissions for AgentCore Runtime&lt;br&gt;
&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  What you actually need (2 identities)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1) Your “caller identity”&lt;/strong&gt;&lt;br&gt;
This is the identity you use to run the CLI:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;IAM User, or&lt;/li&gt;
&lt;li&gt;SSO Role (IAM Identity Center)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This identity needs permission to deploy/configure and often PassRole.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2) The “AgentCore Runtime execution role” (the important one)&lt;/strong&gt;&lt;br&gt;
This is the role AgentCore uses at runtime to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;pull images from ECR (if applicable),&lt;/li&gt;
&lt;li&gt;write logs to CloudWatch,&lt;/li&gt;
&lt;li&gt;send traces to X-Ray,&lt;/li&gt;
&lt;li&gt;publish metrics,&lt;/li&gt;
&lt;li&gt;call Bedrock models,&lt;/li&gt;
&lt;li&gt;get workload access tokens.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is the ARN you pass via &lt;strong&gt;-er&lt;/strong&gt;.&lt;/p&gt;
&lt;h2&gt;
  
  
  Step-by-step: Create the AgentCore Runtime Execution Role in AWS Console
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Step 1 — Create the Role&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Go to AWS Console → IAM&lt;/li&gt;
&lt;li&gt;Click Roles → Create role&lt;/li&gt;
&lt;li&gt;Choose Custom trust policy&lt;/li&gt;
&lt;li&gt;Paste this trust policy (replace region/account, 123456789012 and us-east-1):
&lt;/li&gt;
&lt;/ol&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "Version":"2012-10-17",
  "Statement": [
    {
      "Sid": "AssumeRolePolicy",
      "Effect": "Allow",
      "Principal": {
        "Service": "bedrock-agentcore.amazonaws.com"
      },
      "Action": "sts:AssumeRole",
      "Condition": {
        "StringEquals": {
          "aws:SourceAccount": "123456789012"
        },
        "ArnLike": {
          "aws:SourceArn": "arn:aws:bedrock-agentcore:us-east-1:123456789012:*"
        }
      }
    }
  ]
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;ol&gt;
&lt;li&gt;Name the role something clear, for example:&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;AgentCoreRuntimeExecutionRole-
Create the role.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Step 2 — Attach the correct permissions policy&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is where most people get confused.&lt;/p&gt;

&lt;p&gt;You want the policy titled “AgentCore Runtime execution role” (NOT the “direct deploy execution role”, and NOT the “starter toolkit” caller policy).&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Open the role you just created&lt;/li&gt;
&lt;li&gt;Go to &lt;strong&gt;Permissions tab&lt;/strong&gt; &lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;Add permissions&lt;/strong&gt; → &lt;strong&gt;Create inline policy&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Choose &lt;strong&gt;JSON&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Paste the following policy JSON :
&lt;/li&gt;
&lt;/ol&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "ECRImageAccess",
            "Effect": "Allow",
            "Action": [
                "ecr:BatchGetImage",
                "ecr:GetDownloadUrlForLayer"
            ],
            "Resource": [
                "arn:aws:ecr:us-east-1:123456789012:repository/*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "logs:DescribeLogStreams",
                "logs:CreateLogGroup"
            ],
            "Resource": [
                "arn:aws:logs:us-east-1:123456789012:log-group:/aws/bedrock-agentcore/runtimes/*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "logs:DescribeLogGroups"
            ],
            "Resource": [
                "arn:aws:logs:us-east-1:123456789012:log-group:*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "logs:CreateLogStream",
                "logs:PutLogEvents"
            ],
            "Resource": [
                "arn:aws:logs:us-east-1:123456789012:log-group:/aws/bedrock-agentcore/runtimes/*:log-stream:*"
            ]
        },
        {
            "Sid": "ECRTokenAccess",
            "Effect": "Allow",
            "Action": [
                "ecr:GetAuthorizationToken"
            ],
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "xray:PutTraceSegments",
                "xray:PutTelemetryRecords",
                "xray:GetSamplingRules",
                "xray:GetSamplingTargets"
            ],
            "Resource": [
                "*"
            ]
        },
        {
            "Effect": "Allow",
            "Resource": "*",
            "Action": "cloudwatch:PutMetricData",
            "Condition": {
                "StringEquals": {
                    "cloudwatch:namespace": "bedrock-agentcore"
                }
            }
        },
        {
            "Sid": "GetAgentAccessToken",
            "Effect": "Allow",
            "Action": [
                "bedrock-agentcore:GetWorkloadAccessToken",
                "bedrock-agentcore:GetWorkloadAccessTokenForJWT",
                "bedrock-agentcore:GetWorkloadAccessTokenForUserId"
            ],
            "Resource": [
                "arn:aws:bedrock-agentcore:us-east-1:123456789012:workload-identity-directory/default",
                "arn:aws:bedrock-agentcore:us-east-1:123456789012:workload-identity-directory/default/workload-identity/agentName-*"
            ]
        },
        {
            "Sid": "BedrockModelInvocation",
            "Effect": "Allow",
            "Action": [
                "bedrock:InvokeModel",
                "bedrock:InvokeModelWithResponseStream"
            ],
            "Resource": [
                "arn:aws:bedrock:*::foundation-model/*",
                "arn:aws:bedrock:us-east-1:123456789012:*"
            ]
        }
    ]
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;ol&gt;
&lt;li&gt; (replace region/account, 123456789012 and us-east-1).&lt;/li&gt;
&lt;li&gt;Click on Next and save policy name.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Step 3 — Copy the Role ARN (this is what -er needs)&lt;/strong&gt;&lt;br&gt;
In IAM → Roles → open your role → copy ARN.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Deploy using the role ARN in your CLI&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;agentcore configure --entrypoint my_agent.py -er YOUR-ROLE_ARN
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Please note that &lt;strong&gt;my_agent.py&lt;/strong&gt; has to be replaced by your entry file where you define your agentCore setup&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Summary&lt;/strong&gt;&lt;br&gt;
The key unlock is understanding:&lt;/p&gt;

&lt;p&gt;✅ -er expects the AgentCore Runtime execution role ARN&lt;br&gt;
❌ It is NOT your user/SSO identity ARN&lt;/p&gt;

&lt;p&gt;Once that role exists (trust + runtime policy), deployment works.&lt;/p&gt;

</description>
      <category>aiagents</category>
      <category>bedrockagencore</category>
      <category>imrole</category>
      <category>generativeai</category>
    </item>
    <item>
      <title>From Demos to Business Value: Taking Agents to Production with Amazon Bedrock AgentCore</title>
      <dc:creator>Milad Rezaeighale</dc:creator>
      <pubDate>Mon, 11 Aug 2025 11:05:02 +0000</pubDate>
      <link>https://dev.to/aws-builders/from-demos-to-business-value-taking-agents-to-production-with-amazon-bedrock-agentcore-2pdj</link>
      <guid>https://dev.to/aws-builders/from-demos-to-business-value-taking-agents-to-production-with-amazon-bedrock-agentcore-2pdj</guid>
      <description>&lt;p&gt;We’re living in the era of agents and agentic workflows. Frameworks like &lt;a href="https://www.langchain.com/" rel="noopener noreferrer"&gt;LangChain&lt;/a&gt;, &lt;a href="https://www.llamaindex.ai/" rel="noopener noreferrer"&gt;LlamaIndex&lt;/a&gt;, &lt;a href="https://www.crewai.com/" rel="noopener noreferrer"&gt;CrewAI&lt;/a&gt;, and others make it easier than ever to design complex single- or multi-agent systems that can plan, reason, and act. It’s exciting to see these frameworks powering demos that wow technical teams and spark imagination.&lt;/p&gt;

&lt;p&gt;But here’s the catch: no matter how clever the prompt chaining is, or how impressive the reasoning looks on screen, it doesn’t create real business value until it’s deployed into production and embedded into the company’s workflows. For executives, a polished demo is nice — but a production-ready agent that’s delivering measurable outcomes is what really matters.&lt;/p&gt;

&lt;p&gt;This is where &lt;a href="https://aws.amazon.com/bedrock/agentcore/" rel="noopener noreferrer"&gt;Amazon Bedrock AgentCore&lt;/a&gt; comes in. It enables you to deploy and operate highly effective agents securely, at scale, using any framework or model — including open-source options like LangChain or LlamaIndex. With AgentCore, you can accelerate AI agents into production with the scale, reliability, and security essential for real-world use. It offers tools to enhance agent capabilities, purpose-built infrastructure to scale securely, and controls to ensure trustworthiness. Best of all, its services are composable and framework-agnostic, so you don’t have to choose between open-source flexibility and enterprise-grade robustness.&lt;/p&gt;

&lt;h2&gt;
  
  
  From Theory to Practice
&lt;/h2&gt;

&lt;p&gt;We’ve talked about why production deployment matters and how Amazon Bedrock AgentCore is designed to make it easier, faster, and more secure. Now, without any further explanation, let’s get straight to the point — in the rest of this article, I’ll show you exactly how you can deploy your own agent into production with AgentCore.&lt;/p&gt;

&lt;p&gt;We’ve talked about why production deployment matters and how Amazon Bedrock AgentCore is designed to make it easier, faster, and more secure. Now, without any further explanation, let’s get straight to the point — in this article, we’ll keep things simple by using the &lt;strong&gt;AgentCore Starter Toolkit&lt;/strong&gt;, which gives you the perfect opportunity for quick prototyping and testing. In the following steps, I’ll walk you through how to use it to deploy your own agent into production with AgentCore.&lt;/p&gt;

&lt;p&gt;Before starting, ensure your AWS CLI is configured and authenticated. You can either:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use &lt;strong&gt;AWS SSO&lt;/strong&gt; via aws configure sso, or&lt;/li&gt;
&lt;li&gt;Use &lt;strong&gt;access keys&lt;/strong&gt; via aws configure&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This configuration must be done in the same environment where you will run your Python script so that boto3 can authenticate and invoke your Bedrock AgentCore runtime successfully.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1 – Configuration
&lt;/h2&gt;

&lt;p&gt;First, install the Bedrock AgentCore Starter Toolkit. This toolkit gives you a ready-made environment to quickly prototype and test agents before taking them to production.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;pip install bedrock-agentcore-starter-toolkit
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once installed, you’ll have access to CLI commands and project templates that speed up setup so you can focus on building and deploying your agent.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2 – Create Your Project Folder
&lt;/h2&gt;

&lt;p&gt;Next, set up a simple project structure for your agent. This will keep your code, dependencies, and package definition organized for deployment.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Project Folder Structure&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;your_project_directory/
├── my_agent.py     
├── requirements.txt     
└── __init__.py          

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;File Contents&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;my_agent.py&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from strands import Agent
from bedrock_agentcore.runtime import BedrockAgentCoreApp
from strands.models import BedrockModel
import json

model_id = "eu.anthropic.claude-3-7-sonnet-20250219-v1:0"
model = BedrockModel(
    model_id=model_id,
)

agent = Agent(
    model=model
)

app = BedrockAgentCoreApp()

@app.entrypoint
def invoke(payload):
    """
    Invoke the agent with a payload
    """
    user_input = payload.get("prompt")
    print("User input:", user_input)
    response = agent(user_input)
    return response.message['content'][0]['text']

if __name__ == "__main__":
    app.run()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;requirements.txt&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;strands-agents
bedrock-agentcore
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This minimal setup defines:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;my_agent.py — where your agent’s logic lives and integrates with AgentCore.&lt;/li&gt;
&lt;li&gt;requirements.txt — listing dependencies so they can be installed in the runtime environment.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;init&lt;/strong&gt;.py — ensures the folder is treated as a Python package.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Step 3 – Configure Your Agent
&lt;/h2&gt;

&lt;p&gt;Before deploying, you need to tell the Starter Toolkit which IAM role your agent should use when running in production. This role must have the necessary AgentCore Runtime permissions (see &lt;a href="https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/runtime-permissions.html" rel="noopener noreferrer"&gt;Permissions for AgentCore Runtime&lt;/a&gt;).&lt;br&gt;
Run the following command, replacing _*&lt;em&gt;YOUR_IAM_ROLE_ARN *&lt;/em&gt;_with the ARN of your IAM role:&lt;/p&gt;
&lt;h2&gt;
  
  
  Step 4 – Deploy Your Agent
&lt;/h2&gt;

&lt;p&gt;Now that your agent is configured, it’s time to deploy it into production using AgentCore.&lt;/p&gt;

&lt;p&gt;Deployment Steps&lt;/p&gt;

&lt;p&gt;Step 1 – Configure Your Agent for Deployment&lt;br&gt;
Run the following command, replacing &lt;em&gt;&lt;strong&gt;&lt;/strong&gt;&lt;/em&gt; with your IAM role ARN:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;agentcore configure --entrypoint my_agent.py -er &amp;lt;YOUR_IAM_ROLE_ARN&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This command will:&lt;/p&gt;

&lt;p&gt;Generate a &lt;em&gt;&lt;strong&gt;Dockerfile **_and _&lt;/strong&gt;.dockerignore**&lt;/em&gt; for containerizing your agent&lt;/p&gt;

&lt;p&gt;Create a &lt;em&gt;&lt;strong&gt;.bedrock_agentcore.yaml&lt;/strong&gt;&lt;/em&gt; configuration file with your agent’s runtime settings&lt;/p&gt;

&lt;p&gt;While configuring your agent, you’ll be prompted to provide the URI of the Amazon ECR repository where the Docker image will be uploaded. You can either create this repository yourself in the AWS Console and enter its URI, or simply press Enter to have AgentCore create one for you automatically.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsmqoducth60d0wtloww1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsmqoducth60d0wtloww1.png" alt="ECR repository" width="800" height="55"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You will also be prompted to confirm your dependencies — press Enter to let AgentCore use requirements.txt. For authorization, you can choose the default no to keep IAM.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fra9lafpsu9soxv9o3ku3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fra9lafpsu9soxv9o3ku3.png" alt="agentCore configuration" width="800" height="278"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;After completing the prompts, you’ll see a &lt;strong&gt;configuration summary&lt;/strong&gt; showing your agent name, AWS region, account ID, execution role, ECR repository, and authorization method. The configuration is then saved locally in a &lt;strong&gt;.bedrock_agentcore.yaml&lt;/strong&gt; file for use during deployment.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc0p91pdvirgctki9tyk1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc0p91pdvirgctki9tyk1.png" alt="agentCore configuration" width="800" height="216"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now you’re ready to launch your agent in production.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 5 – Launch Your Agent
&lt;/h2&gt;

&lt;p&gt;With your configuration complete, you can now deploy your agent to AWS with a single command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;agentcore launch
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This command will:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Build&lt;/strong&gt; a Docker image containing your agent code&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Push&lt;/strong&gt; the image to Amazon ECR&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Create a Bedrock AgentCore runtime&lt;/strong&gt; in your AWS account&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deploy&lt;/strong&gt; your agent to the cloud so it’s ready for production use&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpnrt4pr9667kk02casbe.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpnrt4pr9667kk02casbe.png" alt="agentCore-deployed" width="800" height="244"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Once complete, you’ll have a production-ready agent running on Amazon Bedrock AgentCore, fully integrated with your chosen framework and secured by AWS IAM.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 6 – Invoke the Agent
&lt;/h2&gt;

&lt;p&gt;To test our deployed agent, we’ll create a new file named &lt;strong&gt;test.py&lt;/strong&gt; in the same folder as our project and run the invocation from there.&lt;br&gt;
This script sends a natural-language prompt to the agent and processes the streamed response.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import boto3
import json

# Initialize the Bedrock AgentCore client in the same region as your agent
agentcore_client = boto3.client('bedrock-agentcore', region_name='eu-central-1')

# Your Agent Runtime ARN (from the deployment step)
# You can find this in the Bedrock console under your agent’s runtime details,
# or in the deployment confirmation message.
AGENT_RUNTIME = "AOUTR_AGENT_RUNTIME_ARN"

# Prompt to send to the agent
PROMPT = "Please explain how can I become a professional football player?"

# Invoke the agent
boto3_response = agentcore_client.invoke_agent_runtime(
    agentRuntimeArn=AGENT_RUNTIME,
    qualifier="DEFAULT",
    payload=json.dumps({"prompt": PROMPT})
)

# The response is streamed in chunks; read them all into memory
response_body = boto3_response['response']
all_chunks = [chunk for chunk in response_body]

# Combine chunks into one string
complete_response = b''.join(all_chunks).decode('utf-8')

# Attempt to parse JSON output
try:
    response_json = json.loads(complete_response)
    print(response_json)
except json.JSONDecodeError:
    print("Raw response:")
    print(complete_response)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;How it works:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;boto3.client('bedrock-agentcore')&lt;/strong&gt; – Creates a client to communicate with the AgentCore Runtime service.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;invoke_agent_runtime()&lt;/strong&gt; – Sends the prompt to the agent and streams back the response.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;StreamingBody reading&lt;/strong&gt; – The output is returned in small chunks, which we merge before decoding.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;JSON parsing&lt;/strong&gt; – If the response is in JSON format, we parse it; otherwise, we display the raw text.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Save the file as test.py in your project folder, then run it from your terminal:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;python test.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should see the agent’s JSON response (or raw output) printed in the terminal.&lt;/p&gt;

&lt;p&gt;This approach ensures you receive the complete, assembled agent output, whether it’s plain text or structured JSON.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wrapping Up
&lt;/h2&gt;

&lt;p&gt;Amazon Bedrock AgentCore bridges the gap between impressive agent demos and real-world business impact. By following the steps in this guide, you can go from idea to production-ready agent quickly, while leveraging AWS’s scalability, reliability, and security. The sooner your agent moves into production, the sooner it can start delivering measurable value to your business.&lt;/p&gt;

&lt;p&gt;Whether you’re experimenting with a single-agent workflow or orchestrating multi-agent systems, AgentCore gives you the tools to operationalize your ideas with confidence. Now it’s your turn—deploy your agent, test it, and see how it performs in the real world.&lt;/p&gt;

</description>
      <category>agentcore</category>
      <category>amazonbedrock</category>
      <category>amazonwebservices</category>
      <category>deployagents</category>
    </item>
    <item>
      <title>Understanding Amazon Bedrock Pricing: From On-Demand to Fine-Tuning</title>
      <dc:creator>Milad Rezaeighale</dc:creator>
      <pubDate>Mon, 05 May 2025 10:05:48 +0000</pubDate>
      <link>https://dev.to/aws-builders/understanding-amazon-bedrock-pricing-from-on-demand-to-fine-tuning-316d</link>
      <guid>https://dev.to/aws-builders/understanding-amazon-bedrock-pricing-from-on-demand-to-fine-tuning-316d</guid>
      <description>&lt;p&gt;As generative AI continues to revolutionize industries, Amazon Bedrock emerges as a pivotal platform, providing seamless access to a plethora of foundation models (FMs) from leading AI providers such as Anthropic, Meta, Mistral AI, and Amazon itself. Its serverless architecture and unified API simplify the deployment of AI applications. However, understanding its pricing nuances is crucial for optimizing both performance and cost.&lt;/p&gt;

&lt;h2&gt;
  
  
  Model Inference
&lt;/h2&gt;

&lt;p&gt;When utilizing foundation models (FMs) in Amazon Bedrock for inference, there are two primary approaches: &lt;strong&gt;On-Demand&lt;/strong&gt; and &lt;strong&gt;Provisioned Throughput&lt;/strong&gt;. &lt;/p&gt;

&lt;h3&gt;
  
  
  On-Demand
&lt;/h3&gt;

&lt;p&gt;In the On-Demand model, Amazon Bedrock operates on a pay-as-you-go basis, making it ideal for scenarios where usage patterns are unpredictable. For instance, if you're launching a new LLM application without a clear forecast of user engagement, this model offers flexibility without long-term commitments. Each foundation model (FM) available through Bedrock has its own pricing structure based on token usage. When the model is invoked, Bedrock calculates the number of input and output tokens processed and multiplies these by the respective per-token rates defined for that model. &lt;/p&gt;

&lt;h3&gt;
  
  
  Prompt caching
&lt;/h3&gt;

&lt;p&gt;In addition to standard token pricing, Amazon Bedrock also offers a &lt;strong&gt;prompt caching&lt;/strong&gt; feature. This allows repeated prompts within a short window to be served from cache, reducing both latency and cost—especially useful when parts of your input remain the same across multiple requests.&lt;/p&gt;

&lt;p&gt;Let’s take a look at the current pricing for Amazon Nova Micro on the Bedrock pricing page. (Note: Pricing is subject to change, so it’s always a good idea to refer to the &lt;a href="https://aws.amazon.com/bedrock/pricing/" rel="noopener noreferrer"&gt;official AWS Bedrock pricing page&lt;/a&gt; for the latest rates.)&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frhdhzl6he6epft8oa8w0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frhdhzl6he6epft8oa8w0.png" alt="Amazon Nova pricing" width="800" height="66"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For example, &lt;strong&gt;Amazon Nova Micro&lt;/strong&gt;—a lightweight text generation model—charges &lt;strong&gt;$0.000035 per 1,000 input tokens and $0.00014 per 1,000 output tokens&lt;/strong&gt; when used in &lt;strong&gt;on-demand mode&lt;/strong&gt;. If a portion of your prompt is cached, the cached input tokens are charged at a reduced rate of &lt;strong&gt;$0.00000875 per 1,000&lt;/strong&gt;, offering substantial savings for repeated instructions or context. When running &lt;strong&gt;batch inference&lt;/strong&gt;, input and output costs drop even further to &lt;strong&gt;$0.0000175&lt;/strong&gt; and &lt;strong&gt;$0.00007 per 1,000 tokens&lt;/strong&gt;, respectively—making it a cost-efficient choice for large-scale jobs. While these prices seem small, they can quickly add up when you’re processing thousands of requests per day.&lt;/p&gt;

&lt;p&gt;In addition to text-based models, Amazon Bedrock includes support for image and video generation, with pricing based on output type and quality. For example, generating images through Amazon Nova Canvas or Stability AI models ranges from a few cents depending on resolution and quality level—higher resolutions or premium outputs cost more. &lt;/p&gt;

&lt;h3&gt;
  
  
  Batch processing - potential to reduce inference costs
&lt;/h3&gt;

&lt;p&gt;If you plan to handle a high volume of prompts or images in one scheduled run, using batch inference can help reduce the cost per token or per image. Let’s say you have 1,000 customer support transcripts that you want to summarize. Instead of sending each document individually—which can be both time-consuming and more expensive—you can use batch inference to process them all at once. Each document is treated as a separate prompt within a single batch job. This approach helps reduce the per-token cost compared to on-demand inference and is well-suited for scheduled or background tasks that don’t require real-time responses. The main advantage is reduced per-token cost compared to on-demand inference, making it ideal for large-scale jobs that don't require real-time output.&lt;/p&gt;

&lt;h2&gt;
  
  
  Provisioned Throughput
&lt;/h2&gt;

&lt;p&gt;For applications that require consistent, high-performance inference—especially in production environments—&lt;strong&gt;Provisioned Throughput&lt;/strong&gt; is a valuable option. Unlike the on-demand model where you pay per token, Provisioned Throughput reserves dedicated capacity for your chosen foundation model, ensuring low-latency and predictable response times. You are billed hourly/daily/monthly for the provisioned units, regardless of how much you use them, which makes this approach ideal for steady, high-volume workloads. Bedrock also offers discounts based on commitment: the longer the reservation (e.g., 1-month or 6-month plans), the lower the hourly rate.&lt;/p&gt;

&lt;h2&gt;
  
  
  Which Pricing Model Should You Choose?
&lt;/h2&gt;

&lt;p&gt;If you're just starting out or expect fluctuating usage, &lt;strong&gt;On-Demand&lt;/strong&gt; gives you the flexibility to pay only for what you use—perfect for development, experimentation, or unpredictable traffic. If you’re processing large volumes of requests in scheduled jobs, &lt;strong&gt;Batch Inference&lt;/strong&gt; offers the same flexibility with better cost-efficiency. For steady, production-level workloads that demand consistent performance and low latency, &lt;strong&gt;Provisioned Throughput&lt;/strong&gt; is the most reliable choice, especially when combined with long-term commitments for additional savings.&lt;/p&gt;

&lt;h2&gt;
  
  
  Fine-Tuning: Customizing Models
&lt;/h2&gt;

&lt;p&gt;When you want to fine-tune a model in Amazon Bedrock, the cost structure differs from standard inference and comes with a few additional components:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Training Cost:&lt;/strong&gt; For text models, you’re charged per 1,000 tokens processed during training. For image or multimodal models, pricing is typically based on the number of images used.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Storage Fee:&lt;/strong&gt; After fine-tuning, the custom model is stored in your account, and a monthly storage fee applies.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Inference Cost:&lt;/strong&gt; You can’t run fine-tuned models in on-demand mode. Instead, you must use &lt;strong&gt;Provisioned Throughput&lt;/strong&gt;, which is billed hourly—even if the model isn’t actively being used.&lt;/p&gt;

&lt;p&gt;For example, let’s consider fine-tuning Amazon Nova Micro using a small dataset.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fua9181lbciy76nytv3dn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fua9181lbciy76nytv3dn.png" alt="Amazon Nova model fine-tuning pricing" width="800" height="84"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Pricing for model customization (fine-tuning)&lt;br&gt;
Let’s say you're fine-tuning a model with 100,000 tokens (about 75,000 words or 150+ pages of content). That’s still on the small side for deep fine-tuning, but it’s a more realistic starting point.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Training Cost (One-time)&lt;/strong&gt; You’re charged based on the number of tokens processed during training. → Example: 100,000 tokens × $0.001 per 1,000 tokens = &lt;strong&gt;$0.10 (one-time)&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Model Storage (Monthly)&lt;/strong&gt; Once the model is fine-tuned, storing it incurs a fixed monthly cost. → Example: &lt;strong&gt;$1.95 per month&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Provisioned Throughput for Inference (Hourly)&lt;/strong&gt; Fine-tuned models must use provisioned throughput—you pay even if no requests are made. → Example: &lt;strong&gt;$108.15 per hour&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Special Case Pricing: What You Should Know
&lt;/h2&gt;

&lt;p&gt;When exploring Amazon Bedrock's pricing structure, it's essential to be aware of certain exceptional costs that can significantly impact your overall expenditure. Beyond the standard charges for on-demand usage and provisioned throughput, there are additional fees associated with model customization. For instance, fine-tuning a model on your proprietary data incurs costs based on the number of tokens processed during training. Moreover, once a model is fine-tuned, storing it attracts a monthly storage fee. These costs are separate from the inference charges and can accumulate over time, especially if multiple custom models are maintained.&lt;/p&gt;

&lt;p&gt;Another area to consider is the inference of fine-tuned models. Unlike base models that can be used on-demand, fine-tuned models require provisioned throughput, meaning you need to reserve dedicated capacity, which is billed hourly regardless of usage. This can lead to higher costs, particularly if the reserved capacity isn't fully utilized. Additionally, importing models trained outside of Bedrock may involve compatibility evaluations and associated fees. It's crucial to factor in these exceptional costs when planning your AI infrastructure to avoid unexpected charges.&lt;/p&gt;

&lt;p&gt;🧠 Final Thoughts&lt;br&gt;
Amazon Bedrock offers a flexible and modular pricing structure that adapts to various use cases—from quick experiments to production-grade AI applications. Whether you're using foundation models as-is or customizing them through fine-tuning, understanding the cost breakdown is crucial to optimizing both performance and spend. With the right usage pattern, you can scale your AI applications efficiently without surprises in your billing dashboard.&lt;/p&gt;

</description>
      <category>amazonbedrock</category>
      <category>inferencepricing</category>
      <category>foundationmodels</category>
      <category>bedrockexplained</category>
    </item>
    <item>
      <title>Unifying or Separating Endpoints in Generative AI Applications on AWS</title>
      <dc:creator>Milad Rezaeighale</dc:creator>
      <pubDate>Wed, 27 Nov 2024 21:19:31 +0000</pubDate>
      <link>https://dev.to/aws-builders/unifying-or-separating-endpoints-in-generative-ai-applications-on-aws-g2g</link>
      <guid>https://dev.to/aws-builders/unifying-or-separating-endpoints-in-generative-ai-applications-on-aws-g2g</guid>
      <description>&lt;p&gt;When building generative AI applications on AWS, one critical decision is how to manage multiple components. For example, you might have a &lt;strong&gt;retrieval-augmented generation (RAG)&lt;/strong&gt; pipeline for context and a &lt;strong&gt;fine-tuned model&lt;/strong&gt; for specific tasks. Should these components share a single endpoint, or should you give each one its own? Both approaches have their pros and cons, and the right choice depends on your use case.&lt;/p&gt;

&lt;p&gt;In this article, I’ll break down the &lt;strong&gt;unified endpoint&lt;/strong&gt; vs. &lt;strong&gt;separated endpoint&lt;/strong&gt; designs, so you can make an informed decision for your architecture.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Unified Endpoint Approach
&lt;/h2&gt;

&lt;p&gt;With a unified endpoint, you deploy a single API Gateway and route requests to the appropriate model based on paths, methods, or query parameters.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpy7eun0o97kp9itcwloz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpy7eun0o97kp9itcwloz.png" alt="The Unified Endpoint Approach" width="800" height="363"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here’s how it works:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use a single API Gateway, like &lt;a href="https://api.example.com" rel="noopener noreferrer"&gt;https://api.example.com&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Backend logic (usually a Lambda function) handles routing. For instance:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;POST /rag routes traffic to the RAG pipeline.&lt;/li&gt;
&lt;li&gt;POST /fine-tuned invokes the fine-tuned model.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Why Choose Unified?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Cost-Effective:&lt;/strong&gt; Operating one gateway is cheaper than managing multiple.&lt;br&gt;
&lt;strong&gt;Simplified Integration:&lt;/strong&gt; Clients use one URL for all requests, reducing complexity.&lt;br&gt;
&lt;strong&gt;Flexible:&lt;/strong&gt; Adding new routes for additional models or services is straightforward.&lt;/p&gt;

&lt;h2&gt;
  
  
  Potential Drawbacks
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Routing Overhead:&lt;/strong&gt; You need backend logic to manage and direct requests.&lt;br&gt;
&lt;strong&gt;Shared Bottlenecks:&lt;/strong&gt; High traffic to one pipeline might impact the other unless autoscaling is configured carefully.&lt;br&gt;
Unified endpoints are great for early-stage projects or MVPs where simplicity and cost savings matter most.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Separated Endpoint Approach
&lt;/h2&gt;

&lt;p&gt;In a separated design, each model gets its own API Gateway. For example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://rag.example.com" rel="noopener noreferrer"&gt;https://rag.example.com&lt;/a&gt; for the RAG pipeline.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://fine-tuned.example.com" rel="noopener noreferrer"&gt;https://fine-tuned.example.com&lt;/a&gt; for the fine-tuned model.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9qmehr3hh94m8m2j3yrr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9qmehr3hh94m8m2j3yrr.png" alt="The Separated Endpoint Approach" width="800" height="375"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Choose Separated?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Scalability:&lt;/strong&gt; Each gateway can scale independently, ensuring reliable performance.&lt;br&gt;
&lt;strong&gt;Reliability:&lt;/strong&gt; Issues in one model don’t affect the other.&lt;br&gt;
&lt;strong&gt;No Routing Logic:&lt;/strong&gt; Each gateway directly connects to its respective model, simplifying backend code.&lt;/p&gt;

&lt;h2&gt;
  
  
  Trade-Offs
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Higher Costs:&lt;/strong&gt; Operating multiple gateways adds to your AWS bill.&lt;br&gt;
&lt;strong&gt;More Complex Integration:&lt;/strong&gt; Clients need to manage multiple URLs, which can complicate development.&lt;/p&gt;

&lt;p&gt;Separated endpoints are ideal for production systems with high traffic or strict performance requirements.&lt;/p&gt;

&lt;h2&gt;
  
  
  Which Approach Is Right for You?
&lt;/h2&gt;

&lt;p&gt;It depends on your application’s stage and requirements:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use Unified Endpoints If:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You’re in the early stages or building an MVP.&lt;/li&gt;
&lt;li&gt;Traffic for both models is predictable and not too high.&lt;/li&gt;
&lt;li&gt;Cost savings and simplicity are top priorities.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Use Separated Endpoints If:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Your application handles high traffic or requires independent scaling.&lt;/li&gt;
&lt;li&gt;Reliability and modularity are critical.&lt;/li&gt;
&lt;li&gt;You’re running a production-grade system with strict SLAs.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  A Hybrid Approach?
&lt;/h2&gt;

&lt;p&gt;In many cases, starting with a unified endpoint and transitioning to separated endpoints as your app scales can be the best option. This approach lets you balance simplicity and cost in the beginning with scalability and performance later on.&lt;/p&gt;

&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;Architecting generative AI applications on AWS involves trade-offs, and there’s no one-size-fits-all solution. Unified endpoints keep things simple and cost-effective for small or early-stage projects, while separated endpoints shine in production systems with demanding workloads.&lt;/p&gt;

&lt;p&gt;If you’re just starting out, consider trying a unified endpoint and evolving your architecture as needed. AWS services like API Gateway and Lambda give you the flexibility to adapt and scale your design over time.&lt;/p&gt;

&lt;p&gt;What’s your preference—unified or separated endpoints? Let’s discuss in the comments below!&lt;/p&gt;

</description>
      <category>llmops</category>
      <category>fmops</category>
      <category>machinelearning</category>
      <category>amazonwebservices</category>
    </item>
    <item>
      <title>Fine-Tuning and Deploying Custom AI Models on Amazon Bedrock: A Practical Guide</title>
      <dc:creator>Milad Rezaeighale</dc:creator>
      <pubDate>Mon, 25 Nov 2024 12:32:14 +0000</pubDate>
      <link>https://dev.to/aws-builders/fine-tuning-and-deploying-custom-ai-models-on-amazon-bedrock-a-practical-guide-39m6</link>
      <guid>https://dev.to/aws-builders/fine-tuning-and-deploying-custom-ai-models-on-amazon-bedrock-a-practical-guide-39m6</guid>
      <description>&lt;p&gt;In the rapidly evolving field of Generative AI, the ability to fine-tune and deploy custom models is a crucial skill that enables businesses to tailor solutions to their unique needs. &lt;a href="https://aws.amazon.com/bedrock/" rel="noopener noreferrer"&gt;Amazon Bedrock&lt;/a&gt;, a powerful service within the Amazon Web Services (AWS) ecosystem, simplifies this process by offering a robust platform for building, fine-tuning, and deploying large language models (LLMs). Whether you’re looking to enhance a model's performance for a specific task or deploy it at scale, Amazon Bedrock provides the tools and infrastructure to do so efficiently.&lt;/p&gt;

&lt;p&gt;Amazon Bedrock provides a seamless environment for fine-tuning and deploying these models, simplifying what can often be a complex process. If you're new to the concept of fine-tuning or want to delve deeper into its mechanics, I highly recommend &lt;a href="https://towardsdatascience.com/stepping-out-of-the-comfort-zone-through-domain-adaptation-a-deep-dive-into-dynamic-prompting-4860c6d16224" rel="noopener noreferrer"&gt;A Deep Dive into Fine-Tuning&lt;/a&gt; which offers an excellent explanation.&lt;/p&gt;

&lt;p&gt;In this article, I will guide you through the process of fine-tuning a language model using Amazon Bedrock. We'll focus on the most critical sections of the code, providing a clear understanding of the key components and steps involved in the fine-tuning process. The goal is to highlight the essential elements so you can grasp how the general workflow is implemented, without diving into every line of code.&lt;/p&gt;

&lt;p&gt;For those who want to dive directly into the code or explore it further, the complete implementation is available in &lt;a href="https://github.com/miladrezaei-ai/bedrock-custom-model-finetuning" rel="noopener noreferrer"&gt;my GitHub repository&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Use Case: Summarizing Doctor-Patient Dialogues
&lt;/h2&gt;

&lt;p&gt;For this example, we'll focus on a dataset containing doctor-patient dialogues sourced from the &lt;a href="https://github.com/microsoft/clinical_visit_note_summarization_corpus" rel="noopener noreferrer"&gt;ACI-Bench dataset&lt;/a&gt;. Our task is to train the model to summarize these dialogues into structured clinical notes. The foundation model selected for this fine-tuning is &lt;strong&gt;Cohere's command-light-text-v14&lt;/strong&gt;, which excels at generating concise and coherent text summaries.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Objective&lt;/strong&gt;: we will:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Set up the necessary AWS resources.&lt;/li&gt;
&lt;li&gt;Prepare and upload finefune dataset to S3.&lt;/li&gt;
&lt;li&gt;Create and submit a fine-tuning job.&lt;/li&gt;
&lt;li&gt;Purchase provisioned throughput.&lt;/li&gt;
&lt;li&gt;Test our fine-tuned model.&lt;/li&gt;
&lt;li&gt;Clean up&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Step 1: Set up the necessary AWS resources
&lt;/h2&gt;

&lt;p&gt;Before we begin, we need to ensure we have the necessary AWS SDK installed and configured. We'll use &lt;a href="https://aws.amazon.com/sdk-for-python/" rel="noopener noreferrer"&gt;boto3&lt;/a&gt;, the AWS SDK for Python, to interact with various AWS services:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import boto3
import json
import os
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 2: Prepare and upload finefune dataset to S3
&lt;/h2&gt;

&lt;p&gt;In this step, we prepare the dataset by formatting it into the &lt;a href="https://jsonlines.org/" rel="noopener noreferrer"&gt;JSON Lines (JSONL)&lt;/a&gt; structure required for fine-tuning on Amazon Bedrock. Each line in the JSONL file must include a &lt;strong&gt;Prompt&lt;/strong&gt; and a &lt;strong&gt;Completion&lt;/strong&gt; field.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Define output path for JSONL
output_file_name = 'clinical_notes_fine_tune.jsonl'
output_file_path = os.path.join('dataset', output_file_name)
output_dir = os.path.dirname(output_file_path)

# Prepare and save the dataset in the fine-tuning JSONL format
with open(output_file_path, 'w') as outfile:
    for _, row in train_dataset.iterrows():
        formatted_entry = {
            "completion": row['note'],  # Replace 'note' with the correct column name
            "prompt": f"Summarize the following conversation.\n\n{row['dialogue']}"  # Replace 'dialogue' as needed
        }
        json.dump(formatted_entry, outfile)
        outfile.write('\n')
    print(f"Dataset has been reformatted and saved to {output_file_path}.")
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The following is the format of the data converted into JSONL:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
    "completion": "&amp;lt;Summarized clinical note&amp;gt;",
    "prompt": "Summarize the following conversation:\n\n&amp;lt;Doctor-patient dialogue&amp;gt;"
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To make the dataset accessible for fine-tuning, it needs to be uploaded to an Amazon S3 bucket. The code ensures that the S3 bucket exists, creating it if necessary. Once the bucket is verified, the fine-tuning dataset, saved in JSON Lines format, is uploaded to the specified bucket. This step is essential, as Amazon Bedrock accesses the dataset from S3 during the fine-tuning process.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Define the file path and S3 details
bucket_name = 'bedrock-finetuning-bucket25112024'
s3_key = abstracts_file

# Specify the region
region = 'us-east-1'  # Change this if needed

# Initialize S3 client with the specified region
s3_client = boto3.client('s3', region_name=region)

# Check if the bucket exists
try:
    existing_buckets = s3_client.list_buckets()
    bucket_exists = any(bucket['Name'] == bucket_name for bucket in existing_buckets['Buckets'])

    if not bucket_exists:
        # Create the bucket based on the region
        try:
            if bucket_region == 'us-east-1':
                # For us-east-1, do not specify LocationConstraint
                s3_client.create_bucket(Bucket=bucket_name)
                print(f"Bucket {bucket_name} created successfully in us-east-1.")
            else:
                # For other regions, specify the LocationConstraint
                s3_client.create_bucket(
                    Bucket=bucket_name,
                    CreateBucketConfiguration={'LocationConstraint': bucket_region}
                )
                print(f"Bucket {bucket_name} created successfully in {bucket_region}.")
        except Exception as e:
            print(f"Error creating bucket: {e}")
            raise e
    else:
        print(f"Bucket {bucket_name} already exists.")

    # Upload the file to S3
    try:
        s3_client.upload_file(output_file_path, bucket_name, s3_key)
        print(f"File uploaded to s3://{bucket_name}/{s3_key}")
    except Exception as e:
        print(f"Error uploading to S3: {e}")

except Exception as e:
    print(f"Error: {e}")
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 3: Create and submit a fine-tuning job
&lt;/h2&gt;

&lt;p&gt;With the dataset uploaded to Amazon S3 and the necessary resources in place, the next step is to create and submit the fine-tuning job. This involves specifying the pre-trained foundation model, the job details, and the fine-tuning parameters.&lt;/p&gt;

&lt;p&gt;In this example, we fine-tune the &lt;strong&gt;Cohere command-light-text-v14&lt;/strong&gt; model to summarize medical conversations. Below is the configuration used to submit the job:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Define the job parameters
base_model_id = "cohere.command-light-text-v14:7:4k"
job_name = "cohere-Summarizer-medical-finetuning-job-v1"
model_name = "cohere-Summarizer-medical-Tuned-v1"

# Submit the fine-tuning job
bedrock.create_model_customization_job(
    customizationType="FINE_TUNING",
    jobName=job_name,
    customModelName=model_name,
    roleArn=role_arn,
    baseModelIdentifier=base_model_id,
    hyperParameters={
        "epochCount": "3",  # Number of passes over the dataset
        "batchSize": "16",  # Number of samples per training step
        "learningRate": "0.00005",  # Learning rate for weight updates
    },
    trainingDataConfig={"s3Uri": f"s3://{bucket_name}/{s3_key}"},
    outputDataConfig={"s3Uri": f"s3://{bucket_name}/finetuned/"}
)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Key Parameters:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Base Model:&lt;/strong&gt; The pre-trained model (cohere.command-light-text-v14) serves as the foundation for customization.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Job Name and Model Name:&lt;/strong&gt; These identifiers help track the fine-tuning job and the resulting fine-tuned model for future deployments.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Hyperparameters:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;epochCount:&lt;/strong&gt; Specifies the number of training cycles. For demonstration, three epoch is used, but more epochs may yield better results for larger datasets.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;batchSize:&lt;/strong&gt; Determines how many samples are processed in each training step. A value of 16 balances memory usage and training efficiency.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;learningRate:&lt;/strong&gt; Sets the pace at which the model learns. Lower values ensure stable training but may require more time to converge.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Training and Output Configuration:The trainingDataConfig points to the S3 location of the dataset.The outputDataConfig specifies where the fine-tuned model will be stored.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Considerations:&lt;/strong&gt;&lt;br&gt;
The parameters, especially the hyperparameters, can be adjusted to optimize the fine-tuning process:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Smaller datasets&lt;/strong&gt; may benefit from lower batchSize values.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Complex tasks&lt;/strong&gt; may require more epochs to achieve convergence.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Learning rates&lt;/strong&gt; should be fine-tuned to balance training stability and speed.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This step officially kicks off the fine-tuning process, allowing Amazon Bedrock to handle the heavy lifting of training your model with the provided data and configuration.&lt;/p&gt;

&lt;p&gt;The status of the fine-tuning job can be also seen:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;status = bedrock.get_model_customization_job(jobIdentifier="cohere-Summarizer-medical-finetuning-job-v1")["status"]
print(f"Job status: {status}")
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The status of the fine-tuning job can be also seen in the Bedrock console: &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8xnx5kkuac1f80rdjk6g.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8xnx5kkuac1f80rdjk6g.png" alt="Taining job in custom model - Amazon Bedrock" width="800" height="147"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 4: Purchase provisioned throughput
&lt;/h2&gt;

&lt;p&gt;To use the model for inference, you need to purchase "Provisioned Throughput." On Amazon Bedrock sidebar in your AWS console, go to "Custom Models" and then choose the "Models" tab, select the model you have trained, and then click on "Purchase Provisioned Throughput." &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn9ogalt3wi2vwotec329.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn9ogalt3wi2vwotec329.png" alt="Purchase provisioned throughput" width="800" height="150"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Give the provisioned throughput a name, select a commitment term (you can choose "No Commitment" for testing), and then click "Purchase Provisioned Throughput." You will be able to see the estimated price as well. Once this is set up, you'll be able to use the model for inference.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4jhrcj1yxqyhrmqdfbz3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4jhrcj1yxqyhrmqdfbz3.png" alt="Commitment in Amazon Bedrock" width="800" height="422"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To access your deployed model's endpoint, you'll need its ARN. Go to the "Provisioned Throughput" section under Inference in the sidebar. Select the name of your fine-tuned model, and on the new page, copy the ARN for use in the next step. Keep in mind that provisioning throughput may take a few minutes to complete.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm6ush6m03e1x9hpdf4et.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm6ush6m03e1x9hpdf4et.png" alt="Custom model's ARN" width="800" height="212"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 5: Test our fine-tuned model
&lt;/h2&gt;

&lt;p&gt;In the next step, we will make a request to the model for inference. Be sure to replace YOUR_MODEL_ARN with the ARN you copied earlier.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Initialize Bedrock runtime client
bedrock_runtime = boto3.client(service_name="bedrock-runtime", region_name=bedrock_region)

# Define a prompt for model inference
prompt = """
[doctor] Good morning, Mr. Smith. How have you been feeling since your last visit?  
[patient] Good morning, doctor. I've been okay overall, but I’ve been struggling with persistent fatigue and some dizziness.  
[doctor] I see. Is the dizziness occurring frequently or only under specific circumstances?  
[patient] It’s mostly when I stand up quickly or after I've been walking for a while.  
[doctor] Have you noticed any changes in your heart rate or shortness of breath during these episodes?  
[patient] No shortness of breath, but I do feel my heart racing sometimes.  

[doctor] How about your medications? Are you taking them as prescribed?  
[patient] Yes, but I missed a few doses of my beta-blocker last week due to travel.  
[doctor] That could explain some of the symptoms. I’ll need to check your blood pressure and do an EKG to assess your heart rhythm.  
[patient] Okay, doctor.  

[doctor] How has your diet been? Are you still following the low-sodium plan we discussed?  
[patient] I’ve been trying, but I’ve slipped up a bit during holidays with family meals.  
[doctor] I understand. We’ll reinforce that, as it’s critical for managing your hypertension.  
[patient] Yes, I’ll make sure to get back on track.  

[doctor] Let’s discuss the results from your last bloodwork. Your cholesterol levels were slightly elevated, and your hemoglobin A1c suggests borderline diabetes.  
[patient] I see. What does that mean for me?  
[doctor] It means we need to focus on dietary changes and consider starting a low-dose statin. I’ll also refer you to a nutritionist for better meal planning.  
[patient] That makes sense. Thank you, doctor.  

[doctor] Lastly, you mentioned experiencing more frequent leg swelling recently. Is that still a concern?  
[patient] Yes, especially after long days at work.  
[doctor] That could be a sign of fluid retention. I’ll adjust your diuretic dose and monitor your progress over the next two weeks.  
[patient] Thank you, doctor.  

[doctor] All right, let’s get those tests done and review everything at our next appointment. Do you have any other concerns?  
[patient] No, I think that’s all for now.  
[doctor] Great. See you in two weeks. 
"""

# Define the inference request body
body = {
    "prompt": prompt,
    "temperature": 0.5,
    "p": 0.9,
    "max_tokens": 80,
}

# Specify the ARN of the custom model
custom_model_arn = "YOUR_MODEL_ARN" #Put your model ARN here

# Invoke the custom model for inference
try:
    response = bedrock_runtime.invoke_model(
        modelId=custom_model_arn,
        body=json.dumps(body)
    )

    # Read and parse the response
    response_body = response['body'].read().decode('utf-8')
    result = json.loads(response_body)

    # Extract the summary from the response
    summary_text = result['generations'][0]['text']
    print("Extracted Summary:", summary_text)
except Exception as e:
    print(f"Error invoking model: {e}")
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I tested it with the following conversation to evaluate its ability to generate concise and meaningful summaries for medical dialogues. The input conversation is designed to reflect a real-world doctor-patient interaction, emphasizing symptoms, medication adherence, and a follow-up plan:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;[doctor] Good morning, Mr. Smith. How have you been feeling since your last visit?&lt;br&gt;&lt;br&gt;
[patient] Good morning, doctor. I've been okay overall, but I’ve been struggling with persistent fatigue and some dizziness.&lt;br&gt;&lt;br&gt;
[doctor] I see. Is the dizziness occurring frequently or only under specific circumstances?&lt;br&gt;&lt;br&gt;
[patient] It’s mostly when I stand up quickly or after I've been walking for a while.&lt;br&gt;&lt;br&gt;
[doctor] Have you noticed any changes in your heart rate or shortness of breath during these episodes?&lt;br&gt;&lt;br&gt;
[patient] No shortness of breath, but I do feel my heart racing sometimes.&lt;br&gt;&lt;br&gt;
[doctor] How about your medications? Are you taking them as prescribed?&lt;br&gt;&lt;br&gt;
[patient] Yes, but I missed a few doses of my beta-blocker last week due to travel.&lt;br&gt;&lt;br&gt;
[doctor] That could explain some of the symptoms. I’ll need to check your blood pressure and do an EKG to assess your heart rhythm.&lt;br&gt;&lt;br&gt;
[patient] Okay, doctor.&lt;br&gt;&lt;br&gt;
[doctor] How has your diet been? Are you still following the low-sodium plan we discussed?&lt;br&gt;&lt;br&gt;
[patient] I’ve been trying, but I’ve slipped up a bit during holidays with family meals.&lt;br&gt;&lt;br&gt;
[doctor] I understand. We’ll reinforce that, as it’s critical for managing your hypertension.&lt;br&gt;&lt;br&gt;
[patient] Yes, I’ll make sure to get back on track.&lt;br&gt;&lt;br&gt;
[doctor] Let’s discuss the results from your last bloodwork. Your cholesterol levels were slightly elevated, and your hemoglobin A1c suggests borderline diabetes.&lt;br&gt;&lt;br&gt;
[patient] I see. What does that mean for me?&lt;br&gt;&lt;br&gt;
[doctor] It means we need to focus on dietary changes and consider starting a low-dose statin. I’ll also refer you to a nutritionist for better meal planning.&lt;br&gt;&lt;br&gt;
[patient] That makes sense. Thank you, doctor.&lt;br&gt;&lt;br&gt;
[doctor] Lastly, you mentioned experiencing more frequent leg swelling recently. Is that still a concern?&lt;br&gt;&lt;br&gt;
[patient] Yes, especially after long days at work.&lt;br&gt;&lt;br&gt;
[doctor] That could be a sign of fluid retention. I’ll adjust your diuretic dose and monitor your progress over the next two weeks.&lt;br&gt;&lt;br&gt;
[patient] Thank you, doctor.&lt;br&gt;&lt;br&gt;
[doctor] All right, let’s get those tests done and review everything at our next appointment. Do you have any other concerns?&lt;br&gt;&lt;br&gt;
[patient] No, I think that’s all for now.&lt;br&gt;&lt;br&gt;
[doctor] Great. See you in two weeks. &lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;You can also test the inference directly from the &lt;strong&gt;Playground&lt;/strong&gt; in the Amazon Bedrock console. To do this, navigate to &lt;strong&gt;Chat/Text&lt;/strong&gt; under the Playground section, select your fine-tuned model, and enter your desired prompt.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0ovdwrek28ebjfsbq0at.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0ovdwrek28ebjfsbq0at.png" alt="Playground in Amazon Bedrock" width="800" height="638"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Input to the model:&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;[doctor] Good morning, Mr. Smith. How have you been feeling since your last visit?&lt;br&gt;&lt;br&gt;
[patient] Good morning, doctor. I've been okay overall, but I’ve been struggling with persistent fatigue and some dizziness.&lt;br&gt;&lt;br&gt;
[doctor] I see. Is the dizziness occurring frequently or only under specific circumstances?&lt;br&gt;&lt;br&gt;
[patient] It’s mostly when I stand up quickly or after I've been walking for a while.&lt;br&gt;&lt;br&gt;
[doctor] Have you noticed any changes in your heart rate or shortness of breath during these episodes?&lt;br&gt;&lt;br&gt;
[patient] No shortness of breath, but I do feel my heart racing sometimes.&lt;br&gt;&lt;br&gt;
[doctor] How about your medications? Are you taking them as prescribed?&lt;br&gt;&lt;br&gt;
[patient] Yes, but I missed a few doses of my beta-blocker last week due to travel.&lt;br&gt;&lt;br&gt;
[doctor] That could explain some of the symptoms. I’ll need to check your blood pressure and do an EKG to assess your heart rhythm.&lt;br&gt;&lt;br&gt;
[patient] Okay, doctor.&lt;br&gt;&lt;br&gt;
[doctor] How has your diet been? Are you still following the low-sodium plan we discussed?&lt;br&gt;&lt;br&gt;
[patient] I’ve been trying, but I’ve slipped up a bit during holidays with family meals.&lt;br&gt;&lt;br&gt;
[doctor] I understand. We’ll reinforce that, as it’s critical for managing your hypertension.&lt;br&gt;&lt;br&gt;
[patient] Yes, I’ll make sure to get back on track.&lt;br&gt;&lt;br&gt;
[doctor] Let’s discuss the results from your last bloodwork. Your cholesterol levels were slightly elevated, and your hemoglobin A1c suggests borderline diabetes.&lt;br&gt;&lt;br&gt;
[patient] I see. What does that mean for me?&lt;br&gt;&lt;br&gt;
[doctor] It means we need to focus on dietary changes and consider starting a low-dose statin. I’ll also refer you to a nutritionist for better meal planning.&lt;br&gt;&lt;br&gt;
[patient] That makes sense. Thank you, doctor.&lt;br&gt;&lt;br&gt;
[doctor] Lastly, you mentioned experiencing more frequent leg swelling recently. Is that still a concern?&lt;br&gt;&lt;br&gt;
[patient] Yes, especially after long days at work.&lt;br&gt;&lt;br&gt;
[doctor] That could be a sign of fluid retention. I’ll adjust your diuretic dose and monitor your progress over the next two weeks.&lt;br&gt;&lt;br&gt;
[patient] Thank you, doctor.&lt;br&gt;&lt;br&gt;
[doctor] All right, let’s get those tests done and review everything at our next appointment. Do you have any other concerns?&lt;br&gt;&lt;br&gt;
[patient] No, I think that’s all for now.&lt;br&gt;&lt;br&gt;
[doctor] Great. See you in two weeks. &lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Model's Response:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5rel7hu2u06mwafvhw0e.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5rel7hu2u06mwafvhw0e.png" alt="Amazon Bedrock playground" width="800" height="211"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 6. Cleanup
&lt;/h2&gt;

&lt;p&gt;To avoid incurring additional costs, please ensure that you &lt;strong&gt;remove any provisioned throughput&lt;/strong&gt;. You can remove provisioned throughput by navigating to the Provisioned Throughput section from the sidebar in the Amazon Bedrock console. Select the active provisioned throughput and delete it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Fine-tuning and deploying custom AI models on Amazon Bedrock unlocks the potential to create tailored solutions for specific use cases, such as summarizing medical dialogues. This guide has walked you through every step of the process, from preparing your dataset and configuring fine-tuning parameters to testing your model and deploying it for real-world inference. By leveraging the robust infrastructure and tools provided by Amazon Bedrock, you can streamline the fine-tuning process and focus on delivering impactful AI-driven solutions.&lt;/p&gt;

&lt;p&gt;The steps outlined in this article illustrate how even a relatively small, structured dataset can yield meaningful results with careful preparation and parameter tuning. Whether you're exploring summarization, classification, or other NLP tasks, Amazon Bedrock makes advanced model customization accessible and efficient.&lt;/p&gt;

&lt;p&gt;As you begin your fine-tuning journey, remember to experiment with hyperparameters and test your model rigorously to ensure optimal performance. Lastly, always clean up unused resources to avoid unnecessary costs. For further exploration, check out the complete implementation on [&lt;a href="https://github.com/miladrezaei-ai/bedrock-custom-model-finetuning" rel="noopener noreferrer"&gt;https://github.com/miladrezaei-ai/bedrock-custom-model-finetuning&lt;/a&gt;].&lt;/p&gt;

&lt;p&gt;With Amazon Bedrock, the possibilities for building intelligent, custom AI models are endless—empowering businesses to innovate and thrive in the evolving AI landscape.&lt;/p&gt;

</description>
      <category>bedrock</category>
      <category>foundationmodel</category>
      <category>llm</category>
      <category>aws</category>
    </item>
  </channel>
</rss>
