<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: saikrishna1729</title>
    <description>The latest articles on DEV Community by saikrishna1729 (@saikrishna1729).</description>
    <link>https://dev.to/saikrishna1729</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1005099%2F3fde8fa5-21fe-4e3c-b036-eb57d5701f18.png</url>
      <title>DEV Community: saikrishna1729</title>
      <link>https://dev.to/saikrishna1729</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/saikrishna1729"/>
    <language>en</language>
    <item>
      <title>Understanding the Model Context Protocol (MCP) and using it with Amazon Q Developer CLI</title>
      <dc:creator>saikrishna1729</dc:creator>
      <pubDate>Tue, 06 May 2025 06:10:31 +0000</pubDate>
      <link>https://dev.to/saikrishna1729/using-mcp-servers-in-amazon-q-developer-cli-4hc8</link>
      <guid>https://dev.to/saikrishna1729/using-mcp-servers-in-amazon-q-developer-cli-4hc8</guid>
      <description>&lt;h2&gt;
  
  
  What is Model Context Protocol (MCP) ?
&lt;/h2&gt;

&lt;p&gt;MCP is an open protocol developed by Anthropic.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Purpose:&lt;/strong&gt; It is designed to standardize the way applications provide contextual information (like data from files, APIs, or other tools) to Large Language Models (LLMs). Think of it as giving LLMs controlled access to the specific information they need to perform complex tasks accurately.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Analogy:&lt;/strong&gt; MCP can be compared to a USB-C port for AI applications. Just as USB-C offers a standard connection for various devices and peripherals, MCP provides a standardized interface for connecting AI models to different data sources and tools.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Benefits:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Aims to unlock more practical AI applications to enhance productivity.&lt;/li&gt;
&lt;li&gt;Enables application vendors to develop "MCP servers".&lt;/li&gt;
&lt;li&gt;Allows users to employ AI applications that consume services and interact with tools provided by these MCP servers.&lt;/li&gt;
&lt;li&gt;In essence, MCP facilitates a standardized communication layer, making it easier for AI models to access and utilize external tools and data sources required for their tasks.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For further details, refer to the official introduction: &lt;a href="https://modelcontextprotocol.io/introduction" rel="noopener noreferrer"&gt;https://modelcontextprotocol.io/introduction&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This protocol has become popular these days as it provides significant values for developers using them in their daily workflow or for GenAI apps.&lt;/p&gt;

&lt;h2&gt;
  
  
  Basic Components of MCP
&lt;/h2&gt;

&lt;p&gt;The Model Context Protocol relies on two fundamental components:&lt;br&gt;
&lt;strong&gt;MCP Server:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;This is the &lt;strong&gt;server-side implementation&lt;/strong&gt; of the protocol.&lt;/li&gt;
&lt;li&gt;It is responsible for exposing relevant tools or data sources.&lt;/li&gt;
&lt;li&gt;The primary purpose of these tools is to provide additional context to the LLM when requested by a client.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Below is the official list of MCP servers available to consume - &lt;a href="https://github.com/modelcontextprotocol/servers" rel="noopener noreferrer"&gt;MCP Servers&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;MCP Client:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;This is a designated application (like an AI agent or chatbot interface) that interacts with an MCP Server.&lt;/li&gt;
&lt;li&gt;It requests information or tool execution from the server to augment the LLM's context.&lt;/li&gt;
&lt;li&gt;Communication methods between the client and server can be:&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;STDIO (Standard Input/Output)&lt;/strong&gt;: For direct process communication.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SSE (Server-Sent Events)&lt;/strong&gt;: Often used via Web API calls for web-based applications.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example : &lt;a href="https://claude.ai/download" rel="noopener noreferrer"&gt;Claude Desktop&lt;/a&gt; , &lt;a href="https://docs.aws.amazon.com/amazonq/latest/qdeveloper-ug/command-line-installing.html" rel="noopener noreferrer"&gt;Amazon Q&lt;/a&gt; , &lt;a href="https://code.visualstudio.com/" rel="noopener noreferrer"&gt;VSCode Agent mode&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In this post , we will see how to setup Amazon Q Developer and setup one of the MCP servers published by AWS.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1 : Install Amazon Q Developer CLI&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Instructions here -&lt;a href="https://docs.aws.amazon.com/amazonq/latest/qdeveloper-ug/command-line-installing.html" rel="noopener noreferrer"&gt;Setup&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2 : Login to Amazon Q cli&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;q login&lt;/code&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Then you will be prompted to choose the Licence. For getting started you can choose the Free one.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp4uhek7t9kh5232ddy0k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp4uhek7t9kh5232ddy0k.png" alt="Image description" width="800" height="129"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Then, you will be redirected to Login to AWS Builder account. Please login using the credentials.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Then, allow Q Developer access to the command line as below.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fexusly5lnew5r8kdlwe6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fexusly5lnew5r8kdlwe6.png" alt="Image description" width="800" height="280"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 3: Try to interact with Amazon Q Developer on AWS related topics&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Type below command to initiate Q developer app.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;q chat&lt;/code&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Example : Provide best practice to deploy a static website on AWS and services to use&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr902y2owedrwzae01vi1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr902y2owedrwzae01vi1.png" alt="Image description" width="800" height="590"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Yeah! it generates the recommendation quite beautifully.&lt;/p&gt;

&lt;h2&gt;
  
  
  Now let's do some MCP fun
&lt;/h2&gt;

&lt;p&gt;AWS is developing MCP servers to be used relevant to various use cases. List is here &lt;br&gt;
&lt;a href="https://awslabs.github.io/mcp/" rel="noopener noreferrer"&gt;AWS MCP Servers&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;So, lets load one of them "AWS Documentation MCP Server" , which reads the latest documentation and returns the context to LLM used by Amazon Q.&lt;/p&gt;

&lt;p&gt;So below are the things to be done to configure MCP servers for Amazon Q to interact.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1 :  Install uv : Fast python package manager built on rust (similar to PIP) **
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://docs.astral.sh/uv/getting-started/installation/#installation-methods" rel="noopener noreferrer"&gt;Steps to install UV&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Step 2 :  create mcp.json file and copy Server configuration
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Refer to the link &lt;a href="https://awslabs.github.io/mcp/servers/aws-documentation-mcp-server/" rel="noopener noreferrer"&gt;AWS Documentation MCP Server&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Copy below JSON snippet &lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;code&gt;{&lt;br&gt;
  "mcpServers": {&lt;br&gt;
    "awslabs.aws-documentation-mcp-server": {&lt;br&gt;
        "command": "uvx",&lt;br&gt;
        "args": ["awslabs.aws-documentation-mcp-server@latest"],&lt;br&gt;
        "env": {&lt;br&gt;
          "FASTMCP_LOG_LEVEL": "ERROR"&lt;br&gt;
        },&lt;br&gt;
        "disabled": false,&lt;br&gt;
        "autoApprove": []&lt;br&gt;
    }&lt;br&gt;
  }&lt;br&gt;
}&lt;/code&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;create mcp.json file in the directory at the path &lt;strong&gt;~/.aws/amazonq/mcp.json&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;code&gt;touch ~/.aws/amazonq/mcp.json&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;then copy the above JSON snippet.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3 : Launch Amazon Q and start fun with MCP
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm3qoo8qmpzmjbosimex4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm3qoo8qmpzmjbosimex4.png" alt="Image description" width="800" height="372"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Let's ask a question about AWS service as below.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Suggest me a best service to use in AWS for relational database , this should be with low cost and highly available. Refer to the latest documentation available&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd4w8c87nrqhxl48ksep3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd4w8c87nrqhxl48ksep3.png" alt="Image description" width="800" height="234"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It will prompt you to trust all the required tools , you trust them selectively.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9xkmrvbgiw0c9pv4jfa1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9xkmrvbgiw0c9pv4jfa1.png" alt="Image description" width="800" height="854"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can explore other MCP servers provided by AWS or the wider community to further enhance Amazon Q's capabilities for different tasks.&lt;/p&gt;

&lt;p&gt;Hope you got a good idea of what MCP is and how to start using it with Amazon Q Developer. Happy Q'ing!&lt;/p&gt;

</description>
      <category>amazonq</category>
      <category>aws</category>
      <category>mcp</category>
      <category>genai</category>
    </item>
    <item>
      <title>Bedrock Cross-Region Inference: Tackling ratelimits and regional availability of inference</title>
      <dc:creator>saikrishna1729</dc:creator>
      <pubDate>Fri, 18 Apr 2025 14:36:08 +0000</pubDate>
      <link>https://dev.to/saikrishna1729/bedrock-cross-region-inference-tackling-ratelimits-and-regional-availability-of-inference-59i8</link>
      <guid>https://dev.to/saikrishna1729/bedrock-cross-region-inference-tackling-ratelimits-and-regional-availability-of-inference-59i8</guid>
      <description>&lt;p&gt;Keeping your AI applications online and running smoothly, even when lots of people use them at once, is super important.&lt;/p&gt;

&lt;p&gt;Good news! Bedrock has a cool feature called &lt;strong&gt;cross-region inference&lt;/strong&gt; that makes building resilient and highly available GenAI applications much easier.&lt;/p&gt;

&lt;h2&gt;
  
  
  So, What Exactly is Cross-Region Inference?
&lt;/h2&gt;

&lt;p&gt;Imagine you have a popular AI app powered by Bedrock.&lt;/p&gt;

&lt;p&gt;Normally, all the AI thinking happens in one specific AWS Region. But what if that region gets really busy (as it is shared one)? Or what if there's a temporary issue? Your app might slow down or not work for some users.&lt;/p&gt;

&lt;p&gt;Amazon Bedrock's cross-region inference is like having backup locations ready to help. It automatically sends your requests for AI processing (like asking the model a question or asking it to generate text) to other available AWS Regions within the same general area (like the US or Europe).&lt;/p&gt;

&lt;p&gt;This means your application doesn't just rely on one spot. It can tap into resources from other spots nearby, helping ensure your users get a consistent and speedy experience.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Makes Your AI App Resilient
&lt;/h2&gt;

&lt;p&gt;The main reason cross-region inference is a game-changer is the resiliency it adds. By spreading the work across multiple AWS Regions, it gives you several key benefits that make your AI applications much more robust:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Handles Traffic Jumps Easily&lt;/strong&gt;: When your app suddenly gets popular, cross-region inference helps handle the rush. It automatically sends requests to regions with available capacity. You don't have to guess how much traffic you'll get or build complex systems to manage it yourself. Bedrock checks your main region first and, if needed, smartly sends the request to another region that can handle it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Better Model Availability&lt;/strong&gt;: If one region is facing a temporary capacity crunch, your requests can be sent to another. This greatly increases the chances that your request will be completed successfully, helping keep your service running continuously for your users.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Increased Throughput&lt;/strong&gt;: By using compute power from different regions, your application can process more requests overall. This means your app can handle a higher volume of activity without performance dropping.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automatic Failover&lt;/strong&gt;: If processing a request in the initial region fails for some reason, Bedrock will automatically try to send it to another working region within the group. This built-in safety net makes your AI applications much more reliable.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No Extra Cost for Resiliency&lt;/strong&gt;: This is fantastic! There's no additional charge for using cross-region inference. You pay the same price per token (the units used for processing text) as you would if the request was handled only in your original region.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Getting Started:
&lt;/h2&gt;

&lt;p&gt;It's Not Automatic, But It Is Simple!&lt;br&gt;
So, is this cross-region magic automatic as soon as you use Bedrock? Not quite.&lt;/p&gt;

&lt;p&gt;The routing of the request across regions is automatic once you tell Bedrock you want to use this feature, but you need to make a small change in your code to enable it.&lt;/p&gt;

&lt;p&gt;The key to enabling cross-region inference is using something called &lt;strong&gt;inference profile IDs&lt;/strong&gt;. Instead of telling Bedrock the specific, single-region address (ARN) of an AI model, you use a special ID that represents the model and tells Bedrock it can use the cross-region capability.&lt;/p&gt;

&lt;p&gt;Here’s how you get started specifically:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Discover the Right Inference Profile IDs:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Amazon Bedrock provides special IDs called "system-defined" inference profiles for models that support cross-region inference. These IDs cover specific geographical areas (like a group of US regions or a group of EU regions).&lt;/p&gt;

&lt;p&gt;How to find them:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;AWS Console:&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;Go to the Amazon Bedrock console.&lt;/li&gt;
&lt;li&gt;Look for an option like "Cross-region Inference" on the left menu. Here, you can browse the available profiles for your regions and easily copy the IDs.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AWS CLI&lt;/strong&gt;:

&lt;ul&gt;
&lt;li&gt;You can use a command-line tool.&lt;/li&gt;
&lt;li&gt;Run &lt;code&gt;aws bedrock list-inference-profiles&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Look for profiles listed with the type "SYSTEM_DEFINED". These IDs will often start with a prefix like &lt;code&gt;us.&lt;/code&gt; or &lt;code&gt;eu.&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Boto3 (Python SDK):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If you're coding in Python, you can use the AWS SDK:
&lt;/li&gt;
&lt;/ul&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;

&lt;span class="c1"&gt;# Replace "your-aws-region" with the region you're working from
&lt;/span&gt;&lt;span class="n"&gt;bedrock_client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bedrock&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;region_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;your-aws-region&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bedrock_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;list_inference_profiles&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;


&lt;ul&gt;
&lt;li&gt;This will list the available profiles, including the system-defined cross-region ones.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Use the Inference Profile ID in Your Code:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Once you have the inference profile ID (it will look something like &lt;code&gt;us.anthropic.claude-3-5-sonnet-20240620-v1:0&lt;/code&gt;), you simply use this ID instead of the regular model ARN when you make your requests to the Bedrock API (using &lt;code&gt;InvokeModel&lt;/code&gt; or &lt;code&gt;Converse&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Using a single-region ARN
# model_id = "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-5-sonnet-20240620-v1:0"
# ... call bedrock_runtime.invoke_model(modelId=model_id, ...)
&lt;/span&gt;
&lt;span class="c1"&gt;# You would use:
&lt;/span&gt;
&lt;span class="c1"&gt;# Using the cross-region inference profile ID
&lt;/span&gt;&lt;span class="n"&gt;inference_profile_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;us.anthropic.claude-3-5-sonnet-20240620-v1:0&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="c1"&gt;# Example ID
&lt;/span&gt;
&lt;span class="c1"&gt;# Replace "your-source-region" with the region you are making the request from
&lt;/span&gt;&lt;span class="n"&gt;bedrock_runtime&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bedrock-runtime&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;region_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;your-source-region&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bedrock_runtime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;converse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;modelId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;inference_profile_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;# Use the profile ID here!
&lt;/span&gt;    &lt;span class="n"&gt;system&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;You are a helpful AI assistant.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}],&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Tell me something interesting.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}]}]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;output&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;message&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Bedrock sees the inference profile ID and knows it has the flexibility to route your request to any available region within that profile's set.&lt;/p&gt;

&lt;p&gt;So, while you don't manage the traffic routing yourself (Bedrock does that automatically!), you do need to make the initial step of changing your code to use the inference profile ID instead of the standard model ARN. It's a straightforward change that unlocks powerful resiliency.&lt;/p&gt;

&lt;h2&gt;
  
  
  Monitoring Your Applications
&lt;/h2&gt;

&lt;p&gt;Amazon Bedrock gives you visibility into how cross-region inference is working. If your request gets re-routed to another region, this information is recorded in your AWS CloudTrail logs and Amazon Bedrock Model Invocation Logs.&lt;/p&gt;

&lt;p&gt;You'll find details like an &lt;code&gt;inferenceRegion&lt;/code&gt; key which tells you where the request was actually processed. By looking at these logs (for example, in Amazon CloudWatch), you can see when requests are being served from different regions. This helps you understand your application's traffic flow and how effectively cross-region inference is handling demand spikes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Important Things to Keep in Mind
&lt;/h2&gt;

&lt;p&gt;While cross-region inference is great for resiliency, here are a few points to be aware of:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Latency:&lt;/strong&gt; There might be a very slight delay (usually very small, like double-digit milliseconds in testing) if a request needs to be re-routed to another region.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data Residency:&lt;/strong&gt; Your main data stays in your source region. However, the input prompts and output results for a specific inference request might be processed in another region within the same geographical group (like US or EU). Make sure this fits with any data location rules or compliance requirements you have.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Rest assured, all data transfer between regions happens over AWS's secure network.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Supported Models and Regions:&lt;/strong&gt; This feature works with a specific list of models and within defined geographical sets of regions (US, EU, etc.). Check the Bedrock documentation to confirm that the models you want to use and the regions you operate in are supported.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/inference-profiles-support.html" rel="noopener noreferrer"&gt;AWS Link&lt;/a&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>bedrock</category>
      <category>llm</category>
      <category>cloud</category>
    </item>
    <item>
      <title>Get handson with Latest LLM models for Free with Github</title>
      <dc:creator>saikrishna1729</dc:creator>
      <pubDate>Wed, 16 Apr 2025 14:03:10 +0000</pubDate>
      <link>https://dev.to/saikrishna1729/get-handson-with-latest-llm-models-for-free-with-github-205n</link>
      <guid>https://dev.to/saikrishna1729/get-handson-with-latest-llm-models-for-free-with-github-205n</guid>
      <description>&lt;p&gt;If you want to develop a generative AI application, you can use GitHub Models to find and experiment with AI models for free. &lt;/p&gt;

&lt;p&gt;Yes, you heard it right for FREE..&lt;/p&gt;

&lt;p&gt;Github has made it easy to use latest LLM models with a single inference endpoint. All that you need is a Github profile.&lt;/p&gt;

&lt;p&gt;Create a github profile at &lt;a href="https://dev.toGithub"&gt;github.com&lt;/a&gt; or use your existing github account.&lt;/p&gt;

&lt;p&gt;GitHub now provides an easy way to use various LLM models through a single inference endpoint. All that you need is a GitHub profile to get started.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Prerequisites:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Create a GitHub profile at github.com or use your existing GitHub account.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Steps to Access the Playground:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Go to github.com/marketplace/models.&lt;/li&gt;
&lt;li&gt;Click "Model": Select a Model, typically located at the top left of the page.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Choose a desired model from the dropdown menu.&lt;/p&gt;

&lt;p&gt;Once you click on the model, you will be redirected to the playground. Here you can access the chat interface to test and experience the model directly.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqnbfh130yqivrjlgkvia.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqnbfh130yqivrjlgkvia.png" alt="Image description" width="800" height="484"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Once you click on the model, you will be redirected to the playground where you can access the chat interface to experience the model.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6vfqqnxcuigvorpbqxpp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6vfqqnxcuigvorpbqxpp.png" alt="Image description" width="800" height="523"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Steps to Get a Developer Token for API Use:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If you want to use this in your code for experimentation, you can get a free API Key (Developer Token).&lt;/p&gt;

&lt;p&gt;In the model's playground or page, click on the "Use this model" option.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv2qdhgo4a2ljba7ir2a2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv2qdhgo4a2ljba7ir2a2.png" alt="Image description" width="703" height="394"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Then, click on the "Get Developer Token" option as shown below. This token allows you to experiment with the model for free in your own code via the API.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3g1wncq5w3nhxlejwccu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3g1wncq5w3nhxlejwccu.png" alt="Image description" width="703" height="394"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Hope this post is helpful for you.&lt;/p&gt;

&lt;p&gt;Do use this in your code with API Key for experimentation. Click on the "&lt;strong&gt;Use this model&lt;/strong&gt;" option as shown below.&lt;/p&gt;

&lt;p&gt;That's it. You now have access to experiment with the model.&lt;/p&gt;

&lt;p&gt;This GitHub Models catalog includes various models, potentially including recent ones like GPT-4.1 (check the marketplace for current availability).&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Unlocking Private Connections in Azure AI Foundry 🔐</title>
      <dc:creator>saikrishna1729</dc:creator>
      <pubDate>Tue, 15 Apr 2025 11:51:42 +0000</pubDate>
      <link>https://dev.to/saikrishna1729/unlocking-private-connections-in-azure-ai-foundry-35l4</link>
      <guid>https://dev.to/saikrishna1729/unlocking-private-connections-in-azure-ai-foundry-35l4</guid>
      <description>&lt;p&gt;Microsoft has introduced &lt;strong&gt;Azure AI Foundry&lt;/strong&gt;, rebranding the existing Azure AI Studio. This platform serves as a new central hub for designing, customizing, and managing AI applications and agents effectively at scale.&lt;/p&gt;

&lt;h3&gt;
  
  
  What is Azure AI Foundry?
&lt;/h3&gt;

&lt;p&gt;With Azure AI Foundry, you can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Explore a wide variety of models, services, and capabilities.&lt;/li&gt;
&lt;li&gt;Build AI applications tailored to your specific goals.&lt;/li&gt;
&lt;li&gt;Facilitate scalability, transforming proofs of concept into full-fledged production applications with ease.&lt;/li&gt;
&lt;li&gt;Leverage continuous monitoring and refinement features to support long-term success.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Core Components
&lt;/h3&gt;

&lt;p&gt;Azure AI Foundry primarily consists of two components:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Azure AI Foundry Hub&lt;/strong&gt;: This is the core, region-specific infrastructure used to interface with AI models. You need to create a Hub first.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Azure AI Foundry Project&lt;/strong&gt;: Built upon a Hub, Projects are where you deploy specific AI models (like Phi-3, DeepSeek, Mistral). These are also region-specific.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The native &lt;strong&gt;Azure AI Foundry SDK&lt;/strong&gt; aims to provide a simpler and more unified experience for developers building Generative AI applications.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Challenge: Private Connections
&lt;/h3&gt;

&lt;p&gt;While the documentation for this service is comprehensive, configuring &lt;strong&gt;private network connections&lt;/strong&gt; for model inference seemed a bit unclear based on my experience. This post aims to provide guidance on achieving this.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step-by-Step: Setting Up Private Endpoints
&lt;/h3&gt;

&lt;p&gt;To establish a private connection for inference to models deployed in an Azure AI Foundry Project, follow these steps:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; Navigate to your &lt;strong&gt;Azure AI Foundry Hub&lt;/strong&gt; resource within the Azure portal.&lt;/li&gt;
&lt;li&gt; From the left-side menu, select &lt;strong&gt;Settings&lt;/strong&gt;, then &lt;strong&gt;Networking&lt;/strong&gt;. Click on the &lt;strong&gt;Private endpoint connections&lt;/strong&gt; tab and select &lt;strong&gt;+ Private endpoint&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt; When filling out the forms to create the private endpoint:

&lt;ul&gt;
&lt;li&gt;On the &lt;strong&gt;Basics&lt;/strong&gt; tab, ensure the selected &lt;strong&gt;Region&lt;/strong&gt; matches the region of your virtual network.&lt;/li&gt;
&lt;li&gt;On the &lt;strong&gt;Resource&lt;/strong&gt; tab, select &lt;code&gt;amlworkspace&lt;/code&gt; as the &lt;strong&gt;Target sub-resource&lt;/strong&gt;. (Internally, it leverages Azure Machine Learning Workspace infrastructure).&lt;/li&gt;
&lt;li&gt;On the &lt;strong&gt;Virtual Network&lt;/strong&gt; tab, select the target &lt;strong&gt;Virtual network&lt;/strong&gt; and &lt;strong&gt;Subnet&lt;/strong&gt; you wish to connect from.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt; Configure any additional network settings as required, review your settings on the &lt;strong&gt;Review + create&lt;/strong&gt; tab, and then click &lt;strong&gt;Create&lt;/strong&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This process should automatically configure the necessary DNS records within your private DNS zones associated with the virtual network.&lt;/p&gt;

&lt;p&gt;And that's it! No further configuration should be necessary. Attempts to reach your serverless model endpoint (e.g., &lt;code&gt;xxx.region.models.ai.azure.com&lt;/code&gt;) should now resolve privately.&lt;/p&gt;

&lt;h3&gt;
  
  
  How it Works
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe9ta2jj1xugfqm63h2sx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe9ta2jj1xugfqm63h2sx.png" alt="Image description" width="800" height="285"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The reason this works is that Azure automatically creates a &lt;strong&gt;CNAME&lt;/strong&gt; DNS record for your endpoint, similar to:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;&amp;lt;deploymentname&amp;gt;.&amp;lt;ProjectID&amp;gt;.models.&amp;lt;region&amp;gt;.privatelink.api.azureml.ms&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;The Private Endpoint you created specifically handles DNS resolution for this &lt;code&gt;.privatelink&lt;/code&gt; address, ensuring traffic stays within your private network.&lt;/p&gt;




&lt;p&gt;Hopefully, this guide clarifies the process for setting up private connections in Azure AI Foundry.&lt;/p&gt;

&lt;p&gt;Documentation for reference  : &lt;a href="https://learn.microsoft.com/en-us/azure/ai-foundry/how-to/configure-private-link?tabs=azure-portal" rel="noopener noreferrer"&gt;Azure AI Foundry Private Endpoints&lt;/a&gt; &lt;/p&gt;

</description>
      <category>azureaifoundry</category>
      <category>privateaccess</category>
    </item>
    <item>
      <title>AWS Bedrock : Interface Claude LLM using Python</title>
      <dc:creator>saikrishna1729</dc:creator>
      <pubDate>Sat, 10 Aug 2024 14:32:37 +0000</pubDate>
      <link>https://dev.to/saikrishna1729/aws-bedrock-interface-claude-llm-using-python-d1g</link>
      <guid>https://dev.to/saikrishna1729/aws-bedrock-interface-claude-llm-using-python-d1g</guid>
      <description>&lt;h3&gt;
  
  
  Introduction
&lt;/h3&gt;

&lt;p&gt;Amazon Bedrock, a fully managed service by AWS, empowers developers to rapidly build and scale generative AI applications using foundational models (FMs). It offers a diverse selection of Large Language Models (LLMs) from leading providers like Amazon, Anthropic, A21 Labs, and Meta.&lt;/p&gt;

&lt;p&gt;In this guide, I'll walk you through the simple steps to get started with Amazon Bedrock using the AWS Python SDK.&lt;/p&gt;




&lt;h3&gt;
  
  
  Prerequisites
&lt;/h3&gt;

&lt;p&gt;Before we dive in, make sure you have the following:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;AWS Credentials&lt;/strong&gt;: Ensure your machine has properly configured AWS credentials with the necessary IAM policy:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
    "Version": "2012-10-17",
    "Statement": [\n
        {
            "Sid": "BedrockFullAccess",
            "Effect": "Allow",
            "Action": ["bedrock:*"],
            "Resource": "*"
        }
    ]
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Python with 3.10 minimum with &lt;strong&gt;pip&lt;/strong&gt; installed.&lt;/li&gt;
&lt;li&gt;Install python pip packages with below commands. &lt;strong&gt;boto3&lt;/strong&gt; for AWS SDK , &lt;strong&gt;chainlit&lt;/strong&gt; for simple UI framework. &lt;a href="https://docs.chainlit.io/get-started/installation" rel="noopener noreferrer"&gt;More about Chainlit&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;pip install boto3
pip install chainlit 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Enable Bedrock Foundational Models:&lt;/strong&gt; Enable Bedrock Foundational models in AWS Console.&lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/getting-started.html#getting-started-model-access" rel="noopener noreferrer"&gt;Follow the instructions here&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;We will be using &lt;strong&gt;Claude 3 sonnet&lt;/strong&gt; LLM for this application in&lt;br&gt;
  &lt;strong&gt;us-east-1&lt;/strong&gt; AWS region. ( refer to documentation for available regions for Bedrock if you want to use someother region )&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdd9fuh2rv89gfov1jl41.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdd9fuh2rv89gfov1jl41.png" alt="Image description" width="800" height="240"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;&lt;strong&gt;Note:&lt;/strong&gt; During setup, you'll be asked to provide company and use case information. Complete this step as required.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Please allow a few minutes for the model to become available after enabling it.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F86jwbyhvwoxeq539panz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F86jwbyhvwoxeq539panz.png" alt="Image description" width="800" height="267"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  Interfacing with Python
&lt;/h3&gt;

&lt;p&gt;Once the prerequisites are in place, create a Python file (e.g., &lt;code&gt;main.py&lt;/code&gt;) with the following code:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Replace &lt;code&gt;question&lt;/code&gt; variable in the code with any other sample question you want LLM to answer.&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import boto3
import json

def chat_with_bedrock(p_message):
    bedrock = boto3.client(service_name="bedrock-runtime", region_name="us-east-1")

    messages = [{
        "role": "user",
        "content": p_message
    }]

    body = json.dumps({
    "max_tokens": 256,
    "messages": messages,
    "anthropic_version": "bedrock-2023-05-31"
    })

    response = bedrock.invoke_model(body=body, modelId="anthropic.claude-3-haiku-20240307-v1:0")

    response_body = json.loads(response.get("body").read())
    return response_body.get("content")

# replace the question with your query
question = "Who is US President in 2020"

response = chat_with_bedrock(question)

print(response[0].get("text"))

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Sample response &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The President of the United States in 2020 is Donald Trump. He was elected in 2016 and his current term runs until January 20, 2021.&lt;br&gt;
Some key facts about Donald Trump's presidency in 2020:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;He is the 45th President of the United States.&lt;/li&gt;
&lt;li&gt;He is a member of the Republican Party.&lt;/li&gt;
&lt;li&gt;His vice president is Mike Pence.&lt;/li&gt;
&lt;li&gt;Major events in 2020 included the COVID-19 pandemic, economic crisis, protests against police brutality, and the 2020 presidential election.&lt;/li&gt;
&lt;li&gt;He ran for re-election against Democratic candidate Joe Biden in the 2020 election.&lt;/li&gt;
&lt;li&gt;The 2020 election took place on November 3, 2020. Biden won both the popular vote and electoral college.&lt;/li&gt;
&lt;li&gt;However, Trump did not concede the election and made unsubstantiated claims of widespread voter fraud before leaving office.
So in summary, Donald Trump served as president throughout 2020, but lost his bid for re-election to Joe Biden, who was inaugurated as the 46th president on January 20, 2021.&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;

&lt;p&gt;This example demonstrates how to interface with AWS Bedrock using the AWS SDK for Python. You can further enhance this by implementing a prompt template or experimenting with different models available in Bedrock.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tip:&lt;/strong&gt; To use a different model in Bedrock, request access and update the &lt;code&gt;modelId&lt;/code&gt; in the code accordingly.&lt;/p&gt;

&lt;p&gt;Thank you!&lt;/p&gt;

</description>
      <category>awsbedrock</category>
      <category>aws</category>
      <category>genai</category>
    </item>
  </channel>
</rss>
