<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: AKASH Ramesh CB student</title>
    <description>The latest articles on DEV Community by AKASH Ramesh CB student (@akash_rameshcbstudent_7).</description>
    <link>https://dev.to/akash_rameshcbstudent_7</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3600648%2Fe1a1c94e-cd5c-4bf8-8f04-00f3d2e95b10.png</url>
      <title>DEV Community: AKASH Ramesh CB student</title>
      <link>https://dev.to/akash_rameshcbstudent_7</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/akash_rameshcbstudent_7"/>
    <language>en</language>
    <item>
      <title>How to Build an AI Image Captioning App with Azure AI Vision and Streamlit</title>
      <dc:creator>AKASH Ramesh CB student</dc:creator>
      <pubDate>Fri, 07 Nov 2025 08:42:19 +0000</pubDate>
      <link>https://dev.to/akash_rameshcbstudent_7/how-to-build-an-ai-image-captioning-app-with-azure-ai-vision-and-streamlit-4206</link>
      <guid>https://dev.to/akash_rameshcbstudent_7/how-to-build-an-ai-image-captioning-app-with-azure-ai-vision-and-streamlit-4206</guid>
      <description>&lt;p&gt;As a developer, I'm always looking for ways to build impactful projects. One of the most powerful applications of AI is its ability to make the digital world more accessible.&lt;/p&gt;

&lt;p&gt;I was inspired by Microsoft's mission to empower everyone, so I built a simple web app that helps describe the world for those who are visually impaired.&lt;/p&gt;

&lt;p&gt;This application uses Microsoft Azure AI Vision to generate human-readable captions for any image you upload. And the best part? We can build the entire web app in about 30 lines of Python using Streamlit.&lt;/p&gt;

&lt;p&gt;Let's get started!&lt;/p&gt;

&lt;p&gt;What You'll Need&lt;/p&gt;

&lt;p&gt;Python: Make sure you have Python 3.7+ installed.&lt;/p&gt;

&lt;p&gt;An Azure Account: You'll need one to create an "AI Vision" resource. You can get a free account to start.&lt;/p&gt;

&lt;p&gt;A Few Python Libraries: We'll install them with pip.&lt;/p&gt;

&lt;p&gt;pip install streamlit requests pillow&lt;/p&gt;

&lt;p&gt;Step 1: Get Your Azure AI Vision Keys&lt;/p&gt;

&lt;p&gt;Before we can code, we need to tell Azure who we are.&lt;/p&gt;

&lt;p&gt;Go to the Azure portal and click "Create a resource."&lt;/p&gt;

&lt;p&gt;Search for "AI Vision" and create one.&lt;/p&gt;

&lt;p&gt;Once it's deployed, go to the "Keys and Endpoint" blade.&lt;/p&gt;

&lt;p&gt;Copy your VISION_API_KEY (one of the keys) and your VISION_ENDPOINT. We'll need these for our code.&lt;/p&gt;

&lt;p&gt;Step 2: The Code Walkthrough&lt;/p&gt;

&lt;p&gt;Here is the complete Python script. You can save this as app.py. I'll break down what each part does.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import streamlit as st
import requests
from PIL import Image
import io

# --- 1. SET UP AZURE CREDENTIALS ---
# (Paste your key and endpoint here)
VISION_API_KEY = "YOUR_API_KEY_HERE"
VISION_ENDPOINT = "YOUR_ENDPOINT_HERE"

# (This is the specific API endpoint we'll hit)
analyze_url = f"{VISION_ENDPOINT}/vision/v3.2/analyze"


# --- 2. CREATE THE STREAMLIT UI ---
st.title("🖼️ AI Image Captioning with Azure")
st.write("Upload an image and let Azure's AI describe it for you!")

uploaded_file = st.file_uploader("Choose an image...", type=["jpg", "jpeg", "png"])


# --- 3. RUN THE ANALYSIS ---
if uploaded_file:
    # A. Display the uploaded image
    image = Image.open(uploaded_file)
    st.image(image, caption="Uploaded Image", use_column_width=True)

    # B. Convert image to bytes for the API
    image_bytes = io.BytesIO()
    image.save(image_bytes, format="JPEG")
    image_bytes = image_bytes.getvalue() # Get the byte value

    # C. Set up headers and parameters for the API call
    headers = {
        "Ocp-Apim-Subscription-Key": VISION_API_KEY,
        "Content-Type": "application/octet-stream"
    }
    params = {"visualFeatures": "Description"}

    st.write("Analyzing image...")

    # D. Make the API call to Azure
    response = requests.post(analyze_url, headers=headers, params=params, data=image_bytes)
    response.raise_for_status() # Raise an error if the call fails
    analysis = response.json()

    # E. Display the result
    captions = analysis["description"]["captions"]
    if captions:
        caption = captions[0]["text"]
        confidence = captions[0]["confidence"]

        # Display the caption in a green success box!
        st.success(f"**Caption:** {caption} (Confidence: {confidence:.2f})")
    else:
        st.warning("No caption found for this image.")


Step 3: Run Your App!

Save the code as app.py. Open your terminal in the same folder and run:

streamlit run app.py


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your browser will automatically open, and you'll have a working web app!&lt;/p&gt;

&lt;p&gt;Example Output&lt;/p&gt;

&lt;p&gt;When you run the app and upload an image, you'll see a result that looks like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi1txhy784qyj1wlxen10.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi1txhy784qyj1wlxen10.png" alt=" " width="800" height="391"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;How It Works&lt;/p&gt;

&lt;p&gt;This project is a perfect example of how Microsoft Azure makes complex AI simple. We didn't have to train a single model.&lt;/p&gt;

&lt;p&gt;All the heavy lifting is done by the requests.post call. We send the image bytes to the Azure endpoint, and Azure's pre-trained AI model analyzes it and sends us back a simple JSON file with the description.&lt;/p&gt;

&lt;p&gt;We just ask for the "Description" feature in our params, and the API does the rest.&lt;/p&gt;

&lt;p&gt;Final Thoughts&lt;/p&gt;

&lt;p&gt;This simple app shows the power of leveraging cloud-based AI. We built an accessibility tool in just a few minutes that can have a real impact. This same analyze API can also detect objects, read text in images (OCR), and more.&lt;/p&gt;

&lt;p&gt;I highly recommend exploring the Azure AI Services. The possibilities are endless.&lt;/p&gt;

&lt;p&gt;project url: &lt;a href="https://github.com/akash7ashy/vision" rel="noopener noreferrer"&gt;https://github.com/akash7ashy/vision&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Thanks for reading!&lt;/p&gt;

</description>
      <category>azure</category>
      <category>tutorial</category>
      <category>ai</category>
      <category>python</category>
    </item>
  </channel>
</rss>
