Ron Dagdag

Posted on Mar 3 • Edited on Mar 4

AI Agents on the Fly: AI Agents into Web Apps

#azure #ai #aiops #genai

This article is part of Azure Spring Clean 2025 Series

AI Agents on the Fly: AI Agents into Web Apps

The technology industry has witnessed a remarkable progression in service delivery models, from traditional on-premises solutions to increasingly sophisticated cloud-based offerings. As artificial intelligence (AI) reaches new heights, we're now entering the era of Agents on demand - a revolutionary approach to Agent as a Service (AaaS), representing the next frontier in the as-a-service evolution.

Agents on Demand: A Cultural Perspective

The concept of summoning specialized agents on demand has long captured our imagination in popular culture:

Mission Impossible's IMF Team: Highly specialized agents assembled for specific missions
The A-Team: "If you have a problem, if no one else can help, and if you can find them..."
James Bond's Q Branch: On-demand technical expertise and gadgets
Men in Black: Specialized agents appearing exactly when needed
Ghost Protocol: Agents operating independently when official channels are down

These cultural touchstones reflect our enduring fascination with the idea of specialized agents ready to tackle specific challenges at a moment's notice - a concept now becoming reality with AI.

The As-a-Service Evolution

1. Infrastructure as a Service (IaaS)

Basic building blocks of cloud computing
Virtual machines, storage, and networking
Pay-as-you-go infrastructure

2. Platform as a Service (PaaS)

Development and deployment platforms
Managed runtime environments
Simplified application lifecycle management

3. Software as a Service (SaaS)

Ready-to-use applications
Subscription-based model
Minimal setup and maintenance

4. Functions as a Service (FaaS)

Event-driven compute execution
Serverless architecture
Automatic scaling of individual functions
Pay only for execution time
Popular implementations like AWS Lambda, Azure Functions

5. Agent as a Service (AaaS)

The newest paradigm combining autonomous AI agents with cloud-based delivery, representing a significant leap forward in service automation and intelligence.

Agent as a Service represents a paradigm shift in how we deploy and utilize AI agents. Unlike traditional static AI services, these agents are:

Instantly Available: Spin up exactly when needed
Purpose-Built: Configured for specific tasks or domains
Ephemeral: Exist only for the duration of their mission
Resource Efficient: Consume resources only when active
Highly Specialized: Optimized for their designated purpose

What Exactly is AaaS?

AaaS leverages AI agents capable of performing tasks independently. These agents use machine learning and natural language processing (NLP) to respond to user requests, automate complex workflows, and improve operational efficiency. For newcomers, machine learning refers to algorithms that learn patterns from data, while NLP (Natural Language Processing) enables agents to understand and generate human language. Similar to SaaS, AaaS relies on cloud-based delivery, making it scalable and accessible for businesses of any size.

The Anatomy of an AI Agent

To understand AaaS, it helps to know what makes up an AI agent:

Reasoning Engine: Powered by large language models (LLMs), this is the brain that processes inputs and makes decisions.
Knowledge Base: A repository of data (similar to an organized database) that the agent pulls from to complete tasks.
Memory: Short-term memory that helps agents manage ongoing conversations or tasks (comparable to session storage in Azure Functions).
Tools and Actions: External APIs or software that agents use to execute tasks (much like invoking Azure Function Apps for specific operations).
Planning Module: Breaks down high-level goals into manageable steps (comparable to task orchestrators like Azure Logic Apps).

AI Agent Considerations

Knowledge: Providing agents with the right context
Actions: Giving agents access to the tools needed to complete tasks
Security: Ensuring agents have access to only the data and services they need
Evaluation: Ensuring agents complete tasks correctly

Azure AI Agent Service Overview

Azure AI Agent Service is a fully managed service that simplifies the development, deployment, and scaling of AI agents. It eliminates the complexity of managing infrastructure while providing robust tools for building extensible AI agents.

Managed agentmicro-services
Used for Rapid development and automation, Extensive data connections, Flexible model selection, Enterprise-grade security.

Key Features

Automatic Tool Calling
- Server-side handling of tool calls
- Simplified integration with external services
- Streamlined workflow automation
Secure Data Management
- Thread-based conversation state management
- Built-in security features
- Compliant data handling
Pre-built Tool Integration
- Bing integration
- Azure AI Search compatibility
- Azure Functions support
- Code interpreter tools
- File handling capabilities

Infrastructure Management

Automated provisioning
Dynamic scaling
Load balancing
Resource optimization

Security and Compliance

Identity and access management
Network isolation
Encryption at rest and in transit
Compliance certifications

Operational Excellence

Automated updates
Performance monitoring
Disaster recovery
Backup management

Benefits of Using Azure AI Agent Service

Serverless Operations
- No infrastructure management
- Automatic scaling from zero
- Pay-per-execution pricing
- Built-in fault tolerance
- Instant deployment
- Global availability
Simplified Development
- Reduced code complexity
- Fast implementation
- Built-in best practices
Enterprise-Grade Security
- Managed authentication
- Secure data handling
- Compliance-ready infrastructure
Scalability
- Automatic resource management
- Elastic computing capabilities
- Cost-effective scaling
Simplified Operations
- Zero-touch deployments
- Automated scaling
- Built-in monitoring
- Simplified maintenance
Enterprise-Grade Architecture
- Microservices design
- Container orchestration
- Service mesh integration
- Event-driven patterns
Cost Optimization
- Pay-per-use pricing
- Resource optimization
- Automatic scaling
- Shared infrastructure

Key Components

Understanding these components is crucial for effective implementation:

Agent: The custom AI that utilizes AI models and tools; Large Language Model with defined instructions and tools
Thread: A conversation session between agent and user; manages and truncates messages.
Message: Content created by either agent or user
Run: The activation of an agent based on thread contents and configured tools
Tools: Extensions that enhance agent capabilities; services and functions that extend the agent's ability

AI Agent Service Steps

Step 1: Create an Agent - provide what the agent needs, like tools and documents to search. Or use the agent id to activate.

Step 2:Create a Thread - track session

Step 3:Run the Agent - send the message, including attachments if necessary

Step 4:Check the Run status - is the process complete?

Step 5:Display theAgent’s Response - read the response including generated images,

Step 6: Clean up if necessary - Delete the Agent if it's not needed

Sample Project: Agent On The Fly, a web app with agents

Visit GitHub repository

The file AgentOnTheFly.py demonstrates a practical blueprint for embedding AI agents into a web application that combines retrieval-augmented generation (RAG) with on-the-fly code generation. This pattern works as follows:

Dynamic Agent Instantiation: Upon uploading a document, the application automatically instantiates an AI agent that creates a vector search index to extract contextual information from the file.
Session-Based Context: The agent establishes session context that preserves state during interaction, ensuring that relevant data is available throughout the conversation.
On-the-Fly Code Generation: By processing user instructions, the agent can generate and execute Python code in real time to perform analyses or create visual outputs.
Automated Resource Cleanup: After completing its task—such as downloading a generated image—the agent terminates, ensuring efficient resource management.

This approach embodies an "agent for hire" model on the fly: you upload a document and the system dynamically creates the necessary backend processes to provide intelligent, context-aware assistance exactly when needed.

1. RAG Implementation

The Retrieval-Augmented Generation capability allows agents to dynamically access and utilize knowledge from uploaded documents, enhancing their responses with contextual information.

            # Upload the file
            file = project_client.agents.upload_file_and_poll(
                file_path=temp_file_path,
                purpose=FilePurpose.AGENTS
            )
            print(f"Uploaded file, file ID: {file.id}")

            # Create vector store if not exists or if new file is uploaded
            if not st.session_state["vector_store_id"]:
                vector_store = project_client.agents.create_vector_store_and_poll(
                    file_ids=[file.id],
                    name=f"vectorstore_{file_obj.name}"
                )
                st.session_state["vector_store_id"] = vector_store.id
                print(f"Created vector store, vector store ID: {vector_store.id}")

            # Create file search tool
            file_search_tool = FileSearchTool(vector_store_ids=[st.session_state["vector_store_id"]])

            agent = project_client.agents.create_agent(
                model=model,
                name="rag-agent",
                instructions="You are a helpful agent which provides answer ONLY from the search.",
                tools=file_search_tool.definitions,
                tool_resources=file_search_tool.resources,
            )
            st.session_state["rag_agent_id"] = agent.id
            print(f"Created agent, agent ID: {agent.id}")

        # Create thread and message
        thread = project_client.agents.create_thread()
        message = project_client.agents.create_message(
            thread_id = thread.id,
            role = "user",
            content = prompt
        )
        print(f"Created message, message ID: {message.id}")

        # Run the agent
        run = project_client.agents.create_and_process_run(
            thread_id=thread.id,
            assistant_id=st.session_state["rag_agent_id"]
        )

        # Check the run status
        if run.status == "failed":
            progress_bar.empty()
            return f"Run failed: {run.last_error}"

        # Get the last message from the agent
        messages = project_client.agents.list_messages(thread_id=thread.id)

2. Code Interpretation Capabilities

The code interpretation feature allows AI agents to generate and execute Python code in real-time, enabling data analysis, visualization, and other programmatic tasks based on user requests.

        # Initiate AI Project client
        project_client = AIProjectClient.from_connection_string(
            credential = DefaultAzureCredential(),
            conn_str = conn_str
        )

        # Add Code Interpreter to the Agent's ToolSet
        toolset = ToolSet()
        code_interpreter_tool = CodeInterpreterTool()
        toolset.add(code_interpreter_tool)

        # Initiate Agent Service
        agent = project_client.agents.create_agent(
            model = model,
            name = "code-interpreter-agent",
            instructions = "You are a helpful data analyst. You can use Python to perform required calculations.",
            toolset = toolset
        )
        print(f"Created agent, agent ID: {agent.id}")

        # Create a thread
        thread = project_client.agents.create_thread()
        print(f"Created thread, thread ID: {thread.id}")

        # Create a message
        message = project_client.agents.create_message(
            thread_id = thread.id,
            role = "user",
            content = prompt
        )
        print(f"Created message, message ID: {message.id}")

        # Run the agent
        run = project_client.agents.create_and_process_run(
            thread_id = thread.id,
            assistant_id = agent.id
        )

        # Check the run status
        if run.status == "failed":
            project_client.agents.delete_agent(agent.id)
            print(f"Deleted agent, agent ID: {agent.id}")
            progress_bar.empty()
            return f"Run failed: {run.last_error}"

        time.sleep(10)  # Increase delay if needed

        # Get the last message from the agent
        messages = project_client.agents.list_messages(thread_id=thread.id)

3. Combined Approach: Code Interpretation + RAG

This combined approach integrates both capabilities:

Document Knowledge Access: Agents can search and retrieve information from uploaded documents
Live Code Execution: Python code is generated and run in real-time
Data-Driven Analysis: Perform analysis on document contents directly
Interactive Visualizations: Create charts and graphs based on document data
Automated Problem Solving: Generate solutions that combine document knowledge with computational capabilities

        project_client = AIProjectClient.from_connection_string(
            credential=DefaultAzureCredential(),
            conn_str=conn_str
        )

        # Save and upload file
        temp_file_path = f"temp_{file_obj.name}"
        with open(temp_file_path, "wb") as f:
            f.write(file_obj.getvalue())

        # Upload the file
        file = project_client.agents.upload_file_and_poll(
            file_path=temp_file_path,
            purpose=FilePurpose.AGENTS
        )
        print(f"Uploaded file, file ID: {file.id}")

        # Create vector store
        vector_store = project_client.agents.create_vector_store_and_poll(
            file_ids=[file.id],
            name=f"vectorstore_{file_obj.name}"
        )
        print(f"Created vector store, vector store ID: {vector_store.id}")

        # Setup tools
        toolset = ToolSet()
        code_interpreter_tool = CodeInterpreterTool()
        toolset.add(code_interpreter_tool)
        print("Added Code Interpreter tool")

        file_search_tool = FileSearchTool(vector_store_ids=[vector_store.id])
        toolset.add(file_search_tool)
        print("Added File Search tool")


        # Create agent with both capabilities
        agent = project_client.agents.create_agent(
            model=model,
            name="rag-code-interpreter-agent",
            instructions="You are a helpful agent that can analyze documents and generate Python code based on the document content. Use the file search to extract relevant information and then generate appropriate Python code for analysis when needed.",
            toolset = toolset
        )

        print(f"Created agent, agent ID: {agent.id}")

        # Create thread and message
        thread = project_client.agents.create_thread()
        message = project_client.agents.create_message(
            thread_id=thread.id,
            role="user",
            content=prompt
        )
        print(f"Created message, message ID: {message.id}")

        # Run the agent
        run = project_client.agents.create_and_process_run(
            thread_id=thread.id,
            assistant_id=agent.id
        )

        if run.status == "failed":
            project_client.agents.delete_agent(agent.id)
           project_client.agents.delete_vector_store(vector_store.id)
            os.remove(temp_file_path)
            return f"Run failed: {run.last_error}"

        # Get the last message from the agent
        messages = project_client.agents.list_messages(thread_id=thread.id)
        print(f"Messages: {pformat(messages)}")

Project Summary

The Agent on the Fly project demonstrates the practical implementation of Agent as a Service (AaaS) principles through a streamlined, user-friendly web application. This project showcases:

Core Capabilities

Document Intelligence: Upload any document and instantly access an AI agent that can understand, analyze, and respond to questions about its contents
Dynamic AI Agent Creation: Agents are created on-demand, precisely when needed, and only exist for the duration of their task
Serverless Architecture: Built on Azure's serverless infrastructure for optimal resource utilization and scalability
Combined RAG and Code Generation: Unique integration of retrieval capabilities with real-time code execution
Interactive Data Analysis: Ask questions about your data and receive visual insights without writing a single line of code

Technical Implementation

The project leverages several key technologies:

Azure OpenAI Service: Powers the underlying language model capabilities
Azure AI Search: Enables efficient vector search and document retrieval
Python Runtime Environment: Executes generated code within secure boundaries
Web-based Interface: Provides an intuitive entry point for users of all technical levels

Top comments (2)

neonrehan • Mar 11 • Edited

Thank you for this write up! We were just trying to build a new business model for agentic AI, and the term "Agents as a Service" is perfect way to sum up the business opportunity. Also, the evolution of MSA to Agent-MSA is a fascinating and modular way of looking at it when building a Rapid-Prototyping factory or capability in your organization. Do you have an example of "clean up" step when an Agent goes through a tear down, give back resources or stop consuming resources after the job is complete or user terminates? A project or use-case would illuminate this for me.

Ron Dagdag • Mar 15

Thanks. For the clean up step, it really depends on resources it's going to connect. I recommend separating that where the agent calls other services like api/azure functions/aws lambda. And the functions would be responsible for releasing the resources it consumes. This way there's separation of concerns. The agent is not responsible for connecting/disconnecting to database but rather identifying steps and executing steps to call other services. AI Agent should not care about database connections, but the api should be responsible to maintain the database connection.