Derek

Posted on Feb 17

LangChain x AI Agent A2Z Agent Deployment Tutorial on How to Bring Agent Live

#ai #agents #automation

Abstract

Deploy the framework based AI Agent online is always difficult and in this blog, we will introduce
how to deploy a langchain framework based AI agent and bring it from local to online service with an /chat endpoint. Tutorials covers two examples content-builder-agent and deep_research in LangChain DeepAgents Repo(github: langchain-ai/deepagents) and the Agent Deployment templates GitHub can be found and easily deployed on A2Z Deployment A2Z Deployment Doc and A2Z Deployment Platform. After deployment, we can bring the agent live and get "/chat" endpoints.

Tutorial

Step 1. Convert the LangChain DeepAgents to LiveRunTime

The agent class in their original implementation has two skills blog-post and social-media
and is created using the create_deep_agent base function.

def create_content_writer():
    """Create a content writer agent configured by filesystem files."""
    return create_deep_agent(
        memory=["./AGENTS.md"],           # Loaded by MemoryMiddleware
        skills=["./skills/"],             # Loaded by SkillsMiddleware
        tools=[generate_cover, generate_social_image],  # Image generation
        subagents=load_subagents(EXAMPLE_DIR / "subagents.yaml"),  # Custom helper
        backend=FilesystemBackend(root_dir=EXAMPLE_DIR),
    )

Step 2. Create a BaseLiveRuntime and implement an Async Generator

The BaseLiveRuntime object produces a FastAPI app which expose and /chat endpoint that takes messages format input.
To make the agent online, the BaseLiveRuntime takes two variables, the first one is the agent object defined by various
framework, such as LangChain/CrewAI/OpenAI Agent SDK/etc, the second one is an async generator which defined how the
agent will run the input, such as agent.run, agent.invoke and customized function.

Define Runtime

from ai_agent_marketplace.runtime.base import *

async def content_builder_stream_generator(
    agent: Any,
    user_query: str,
    **kwargs
) -> AsyncGenerator[str, None]:
    """
    """
    ## more

runtime = BaseLiveRuntime(
    agent=agent,
    stream_handler=content_builder_stream_generator
)

## Returned a FastAPI based app with /chat endpoint
app = runtime.app

Create a Streaming Adapter

Define an async generator that adapts your LangChain agent output into streaming chunks.

The async generator takes in two parameters: agent an customized agent object, user_query
that are parsed from the messages object from the "\chat" endpoints.
In the async generator, the agent calls agent.invoke({"messages": messages}) methods.

from ai_agent_marketplace.runtime.base import *
from typing import Any, AsyncGenerator
import json
import uuid
import asyncio

async def deepagents_stream_generator(
    agent: Any,
    user_query: str,
    **kwargs
) -> AsyncGenerator[str, None]:
    """
    Universal async adapter for LangChain agent
    """

    # Send initial streaming message
    initial_content = "Task Started and Research Take a Few Minutes"
    initial_chunk = json.dumps(
        assembly_message(
            type=MESSAGE_TYPE_ASSISTANT,
            format=OUTPUT_FORMAT_TEXT,
            content=initial_content,
            content_type=CONTENT_TYPE_MARKDOWN,
            section=SECTION_ANSWER,
            message_id=str(uuid.uuid4()),
            template=TEMPLATE_STREAMING_CONTENT_TYPE,
        )
    )

    yield initial_chunk + STREAMING_SEPARATOR_DEFAULT
    await asyncio.sleep(0)

    try:
        # Call LangChain agent
        messages = [{"role": "user", "content": user_query}]
        result = agent.invoke({"messages": messages})

        output_messages = result["messages"] if "messages" in result else []

        for message in output_messages:
            message_id, content, role = extract_message_content_langchain(message)

            output_chunk = json.dumps(
                assembly_message(
                    type=MESSAGE_TYPE_ASSISTANT,
                    format=OUTPUT_FORMAT_TEXT,
                    content=content,
                    content_type=CONTENT_TYPE_MARKDOWN,
                    section=SECTION_ANSWER,
                    message_id=message_id,
                    template=TEMPLATE_STREAMING_CONTENT_TYPE,
                )
            )

            yield output_chunk + STREAMING_SEPARATOR_DEFAULT

    except Exception:
        yield json.dumps({}) + STREAMING_SEPARATOR_DEFAULT

Step 3. Deploy the Agent Live

Go to the deployment workspace (DeepNLP AI Agent A2Z Deployment)

Choose Github Tab
Public url: https://github.com/aiagenta2z/agent-mcp-deployment-templates
Entry Point Command shell

uvicorn langchain_deepagents.deep_research.research_agent_server:app

Set the Environment Variables

# Set API keys
export GOOGLE_API_KEY="..."      # For image generation
export TAVILY_API_KEY="..."      # For web search (optional)

Step 5. Click Deploy and You will get the URL

Get the Product /chat POST URL :

https://langchain-ai.aiagenta2z.com/content-builder-agent/chat

Architecture Summary

LangChain Agent
        ↓
Streaming Adapter (Async Generator)
        ↓
BaseLiveRuntime
        ↓
FastAPI App (/chat)
        ↓
Streaming JSON Response

Step 4. Test Deployed Agent with curl

Case 1: Simple Math

curl -X POST "http://localhost:8000/chat" \
  -H "Content-Type: application/json" \
  -d '{"messages":[{"role":"user","content":"Calculate 1+1 result"}]}'

Sample Streaming Output

{"type":"assistant","format":"text","content":"Task Started...","section":"answer","message_id":"670d3458-a539-406f-a786-1afc0f0fc201","content_type":"text/markdown","template":"streaming_content_type"}
{"type":"assistant","format":"text","content":"Calculate 1+1 result","section":"answer","message_id":"701be311-37e3-4ee1-9519-6d8e65b47f59","content_type":"text/markdown","template":"streaming_content_type"}
{"type":"assistant","format":"text","content":"1 + 1 = 2","section":"answer","message_id":"lc_run--019c55fe-4ed2-7da3-9e05-0a8758aa10cc-0","content_type":"text/markdown","template":"streaming_content_type"}

Case 2: Research Task

curl -X POST "http://localhost:8000/chat" \
  -H "Content-Type: application/json" \
  -d '{"messages":[{"role":"user","content":"research context engineering approaches used to build AI agents"}]}'

Sample Streaming Output (Truncated)

{"type":"assistant","content":"Task Started..."}
{"type":"assistant","content":"Updated todo list ..."}
{"type":"assistant","content":"Updated file /research_request.md"}
{"type":"assistant","content":"Here is a comprehensive report on context engineering approaches..."}

The response is streamed incrementally as the agent reasons, calls tools, and produces final output.

Deploy And Test Examples

You can also deploy publicly:

curl -X POST "https://deepagents.aiagenta2z.com/deep_research/chat" \
  -H "Content-Type: application/json" \
  -d '{"messages":[{"role":"user","content":"Calculate 1+1 result"}]}'

{"type":"assistant","content":"Task Started..."}
{"type":"assistant","content":"Updated todo list ..."}
{"type":"assistant","content":"Updated file /research_request.md"}
{"type":"assistant","content":"Here is a comprehensive report on context engineering approaches..."}

DEV Community