DEV Community

Derek
Derek

Posted on

LangChain x AI Agent A2Z Agent Deployment Tutorial on How to Bring Agent Live

Abstract

Deploy the framework based AI Agent online is always difficult and in this blog, we will introduce
how to deploy a langchain framework based AI agent and bring it from local to online service with an /chat endpoint. Tutorials covers two examples content-builder-agent and deep_research in LangChain DeepAgents Repo(github: langchain-ai/deepagents) and the Agent Deployment templates GitHub can be found and easily deployed on A2Z Deployment A2Z Deployment Doc and A2Z Deployment Platform. After deployment, we can bring the agent live and get "/chat" endpoints.

Tutorial

Step 1. Convert the LangChain DeepAgents to LiveRunTime

The agent class in their original implementation has two skills blog-post and social-media
and is created using the create_deep_agent base function.

def create_content_writer():
    """Create a content writer agent configured by filesystem files."""
    return create_deep_agent(
        memory=["./AGENTS.md"],           # Loaded by MemoryMiddleware
        skills=["./skills/"],             # Loaded by SkillsMiddleware
        tools=[generate_cover, generate_social_image],  # Image generation
        subagents=load_subagents(EXAMPLE_DIR / "subagents.yaml"),  # Custom helper
        backend=FilesystemBackend(root_dir=EXAMPLE_DIR),
    )
Enter fullscreen mode Exit fullscreen mode

Step 2. Create a BaseLiveRuntime and implement an Async Generator

The BaseLiveRuntime object produces a FastAPI app which expose and /chat endpoint that takes messages format input.
To make the agent online, the BaseLiveRuntime takes two variables, the first one is the agent object defined by various
framework, such as LangChain/CrewAI/OpenAI Agent SDK/etc, the second one is an async generator which defined how the
agent will run the input, such as agent.run, agent.invoke and customized function.

Define Runtime
from ai_agent_marketplace.runtime.base import *

async def content_builder_stream_generator(
    agent: Any,
    user_query: str,
    **kwargs
) -> AsyncGenerator[str, None]:
    """
    """
    ## more

runtime = BaseLiveRuntime(
    agent=agent,
    stream_handler=content_builder_stream_generator
)

## Returned a FastAPI based app with /chat endpoint
app = runtime.app

Enter fullscreen mode Exit fullscreen mode
Create a Streaming Adapter

Define an async generator that adapts your LangChain agent output into streaming chunks.

The async generator takes in two parameters: agent an customized agent object, user_query
that are parsed from the messages object from the "\chat" endpoints.
In the async generator, the agent calls agent.invoke({"messages": messages}) methods.

from ai_agent_marketplace.runtime.base import *
from typing import Any, AsyncGenerator
import json
import uuid
import asyncio

async def deepagents_stream_generator(
    agent: Any,
    user_query: str,
    **kwargs
) -> AsyncGenerator[str, None]:
    """
    Universal async adapter for LangChain agent
    """

    # Send initial streaming message
    initial_content = "Task Started and Research Take a Few Minutes"
    initial_chunk = json.dumps(
        assembly_message(
            type=MESSAGE_TYPE_ASSISTANT,
            format=OUTPUT_FORMAT_TEXT,
            content=initial_content,
            content_type=CONTENT_TYPE_MARKDOWN,
            section=SECTION_ANSWER,
            message_id=str(uuid.uuid4()),
            template=TEMPLATE_STREAMING_CONTENT_TYPE,
        )
    )

    yield initial_chunk + STREAMING_SEPARATOR_DEFAULT
    await asyncio.sleep(0)

    try:
        # Call LangChain agent
        messages = [{"role": "user", "content": user_query}]
        result = agent.invoke({"messages": messages})

        output_messages = result["messages"] if "messages" in result else []

        for message in output_messages:
            message_id, content, role = extract_message_content_langchain(message)

            output_chunk = json.dumps(
                assembly_message(
                    type=MESSAGE_TYPE_ASSISTANT,
                    format=OUTPUT_FORMAT_TEXT,
                    content=content,
                    content_type=CONTENT_TYPE_MARKDOWN,
                    section=SECTION_ANSWER,
                    message_id=message_id,
                    template=TEMPLATE_STREAMING_CONTENT_TYPE,
                )
            )

            yield output_chunk + STREAMING_SEPARATOR_DEFAULT

    except Exception:
        yield json.dumps({}) + STREAMING_SEPARATOR_DEFAULT
Enter fullscreen mode Exit fullscreen mode

Step 3. Deploy the Agent Live

Go to the deployment workspace (DeepNLP AI Agent A2Z Deployment)

  1. Choose Github Tab
  2. Public url: https://github.com/aiagenta2z/agent-mcp-deployment-templates
  3. Entry Point Command shell
uvicorn langchain_deepagents.deep_research.research_agent_server:app
Enter fullscreen mode Exit fullscreen mode
  1. Set the Environment Variables
# Set API keys
export GOOGLE_API_KEY="..."      # For image generation
export TAVILY_API_KEY="..."      # For web search (optional)
Enter fullscreen mode Exit fullscreen mode

Step 5. Click Deploy and You will get the URL

Deployment of LangChain Content

Get the Product /chat POST URL :

https://langchain-ai.aiagenta2z.com/content-builder-agent/chat
Enter fullscreen mode Exit fullscreen mode

Architecture Summary

LangChain Agent
        ↓
Streaming Adapter (Async Generator)
        ↓
BaseLiveRuntime
        ↓
FastAPI App (/chat)
        ↓
Streaming JSON Response
Enter fullscreen mode Exit fullscreen mode

Step 4. Test Deployed Agent with curl

Case 1: Simple Math

curl -X POST "http://localhost:8000/chat" \
  -H "Content-Type: application/json" \
  -d '{"messages":[{"role":"user","content":"Calculate 1+1 result"}]}'
Enter fullscreen mode Exit fullscreen mode

Sample Streaming Output

{"type":"assistant","format":"text","content":"Task Started...","section":"answer","message_id":"670d3458-a539-406f-a786-1afc0f0fc201","content_type":"text/markdown","template":"streaming_content_type"}
{"type":"assistant","format":"text","content":"Calculate 1+1 result","section":"answer","message_id":"701be311-37e3-4ee1-9519-6d8e65b47f59","content_type":"text/markdown","template":"streaming_content_type"}
{"type":"assistant","format":"text","content":"1 + 1 = 2","section":"answer","message_id":"lc_run--019c55fe-4ed2-7da3-9e05-0a8758aa10cc-0","content_type":"text/markdown","template":"streaming_content_type"}
Enter fullscreen mode Exit fullscreen mode

Case 2: Research Task

curl -X POST "http://localhost:8000/chat" \
  -H "Content-Type: application/json" \
  -d '{"messages":[{"role":"user","content":"research context engineering approaches used to build AI agents"}]}'
Enter fullscreen mode Exit fullscreen mode

Sample Streaming Output (Truncated)

{"type":"assistant","content":"Task Started..."}
{"type":"assistant","content":"Updated todo list ..."}
{"type":"assistant","content":"Updated file /research_request.md"}
{"type":"assistant","content":"Here is a comprehensive report on context engineering approaches..."}
Enter fullscreen mode Exit fullscreen mode

The response is streamed incrementally as the agent reasons, calls tools, and produces final output.


Deploy And Test Examples

You can also deploy publicly:

curl -X POST "https://deepagents.aiagenta2z.com/deep_research/chat" \
  -H "Content-Type: application/json" \
  -d '{"messages":[{"role":"user","content":"Calculate 1+1 result"}]}'
Enter fullscreen mode Exit fullscreen mode
{"type":"assistant","content":"Task Started..."}
{"type":"assistant","content":"Updated todo list ..."}
{"type":"assistant","content":"Updated file /research_request.md"}
{"type":"assistant","content":"Here is a comprehensive report on context engineering approaches..."}
Enter fullscreen mode Exit fullscreen mode

Top comments (0)