DEV Community

Gerardo Moreno
Gerardo Moreno

Posted on • Edited on

Run Autogen Generated Code in the Cloud

Autogen is an open-source framework to build agentic workflows. While agents are powerful on their own, their full potential is realized when equipped with the right tools, one of the most important being the ability to execute code. However, this capability also introduces risks, as agents may sometimes make errors, and since you're giving your own machine as a playground, these errors have the potential to be costly, such as running code that deletes important files. To mitigate these risks, it's essential to provide agents with a safe, containerized environment for running their code.

Autogen allows you to run code inside a Docker container on your local machine. But if you don't have Docker installed or need more scalability, running code in the cloud is a great alternative.

In this tutorial, we'll use e2b, a service that provides sandboxes specifically designed for running AI agent-generated code! To use e2b, you'll need to create an account, and the free tier will be more than enough.
We'll use an example from the documentation of the recently released Autogen 0.4. Our focus will be on integrating e2b, without diving too deeply into the specifics of Autogen syntax. You can refer to the link above for more details on that.

Here’s the code from the documentation, with additional imports necessary for using the e2b Sandbox:

import asyncio
from dataclasses import dataclass

from autogen_core.application import SingleThreadedAgentRuntime
from autogen_core.base import MessageContext
from autogen_core.components import (
    DefaultTopicId,
    RoutedAgent,
    default_subscription,
    message_handler,
)
from autogen_core.components.code_executor import extract_markdown_code_blocks
from autogen_core.components.model_context import BufferedChatCompletionContext
from autogen_core.components.models import (
    AssistantMessage,
    ChatCompletionClient,
    OpenAIChatCompletionClient,
    SystemMessage,
    UserMessage,
)
from e2b_code_interpreter import Sandbox


@dataclass
class Message:
    content: str


@default_subscription
class Assistant(RoutedAgent):
    def __init__(self, model_client: ChatCompletionClient) -> None:
        super().__init__("An assistant agent.")
        self._model_client = model_client
        self._model_context = BufferedChatCompletionContext(buffer_size=5)
        self._system_messages = [
            SystemMessage(
                content="""Write Python script in markdown block, and it will be executed. 
                Always save figures to file in the current directory. Do not use plt.show().""",
            )
        ]

    @message_handler
    async def handle_message(self, message: Message, ctx: MessageContext) -> None:
        await self._model_context.add_message(
            UserMessage(content=message.content, source="user")
        )
        messages = self._system_messages + (await self._model_context.get_messages())
        result = await self._model_client.create(
            messages,
            cancellation_token=ctx.cancellation_token,
        )
        print(f"\n{'-'*80}\nAssistant:\n{result.content}")
        await self._model_context.add_message(
            AssistantMessage(content=result.content, source="assistant")
        )
        await self.publish_message(Message(content=result.content), DefaultTopicId())

Enter fullscreen mode Exit fullscreen mode

Defining the Executor

Now, let's define the Executor agent, which includes the logic for running code in the e2b Sandbox. The handle_message function is triggered whenever a new message is received. It will only execute code if code blocks are found in the message (using extract_markdown_code_blocks).

For each code block, we will use the helper function execute_code_block_in_sandbox, this function handles the code execution and collects the logs. These logs serve as feedback for the model.

@default_subscription
class Executor(RoutedAgent):
    def __init__(self) -> None:
        super().__init__("An executor agent.")
        self.sbx = Sandbox()

    @message_handler
    async def handle_message(self, message: Message, ctx: MessageContext) -> None:
        code_blocks = extract_markdown_code_blocks(message.content)

        result = ""
        for code_block in code_blocks:
            result += self.execute_code_block_in_sandbox(code_block)

        if result:
            await self.publish_message(Message(content=result), DefaultTopicId())

    def execute_code_block_in_sandbox(self, code_block):
        if code_block.language == "python":
            execution = self.sbx.run_code(code_block.code)
            return ("".join(execution.logs.stdout) or "") + (
                execution.error.value if execution.error else ""
            )
        elif code_block.language == "bash":
            execution = self.sbx.commands.run(code_block.code)
            return (execution.stdout or "") + (execution.error or "")
        else:
            return f"Unsupported language: {code_block.language}"
Enter fullscreen mode Exit fullscreen mode

Currently, the sandbox only supports executing Python and Bash code, but this should be sufficient for our example.

Running the Setup

Finally, let’s write the code to run the setup. We register both the Assistant and Executor agents using Autogen's core syntax and publish a message to the agents.

async def main() -> None:
    runtime = SingleThreadedAgentRuntime()

    await Assistant.register(
        runtime,
        "assistant",
        lambda: Assistant(OpenAIChatCompletionClient(model="gpt-4")),
    )

    await Executor.register(runtime, "executor", lambda: Executor())

    runtime.start()
    await runtime.publish_message(
        Message("Create a plot of NVIDIA vs TSLA stock returns YTD from 2024-01-01."),
        DefaultTopicId(),
    )
    await runtime.stop_when_idle()

if __name__ == "__main__":
    asyncio.run(main())
Enter fullscreen mode Exit fullscreen mode

This script sets up the runtime, registers the agents, and publishes a test message asking the assistant to create a stock plot. The Executor will handle any code blocks generated by the assistant and execute them in the sandbox.

Flow Diagram

Here's a recap of what just happened internally:

Flow Diagram

Downloading the image

You want to make sure you don't forget getting your Nvidia vs Tesla plot!

from e2b_code_interpreter import Sandbox

runningSandboxes = Sandbox.list() 
sdb_info = runningSandboxes[0]
sdb = Sandbox(sandbox_id=sdb_info.sandbox_id)

sdb.commands.run("ls -l")

content = sdb.files.read('nvidia_vs_tsla_cumulative_returns_2024.png', format='bytes')
with open('NVIDIA_vs_TSLA_YTD_2024.png', 'wb') as file:
    file.write(content)
Enter fullscreen mode Exit fullscreen mode

Generated Image

Top comments (0)