Gunnar Grosch for AWS

Posted on Dec 24, 2025

DEV Track Spotlight: AI agents at the edge: Build for offline, scale in cloud (DEV301)

#aws #ai #edge #strands

Industrial operations face a critical challenge: the average cost of unplanned downtime exceeds $260,000 per hour according to Aberdeen Research. Yet many industrial environments operate in conditions where cloud connectivity is unreliable or impossible - underground mines without signal, remote manufacturing plants with legacy infrastructure, offshore oil platforms miles into the sea, and facilities with strict data sovereignty requirements.

Ana Cunha, Senior Developer Advocate at AWS, and David Victoria, AWS Community Hero and Senior Cloud Architect at Caylent, tackled this challenge head-on in their DEV301 session. Their solution? Building AI agents that operate fully offline at the edge while seamlessly scaling to leverage cloud capabilities when connectivity returns.

Watch the Full Session:

The Cloud Dependency Problem

The challenge is straightforward but critical. Modern AI agents require cloud connectivity to function - they need access to large language models for reasoning and decision-making. In a factory setting, when internet connection is available, agents can analyze sensor data through Amazon Bedrock and everything works perfectly. But the moment that connection drops, the entire system fails.

As Ana explained: "What do we do? So what do agents require to work offline? First of all, agents are powered by large language models so they need to think, they need to reason locally. Agents are powered by LLMs, but they need the ability to take actions, and agents do that through tools. So they also need to be able to use those tools locally, not through internet. They need to keep context between different sessions so they have memory, and then finally, once we have connectivity again to sync that information to and from the cloud."

The Solution: Strands Agents SDK + Ollama

The hybrid architecture combines Strands Agents SDK for orchestration with Ollama for local model inference. When offline, agents use lightweight models running through Ollama and custom local tools. When online, they seamlessly transition to powerful models like Claude on Amazon Bedrock.

Why Ollama?

Ollama is an open source tool that enables running lightweight LLMs on local computers. David demonstrated running 8 billion parameter models like Qwen on his MacBook. The catalog includes models from Google (Gemma), Mistral AI, Meta (Llama), DeepSeek, Phi, and many others - all capable of running in local facilities without internet connectivity.

Why Strands Agents SDK?

As David emphasized: "Strands Agents is an open source Python and - since yesterday also TypeScript - SDK for building agents using just few lines of code. Basically, Strands Agents is a solution for builders, people that actually write code."

Strands isn't vendor-locked. While it defaults to Amazon Bedrock with Claude Sonnet, it supports Ollama, first-party APIs like Anthropic and OpenAI, and other vendors like Groq. This flexibility allows seamless switching between local and cloud models.

Understanding the Agentic Loop

Agents aren't just models - they're sophisticated systems that iterate through a reasoning and action cycle until they achieve their goal. The agentic loop consists of three key phases:

Reasoning: The agent receives user input, processes it with a large language model, and decides whether to call additional tools or respond directly.

Actions: The agent executes tools to gather information or perform tasks, adding context that helps complete the user's request.

Response: Once satisfied with the gathered information, the agent provides the final output to the user.

As David put it: "That's why I say that the agents are loops with superpowers."

Building Offline Agents: Five Key Capabilities

1. Local Model Integration with Ollama

The demo repository shows how to configure an agent with Ollama using a simple configuration dataclass. The setup specifies the Ollama host (typically localhost:11434), the model ID (like qwen3-nothink:4b), temperature settings, and keep-alive parameters. The ModelRouter class handles initialization and can switch between Ollama and Amazon Bedrock models dynamically.

The key difference from cloud-based agents? Just specifying the Ollama model instead of using Strands' default. Everything else - tools, memory, session management - works identically.

2. Custom Local Tools

Tools are how agents take action in the physical world. Creating a custom tool is remarkably simple with Strands' @tool decorator pattern:

@tool
def read_sensor(device_id: str) -> str:
    """Read current sensor values from an IoT device.

    Args:
        device_id: The unique identifier of the sensor device to read
    """
    device = default_registry.get(device_id)

    if device is None:
        available_ids = default_registry.list_device_ids()
        return f"Error: Device '{device_id}' not found. Available devices: {', '.join(available_ids)}"

    reading = device.read()
    timestamp = datetime.now().isoformat()

    return (
        f"Sensor Reading:
"
        f"  Device ID: {device.device_id}
"
        f"  Value: {reading} {device.unit}
"
        f"  Timestamp: {timestamp}"
    )

The @tool decorator transforms any Python function into an agent-callable tool. The docstring is critical - it tells the agent what the tool does and what parameters it needs. David demonstrated tools for reading sensors, controlling actuators, and listing available devices. The possibilities are endless: analyzing images, querying databases, creating work orders, or any operation you can code in Python.

3. Structured Outputs for System Integration

Industrial systems often require specific data formats - JSON for REST APIs, XML for SOAP envelopes, or structured data for ERP and MES systems. The demo uses Pydantic models to ensure agents return precisely formatted data.

The SCADA extraction tool demonstrates this perfectly. It parses unstructured SCADA report text and extracts validated production metrics conforming to a ProductionMetrics schema. The tool includes comprehensive validation error handling, providing detailed feedback when extracted data doesn't match the expected format. Instead of verbose text responses, the agent returns structured dictionaries matching the exact fields and types that external systems expect, enabling autonomous integration without human intervention.

4. Session Management and Memory

Agents need to remember context across interactions. If a machine showed abnormal temperature readings, the agent should recall that information even after an app restart or power outage. The demo provides a clean implementation using Strands' FileSessionManager:

def create_session_manager(
    session_id: str,
    storage_dir: str = "./edge_sessions"
) -> FileSessionManager:
    """Create a FileSessionManager with automatic directory creation."""
    storage_path = Path(storage_dir)
    storage_path.mkdir(parents=True, exist_ok=True)

    return FileSessionManager(
        session_id=session_id,
        storage_dir=str(storage_path)
    )

The session manager automatically creates storage directories if they don't exist and persists conversation history to the local filesystem. Each session is isolated by a unique session ID, ensuring that different conversations don't interfere with each other. When connectivity returns, sessions can sync to cloud storage like Amazon DynamoDB or Amazon S3, providing a complete offline-to-online workflow.

5. Model Context Protocol (MCP) Integration

MCP servers, created by Anthropic, expose tools and data sources that agents can use. Instead of writing custom code to interact with databases or external systems, agents can leverage existing MCP servers.

David demonstrated using the SQLite MCP server for local database operations. The database tools class wraps MCP operations into agent-friendly functions for logging telemetry, querying records with optional filters, and performing aggregation queries. The agent gained instant access to database operations - creating tables, inserting data, running queries, calculating averages and aggregations - all through natural language. When asked to "calculate the average value for each device ID in the device telemetry table," the agent constructed the correct SQL query automatically.

Real-World Demo: Edge Operator Agent

Ana and David demonstrated a production-ready edge operator agent managing IoT devices in an industrial facility. The agent architecture brings together the model router, session manager, IoT tools, database tools via MCP, and SCADA extraction capabilities into a unified conversational interface.

The agent could read sensor data (temperature, humidity, pressure), control actuators (opening/closing valves), perform multi-step operations (check humidity AND validate if it's in acceptable range), maintain conversation context across sessions, store telemetry data in local databases, and extract structured data from unstructured SCADA reports.

The most impressive moment? Watching the agent seamlessly switch from local Ollama models to Amazon Bedrock when connectivity returned:

def set_model_mode(self, mode: str) -> tuple[bool, str]:
    """Switch between local and cloud model modes."""
    success = self.model_router.set_mode(mode)

    if success:
        if self._agent is not None:
            self._agent.model = self.model_router.get_model()
        return True, f"Successfully switched to {mode} mode"

Just a single line of code change. The latency difference was noticeable, but the functionality remained identical.

The Kiro Advantage

Throughout the demo, David credited Kiro - AWS' new agentic IDE - for accelerating development. By connecting the Strands Agents MCP server to Kiro, they gained access to up-to-date documentation and code examples even though Strands is a relatively new library.

David's advice: "Without the Strands Agents MCP server, Kiro would be unable to accomplish the task." The MCP server provided real-time documentation access, ensuring accurate code generation for the latest Strands features.

He also demonstrated using steering docs - instructions that guide Kiro's behavior - and agent hooks that automatically create git commits after each task completion. These features transformed Kiro from a coding assistant into a true development partner.

Key Takeaways

Start with the hybrid mindset: Design agents to work offline first, then add cloud capabilities as an enhancement rather than a requirement.

Leverage existing tools: Strands provides HTTP requests, system operations, file management, and human-in-the-loop capabilities out of the box. Check the documentation before building custom tools.

Use MCP servers: Don't reinvent the wheel. If an MCP server exists for your database or external system, use it instead of writing custom integration code.

Implement human-in-the-loop: For critical operations like deleting data or controlling industrial equipment, always include approval workflows. Agents should augment human decision-making, not replace it.

Session management is essential: Industrial environments experience power outages and connectivity issues. Persistent memory ensures agents can recover gracefully.

Model selection matters: Lightweight models like DeepSeek 8B work remarkably well for specific industrial tasks. You don't always need the largest models.

DevOps principles apply: As David noted, "AI agents are software - DevOps principles apply here too." Use version control, testing, and monitoring just as you would for any production system.

David's closing advice captured the session's essence: "Don't wait for your cloud to think, but use the cloud to think bigger."

Resources

GitHub Repository: All code examples, Jupyter notebooks, and the full Edge Operator Agent are available at https://github.com/davidvictoria/reinvent-dev301-edge-agents
Strands Agents SDK: Open source documentation and resources at https://strandsagents.com/
Ollama: Download and explore the model catalog at ollama.ai

About This Series

This post is part of DEV Track Spotlight, a series highlighting the incredible sessions from the AWS re:Invent 2025 Developer Community (DEV) track.

The DEV track featured 60 unique sessions delivered by 93 speakers from the AWS Community - including AWS Heroes, AWS Community Builders, and AWS User Group Leaders - alongside speakers from AWS and Amazon. These sessions covered cutting-edge topics including:

🤖 GenAI & Agentic AI - Multi-agent systems, Strands Agents SDK, Amazon Bedrock
🛠️ Developer Tools - Kiro, Kiro CLI, Amazon Q Developer, AI-driven development
🔒 Security - AI agent security, container security, automated remediation
🏗️ Infrastructure - Serverless, containers, edge computing, observability
⚡ Modernization - Legacy app transformation, CI/CD, feature flags
📊 Data - Amazon Aurora DSQL, real-time processing, vector databases

Each post in this series dives deep into one session, sharing key insights, practical takeaways, and links to the full recordings. Whether you attended re:Invent or are catching up remotely, these sessions represent the best of our developer community sharing real code, real demos, and real learnings.

Follow along as we spotlight these amazing sessions and celebrate the speakers who made the DEV track what it was!

DEV Community