Morgan Willis for AWS

Posted on Apr 30

What is an Agent Harness? A Hands-On Guide With AgentCore harness

#aws #ai #agents #tutorial

To have an AI agent, you need to build a harness first. But what exactly is an agent harness?

The AI world doesn't seem to agree on a single definition for this, but this is how I've been thinking about it: an agent harness is the system that wraps around a model to turn it into an agent. This includes the orchestration loop, tool connections, memory, compute, observability, and anything else it needs. It's everything between "I have a model" and "I have a running agent."

That sounds like a lot, and it can be. Teams spend days writing harness code for every agent they build using frameworks like Strands Agents SDK. But the tooling has now caught up and has made this all a lot easier. Services like the managed harness in Amazon Bedrock AgentCore lets you define your agent as configuration and it takes care of building the harness for you, taking you from idea to running agent with custom tools and capabilities in minutes.

In this post, I'll break down what agent harnesses are, what AgentCore harness is, and how I built an AI trends analyst without writing any code myself.

What Is an Agent Harness?

Every agentic product you've used has a harness powering it. Claude Code, Kiro, and Codex are all coding assistants that take a model and wrap a custom harness around it. The harness is what gives the model a filesystem, tool access, a code sandbox, memory across sessions, context management, and the ability to reason in a loop. Without it, the model just generates text. With it, the model writes code, runs tests, browses the web, and completes multi-step tasks. It turns out, the harness has been there all along. We're just now giving it a formal name.

The simplest way I've seen it put is Agent = Model + Harness. That means that any part of your agent that isn't the model is a part of the harness.

In practice, building a harness means picking an agent framework (Strands Agents, LangGraph, LlamaIndex, CrewAI) and writing code that configures the model, defines a system prompt, wires up tools, sets up memory, performs context engineering tasks, adds observability, and handles authentication. That's the job of an AI engineer. The final program might be 50 lines or 5000 lines depending on what you're doing. It can vary a lot dependent on your use case.

But beyond the agent code itself, you also need the infrastructure underneath the agent for this all to work: compute to host it, a sandbox so the agent can execute code safely, session management so users can come back tomorrow, auth so the agent can be secured and can call external APIs, and observability so you know what happened when things break at 2 AM. So you package up your agent code, deploy it to some agent hosting environment, and keep it running. What you have in the end is a running agent.

Those are the two layers of work. The agent code and the infrastructure to run it. For a lot of use cases that don't require super complex agent harness setups, the pieces are standard. Many teams are building out agent platforms, or agent factories, where you provide basic configuration for your specific agent and it's built out for you without you having to write the code yourself. That's exactly what AgentCore harness does for you.

AgentCore harness: An Agent Factory

Think of it like an agent factory where you declare config (model, prompt, tools, memory settings) and AgentCore harness compiles it into a running agent.

Under the hood, AgentCore harness takes your config and assembles a fully wired Strands Agents agent: the orchestration loop, tool execution, memory management, context handling, streaming, and error recovery are all handled. Then it runs that agent in an isolated microVM with its own CPU, memory, filesystem, and shell, without you provisioning anything.

It tackles both layers of work, the agent code and infrastructure to host and run it.

Let's Get Hands-On: The AI Trends Analyst Agent

I built an agent with AgentCore harness to browse HackerNews and dev.to and tells me what the AI community is talking about. It pulls the top posts with a built-in browser tool, and then uses a built-in code interpreter tool to analyze frequency, sentiment, and produces a ranked summary of trending topics with a chart. I didn't write any of the agent code myself.

I used the agentcore CLI which has a really nice interactive agentcore create command to provide my agent harness configuration. It walks you through all of the options you have. At the end, it creates a configuration file for you.

Download the agentcore CLI:

npm install -g @aws/agentcore@preview

Run agentcore create and follow the prompts.
Inspect the config file it produces. This was mine:

{
  "name": "TrendsAgentHarness",
  "model": {
    "provider": "bedrock",
    "modelId": "global.anthropic.claude-sonnet-4-6"
  },
  "tools": [
    {
      "type": "agentcore_browser",
      "name": "browser"
    },
    {
      "type": "agentcore_code_interpreter",
      "name": "code-interpreter"
    }
  ],
  "skills": [],
  "authorizerType": "AWS_IAM"
}

Modify the system-prompt.md file to define agent behavior. This is what my trends agent’s system prompt was:

You are a tech trend analyst. 
Every time you are invoked, browse HackerNews and dev.to for today's top posts related to AI and developer tools. 
Use the code interpreter to analyze the results: count topic frequency, identify clusters, and produce a ranked list of the top 5 trending topics with a brief summary of each. 
Include a bar chart of topic frequency. Be concise.

Now you have the entire agent definition.

To deploy it run:

agentcore deploy

And to invoke it run:

agentcore invoke --harness TrendsAgentHarness \
  --session-id "$(uuidgen)" \
  "What's trending in AI today?"

When the agent ran, it opened a browser, navigated to HackerNews, scrolled through posts, did the same on dev.to, pulled results into the code interpreter, ran Python to cluster topics and build a chart, and streamed the final summary back to my terminal which I then saved into a markdown file to read. All of it ran in an isolated microVM that spun up just for this session.

This is what the report it generated looks like:

The harness supports Bedrock, OpenAI, and Google Gemini models. You can switch providers between turns of the same session and the conversation continues with context intact.

When I first got my hands on this, I started looking for the catch. I've spent enough time wiring agent orchestration code to be suspicious when someone says "just provide a config." But the JSON above and two CLI commands are genuinely the whole workflow. The browser tool does eat a lot of tokens, but the core loop worked on the first try.

Plug In Your Tools

AgentCore harness comes with a browser and code interpreter built in, powered by AgentCore Browser and AgentCore Code Interpreter. That is what I used for my trends analyst, but you can connect whatever tools your agent needs:

MCP servers — connect to any MCP-compatible tool server
AgentCore Gateway — connect to APIs you've registered with AgentCore
Custom tools — define inline function tools the agent can call

Adding a new tool or MCP server is a config change.

Bring Your Own Skills

You can also include skills for your agent. An agent skill is a bundle of markdown instructions and scripts that gives the agent domain knowledge on demand. This could be how to work with Excel files, use a specific API, or follow a particular workflow. You package the skill into your agent's environment, then you can point the harness at it and the agent can use it.

The agent picks up the skill's instructions and follows them. This gives you the ability to provide domain expertise to custom agents without writing all the orchestration logic yourself. Just provide the skill.

What You Get Out of the Box

Capability	What It Means
Isolated microVM per session	Dedicated CPU, memory, filesystem. No cross-session data leakage.
Shell access	Run commands directly on the VM without model reasoning or token cost.
Persistent filesystem	Suspend mid-task, resume exactly where you left off within a single session’s lifetime.
Model-agnostic	Bedrock, OpenAI, Gemini. Switch mid-session without losing context.
Tool connectivity	MCP servers, AgentCore Gateway, browser, code interpreter, custom tools.
Skills	Package domain knowledge as markdown + scripts.
Custom environments	Bring your own container image with your dependencies and runtimes.
Observability	Every action auto-traced via AgentCore Observability.

How Does This Actually Work? Strands Agents Under the Hood

When you submit a config to AgentCore harness, something has to actually build the agent and run the loop. That something is Strands Agents, the open-source agent harness SDK from AWS.

Strands handles the agent loop: take in a user message, reason about it, decide which tool to call, execute the tool, feed the result back to the model, repeat until the task is done, and then stream back the response. It's a model-driven approach where the model makes the decisions and Strands provides the runtime that carries them out.

AgentCore harness takes your config and assembles a Strands agent from the components Strands offers. Your declarations become a Strands agent program that would have taken you hours or days to write by hand.

But one of my favorite parts of AgentCore harness is that you have an escape hatch if you need it. When you outgrow configuration and need custom orchestration logic, specialized routing, or multi-agent coordination, you can export the harness to Strands code and keep running on the same platform.

If you want to learn more about Strands, the SDK docs and the blog post on the model-driven approach are good starting points.

Frequently Asked Questions

Do I need a harness to use Claude Code or ChatGPT?
No. If you're calling a model with a prompt and getting a response, you're already using a harness someone else built (the consumer-facing application). You need to build your own when you're creating custom agents that call tools, connect to internal MCP servers, maintain state across turns, execute code, or run autonomously over multiple steps.

Is an agent harness the same as an agent framework?
No. A framework gives you building blocks (tool interfaces, model connectors, loop patterns). A harness is the assembled, running system: framework code plus compute, sandboxing, memory, and observability. You use a framework to build a harness, or you let a managed harness build one for you.

Can I build a harness without a framework?
You can, but you'd be writing the orchestration loop, tool dispatch, error handling, and context management from scratch. Frameworks like Strands Agents exist specifically so you don't have to.

What happens when I outgrow the features AgentCore harness provides?
You export to code and customize from there. AgentCore lets you export your harness config to Strands Agents code, and then you deploy back to the same platform.

Go Build Something

AgentCore harness is in public preview in four regions: US West (Oregon), US East (N. Virginia), Europe (Frankfurt), and Asia Pacific (Sydney) as of the publishing of this blog.

Start here:

My trends analyst agent took about 5 minutes from idea to first invocation. I'm already thinking about what to build next. Your turn to get started.