Udit

Posted on May 26

Building an Agent Harness from Scratch: The Loop and the Tools

#agents #ai #llm #tutorial

How I built a secure coding agent for the terminal, and why the best way to understand agents is to build a small harness yourself.

LLMs are powerful, but by themselves they are not really “agents.”

A model can generate text. It can explain code. It can suggest commands. It can describe how to fix a bug.

But it cannot actually inspect your project, read files, edit code, run tests, or iterate on the result unless something around it gives it those abilities.

That “something” is what I call an Agent Harness.

An Agent Harness is the runtime layer that connects an LLM to:

user input
conversation history
tools
tool results
safety rules
approvals
model/provider configuration
session state

In simple words:

An Agent Harness is the system that turns a language model into an acting system.

I recently built my own terminal coding agent, uai-agent, and the biggest thing I learned is this:

Building a small harness from scratch is the best way to understand agents.

Not reading papers.

Not only using LangChain or existing frameworks.

Not only prompting ChatGPT.

Actually building the loop yourself teaches you what agents really are.

In this article, I want to break down the two most important parts of an Agent Harness:

The Loop
The Tools

These two pieces are the heart of the system.

What Is an Agent Harness?

A basic LLM app usually looks like this:

User input
→ LLM
→ Response

That is just chat.

An agent harness looks more like this:

User input
→ LLM
→ Tool call?
→ Execute tool
→ Send tool result back to LLM
→ LLM continues
→ Repeat until final answer

This repeated cycle is what gives the agent the ability to act.

In my project, the harness runs inside the terminal. The goal is to let an AI coding assistant work inside the current workspace while still giving the developer control.

The agent can:

read files
write files
edit files
run safe shell commands
use workspace context
switch models/providers
save and load sessions
ask for approval before risky actions

The project structure looks like this:

uai-agent/
├── index.js              # Main CLI and agent loop
├── config.js             # Provider, model, and approval configuration
├── config/
│   ├── SYSTEM.md         # Agent behavior instructions
│   └── tools.js          # Tool schemas exposed to the model
├── tools/
│   ├── bash.js           # Controlled shell execution
│   ├── fsOps.js          # File read/write/edit operations
│   └── toolCall.js       # Tool dispatching
├── utils/
│   ├── approval.js       # Approval workflow
│   ├── pathSecurity.js   # Workspace restrictions
│   ├── userAppend.js     # Context tag processing
│   └── commands.js       # Slash command handlers
└── test/

The important idea is that the model is not directly touching my computer.

The model only produces structured tool requests.

The harness decides:

whether the tool exists
whether the arguments are valid
whether the operation is safe
whether the user must approve it
how to execute it
how to send the result back to the model

That separation is the core of a good agent harness.

Part 1: The Loop

The loop is the heart of the agent.

Without a loop, the model can only answer once.

With a loop, the model can:

receive a task
inspect the project
call a tool
observe the output
make another decision
continue until the task is done

A simplified version of the loop looks like this:

while (true) {
  const userInput = await askUser()

  messages.push({
    role: "user",
    content: userInput
  })

  const response = await model.chat({
    messages,
    tools
  })

  if (response.hasToolCalls) {
    messages.push(response.assistantMessage)

    const toolResults = await executeTools(response.toolCalls)

    messages.push(...toolResults)

    continue
  }

  messages.push({
    role: "assistant",
    content: response.content
  })

  print(response.content)
}

That is the basic shape of an agent.

The important thing is not just calling the model.

The important thing is maintaining the conversation state.

In my actual harness, the conversation is stored in a message array:

const msgArray = [
  {
    role: "system",
    content: systemPrompt
  }
]

Every user message, assistant response, tool call, and tool result gets added to this array.

That means the model can see the history of what happened.

For example:

System: You are a coding assistant.
User: Find why the tests are failing.
Assistant: I need to inspect package.json.
Assistant tool_call: read package.json
Tool: contents of package.json
Assistant: I should run npm test.
Assistant tool_call: bash npm test
Tool: test output
Assistant: The failure is in fsOps.test.js...

This is what makes the agent feel continuous.

The LLM is stateless by itself.

The harness gives it memory through the message array.

The Real Agent Loop in `uai-agent`

In index.js, the main loop is basically:

(async () => {
  while (true) {
    await main()
  }
})()

Inside main(), the harness does several things:

initialize the model client
ask the user for input
handle slash commands like /clear, /model, /save
attach extra context from tags like @./file.js or @workspace
send messages and tools to the model
stream the assistant response
collect tool calls if the model requests them
ask for approval if needed
execute the tools
send tool results back into the conversation
save the session

The conceptual flow is:

User
 ↓
Add context
 ↓
Send messages + tools to model
 ↓
Stream assistant response
 ↓
Did model request tools?
 ↓
Yes → approve → execute → append tool result → loop again
No  → print final answer → wait for next user

This is the most important mental model for building agents.

The loop is not magic.

It is just controlled repetition.

Why the Loop Matters

A single LLM response is not enough for coding work.

Imagine asking:

Fix the failing tests.

The model needs to do multiple steps:

1. Inspect the project.
2. Read package.json.
3. Run the test command.
4. Read the failing test file.
5. Read the source file.
6. Edit the source file.
7. Run tests again.
8. Report the fix.

That cannot be done in one normal chat completion unless the harness lets the model call tools and observe results.

The loop is what creates this behavior:

Think → Act → Observe → Think → Act → Observe

Or in coding-agent terms:

Prompt → Tool call → Tool result → Next prompt

This is the agent pattern.

Part 2: Tools

If the loop is the heart of the agent, tools are the hands.

Tools define what the agent can actually do.

In my harness, I started with four practical coding tools:

read
bash
write
edit

These are simple, but they are enough to build a useful coding assistant.

Tool 1: `read`

The read tool lets the agent inspect a file.

{
  name: "read",
  description: "Read the contents of a file.",
  parameters: {
    filePath: "string"
  }
}

The model might request:

{
  "filePath": "./package.json"
}

The harness then reads the file and sends the content back as a tool result.

This is important because the model should not guess what is inside your codebase.

A coding agent should inspect before editing.

Tool 2: `bash`

The bash tool lets the agent run terminal commands.

Example:

{
  "command": "npm test"
}

But this is also the most dangerous tool.

A shell command can do real damage if you are careless.

So in my harness, bash is heavily restricted.

The code uses an allowlist of commands such as:

const ALLOWED_COMMANDS = new Set([
  "ls",
  "dir",
  "pwd",
  "echo",
  "cat",
  "head",
  "tail",
  "find",
  "grep",
  "wc",
  "git",
  "npm",
  "node",
  "true",
  "false",
  "seq"
])

It also blocks dangerous patterns like:

const BLOCKED_COMMANDS = [
  "rm -rf /",
  "mkfs",
  "dd if=",
  "shutdown",
  "reboot",
  "sudo",
  "cat /etc/shadow"
]

The harness does not simply trust the model.

It validates the command first.

That is a very important principle:

The model can request an action, but the harness decides whether the action is allowed.

Tool 3: `write`

The write tool lets the agent create or overwrite a file.

{
  name: "write",
  parameters: {
    filePath: "string",
    content: "string"
  }
}

This is useful for generating new files, tests, docs, or config files.

But again, the harness checks the path before writing.

The model should not be able to write anywhere on your machine.

It should only operate inside the current workspace.

Tool 4: `edit`

The edit tool replaces exact text in a file.

{
  name: "edit",
  parameters: {
    filePath: "string",
    oldContent: "string",
    newContent: "string"
  }
}

This is safer than asking the model to rewrite entire files every time.

The model must provide the exact old content and the replacement.

A simplified implementation looks like this:

function editFile(filePath, oldContent, newContent) {
  const data = fs.readFileSync(filePath, "utf-8")

  if (!data.includes(oldContent)) {
    return "Error: oldContent not found. No changes made."
  }

  fs.writeFileSync(
    filePath,
    data.replace(oldContent, newContent),
    "utf-8"
  )

  return "File edited successfully."
}

This gives the agent precision.

It also reduces accidental edits.

Tool Schemas: How the Model Knows What It Can Do

The model does not magically know your tools.

You have to describe them.

In config/tools.js, each tool is defined as a function schema.

For example, the read tool:

{
  type: "function",
  function: {
    name: "read",
    description: "Read the contents of a file.",
    strict: true,
    parameters: {
      type: "object",
      properties: {
        filePath: {
          type: "string",
          description: "The path to the file to read."
        }
      },
      required: ["filePath"],
      additionalProperties: false
    }
  }
}

This schema does three things:

tells the model the tool exists
explains when to use it
defines the exact argument shape

The important part is:

strict: true

and:

additionalProperties: false

This pushes the model to produce clean structured arguments.

The model should not call:

{
  "path": "./index.js",
  "random": true
}

It should call:

{
  "filePath": "./index.js"
}

Tool schemas are the contract between the model and the harness.

Tool Dispatching: Turning Model Requests into Real Actions

Once the model emits a tool call, the harness has to execute it.

In my project, this happens in tools/toolCall.js.

The idea is to keep a map of tool names to handlers:

const toolHandlers = new Map()

toolHandlers.set("read", async (input) => {
  return readFile(input.filePath)
})

toolHandlers.set("write", async (input) => {
  return writeFile(input.filePath, input.content)
})

toolHandlers.set("edit", async (input) => {
  return editFile(
    input.filePath,
    input.oldContent,
    input.newContent
  )
})

toolHandlers.set("bash", async (input) => {
  return bash(input.command)
})

Then, when the model asks for a tool, the harness does:

const handler = toolHandlers.get(toolName)

if (!handler) {
  return {
    role: "tool",
    content: `Unknown tool: ${toolName}`
  }
}

const output = await handler(input)

return {
  role: "tool",
  tool_call_id: toolCallId,
  content: output
}

This is a clean design because adding a new tool becomes simple:

implement the tool
register the handler
add the schema
write tests

The Most Important Safety Rule: The Model Requests, the Harness Decides

One mistake people make when building their first agent is giving the model too much direct power.

That is dangerous.

A model can hallucinate.

A model can misunderstand.

A model can produce risky commands.

A model can accidentally leak sensitive information.

So the harness must be the authority.

There are several safety layers.

1. Workspace-Scoped File Access

The agent should only operate inside the current project.

In utils/pathSecurity.js, paths are resolved against the current working directory.

The idea is:

function resolveWorkspacePath(filePath) {
  const root = realpath(process.cwd())
  const candidate = realpath(resolve(filePath))

  if (!candidate.startsWith(root)) {
    return {
      ok: false,
      reason: "Path outside workspace is not allowed"
    }
  }

  return {
    ok: true,
    realPath: candidate
  }
}

This prevents the model from reading or writing files like:

/etc/passwd
~/.ssh/id_rsa
../some-other-project/.env

This is critical.

If you are building a coding agent, do not skip workspace restrictions.

2. `.gitignore` Protection

My harness can also treat .gitignore files as unsafe.

Why?

Because .gitignore often contains:

.env
node_modules
dist
secrets.json
coverage

If a file is ignored, there is a good chance it should not be casually sent to an LLM.

So the harness checks .gitignore patterns and can require approval or deny access.

That is a practical safety feature many simple agents miss.

3. Bash Validation

The bash tool validates commands before execution.

It blocks:

shell operators
redirection
command substitution
absolute paths
parent traversal
dangerous git operations
unsafe npm commands
destructive commands

For example, the harness rejects commands with shell metacharacters:

const SHELL_METACHARS = /[;&|`$<>\n\r]/

That means the model cannot do:

cat package.json && rm -rf .

It also executes commands using execFile, not a shell:

execFileAsync(file, args, {
  cwd: process.cwd(),
  shell: false
})

This is much safer than passing arbitrary strings to a shell.

4. Approval Workflow

A good harness should not always auto-execute everything.

In config.js, I use approval modes:

export const autoApprove = {
  default: "auto",

  bash: {
    promptExecution: true,
    promptSending: true
  },

  read: {
    promptExecution: false,
    promptSending: true
  },

  write: {
    promptExecution: true,
    promptSending: false
  },

  edit: {
    promptExecution: true,
    promptSending: false
  }
}

The modes are:

auto   → safe operations can run automatically, risky ones ask
manual → use per-tool settings
block  → block all tool calls
allow  → approve everything, useful only for testing

This gives the developer control.

For example, if the agent wants to run a shell command, the CLI can ask:

Execute this tool call? (y/N):

And after getting the output, it can ask whether to send that output back to the model.

This matters because command output may contain sensitive information.

Context Tags: A Small Feature That Makes the Agent Feel Useful

One feature I added is context tags.

The user can type:

Review @./README.md and suggest improvements.

or:

Look at @workspace and tell me how this project is structured.

The harness detects these tags before sending the message to the model.

In utils/userAppend.js, it parses patterns like:

@./file.js
@workspace

Then it adds the file contents or workspace listing to the user message.

This is a nice middle ground between manual copy-paste and fully autonomous file reading.

The user can explicitly attach context, and the agent can still use tools later if it needs more.

Provider-Agnostic Model Configuration

Another design decision: I did not want the harness to be locked to one provider.

The project uses OpenAI-compatible chat completion APIs, so the config can support multiple providers.

In config.js, providers are registered like:

export const models = {
  openai: {
    apiKey: keys.OPENAI_API_KEY,
    baseURL: keys.OPENAI_BASE_URL,
    gpt54mini: {
      model: "gpt-5.4-mini"
    }
  }
}

System Prompt: The Agent’s Operating Manual

Tools define what the agent can do.

The system prompt defines how the agent should behave.

In config/SYSTEM.md, I describe rules like:

be concise and action-oriented
inspect files before changing them
prefer small focused edits
do not invent tools
operate inside the workspace
avoid destructive operations
use bash only when needed
follow command restrictions

This is important because tool schemas alone are not enough.

You also need behavioral instructions.

The system prompt is like the agent’s operating manual.

But the system prompt is not security.

Security must still live in code.

A good rule is:

Prompt for behavior.
Code for enforcement.

Final Thoughts

An agent is not just an LLM.

An agent is an LLM inside a loop, connected to tools, controlled by a harness.

The loop gives the agent continuity.

Tools give the agent the ability to act.

Safety rules keep that action under control.

Building my own harness made agents feel much less mysterious. Under the hood, the core idea is simple:

User asks
Model decides
Harness validates
Tool executes
Result returns
Model continues

That is the whole pattern.

Of course, production systems add many more layers: memory, planning, retries, tracing, evals, permissions, sandboxes, and deployment infrastructure.

But the core is still the same.

If you want to understand agents deeply, build a small harness yourself.

It runs inside your workspace, gives the model practical coding tools, and keeps the developer in control with safety checks and approvals.

If you are interested, check out the repo here:

GitHub: https://github.com/uditrajput03/uai-agent

And if you are building your own harness, start with the loop and the tools.

Everything else grows from there.

DEV Community

Building an Agent Harness from Scratch: The Loop and the Tools

What Is an Agent Harness?

Part 1: The Loop

The Real Agent Loop in `uai-agent`

Why the Loop Matters

Part 2: Tools

Tool 1: `read`

Tool 2: `bash`

Tool 3: `write`

Tool 4: `edit`

Tool Schemas: How the Model Knows What It Can Do

Tool Dispatching: Turning Model Requests into Real Actions

The Most Important Safety Rule: The Model Requests, the Harness Decides

1. Workspace-Scoped File Access

2. `.gitignore` Protection

3. Bash Validation

4. Approval Workflow

Context Tags: A Small Feature That Makes the Agent Feel Useful

Provider-Agnostic Model Configuration

System Prompt: The Agent’s Operating Manual

Final Thoughts

Top comments (0)

What Is an Agent Harness?

Part 1: The Loop

The Real Agent Loop in uai-agent

Why the Loop Matters

Part 2: Tools

Tool 1: read

Tool 2: bash

Tool 3: write

Tool 4: edit

Tool Schemas: How the Model Knows What It Can Do

Tool Dispatching: Turning Model Requests into Real Actions

The Most Important Safety Rule: The Model Requests, the Harness Decides

1. Workspace-Scoped File Access

2. .gitignore Protection

3. Bash Validation

4. Approval Workflow

Context Tags: A Small Feature That Makes the Agent Feel Useful

Provider-Agnostic Model Configuration

System Prompt: The Agent’s Operating Manual

Final Thoughts

The Real Agent Loop in `uai-agent`

Tool 1: `read`

Tool 2: `bash`

Tool 3: `write`

Tool 4: `edit`

2. `.gitignore` Protection