WonderLab

Posted on May 9

One Open Source Project a Day (No. 60): OpenHarness - Lightweight AI Agent Infrastructure Framework

#opensource #harness #agents #llm

Introduction

"Agent infrastructure should be lightweight, composable, and provider-agnostic."

This is the No.60 article in the "One Open Source Project a Day" series. Today, we are exploring OpenHarness.

Over the past few articles, we covered OpenAI's Symphony (an agent orchestration spec), Addy Osmani's Agent Skills (an engineering discipline skill set), and Anthropic's Financial Services (a financial industry agent suite). Together, they paint a clear trend: AI agents are evolving from "chat assistants" into "engineering infrastructure that can execute workflows."

OpenHarness is an excellent implementation of exactly that infrastructure layer. Built in Python by HKUDS (HKU Data Science Lab) at the University of Hong Kong, it provides four core capabilities—tool-use, skill loading, memory management, and multi-agent coordination—and supports everything from Claude to DeepSeek to a locally running Ollama instance. Its 12.2k Stars reflect the developer community's endorsement of its philosophy: lightweight, composable, and provider-agnostic.

What You Will Learn

The five core pillars of OpenHarness (Agent Loop, Harness Toolkit, Context & Memory, Governance, Swarm)
How to install and launch an AI agent with 43+ built-in tools using a single command
What the MEMORY.md persistent memory mechanism is and how it enables cross-session context recovery
How the Governance layer ensures safe agent execution through permission modes and hooks
How the ohmo personal agent automates coding tasks via Feishu, Slack, and Telegram

Prerequisites

Basic Python familiarity (pip install, command line)
Basic understanding of AI agents (knowing that LLMs can call tools is sufficient)
An API key for Anthropic or OpenAI (or an existing subscription)

Project Background

Project Introduction

OpenHarness is an open-source Python framework designed to provide AI agents with core lightweight infrastructure: Tool-Use, Skills, Memory, and Multi-Agent Coordination.

Its design philosophy rests on three keywords:

Lightweight: No complex DSLs or heavyweight framework dependencies—core logic is clean and readable
Composable: 43+ tools, a skill system, and MCP integrations are all loaded on demand
Provider-Agnostic: The same codebase runs on Claude, DeepSeek, and Ollama without modification

The project also ships ohmo, a personal agent built on OpenHarness. It bridges Feishu, Slack, Telegram, and Discord, running on your existing Claude Code or GitHub Copilot subscription to autonomously create branches, write code, run tests, and open pull requests.

Author/Team Introduction

Team: HKUDS (HKU Data Science Lab, University of Hong Kong)
Background: HKUDS is a research group at HKU with deep experience in recommender systems, graph neural networks, and large-model applications, with multiple well-known open-source projects to their name
Project Positioning: Academic rigor combined with engineering pragmatism—114 passing unit tests and 6 E2E test suites back up the claims

Project Data

⭐ GitHub Stars: 12,200+
🍴 Forks: 2,000+
📝 Commits: 380+
🧪 Tests: 114 passing unit tests + 6 E2E suites
🔧 Built-in Tools: 43+
📄 License: MIT
🌐 Repository: HKUDS/OpenHarness

Main Features

Core Utility

OpenHarness plays the role of the "OS kernel" for AI agents. It is not a chat interface for end users—it provides developers with the foundational, essential runtime capabilities needed to build AI agents.

Think of it like the Linux kernel: you don't use the kernel directly, but every application you build depends on the kernel for process scheduling, file system access, and networking. OpenHarness provides the equivalent for AI agents: tool execution scheduling, persistent memory, permission control, and sub-agent coordination.

Use Cases

Personal Developer Workflow Automation
- Send ohmo a message on Telegram; the AI automatically creates a GitHub branch, writes code, runs tests, and opens a PR.
Building Domain-Specific Agents
- Use OpenHarness as a base to develop your own agent applications, loading skills for financial analysis, code review, or document generation on demand.
Multi-Model Comparison and Switching
- Switch seamlessly between Claude, GPT-4, DeepSeek, and a local Ollama model with the same agent code, comparing output quality and cost.
Enterprise-Grade Agent Governance
- Use permission modes and hooks to control agent access to the file system and shell commands in team environments.
Multi-Agent Collaboration Systems
- Use the Swarm module to launch a team of sub-agents, decomposing complex tasks into parallel subtasks for faster execution.

Quick Start

Installation:

# Method 1: One-command install script
curl -fsSL https://raw.githubusercontent.com/HKUDS/OpenHarness/main/scripts/install.sh | bash

# Method 2: pip install
pip install openharness-ai

Initial Setup:

# Interactive provider configuration (Claude / OpenAI / DeepSeek / etc.)
oh setup

# Example: configure Claude
# Provider: anthropic
# API Key: sk-ant-...
# Model: claude-opus-4-6

Basic Usage:

# Launch interactive terminal UI
oh

# Single-task execution (non-interactive)
oh -p "Analyze the Python code in the current directory and find all unhandled exceptions"

# JSON output format (for pipes and script integration)
oh -p "List all TODO comments" --output-format json

# Dry-run mode (preview config, execute nothing)
oh --dry-run

ohmo Personal Agent:

# Install ohmo
pip install ohmo

# Configure messaging platform (Feishu / Slack / Telegram / Discord)
ohmo setup --platform telegram

# Start listening
ohmo start

# Now send a message on Telegram:
# "Fix the login bug on the feature/login-fix branch and open a PR when done"
# ohmo automatically: creates branch → writes code → runs tests → opens PR

Core Characteristics (Five Pillars)

1. Agent Loop (Loop Engine)

The heart of OpenHarness—a streaming tool-call cycle that handles every round of interaction with the LLM:

# Conceptual Agent Loop structure
while not done:
    response = llm.stream(messages, tools=available_tools)

    if response.has_tool_calls:
        # Execute multiple tool calls in parallel
        results = parallel_execute(response.tool_calls)
        messages.append(tool_results(results))
    else:
        # Model delivers final response — loop ends
        done = True
        yield response.text

Key capabilities:

Streaming output: Display results as they generate, minimizing perceived latency
Exponential backoff retry: Automatically retries on API rate limits without user interruption
Parallel tool execution: Multiple tool calls execute simultaneously for significant speedups
Token counting and cost tracking: Real-time display of token consumption and API cost per call

2. Harness Toolkit (Tool Suite)

43+ built-in tools covering the vast majority of everyday agent tasks:

Category	Example Tools
File Operations	read_file, write_file, edit_file, list_dir, search_files
Shell Commands	bash_execute, python_execute, node_execute
Web	web_search, web_fetch, web_screenshot, parse_html
MCP Integration	Connect to any MCP server (HTTP/SSE transport)
On-Demand Skills	Dynamically load expertise from Markdown skill files

3. Context & Memory

One of OpenHarness's most thoughtfully engineered modules:

CLAUDE.md discovery and injection: On startup, automatically scans the working directory for a CLAUDE.md file and injects it as system context (familiar to Claude Code users)
Auto-Compaction: When context approaches the model's limit, automatically compresses conversation history while preserving key information
MEMORY.md persistent memory: Important things the agent learns during a session are written to MEMORY.md and automatically restored on the next launch—enabling genuine cross-session memory
Session resumption: Pick up exactly where you left off without re-explaining the background

# Example MEMORY.md (auto-maintained by the agent)
## Project Memory

### User Preferences
- Python must run in the conda `dev_base` environment
- Commit messages should be in English
- Test coverage requirement: > 80%

### Known Issues
- `auth.py:142` has a known race condition — pending fix
- PostgreSQL connection pool needs max_conn adjustment under high concurrency

4. Governance Layer

In production environments, letting an agent freely access the file system and execute shell commands is dangerous. OpenHarness's Governance module provides:

Multi-level permission modes: From read-only to fully autonomous, configurable per scenario
Path-level command rules: Precisely control which directories the agent can read/write and which commands it can execute
PreToolUse/PostToolUse hooks: Insert custom logic before and after tool execution (logging, auditing, secondary confirmation)
Interactive approval dialogs: For high-risk operations (deleting files, running deployment commands), display a confirmation prompt for user approval

# Governance configuration example (conceptual)
governance:
  mode: restricted
  allowed_paths:
    read: ["./src", "./docs"]
    write: ["./output"]
  forbidden_commands:
    - "rm -rf"
    - "git push --force"
  hooks:
    pre_tool_use:
      - log_tool_call         # Log all tool calls
    post_tool_use:
      - validate_output       # Validate tool outputs
  require_approval:
    - shell_execute           # Require user approval for shell execution

5. Swarm Coordination

For complex tasks requiring parallel processing, a single agent is too slow. The Swarm module enables multi-agent collaboration:

# Swarm usage example (conceptual)
from openharness import Swarm, Agent

swarm = Swarm()

# Register a team of specialist agents
swarm.register("code_analyst", Agent(skills=["code-review"]))
swarm.register("security_auditor", Agent(skills=["security"]))
swarm.register("doc_writer", Agent(skills=["documentation"]))

# Delegate tasks—execute in parallel
results = await swarm.delegate({
    "code_analyst":    "Analyze code quality in the src/ directory",
    "security_auditor": "Scan for potential security vulnerabilities",
    "doc_writer":      "Generate an API documentation draft"
})

Project Advantages

Feature	OpenHarness	LangChain / LlamaIndex	AutoGen
Learning Curve	Low (just run `oh`)	High (many abstraction layers)	Medium
Core Codebase Size	Lightweight	Hundreds of thousands of lines	Medium
Provider Support	10+ providers (including local models)	Many, but complex configuration	Primarily OpenAI
Memory Mechanism	Native MEMORY.md persistence	Requires external integration	Limited
Multi-Agent	Swarm native support	Via agent framework	Core feature
Governance/Permissions	Built-in multi-level + hooks	Not built-in	Limited
MCP Support	Native (HTTP/SSE transport)	Plugin-based	None

Detailed Analysis

1. Multi-Provider Support: Truly Provider-Agnostic

OpenHarness supports model providers across three tiers:

Anthropic-compatible (via Anthropic SDK):

oh setup
# Provider: anthropic → Claude series
# Provider: moonshot  → Kimi
# Provider: glm       → Zhipu GLM
# Provider: minimax   → MiniMax

OpenAI-compatible (via OpenAI SDK):

# Provider: openai      → GPT-4, GPT-4o
# Provider: openrouter  → Multi-model aggregator
# Provider: dashscope   → Alibaba Qwen
# Provider: deepseek    → DeepSeek series
# Provider: groq        → Ultra-fast Llama inference
# Provider: ollama      → Local open-source models
# Provider: github      → GitHub Models

Subscription bridges (no API key needed—reuse existing subscriptions):

# Provider: claude-code → Reuse your Claude Code subscription
# Provider: codex       → Reuse your GitHub Copilot (Codex CLI) subscription

This means zero additional API costs if you already subscribe to Claude Code or GitHub Copilot.

2. The 43-Tool Arsenal

OpenHarness's built-in tools give AI agents the ability to genuinely "get things done":

File System (~10 tools):
  read_file, write_file, edit_file, list_dir, search_files,
  create_dir, delete_file, move_file, copy_file, get_file_info

Shell (~5 tools):
  bash_execute, python_execute, node_execute, get_env, set_env

Web (~8 tools):
  web_search, web_fetch, web_screenshot, parse_html,
  download_file, check_url, get_headers

Code (~8 tools):
  lint_code, format_code, run_tests, build_project,
  git_status, git_commit, git_diff, git_log

MCP (~5 tools):
  mcp_connect, mcp_list_tools, mcp_call_tool,
  mcp_list_resources, mcp_read_resource

Other (~7 tools):
  token_count, cost_estimate, task_spawn, memory_read,
  memory_write, skill_load, context_compress

3. ohmo: From Framework to Product

If OpenHarness is the "engine," ohmo is the "first production car" built on it:

User sends a message on Telegram
           ↓
    ohmo receives the message
           ↓
  OpenHarness Agent Loop kicks in
           ↓
  Tools invoked (git, bash, file ops, etc.)
           ↓
  Auto: creates branch → writes code → runs tests → opens PR
           ↓
  ohmo replies on Telegram: "Task complete. PR #47 is open for review."

This flow demonstrates OpenHarness's true value: it encapsulates all the messy, unglamorous infrastructure work of "making an AI agent actually do things," freeing higher-level applications like ohmo to focus entirely on business logic.

Project Links & Resources

Official Resources

🌟 GitHub: https://github.com/HKUDS/OpenHarness
🤖 ohmo Personal Agent: Included in the main repository
🛠️ Install Script: scripts/install.sh

Target Audience

Python developers who want to build their own AI agents without building the infrastructure from scratch
AI application explorers who want to compare outputs across model providers and find the best price-performance ratio
Enterprise architects who need a governable, auditable AI agent runtime
Personal productivity enthusiasts who want to realize the dream of "send a message, AI handles the code"

Summary

Key Takeaways

Five pillars: Agent Loop, Harness Toolkit (43+ tools), Context & Memory (MEMORY.md persistence), Governance (multi-level permissions + hooks), Swarm (multi-agent coordination)
Truly provider-agnostic: 10+ providers, including direct reuse of Claude Code and GitHub Copilot subscriptions
MEMORY.md mechanism is the most distinctive design—giving agents genuine long-term memory
From HKUDS at HKU: academic rigor meets engineering practicality (114 tests + 6 E2E suites)
ohmo is the best-practice showcase—a complete path from "infrastructure framework" to "usable product"

One-Line Review

OpenHarness does the least glamorous but most important work in the AI agent space: making tool-use, memory, permissions, and multi-agent coordination clean and reliable—so the applications built on top can stand on its shoulders with elegance.

Find more useful knowledge and interesting products on my Homepage

DEV Community