Introduction
"Agent infrastructure should be lightweight, composable, and provider-agnostic."
This is the No.60 article in the "One Open Source Project a Day" series. Today, we are exploring OpenHarness.
Over the past few articles, we covered OpenAI's Symphony (an agent orchestration spec), Addy Osmani's Agent Skills (an engineering discipline skill set), and Anthropic's Financial Services (a financial industry agent suite). Together, they paint a clear trend: AI agents are evolving from "chat assistants" into "engineering infrastructure that can execute workflows."
OpenHarness is an excellent implementation of exactly that infrastructure layer. Built in Python by HKUDS (HKU Data Science Lab) at the University of Hong Kong, it provides four core capabilities—tool-use, skill loading, memory management, and multi-agent coordination—and supports everything from Claude to DeepSeek to a locally running Ollama instance. Its 12.2k Stars reflect the developer community's endorsement of its philosophy: lightweight, composable, and provider-agnostic.
What You Will Learn
- The five core pillars of OpenHarness (Agent Loop, Harness Toolkit, Context & Memory, Governance, Swarm)
- How to install and launch an AI agent with 43+ built-in tools using a single command
- What the MEMORY.md persistent memory mechanism is and how it enables cross-session context recovery
- How the Governance layer ensures safe agent execution through permission modes and hooks
- How the ohmo personal agent automates coding tasks via Feishu, Slack, and Telegram
Prerequisites
- Basic Python familiarity (pip install, command line)
- Basic understanding of AI agents (knowing that LLMs can call tools is sufficient)
- An API key for Anthropic or OpenAI (or an existing subscription)
Project Background
Project Introduction
OpenHarness is an open-source Python framework designed to provide AI agents with core lightweight infrastructure: Tool-Use, Skills, Memory, and Multi-Agent Coordination.
Its design philosophy rests on three keywords:
- Lightweight: No complex DSLs or heavyweight framework dependencies—core logic is clean and readable
- Composable: 43+ tools, a skill system, and MCP integrations are all loaded on demand
- Provider-Agnostic: The same codebase runs on Claude, DeepSeek, and Ollama without modification
The project also ships ohmo, a personal agent built on OpenHarness. It bridges Feishu, Slack, Telegram, and Discord, running on your existing Claude Code or GitHub Copilot subscription to autonomously create branches, write code, run tests, and open pull requests.
Author/Team Introduction
- Team: HKUDS (HKU Data Science Lab, University of Hong Kong)
- Background: HKUDS is a research group at HKU with deep experience in recommender systems, graph neural networks, and large-model applications, with multiple well-known open-source projects to their name
- Project Positioning: Academic rigor combined with engineering pragmatism—114 passing unit tests and 6 E2E test suites back up the claims
Project Data
- ⭐ GitHub Stars: 12,200+
- 🍴 Forks: 2,000+
- 📝 Commits: 380+
- 🧪 Tests: 114 passing unit tests + 6 E2E suites
- 🔧 Built-in Tools: 43+
- 📄 License: MIT
- 🌐 Repository: HKUDS/OpenHarness
Main Features
Core Utility
OpenHarness plays the role of the "OS kernel" for AI agents. It is not a chat interface for end users—it provides developers with the foundational, essential runtime capabilities needed to build AI agents.
Think of it like the Linux kernel: you don't use the kernel directly, but every application you build depends on the kernel for process scheduling, file system access, and networking. OpenHarness provides the equivalent for AI agents: tool execution scheduling, persistent memory, permission control, and sub-agent coordination.
Use Cases
-
Personal Developer Workflow Automation
- Send ohmo a message on Telegram; the AI automatically creates a GitHub branch, writes code, runs tests, and opens a PR.
-
Building Domain-Specific Agents
- Use OpenHarness as a base to develop your own agent applications, loading skills for financial analysis, code review, or document generation on demand.
-
Multi-Model Comparison and Switching
- Switch seamlessly between Claude, GPT-4, DeepSeek, and a local Ollama model with the same agent code, comparing output quality and cost.
-
Enterprise-Grade Agent Governance
- Use permission modes and hooks to control agent access to the file system and shell commands in team environments.
-
Multi-Agent Collaboration Systems
- Use the Swarm module to launch a team of sub-agents, decomposing complex tasks into parallel subtasks for faster execution.
Quick Start
Installation:
# Method 1: One-command install script
curl -fsSL https://raw.githubusercontent.com/HKUDS/OpenHarness/main/scripts/install.sh | bash
# Method 2: pip install
pip install openharness-ai
Initial Setup:
# Interactive provider configuration (Claude / OpenAI / DeepSeek / etc.)
oh setup
# Example: configure Claude
# Provider: anthropic
# API Key: sk-ant-...
# Model: claude-opus-4-6
Basic Usage:
# Launch interactive terminal UI
oh
# Single-task execution (non-interactive)
oh -p "Analyze the Python code in the current directory and find all unhandled exceptions"
# JSON output format (for pipes and script integration)
oh -p "List all TODO comments" --output-format json
# Dry-run mode (preview config, execute nothing)
oh --dry-run
ohmo Personal Agent:
# Install ohmo
pip install ohmo
# Configure messaging platform (Feishu / Slack / Telegram / Discord)
ohmo setup --platform telegram
# Start listening
ohmo start
# Now send a message on Telegram:
# "Fix the login bug on the feature/login-fix branch and open a PR when done"
# ohmo automatically: creates branch → writes code → runs tests → opens PR
Core Characteristics (Five Pillars)
1. Agent Loop (Loop Engine)
The heart of OpenHarness—a streaming tool-call cycle that handles every round of interaction with the LLM:
# Conceptual Agent Loop structure
while not done:
response = llm.stream(messages, tools=available_tools)
if response.has_tool_calls:
# Execute multiple tool calls in parallel
results = parallel_execute(response.tool_calls)
messages.append(tool_results(results))
else:
# Model delivers final response — loop ends
done = True
yield response.text
Key capabilities:
- Streaming output: Display results as they generate, minimizing perceived latency
- Exponential backoff retry: Automatically retries on API rate limits without user interruption
- Parallel tool execution: Multiple tool calls execute simultaneously for significant speedups
- Token counting and cost tracking: Real-time display of token consumption and API cost per call
2. Harness Toolkit (Tool Suite)
43+ built-in tools covering the vast majority of everyday agent tasks:
| Category | Example Tools |
|---|---|
| File Operations | read_file, write_file, edit_file, list_dir, search_files |
| Shell Commands | bash_execute, python_execute, node_execute |
| Web | web_search, web_fetch, web_screenshot, parse_html |
| MCP Integration | Connect to any MCP server (HTTP/SSE transport) |
| On-Demand Skills | Dynamically load expertise from Markdown skill files |
3. Context & Memory
One of OpenHarness's most thoughtfully engineered modules:
- CLAUDE.md discovery and injection: On startup, automatically scans the working directory for a CLAUDE.md file and injects it as system context (familiar to Claude Code users)
- Auto-Compaction: When context approaches the model's limit, automatically compresses conversation history while preserving key information
- MEMORY.md persistent memory: Important things the agent learns during a session are written to MEMORY.md and automatically restored on the next launch—enabling genuine cross-session memory
- Session resumption: Pick up exactly where you left off without re-explaining the background
# Example MEMORY.md (auto-maintained by the agent)
## Project Memory
### User Preferences
- Python must run in the conda `dev_base` environment
- Commit messages should be in English
- Test coverage requirement: > 80%
### Known Issues
- `auth.py:142` has a known race condition — pending fix
- PostgreSQL connection pool needs max_conn adjustment under high concurrency
4. Governance Layer
In production environments, letting an agent freely access the file system and execute shell commands is dangerous. OpenHarness's Governance module provides:
- Multi-level permission modes: From read-only to fully autonomous, configurable per scenario
- Path-level command rules: Precisely control which directories the agent can read/write and which commands it can execute
- PreToolUse/PostToolUse hooks: Insert custom logic before and after tool execution (logging, auditing, secondary confirmation)
- Interactive approval dialogs: For high-risk operations (deleting files, running deployment commands), display a confirmation prompt for user approval
# Governance configuration example (conceptual)
governance:
mode: restricted
allowed_paths:
read: ["./src", "./docs"]
write: ["./output"]
forbidden_commands:
- "rm -rf"
- "git push --force"
hooks:
pre_tool_use:
- log_tool_call # Log all tool calls
post_tool_use:
- validate_output # Validate tool outputs
require_approval:
- shell_execute # Require user approval for shell execution
5. Swarm Coordination
For complex tasks requiring parallel processing, a single agent is too slow. The Swarm module enables multi-agent collaboration:
# Swarm usage example (conceptual)
from openharness import Swarm, Agent
swarm = Swarm()
# Register a team of specialist agents
swarm.register("code_analyst", Agent(skills=["code-review"]))
swarm.register("security_auditor", Agent(skills=["security"]))
swarm.register("doc_writer", Agent(skills=["documentation"]))
# Delegate tasks—execute in parallel
results = await swarm.delegate({
"code_analyst": "Analyze code quality in the src/ directory",
"security_auditor": "Scan for potential security vulnerabilities",
"doc_writer": "Generate an API documentation draft"
})
Project Advantages
| Feature | OpenHarness | LangChain / LlamaIndex | AutoGen |
|---|---|---|---|
| Learning Curve | Low (just run oh) |
High (many abstraction layers) | Medium |
| Core Codebase Size | Lightweight | Hundreds of thousands of lines | Medium |
| Provider Support | 10+ providers (including local models) | Many, but complex configuration | Primarily OpenAI |
| Memory Mechanism | Native MEMORY.md persistence | Requires external integration | Limited |
| Multi-Agent | Swarm native support | Via agent framework | Core feature |
| Governance/Permissions | Built-in multi-level + hooks | Not built-in | Limited |
| MCP Support | Native (HTTP/SSE transport) | Plugin-based | None |
Detailed Analysis
1. Multi-Provider Support: Truly Provider-Agnostic
OpenHarness supports model providers across three tiers:
Anthropic-compatible (via Anthropic SDK):
oh setup
# Provider: anthropic → Claude series
# Provider: moonshot → Kimi
# Provider: glm → Zhipu GLM
# Provider: minimax → MiniMax
OpenAI-compatible (via OpenAI SDK):
# Provider: openai → GPT-4, GPT-4o
# Provider: openrouter → Multi-model aggregator
# Provider: dashscope → Alibaba Qwen
# Provider: deepseek → DeepSeek series
# Provider: groq → Ultra-fast Llama inference
# Provider: ollama → Local open-source models
# Provider: github → GitHub Models
Subscription bridges (no API key needed—reuse existing subscriptions):
# Provider: claude-code → Reuse your Claude Code subscription
# Provider: codex → Reuse your GitHub Copilot (Codex CLI) subscription
This means zero additional API costs if you already subscribe to Claude Code or GitHub Copilot.
2. The 43-Tool Arsenal
OpenHarness's built-in tools give AI agents the ability to genuinely "get things done":
File System (~10 tools):
read_file, write_file, edit_file, list_dir, search_files,
create_dir, delete_file, move_file, copy_file, get_file_info
Shell (~5 tools):
bash_execute, python_execute, node_execute, get_env, set_env
Web (~8 tools):
web_search, web_fetch, web_screenshot, parse_html,
download_file, check_url, get_headers
Code (~8 tools):
lint_code, format_code, run_tests, build_project,
git_status, git_commit, git_diff, git_log
MCP (~5 tools):
mcp_connect, mcp_list_tools, mcp_call_tool,
mcp_list_resources, mcp_read_resource
Other (~7 tools):
token_count, cost_estimate, task_spawn, memory_read,
memory_write, skill_load, context_compress
3. ohmo: From Framework to Product
If OpenHarness is the "engine," ohmo is the "first production car" built on it:
User sends a message on Telegram
↓
ohmo receives the message
↓
OpenHarness Agent Loop kicks in
↓
Tools invoked (git, bash, file ops, etc.)
↓
Auto: creates branch → writes code → runs tests → opens PR
↓
ohmo replies on Telegram: "Task complete. PR #47 is open for review."
This flow demonstrates OpenHarness's true value: it encapsulates all the messy, unglamorous infrastructure work of "making an AI agent actually do things," freeing higher-level applications like ohmo to focus entirely on business logic.
Project Links & Resources
Official Resources
- 🌟 GitHub: https://github.com/HKUDS/OpenHarness
- 🤖 ohmo Personal Agent: Included in the main repository
- 🛠️ Install Script: scripts/install.sh
Target Audience
- Python developers who want to build their own AI agents without building the infrastructure from scratch
- AI application explorers who want to compare outputs across model providers and find the best price-performance ratio
- Enterprise architects who need a governable, auditable AI agent runtime
- Personal productivity enthusiasts who want to realize the dream of "send a message, AI handles the code"
Summary
Key Takeaways
- Five pillars: Agent Loop, Harness Toolkit (43+ tools), Context & Memory (MEMORY.md persistence), Governance (multi-level permissions + hooks), Swarm (multi-agent coordination)
- Truly provider-agnostic: 10+ providers, including direct reuse of Claude Code and GitHub Copilot subscriptions
- MEMORY.md mechanism is the most distinctive design—giving agents genuine long-term memory
- From HKUDS at HKU: academic rigor meets engineering practicality (114 tests + 6 E2E suites)
- ohmo is the best-practice showcase—a complete path from "infrastructure framework" to "usable product"
One-Line Review
OpenHarness does the least glamorous but most important work in the AI agent space: making tool-use, memory, permissions, and multi-agent coordination clean and reliable—so the applications built on top can stand on its shoulders with elegance.
Find more useful knowledge and interesting products on my Homepage
Top comments (0)