Oresztesz Margaritisz

Posted on Jul 1 • Edited on Jul 6

Harness Engineering - Technology Landscape

#ai #sre #agents #programming

Harness engineering is the discipline of designing the environments, constraints, and feedback loops around AI coding agents that make them reliable at scale. The formula is: Agent = Model + Harness. Harness is everything that isn't the model: the infrastructure that governs how the agent operates, what it can access, and how it self-corrects.

In this article I'm collecting all the technologies I stumbled upon while researching harness engineering. The list is not exhaustive, but it should give you a good starting point to explore the landscape. This aims to be a living document, so I'll keep it updated.

high quality version of the technology landscape

SDD

Spec-Driven Development

OpenSpec - Spec-driven development for AI coding assistants
spec-kit - Toolkit to get started with Spec-Driven Development
BMAD - Breakthrough Method for Agile AI Driven Development

Orchestration

Symphony - Isolated autonomous implementation runs for teams
Conductor - Multi-agent workflows with GitHub Copilot SDK
OpenSquilla - Token-efficient AI agent with higher intelligence density
Fabro - Version-controlled workflow graphs orchestrating AI agents, commands, and human gates
pi.dev - Minimal agent harness - adapt to your workflow, not vice versa
smolagents - Barebones library for agents that think in code
hive - Multi-agent harness for production AI workloads
bernstein - Audit-grade multi-agent orchestration, HMAC-chained audit log

AST and Code Parsers

CocoIndex Code - AST-based lightweight code search engine CLI; saves 70% tokens
CodeGraphContext - MCP server indexing local code into a graph database
ast-grep - CLI tool for code structural search, lint and rewriting in Rust
Graphify - Turn code and docs into queryable AI knowledge graphs

Sandboxes

Nvidia OpenShell - Sandboxed agent runtime with hardware-enforced isolation and policy
forkd - fork() for AI agent microVMs; spawn 100 children in ~100ms
opensandbox - Secure, fast, extensible sandbox runtime for AI agents
Daytona - Elastic infrastructure for running AI-generated code; sub-90ms creation

Working with skills

Skill Reference

Agent Skills - Standardized open format for extending AI agent capabilities via SKILL.md files

Skill Syntax Validation

skill-validator - Validates skill content against the Agent Skill specification

Skill Dependency Management

skills (Vercel Labs) - Open agent skills CLI; supports OpenCode, Claude Code, Codex, Cursor and 68+ more

Skill Security Scanners

DefenseClaw - Security governance for agentic AI; scan capabilities, inspect traffic, audit evidence
SkillSpector - Security scanner for AI agent skills; detects vulnerabilities before installation

Knowledge Base

OKF

OKF Ecosystem Tools - Open Knowledge Format ecosystem; tools, spec and docs for AI agent knowledge bases
okflint - Deterministic compliance linter for OKF bundles; profile-based rule enforcement
okf (superops-team) - CLI for git-based OKF knowledge bases; keeps bundles fresh via git hooks

Technology Specifics

Context7 - Up-to-date documentation for LLMs and AI code editors
Use LangChain Docs Programmatically
NVIDIA Agent Skills - Official NVIDIA-verified skills catalog for CUDA-X, AI Blueprints and platform tools

Token Control

rtk - CLI proxy reduces LLM token consumption by 60-90% on common dev commands; single Rust binary
Headroom Desktop - macOS menu bar app cuts Claude Code and Codex token costs by ~50%

Observability

Langfuse - Open source AI observability platform; LLM evals, metrics, tracing, prompt management
Claude Code - OTEL
SigNoz
Phoenix - AI observability and evaluation by Arize; traces, evals, datasets
agentops - Python SDK for AI agent monitoring, LLM cost tracking, benchmarking

DEV Community