Understand OpenClaw by Building One - 2: Gear up Your Agent

#ai #learning #agents

All code snippets and working code bases are available at this repo.

Beyond Tools

Tools are part of agents' code asset. But every time you want it to do something new, you have to write code, restart the server, and redeploy.

How to extend its capability & knowledge base without changing its code?

Skills - Dynamic Capabilities Loading

Skills are lazy loaded capabilities at runtime. It isn't something Openclaw invented, but an open standard. Reference the official document for more info.

The pattern is simple: a SKILL.md file with YAML frontmatter for metadata loaded up front and markdown for instructions loaded when needed.

def create_skill_tool(skill_loader):
    # Discover skills and get metadata
    skill_metadata = skill_loader.discover_skills()

    # Build XML description of available skills
    skills_xml = "<skills>\n"
    for meta in skill_metadata:
        skills_xml += f'  <skill name="{meta.name}">{meta.description}</skill>\n'
    skills_xml += "</skills>"

    # Tool loads full content only when called
    @tool(name="skill", description=f"Load skill. {skills_xml}", ...)
    async def skill_tool(skill_name: str, session) -> str:
        return skill_loader.load_skill(skill_name).content

Two Approaches to Skills

Openclaw doesn't implement skills with a separate tool. Instead, it uses system prompt injection with file reading:

Tool Approach: Dedicated skill tool lists available skills and loads content. The tool schema includes skill metadata in its description. Self-contained skill discovery and loading.
System Prompt Approach: Skill metadata (id, name, description) injected into system prompt. Agent uses standard read tool to read SKILL.md. No specialized skill tool needed, simpler tool registry.

Slash Commands: User Control

Sometimes you want direct control, not a conversation. Slash commands let you manage the session itself: list skills, show session info, clear history. The implementation is fairly simple.

class Command(ABC):
    name: str
    aliases: list[str] = []

    @abstractmethod
    async def execute(self, args: str, session) -> str:
        pass

class CommandRegistry:
    async def dispatch(self, input: str, session) -> str | None:
        """Parse and execute a slash command. Returns None if not a command."""
        if not input.startswith("/"):
            return None
        # Parse: /command args
        parts = input[1:].split(None, 1)
        cmd_name, args = parts[0], parts[1] if len(parts) > 1 else ""
        if cmd_name in self._commands:
            return await self._commands[cmd_name].execute(args, session)
        return None

Integration in the main loop — check commands before sending to LLM:

async def run(self) -> None:
    while True:
        user_input = await get_input()
        # Check for slash commands first
        cmd_response = await self.command_registry.dispatch(user_input, self.session)
        if cmd_response is not None:
            self.console.print(cmd_response)
            continue

        # Normal chat
        response = await self.session.chat(user_input)
        self.display_agent_response(response)

Slash Commands and Session History

Slash commands may or may not be added to the session history (message log sent to the LLM). This is a design decision — commands are user controls, not conversation content. Either approach is valid depending on your use case.

Web Tools: Connect to the World

Your agent lives in a terminal. But the information it needs lives on the web.

Two tools bridge this gap:

websearch: Search the web and get structured results
webread: Fetch and extract content from URLs

@tool(...)
async def websearch(query: str, session) -> str:
    results = await provider.search(query)
    output = []
    for i, r in enumerate(results, 1):
        output.append(f"{i}. **{r.title}**\n   {r.url}\n   {r.snippet}")
    return "\n\n".join(output)