BoxAgnts Tool System (7) — Skill Templates, Agent Proxies, and Cron Scheduling

#ai #agents #rust #webassembly

BoxAgnts' tool system, from WASM sandbox instruction-level isolation to the Tool trait's unified abstraction to the Provider layer's multi-model adaptation, has supported the secure execution and invocation of individual tools. But a complete Agent system requires three additional capabilities: knowledge reuse (how to ensure consistency when the AI faces repetitive tasks), task decomposition (how to break through the context window limits of a single conversation), and automated execution (how to trigger tasks on a schedule). These three capabilities are provided by Skill templates, Agent sub-agents, and Cron scheduling, respectively.

Skill Templates: Why You Need a "Tool That Isn't a Tool"

Consider this scenario: a user says "review the Rust code in the src/ directory." The AI needs to execute a sequence of operations — use file-glob to find all .rs files, use file-read to read each one, use file-grep to check for potential problems, and output results in a specific format. Each of these 4 steps can be completed with existing tools, but if the AI has to decide the process from scratch every time, the output format and quality will be inconsistent each time.

Skill solves exactly this problem. A Skill is a Markdown-format prompt template stored in extensions/skills/<name>/SKILL.md. The AI calls the skill-tool tool, passing a skill name; the system returns the expanded prompt text, and the AI executes subsequent operations accordingly.

Taking the code-review skill as an example, its YAML frontmatter defines the metadata:

---
name: code-review
description: Perform deep review of code changes and output a structured report
when_to_use: Use when the user requests code review or quality assessment
tools: read, bash, glob, grep
args:
  - name: target
    description: File or directory to review; leave empty to review git staged changes
    required: false
---

The body contains specific work instructions, covering review dimensions (logical correctness, security, performance, maintainability), output format (Markdown tables), and constraints (read-only, no code modifications).

The Skill execution flow is:

The AI receives the user request and determines it matches a Skill's when_to_use condition
The AI calls skill-tool, passing skill="code-review" and args="src/" (the user-specified target path)
SkillTool reads code-review/SKILL.md, strips the YAML header, and replaces $ARGUMENTS in the body with "src/"
Returns the complete prompt text to the AI
The AI follows the instructions in the prompt, calling file-glob, file-read, file-grep, etc., and outputs review results in table format

Key code:

// tools/src/skill/skill_tool.rs
async fn execute(&self, input: Value, ctx: &ToolContext) -> ToolResult {
    let params: SkillInput = serde_json::from_value(input)?;

    if params.skill == "list" {
        return list_skills(&search_dirs(ctx)).await;
    }

    let (_path, raw) = find_and_read_skill(&params.skill, &search_dirs(ctx)).await?;
    let content = strip_frontmatter(&raw);
    let prompt = content.replace("$ARGUMENTS", &params.args.unwrap_or_default());

    ToolResult::success(prompt)
}

The core difference between Skill and Tool lies in the execution subject. Tool's execution subject is the BoxAgnts runtime — the system calls tool.execute(), gets the result, and returns it to the AI. Skill's execution subject is the AI itself — the system only replaces template variables and returns text; subsequent tool invocations are decided and executed autonomously by the AI. This means Skill not only defines "what to do" but also "how to do it" and "what output format to use" — it's a higher-level abstraction.

Agent Sub-Agents: Divide and Conquer Complex Tasks

A single AI conversation hits two ceilings when handling large-scale tasks: context window and attention decay.

The context window ceiling is straightforward — if your project has 100 Rust files totaling 50,000 lines of code, the conversation history of reviewing all files will fill a 200K token context within a few turns. Attention decay is a more subtle problem: LLMs show significantly degraded information retrieval for content in the middle of long contexts (the so-called "lost in the middle" problem); by the time it's processing the 10th file, information from the 1st file may already be ignored.

BoxAgnts' Agent sub-agent mechanism targets both of these problems. AgentTool allows the main Agent to create sub-agents, decomposing complex tasks into independent subtasks:

// tools/src/agent/mod.rs
struct AgentInput {
    description: String,         // subtask description
    prompt: String,              // complete instructions for the subtask
    tools: Option<Vec<String>>,  // tools available to the sub-agent (default: all minus AgentTool)
    max_turns: Option<u32>,     // max turns, default 10
    model: Option<String>,      // model override (sub-agent can use a different model)
    run_in_background: bool,    // whether to execute asynchronously in the background
}

Sub-Agent execution modes are divided into synchronous and asynchronous:

Synchronous mode (run_in_background = false): The main Agent blocks after invocation, waiting for the sub-agent to complete its task and return results. Suitable for scenarios where the main Agent needs subtask results to continue.

Asynchronous mode (run_in_background = true): The main Agent immediately receives an agent_id; the sub-agent runs independently in the background. The main Agent can continue processing other tasks and later query results via agent_id. Suitable for scenarios with multiple independent subtasks to process in parallel.

A practical example: a user asks to "comprehensively review this project."

Main Agent:
  │
  ├── Create sub-Agent A: "Review backend/src/ Rust code, focusing on logic and security"
  │     └── Sub-Agent A: Independent Query Loop, using file-read/file-grep/bash
  │     └── Returns: Markdown table listing 15 issues (3 critical, 7 medium, 5 minor)
  │
  ├── Create sub-Agent B: "Review frontend/src/ Vue components, focusing on performance and accessibility"
  │     └── Sub-Agent B: Independent Query Loop
  │     └── Returns: Markdown table listing 8 issues
  │
  └── Aggregate results of A and B, output comprehensive report

Each sub-agent has an independent context window (no shared conversation history), so there's no cross-contamination. Multiple sub-agents can execute in parallel (asynchronous mode); total time depends on the slowest one.

Recursion safety is an important constraint. Sub-agents' tool lists exclude AgentTool itself by default — preventing infinite recursion of creating sub-sub-agents. If multi-level delegation is genuinely needed (main Agent → sub-agent → sub-sub-agent), the AgentTool can be explicitly included in the sub-agent's tools list.

Context Compression

Long-running, multi-tool, multi-turn Agent conversations can produce massive message histories. Even if each tool invocation's result is small (e.g., file-read returning the content of one function), after 50 rounds the total token count becomes large, squeezing the model's reasoning space.

BoxAgnts' AutoCompactState handles this problem. It monitors the total size of message history and accumulated tool results, automatically triggering compression when approaching the model's context limit:

Detected context pressure (total message tokens approaching 80% of context_window)
  │
  ▼
1. Filter compressible messages
   - Prioritize compressing old tool_result ContentBlocks (tool execution results)
   - Preserve the most recent N rounds of conversation in full
   - Preserve all user and assistant messages (don't compress conversations)
  │
  ▼
2. Generate summaries
   - Old tool_results are replaced with: "[Earlier tool result from file-read: read src/main.rs, returned 42 lines of Rust code]"
  │
  ▼
3. Recalculate token count
   - If still over the limit, expand the range of compressed rounds

There's a specific configuration item tool_result_budget with a default value of 50,000 characters. When the cumulative character count of all tool results exceeds this value, the earliest tool_result is truncated and replaced.

The trade-off of the compression strategy is: tool results may contain details the AI needs for subsequent decisions (e.g., reading a specific field from a configuration file), and summarization loses this information. But for typical usage patterns — where the most recent few rounds' tool results remain the most relevant — this trade-off is acceptable.

Cron Scheduling

The final dimension of tool execution is time. Not all AI tasks are triggered by users in real time — scenarios like "generate today's code quality report at 9 AM every morning" or "check server logs for anomalies every 6 hours" require scheduled execution.

BoxAgnts' Cron system is built on tokio-cron-scheduler:

pub async fn schedule_job(state: AppState, job_cfg: JobConfig) {
    let cron_job = Job::new_async(&job_cfg.cron, move |_uuid, _lock| {
        // On trigger:
        // 1. Create a new AI conversation session
        // 2. Inject job_cfg.prompt as the user message
        // 3. Execute the complete Agent loop (identical to user-triggered conversations)
        // 4. Record JobLog { id, executed_at, success, message, error }
    });
    scheduler.add(cron_job).await;
}

Each Job's configuration includes:

{
  "name": "Daily Code Quality Report",
  "cron": "0 9 * * *",
  "prompt": "Check code changes in the src/ directory and generate today's quality report",
  "model": "claude-sonnet-4-5",
  "timeout": 300,
  "enabled": true
}

Key design points:

Timeout protection: Each Job has an independent timeout setting. If the AI conversation doesn't complete within 5 minutes, the system cancels that execution and logs a timeout entry. This prevents a runaway Agent from consuming all resources.
Scheduler persistence: Job configurations and recent execution logs are stored in SQLite. All Jobs are automatically reloaded after a service restart.
Execution independence: Each Cron trigger creates an independent conversation session with no shared message history. This is consistent with the Agent sub-agent isolation model — context pollution doesn't exist in the Cron scenario.

Permission Filtering

Different Agents may need different permission levels. BoxAgnts supports filtering by tool permission level:

pub async fn filter_tools_for_agent(
    tools: Arc<Vec<Arc<dyn Tool>>>,
    access: &str,
) -> Arc<Vec<Arc<dyn Tool>>> {
    match access {
        "full" => tools,
        "read-only" => {
            tools.iter()
                .filter(|t| matches!(t.permission_level(), ReadOnly | None)
                    || t.name() == "ask-user-question")
                .collect()
        }
        _ => tools,
    }
}

This enables creating "read-only Agents" — they can use read-only tools like file-read, file-glob, file-grep, web-fetch, but cannot use write or execute tools like file-write, file-edit, bash. For an Agent that only does code review, this restriction is natural.

Summary

BoxAgnts' advanced orchestration layer consists of three mechanisms, each addressing a key gap in Agent systems:

Skill templates solve the knowledge reuse problem. Best practices for "how to do something" are solidified as Markdown prompt templates; the AI calls skill-tool to get the expanded instructions, then autonomously executes subsequent operations. The core difference from Tool is the execution subject — Tools are executed by the system, Skills are executed by the AI following instructions.
Agent sub-agents solve the context window and attention decay problem. The main Agent creates sub-agents to handle independent subtasks, each with its own context window, avoiding the "lost in the middle" effect in long conversations. Synchronous mode is used when results are needed; asynchronous mode is used for parallel processing. AgentTool is excluded by default to prevent infinite recursion.
Cron scheduling solves the temporal automation problem. Each Job has independent timeout protection, SQLite persistence, and isolated conversation sessions. Even if one scheduled task goes rogue, it won't affect other tasks or the main conversation.

AutoCompactState's context compression and PermissionLevel's permission filtering serve as infrastructure supporting these three mechanisms: the former automatically compresses old tool results when message history approaches token limits; the latter allows different Agents to have different tool permission levels.