DEV Community

Cover image for BoxAgnts Tool System (5) — WASM Tool Development: From Hello World to Production Deployment
Guyoung Studio
Guyoung Studio

Posted on

BoxAgnts Tool System (5) — WASM Tool Development: From Hello World to Production Deployment

WASM sandboxing provides BoxAgnts with instruction-level security isolation, while the tool registration chain enables zero-configuration auto-discovery. On top of these two foundations, developers only need to focus on one thing: writing programs that follow standard CLI conventions. This article jumps straight into hands-on practice — from a complete base64 encoding tool development process, through compilation, deployment, and testing, to some common pitfalls.


Why Base64 as the Example

Base64 encoding/decoding is an ideal example tool: the logic is simple enough (won't distract), yet it covers the typical characteristics of an AI Agent tool — multiple input parameters (mode, input source, output target), error handling (invalid base64 strings), file I/O, and strict output format requirements. Understanding base64 tool development means understanding all WASM tool development.

The complete example code is located in the BoxAgnts repository at examples/tool-sample-base64-component/.


Cargo.toml Configuration

[package]
name = "tool-sample-base64-component"
version = "1.0.0"
edition = "2021"

[[bin]]
name = "base64"
path = "src/main.rs"

[dependencies]
clap = { version = "4", features = ["derive", "string"] }
base64 = "0.22"
serde_json = "1"
Enter fullscreen mode Exit fullscreen mode

The dependencies are minimal: clap handles CLI argument parsing, base64 handles encoding/decoding logic, and serde_json handles structured output. There are no WASM-specific dependencies — Wasmtime provides the runtime environment on the host side; the WASM tool itself doesn't need to know it's running in a sandbox.

The WASM compilation target needs to be specified in .cargo/config.toml (or via the command-line --target flag):

[build]
target = "wasm32-wasip2"
Enter fullscreen mode Exit fullscreen mode

Core Code

The main function structure (see the repository for the full code):

use clap::{Parser, ValueEnum};
use base64::{engine::general_purpose, Engine as _};
use serde_json::json;

#[derive(Copy, Clone, Debug, PartialEq, ValueEnum)]
enum Mode { Encode, Decode }

#[derive(Copy, Clone, Debug, PartialEq, ValueEnum)]
enum Alphabet { Standard, UrlSafe }

#[derive(Parser, Debug)]
#[command(name = "base64")]
#[command(version)]
#[command(about = "Strict Base64 encode/decode tool")]
struct Args {
    #[arg(long, value_enum, required = true)]
    mode: Mode,

    #[arg(long, conflicts_with = "file_path")]
    input: Option<String>,

    #[arg(long, conflicts_with = "input")]
    file_path: Option<String>,

    #[arg(long)]
    output_file: Option<String>,

    #[arg(long, value_enum, default_value = "standard")]
    alphabet: Alphabet,

    #[arg(long, default_value_t = false)]
    no_padding: bool,
}

fn main() {
    let args = Args::parse();

    if let Err(e) = validate_args(&args) {
        eprintln!(r#"{{"error":true,"content":"{}"}}"#, e);
        std::process::exit(1);
    }

    let input_bytes = match read_input(&args) {
        Ok(b) => b,
        Err(e) => {
            eprintln!(r#"{{"error":true,"content":"{}"}}"#, e);
            std::process::exit(1);
        }
    };

    let engine: &dyn Engine = match (&args.alphabet, args.no_padding) {
        (Alphabet::Standard, false) => &general_purpose::STANDARD,
        (Alphabet::Standard, true) => &general_purpose::STANDARD_NO_PAD,
        (Alphabet::UrlSafe, false) => &general_purpose::URL_SAFE,
        (Alphabet::UrlSafe, true) => &general_purpose::URL_SAFE_NO_PAD,
    };

    let result = match args.mode {
        Mode::Encode => engine.encode(&input_bytes),
        Mode::Decode => {
            let input_str = std::str::from_utf8(&input_bytes)
                .unwrap_or_else(|_| "");
            match engine.decode(input_str.trim()) {
                Ok(bytes) => String::from_utf8_lossy(&bytes).into_owned(),
                Err(e) => {
                    eprintln!(r#"{{"error":true,"content":"Invalid base64: {}"}}"#, e);
                    std::process::exit(1);
                }
            }
        }
    };

    if let Some(output_file) = &args.output_file {
        std::fs::write(output_file, &result).unwrap_or_else(|e| {
            eprintln!(r#"{{"error":true,"content":"Write failed: {}"}}"#, e);
            std::process::exit(1);
        });
        println!(r#"{{"error":false,"content":"Written to {}"}}"#, output_file);
    } else {
        println!(r#"{{"error":false,"content":"{}"}}"#, result);
    }
}
Enter fullscreen mode Exit fullscreen mode

Several implementation details worth noting:

JSON output format. WASM tools return JSON objects via stdout, with the convention {"error": bool, "content": "..."}. BoxAgnts' WasmTool::execute() automatically parses this JSON and maps it to ToolResult. If stdout is not valid JSON, the entire text is treated as the content of a successful result.

Parameter conflict handling. input and file_path are mutually exclusive — conflicts_with lets clap reject both appearing simultaneously at parse time, rather than deferring the check to business logic.

Error output to stderr. When WASM safety failures occur, output should go to stderr, not stdout. BoxAgnts captures both streams separately — stderr content is used for error reporting, stdout for tool results.


Compilation and Deployment

# Compile
cargo build --target wasm32-wasip2 --release

# Artifact location
ls target/wasm32-wasip2/release/base64.wasm
Enter fullscreen mode Exit fullscreen mode

After compilation, copy directly to the extensions directory:

cp target/wasm32-wasip2/release/base64.wasm \
   app/extensions/tools/base64-component.wasm
Enter fullscreen mode Exit fullscreen mode

The filesystem change is captured by the notify event watcher, triggering the hot-reload flow: sandbox execution of --help, output parsing, ToolSpec generation, and global tool table registration. The total latency from file copy to tool availability is typically within 100 milliseconds, with the primary time spent on Wasmtime compiling WASM to .cwasm cache.


Cross-Language Development

Although the example uses Rust, WASM tools can be written in any language that supports wasm32-wasi. Here's a comparison using Go to write a simple file-read tool:

// Go version file-read (compiled with TinyGo)
package main

import (
    "fmt"
    "os"
)

func main() {
    if len(os.Args) < 2 {
        fmt.Fprintf(os.Stderr, `{"error":true,"content":"Missing file path"}`)
        os.Exit(1)
    }
    data, err := os.ReadFile(os.Args[1])
    if err != nil {
        fmt.Fprintf(os.Stderr, `{"error":true,"content":"%s"}`, err)
        os.Exit(1)
    }
    fmt.Printf(`{"error":false,"content":"%s"}`, string(data))
}
Enter fullscreen mode Exit fullscreen mode
# Compile
tinygo build -target wasm-wasi -o file-read.wasm main.go
Enter fullscreen mode Exit fullscreen mode

The Go and Rust versions of file-read behave identically — they output the same JSON format, run under the same sandbox constraints, and are called by the same WasmTool::execute(). This is the core value of WASM as a tool distribution format: define a simple output convention, and different language implementations are automatically compatible.


Common Issues

File I/O Paths

The filesystem seen by WASM tools is not the host's complete filesystem. If RunOption.work_dir is set to /home/user/project, then ./src/main.rs inside the WASM tool accesses the host's /home/user/project/src/main.rs. Attempting to access /etc/passwd will fail because it falls outside the mapped directory scope.

stdout Buffering

Whether WASM stdout is line-buffered or fully buffered depends on the WASI implementation. If a tool writes JSON and exits without explicitly flushing, the final chunk of output may be lost. For single-shot outputs of small JSON this is typically not a problem, but if a tool produces large output (e.g., file-read reading a 100MB file), consider segmenting the output or using a streaming protocol.

Encoding Issues

println! in the WASI environment outputs UTF-8 by default. If a tool needs to output non-UTF-8 encoded text (e.g., reading a GBK-encoded file), encoding must be handled manually, and the result should be Base64-wrapped in the content field.


Testing Tools

During development, you can test WASM tools directly using BoxAgnts' CLI, without going through an AI conversation:

# Simulate tool registration — view the ToolSpec parsed by the system
boxagnts tool:validate path/to/tool.wasm

# Simulate tool execution — pass JSON parameters
boxagnts tool:execute path/to/tool.wasm '{"mode":"encode","input":"hello"}'
Enter fullscreen mode Exit fullscreen mode

This is far faster than testing through AI conversations and lets you directly see Wasmtime-level error messages (if sandbox startup fails).


Tools vs. Skills

WASM tools are suitable for deterministic computational tasks: encoding/decoding, file operations, database queries, regex matching. But if a task's core isn't "computation" but "guiding the AI's thought process" — such as code reviews, architecture suggestions, writing guidance — it's not a good fit for a WASM tool. These scenarios should use Skills, which are pure Markdown prompt templates loaded by the system and injected into the AI's context; the AI then makes autonomous decisions and executes actions accordingly.


Summary

BoxAgnts' WASM tool development workflow subtracts complexity — developers don't need to learn any BoxAgnts-specific APIs or configuration formats; they only need to follow two conventions:

  1. --help output must contain standard CLI help blocks (Usage:, Options:, Arguments:, or Commands:) for the system to auto-extract the Schema.

  2. stdout outputs JSON format {"error": bool, "content": "..."}, with an optional metadata field for passing structured rendering information to the frontend.

Beyond these, the tool code is entirely an ordinary CLI program. This is a watershed in developer experience — traditional Agent frameworks require developers to understand the framework's Tool base class, Schema declaration format, and callback registration patterns. BoxAgnts replaces all of these with "just write proper --help output."

Cross-language support is another unique advantage. Rust, Go, Python, C — any language that can compile to wasm32-wasi can be used to develop BoxAgnts tools. The compiled .wasm file is placed in the extensions directory, and the hot-reload mechanism automatically handles the remaining registration and caching steps.

References

Top comments (0)