DEV Community

Cover image for MiniClaw: A Tiny LLM Agent for Mini Micro
JoeStrout
JoeStrout

Posted on

MiniClaw: A Tiny LLM Agent for Mini Micro

Agents are all the rage these days. Claude Code was one of the first, and perhaps still the most heavily used, specialized for coding. Then OpenClaw burst onto the scene, able to do all sorts of general computer-use things, and caused a shortage of Mac Minis. More recently, Hermes Agent is a common favorite, with over 93 thousand stars on GitHub.

All of these agents work in fundamentally the same way. A "harness" acts as both the main program for an LLM, controlling its context so that it always knows what it needs to know; and provides tools the LLM can use so that it can always do what it needs to do.

I covered accessing LLMs from Mini Micro back in 2022, and again in 2023, so why don't we take it to the logical next step, and create an agent in Mini Micro?

MiniClaw logo

Introducing MiniClaw

Yesterday I sat down and created MiniClaw. It consists mainly of three files:

  1. instructions.txt: these are the instructions to the LLM
  2. agent.ms: the main program, which invokes the LLM and manages its context
  3. tools.ms: code for the tools the agent can use to read, write, and manipulate files

So what can it do? Well, MiniClaw can read any file accessible within Mini Micro, which means the /sys disk, plus whatever minidisk or folder you have mounted as /usr and /usr2. It can also write files (only) under /usr/workspace. So, similar to Claude Code or most other agents, you can use it to create and modify pretty much any kind of text file. Or you can just ask it to explain and summarize things for you. For example, I asked it "tell me about the pictures on the sys disk", and it wrote out a nice summary:

Screen shot of /sys/pics summary

On another occasion, I asked it to create a .md (Markdown) file describing all the demos found in /sys/demo. But then, in a later session, I decided that the document it created was too wordy, so I asked it to shorten it:

Screen shot of agent shortening DEMO_GUIDE.md

The gray text gives us some hints as to what the code is doing: it shows when we call the LLM, how much data we get back as a response, and what tool the LLM is using (and why).

Some tasks, like this one, take only a couple of tool calls. Others take more. The LLM will keep invoking tools, occasionally printing some messages for us about its work, until it figures the task is complete (or that it's unable to complete it).

How it works

The complete agent.ms file is only 263 lines long, divided into 14 functions. That's a bit too long to go over line by line here, but we'll hit the highlights, and I encourage you to check the source file for details.

The big picture is this:

  • Each time we call the LLM, we give it our instructions (always the same), and the prompt (varies each turn).
  • The prompt includes messages from the user, previous tool calls made by the agent, and the results of those calls -- all this stuff is called the "history". It also includes the current task, so the LLM is clear on what it's supposed to be doing.
  • The LLM gives us a response in JSON format: either a tool call, a question for the user, an intermediate message, or a final message (indicating it's done).
  • We run any tool calls the LLM has asked for, and append the call and results to the history.

And that's pretty much it. The main loop looks like this:

    while true
        if not currentUserInput then
            text.color = color.gray; print "==> ", ""
            text.color = "#00AA00"
            globals.currentUserInput = input
            text.color = color.gray
        end if
        response = getResponse
        respData = json.parse(response)
        if respData == null then
            addToHistory("**IMPORTANT:** You must format your response as a JSON object!")
        else
            handleResponse respData
        end if
    end while
Enter fullscreen mode Exit fullscreen mode

currentUserInput is the instruction the agent is working on; it's empty at the start of the run, or when the agent says it's finished. So then we get more input from the user. (Half the code above is just fiddling with the text color to be fancy.)

Then we call getResponse to get the LLM's response to the current context (instructions plus prompt as described above), and try to parse it as JSON. Occasionally the LLM will forget to format its response as JSON; if that happens, we just add a stern reminder to the history (so the LLM will see it) and try again. Otherwise, we call handleResponse:

handleResponse = function(data)
    addToHistory data
    if data.type == "message" then
        printNicely data.content
        globals.lastMessage = data.content
    else if data.type == "question" then
        printNicely data.content
        addToHistory ["", "--- User response ---", input, "--- End user response ---"].join(EOL)
    else if data.type == "finish" then
        printNicely data.content
        globals.currentUserInput = ""
    else if data.type == "tool_call" then
        handleToolCall data
    else
        addToHistory "ERROR: invalid response type """ + data.type +
          """; must be ""message"", ""question"", ""finish"", or ""tool_call""."
    end if  
end function
Enter fullscreen mode Exit fullscreen mode

I've removed some of the error handling and text-coloring above for clarity, but this is the gist of it. We just switch based on the type of response we got from the LLM; it should be one of the four types we put in the instructions. Again, note that when we want to give the LLM more information -- like the user's response to a question -- we just add it to the history.

Let's talk about that addHistory method a moment. Its job is mainly just to append the given string(s) to a list of strings, so they can be included in the context. But for any agent, context management is very important! Too much context burns through tokens, and degrades LLM performance. So, our addHistory method limits how much history it remembers.

addToHistory = function(entry)
    history.push entry
    globals.historyLen += entry.len
    while historyLen > 4096 and history.len > 8
        globals.historyLen -= history[0].len
        history.pull  // discard element 0
    end while
end function
Enter fullscreen mode Exit fullscreen mode

Probably the next most important function is promptInput, which calculates the "prompt" part of the context -- the part that varies from turn to turn.

promptInput = function
    lines = []
    lines.push "# Task/User Input"
    lines.push currentUserInput
    lines.push ""
    lines.push "# Current State"
    lines.push "Date/time: " + dateTime.now
    if history then
        lines.push ""
        lines.push "# Recent history"
        lines += history
    end if
    return lines.join(EOL)
end function
Enter fullscreen mode Exit fullscreen mode

Simple, right? It's just composing a bit of Markdown calling out the current user input, the current state (which for this version of MiniClaw, is only the date/time), and the history. This stuff is appended to the static instructions, and sent to the LLM.

The Tools

The functions above are going to be pretty standard for any agent. What determines what the agent actually does are the tools and instructions provided to it. In MiniClaw, the tools are separated out into their own file, tools.ms.

This file begins with some little helper functions: err, errMissingArg, and okResult, which all generate little result maps to be returned to the LLM; plus resolvePath and isWriteable, which help the tool code deal with files properly. Then, it has a function for each tool:

  • list_files
  • read_file
  • head_file
  • tail_file
  • write_file
  • delete_file
  • move_file
  • make_dir

Each of these takes a map containing arguments (which we've gotten by parsing the JSON from the LLM), does its thing if it can, and then returns a map of results -- usually from one of the err functions, or from okResult, with details (like the path of the affected file) added in. This lets the LLM know whether its attempt to use a tool was successful.

As an example, let's look at head_file, whose job it is to return the first so-many lines of a text file. This tool is described in the instructions file as:

{
    "name": "head_file",
    "desc": "Return the first n lines of a UTF-8 file.  Use this to examine large or unknown text files."
    "arguments": { "path": "string", "lines": "int" }
},
Enter fullscreen mode Exit fullscreen mode

We give the agent the name of the tool, a description including advice on when to use it, and info on the expected arguments. So, the actual MiniScript function is expecting its args map to contain "path" and "lines":

head_file = function(args)
    path = args.get("path"); if not path then return errMissingArg("path")
    if not file.exists(path) then return err("Invalid path `" + path + "`")
    lines = args.get("lines", 10)
    data = file.readLines(path)
    if data.len > lines then data = data[:lines]
    result = okResult
    result.path = path
    result.content = data.join(EOL)
    return result
end function
Enter fullscreen mode Exit fullscreen mode

It pulls those arguments out of the map, does some simple validation on them, reads the file, and returns the requested data as the "content" string of the result map.

The other tools all work in a similar fashion.

Trying it out

You can download the MiniClaw source files from GitHub, but in order to access the LLM (gpt-5.4-nano) it uses, you'll need to set up an API key:

  1. Log in to platform.openai.com.
  2. Click on API Keys on the left.
  3. Click Create new secret key, give it a name like "MiniClaw", and copy the key it shows you.
  4. Paste that into a file called api_key.secret next to agent.ms.

Then you can mount that directory in Mini Micro, and run "agent".

As for cost, I wouldn't worry about it too much... I used this thing a lot yesterday and today while developing it, and it cost under 40 cents. gpt-5.4-nano is pretty cheap, and seems smart enough for everything I've tried so far.

Taking it further

This is where it gets fun: add your own tools! This version of MiniClaw only does basic file creation/manipulation, as you can see from the tool list above. But you could make your own MiniClaw do anything Mini Micro is capable of. Examples:

  • Display a picture
  • Play a sound (or series of sounds — making music?)
  • Launch a program
  • Access the web or web services via http
  • Do math
  • Call Wolfram Alpha for help

And adding more tools is pretty easy: just create a function for it in tools.ms, and add a description of the tool to instructions.txt. That's it; the LLM should be smart enough to invoke it when the time is right.

If you want to switch to a different LLM provider (here's a handy guide to AI models for Hermes), you might have to adjust or rewrite the getResponse function, which formats the input for the LLM and then digs the actual response text out of the JSON package it's buried in. But this is totally doable. You could even run a local LLM, if that's your thing, and connect to it at localhost.

There's also a lot that could be done to improve MiniClaw's user interface. Right now it just prints (albeit with pretty colors, supporting basic markdown commonly used by LLMs) stuff to the text display as it comes in. You could instead make a structured display, keeping the current task up top, showing some info about the context on the side, and neatly formatted responses (perhaps drawn with proportional fonts into a PixelDisplay) below.

My main goal with MiniClaw was to create an agent that is simple and small enough (and MiniScript enough!) to be easily understood and modified. And the great thing is, Mini Micro is a safe sandbox environment, assuming you only mount minidisks or folders you aren't worried about. So go nuts and have fun!

Top comments (0)