AIClaw is a self-hosted agent runtime, so "the answer" is often not the most useful output. In real workflows you usually want the file the agent created: a screenshot, a CSV, a Markdown report, a PDF, or some artifact produced by a tool run.
AIClaw already persists conversations, execution steps, and files in the same runtime. What makes that useful is that generated files are not treated as an afterthought. They are attached to the conversation, returned from chat APIs, and rendered directly in the web chat UI.
Project: https://github.com/chowyu12/aiclaw
The Problem
Many agent demos stop at plain text:
- a tool writes a file somewhere under
/tmp - the model mentions that the file exists
- the user has to go find it manually
- the next turn may not reliably carry that artifact forward
That breaks down fast once agents start using browser automation, code execution, shell commands, or document-style outputs.
AIClaw's current design closes that gap in three places:
- request-time file loading for uploaded files and URL-based files
- tool-output persistence for files created during execution
- chat/UI attachment rendering for both user-provided and agent-generated artifacts
How Files Enter The Runtime
AIClaw's chat request model supports file references through files on ChatRequest, with two transfer modes:
local_fileremote_url
On the server side, internal/agent/file.go loads those inputs into the execution context. Local uploads are resolved from persisted file records, and remote URLs can be fetched into temporary storage. Text and document types also get text extraction when possible, so the model can work with the content instead of only a file pointer.
There is one design detail here that matters a lot in practice: AIClaw also loads prior conversation files with ListFilesByConversation(...), so files stay in the conversation context instead of disappearing after one turn.
That makes attachments part of the working session, not just part of one HTTP request.
How Tool Output Becomes A Conversation Attachment
The interesting part is what happens after a tool runs.
In internal/agent/tool_call.go, AIClaw wraps each tool call with file-aware persistence logic:
- it snapshots the sandbox directory before the tool runs
- it executes the tool
- it checks whether the tool returned a structured file result
- if not, it scans the sandbox for newly created files
- it persists those files into the uploads area and creates file records in storage
That gives AIClaw two ways to capture artifacts:
- Explicit file results
- New files detected from the sandbox after execution
The sandbox scan intentionally skips script carrier files like .py, .js, .sh, .rb, and .ts, so it focuses on outputs rather than execution scaffolding.
This is a pragmatic design choice. It means tool authors do not always need a perfect custom return path for every artifact-producing workflow. If a tool or interpreter creates a real output file, AIClaw still has a chance to preserve it.
Why This Matters For Actual Agent Work
This file flow is especially useful for tools that naturally produce artifacts:
-
browsercan produce screenshots -
code_interpretercan generate plots, tables, and exported files - shell-oriented tools can write reports or transformed datasets
- sub-agents can bubble their files back to the parent run
sub_agent output is also handled explicitly in the same tool-call path, so the parent conversation can inherit generated files from delegated work instead of losing them inside nested execution.
That is the difference between "the agent said it did something" and "the result is now attached to the conversation and available to open."
API And Streaming Behavior
The chat API returns files as first-class response data.
In internal/handler/chat.go, the non-streaming response includes:
messagestepsfilesplan
The streaming shape in model.StreamChunk also includes files, so generated artifacts can be delivered as part of the same execution session rather than requiring a separate polling flow.
This fits the rest of AIClaw's runtime model well:
- execution steps show what happened
- plan state shows task progress
- attachments carry the concrete output
What The Chat UI Does With It
The Vue chat page under web/src/views/chat/Index.vue renders attachments directly inside the message bubble.
Current behavior is straightforward and useful:
- images render as preview cards with thumbnails
- non-image files render as clickable file cards
- file type and size are shown in the UI
- user-side pending uploads and URL attachments are visible before send
That matters because it keeps the artifact in the same visual flow as the model response and the execution timeline. You do not have to jump into another admin page just to confirm that a generated file exists.
A Practical Workflow
Here is a realistic AIClaw flow using this design:
- Upload a CSV to the conversation.
- Ask the agent to analyze it and generate a summary report plus a chart.
- Let a code-oriented tool write the chart image and report file.
- Receive the assistant answer, execution steps, and the generated attachments in the same conversation.
- Open the files directly from the chat UI or reuse them in the next turn.
The same pattern applies to browser screenshots, exported markdown notes, or files produced by a delegated sub-agent.
Why I Like This Design
What stands out in AIClaw's implementation is that attachments are part of the runtime contract, not a bolt-on download feature.
- The executor loads files into context.
- The tool layer persists produced artifacts.
- The store keeps them associated with the conversation.
- The API returns them.
- The chat UI renders them.
That end-to-end path is what makes an agent platform usable for work that produces assets instead of only text.
If you're building local-first or self-hosted agents, this is one of the surfaces worth getting right early. Text answers are cheap. Reliable artifact handling is what makes the system operational.
Top comments (0)