AIClaw's Generated File Attachments Keep Tool Output In The Chat Loop

#ai #opensource #agents #webdev

AIClaw is a self-hosted agent runtime, so "the answer" is often not the most useful output. In real workflows you usually want the file the agent created: a screenshot, a CSV, a Markdown report, a PDF, or some artifact produced by a tool run.

AIClaw already persists conversations, execution steps, and files in the same runtime. What makes that useful is that generated files are not treated as an afterthought. They are attached to the conversation, returned from chat APIs, and rendered directly in the web chat UI.

Project: https://github.com/chowyu12/aiclaw

The Problem

Many agent demos stop at plain text:

a tool writes a file somewhere under /tmp
the model mentions that the file exists
the user has to go find it manually
the next turn may not reliably carry that artifact forward

That breaks down fast once agents start using browser automation, code execution, shell commands, or document-style outputs.

AIClaw's current design closes that gap in three places:

request-time file loading for uploaded files and URL-based files
tool-output persistence for files created during execution
chat/UI attachment rendering for both user-provided and agent-generated artifacts

How Files Enter The Runtime

AIClaw's chat request model supports file references through files on ChatRequest, with two transfer modes:

local_file
remote_url

On the server side, internal/agent/file.go loads those inputs into the execution context. Local uploads are resolved from persisted file records, and remote URLs can be fetched into temporary storage. Text and document types also get text extraction when possible, so the model can work with the content instead of only a file pointer.

There is one design detail here that matters a lot in practice: AIClaw also loads prior conversation files with ListFilesByConversation(...), so files stay in the conversation context instead of disappearing after one turn.

That makes attachments part of the working session, not just part of one HTTP request.

How Tool Output Becomes A Conversation Attachment

The interesting part is what happens after a tool runs.

In internal/agent/tool_call.go, AIClaw wraps each tool call with file-aware persistence logic:

it snapshots the sandbox directory before the tool runs
it executes the tool
it checks whether the tool returned a structured file result
if not, it scans the sandbox for newly created files
it persists those files into the uploads area and creates file records in storage

That gives AIClaw two ways to capture artifacts:

Explicit file results
New files detected from the sandbox after execution

The sandbox scan intentionally skips script carrier files like .py, .js, .sh, .rb, and .ts, so it focuses on outputs rather than execution scaffolding.

This is a pragmatic design choice. It means tool authors do not always need a perfect custom return path for every artifact-producing workflow. If a tool or interpreter creates a real output file, AIClaw still has a chance to preserve it.

Why This Matters For Actual Agent Work

This file flow is especially useful for tools that naturally produce artifacts:

browser can produce screenshots
code_interpreter can generate plots, tables, and exported files
shell-oriented tools can write reports or transformed datasets
sub-agents can bubble their files back to the parent run

sub_agent output is also handled explicitly in the same tool-call path, so the parent conversation can inherit generated files from delegated work instead of losing them inside nested execution.

That is the difference between "the agent said it did something" and "the result is now attached to the conversation and available to open."

API And Streaming Behavior

The chat API returns files as first-class response data.

In internal/handler/chat.go, the non-streaming response includes:

message
steps
files
plan

The streaming shape in model.StreamChunk also includes files, so generated artifacts can be delivered as part of the same execution session rather than requiring a separate polling flow.

This fits the rest of AIClaw's runtime model well:

execution steps show what happened
plan state shows task progress
attachments carry the concrete output

What The Chat UI Does With It

The Vue chat page under web/src/views/chat/Index.vue renders attachments directly inside the message bubble.

Current behavior is straightforward and useful:

images render as preview cards with thumbnails
non-image files render as clickable file cards
file type and size are shown in the UI
user-side pending uploads and URL attachments are visible before send

That matters because it keeps the artifact in the same visual flow as the model response and the execution timeline. You do not have to jump into another admin page just to confirm that a generated file exists.

A Practical Workflow

Here is a realistic AIClaw flow using this design:

Upload a CSV to the conversation.
Ask the agent to analyze it and generate a summary report plus a chart.
Let a code-oriented tool write the chart image and report file.
Receive the assistant answer, execution steps, and the generated attachments in the same conversation.
Open the files directly from the chat UI or reuse them in the next turn.

The same pattern applies to browser screenshots, exported markdown notes, or files produced by a delegated sub-agent.

Why I Like This Design

What stands out in AIClaw's implementation is that attachments are part of the runtime contract, not a bolt-on download feature.

The executor loads files into context.
The tool layer persists produced artifacts.
The store keeps them associated with the conversation.
The API returns them.
The chat UI renders them.

That end-to-end path is what makes an agent platform usable for work that produces assets instead of only text.

If you're building local-first or self-hosted agents, this is one of the surfaces worth getting right early. Text answers are cheap. Reliable artifact handling is what makes the system operational.