Bharat Parsiya

Posted on May 26 • Originally published at bnap.dev

AI Agents Need Artifacts, Not Activity.

#ai #agents #productivity #workflows

You know that slightly cursed feeling when an AI coding agent says "done", but you still have no idea what changed?

It ran commands. It explored files. It produced a confident little summary. Maybe it even used the word "implemented" with alarming calm.

Then you open the repo.

No commit. No doc. No issue update. No test result. No saved draft.

Just vibes and terminal dust.

I have done this to myself more than once. I let an agent "investigate" something for twenty minutes, read the final message, feel briefly productive, and then realise tomorrow-me has nothing useful to pick up.

So I started using one boring rule.

The rule is simple

For any serious agent task, the task does not count until it produces one durable artifact.

Especially if the work touches revenue, product, operations, or anything I will have to explain later.

That artifact can be:

A merged or review-ready code change
A written spec
A campaign brief
A bug report with reproduction steps
A customer research summary
A decision log
A checklist with owners and dates
A published post
A saved draft that someone can actually use

Not a chat summary. Not "I looked into it." Not "next steps: continue investigation." Something another human or agent can pick up tomorrow without needing the original conversation.

If the work disappears when the chat window closes, it was not work. It was rehearsal.

Why agents fake progress so well

Humans do this too, to be fair. We have all spent an afternoon "researching" and ended with twelve tabs, zero decisions, and a vague sense that we are now more informed.

Agents just do it faster.

The default agent loop rewards motion. Search. Read. Think. Run. Retry. Summarize. Each step looks productive in the transcript, so it feels like the task is advancing. But unless the loop is forced to write something durable, the output often stays trapped inside the session.

This is extra dangerous because AI agents are very good at sounding finished. A human might say, "I poked around but I need to write this up." An agent will often say, "I analyzed the issue and identified opportunities," which sounds like work until you ask where the artifact is.

There usually isn't one.

The artifact is the handoff

The real value of an artifact is not that it looks neat. It is that it makes the next step cheaper.

A bug report with exact reproduction steps means the developer does not have to rediscover the bug. A campaign brief with audience, channel, message, and metric means the marketer does not have to reverse-engineer the strategy. A code change with a test result means the reviewer can inspect behavior instead of guessing what the agent thought it did.

Artifacts turn agent work from a private conversation into shared company memory.

That matters because the best use of agents is not one magic assistant doing everything. It is many small loops handing clean state to each other: one agent investigates, one writes, one reviews, one ships, one monitors.

That only works if each loop leaves a clean object behind.

Otherwise you get a very expensive game of telephone.

My "done" checklist

When I give an agent a task now, I want the closing update to answer five questions:

Question	Good answer
What changed?	A file, document, issue, draft, or published asset exists.
Where is it?	There is a path, link, ticket, or commit reference.
How was it checked?	There is a build, test, review, preview, or sanity check.
What remains?	Either nothing, or a specific blocker with an owner.
Who owns the next action?	A named human, agent, or team.

If those questions are not answered, the task is not done. It may be in progress. It may be blocked. It may be waiting for review. But it is not done.

This sounds obvious until you watch how often agent workflows skip it.

Tiny artifacts are fine

This does not mean every task needs a 20-page document. Please no.

The trick is to ask for the smallest artifact that proves the work moved forward.

If the task is "check whether this bug is real," the artifact can be a five-line repro note. If the task is "write the next blog post," the artifact is the Markdown file and the cross-post copy. If the task is "evaluate this feature idea," the artifact is a decision memo with the audience, risk, and next test.

Small is good. Small means it actually gets created.

The artifact just has to survive the session and reduce future uncertainty.

The prompt I use now

The best prompt change is boring:

Do the task and leave a durable artifact.

Before marking it done, report:
- what artifact you created or changed
- where it lives
- how you verified it
- what remains, if anything

For coding agents, I add:

Do not call the work complete unless the change is in files and the smallest relevant verification has run.

For marketing or strategy agents, I add:

The deliverable must include audience, goal, channel, message, owner, acceptance criteria, and success metric.

None of this makes the agent smarter. It makes the work inspectable. That is usually more important.

The annoying part

The artifact rule also exposes weak tasks.

If you cannot name the artifact you want, you probably have not defined the work clearly enough.

"Improve growth" is not a task. "Draft three LinkedIn posts promoting the new agent workflow article, each with a different hook and one success metric" is a task.

"Look into onboarding" is not a task. "Audit the first-run flow and produce a ranked list of the top five drop-off risks" is a task.

Agents do better when the finish line is an object, not a mood.

My current bias

I am increasingly convinced that the next leap in AI productivity will come less from fancier prompts and more from boring operational discipline.

Clear issue. Clear owner. Clear artifact. Clear verification. Clear next action.

That is not glamorous. It does not demo as well as an agent opening fifty tabs and pretending to be a tiny intern with caffeine problems. But it compounds.

The more agent work becomes durable, reviewable, and handoff-friendly, the less it feels like babysitting a chatbot and the more it feels like running an actual team.

Feel free to connect or reach out if you have questions — or if you have a better name than "terminal dust" for all the work agents do that never lands anywhere.