DEV Community: Alex Shev

Stop Teaching Terminal Commands. Teach Terminal Workflows.

Alex Shev — Mon, 01 Jun 2026 15:35:38 +0000

Most terminal education teaches commands.

That makes sense at the beginning.

You need to know what grep does. You need to know how find works. You need to understand pipes, redirects, exit codes, environment variables, and shell scripts.

But after a while, the command is not the hard part.

The hard part is the workflow.

Not:

grep -R "TODO" .

But:

Inspect the repo.
Find the risky files.
Decide what needs changing.
Run the smallest useful command.
Verify the result.
Recover cleanly if the command was wrong.

That is the part most people are not taught.

And it is also the part AI agents need most.

Commands are vocabulary

Knowing terminal commands is useful.

But commands are vocabulary.

Workflows are how you actually get work done.

A beginner asks:

What command lists files?

A more experienced developer asks:

What am I trying to learn from the filesystem, and what is the safest way to inspect it?

Those are different questions.

For example, this command is easy to memorize:

ls

But the real workflow might be:

pwd
ls -la
find . -maxdepth 2 -type f | head
git status --short

That sequence tells you where you are, what is around you, what kind of project you are in, and whether the workspace is already dirty.

The command is small.

The judgment around the command is the skill.

A terminal workflow has stages

I like this simple model:

inspect -> decide -> run -> verify -> recover

That pattern shows up everywhere.

1. Inspect

Before you change anything, learn the shape of the problem.

pwd
git status --short
find . -maxdepth 2 -type f | sort | head -50

This is not busywork.

It prevents the most common terminal mistake: running the right command in the wrong place.

2. Decide

Once you inspect, choose the smallest useful action.

Do not start with the most powerful command.

Start with the command that gives you more information or makes a reversible change.

For example:

git diff -- path/to/file

before:

git add .

Or:

ffprobe input.mp4

before:

ffmpeg -i input.mp4 output.mp4

3. Run

Run the command with a clear expectation.

You should know what success looks like before you press Enter.

Bad:

Let's try this and see what happens.

Better:

This should produce one MP4 in the output folder, keep the source file unchanged, and exit non-zero if the input is invalid.

4. Verify

The command finishing is not the same as the workflow being done.

If you converted a video, inspect the output.

ffprobe output.mp4

If you changed code, run a relevant test.

npm test

If you edited content, check the final rendered page, not just the local markdown.

Verification is where a terminal habit becomes a production workflow.

5. Recover

Every useful workflow needs a recovery path.

What happens if the command fails?

What happens if it succeeds but produces the wrong output?

What happens if the workspace already had unrelated changes?

For example:

git diff
git restore --staged .

or:

mkdir -p output_failed
mv broken-output.mp4 output_failed/

Recovery should not be improvised after something breaks.

It should be part of the workflow.

Why this matters more with AI agents

AI agents are very good at discovering commands.

They can usually find the right shell syntax faster than a human can.

But they can still fail in familiar ways:

running commands before inspecting the project
using broad commands when a narrow one would do
trusting success output too early
overwriting files without a recovery plan
skipping platform-specific verification
treating a tool capability as a finished workflow

That is why "give the agent terminal access" is not enough.

Tool access gives the agent hands.

The workflow tells the agent what careful work looks like.

Example: teaching `grep` vs teaching search

You can teach someone this:

grep -R "stripe" .

That is a command.

But a real search workflow might be:

git status --short
find . -maxdepth 3 -type f | sort | head -100
grep -R --exclude-dir=node_modules --exclude-dir=.git "stripe" .
grep -R --include="*.ts" --include="*.tsx" "createCheckout" .

The difference is not just syntax.

The workflow says:

check the workspace first
avoid dependency folders
search broadly, then narrow by file type
search both product terms and implementation terms
do not edit until you know where the real boundary is

That is what developers actually do.

The command is only one piece.

Example: teaching `ffmpeg` vs teaching media prep

You can teach:

ffmpeg -i input.mov output.mp4

That is fine for a demo.

But if the file needs to be uploaded to a platform, the workflow matters more:

Inspect input.
Check duration, codec, pixel format, audio, and dimensions.
Convert to a safe default.
Inspect output.
Check file size.
Upload.
Verify the real platform preview.
Publish.
Verify the final post.

The important lesson is not "use FFmpeg."

The important lesson is:

Never trust a media conversion until the destination platform accepts it.

That is workflow knowledge.

This is where Terminal Skills fits

Terminal Skills is useful because it gives agents a way to package workflows, not just commands.

A good skill does not only say:

Run this command.

It says:

Use this when the task looks like this.
Inspect these inputs.
Run these steps.
Stop in these cases.
Verify these outputs.
Report the result in this format.

That is the missing layer between raw tool access and reliable automation.

For example, a media skill can teach an agent how to prepare upload-safe video.

A Git skill can teach an agent how to inspect a dirty worktree without destroying user changes.

A deployment skill can teach an agent how to build, test, deploy, and verify a live URL.

The point is not that the terminal becomes easier.

The point is that the workflow becomes repeatable.

A useful skill has a definition of done

This is the part I care about most.

A command can finish successfully and still not solve the problem.

A skill should define done.

For a code change:

Done means the targeted files were changed, tests passed, no unrelated files were touched, and the result was summarized with exact file paths.

For a video conversion:

Done means the output file is readable, uses the expected codec and pixel format, is under the target size, and has been accepted by the upload composer.

For a content workflow:

Done means the draft exists, the links work, the target platform format is respected, and publishing approval was recorded.

That last line matters.

In real work, verification and approval are part of the system.

They are not optional admin details.

The better way to teach terminal work

I still think people should learn commands.

But I would not stop there.

Teach commands inside workflows.

Instead of:

Here are 20 useful Linux commands.

Teach:

Here is how to inspect an unfamiliar repo.
Here is how to safely search a codebase.
Here is how to prepare a video for upload.
Here is how to debug a broken script.
Here is how to verify a deployment.
Here is how to recover after a bad command.

That is closer to how developers actually work.

It is also closer to what AI agents need.

Because the future of terminal automation is not agents memorizing more commands.

It is agents following better workflows.

If you are building with AI agents, ask this:

What terminal task do I keep explaining from scratch?

That is probably not a prompt problem.

It is probably a workflow that should become a skill.

Disclosure: I used AI assistance to draft and edit this article, then reviewed the examples, commands, and claims before publishing.

How I Made AI Video Uploads Boring with a Terminal Skill

Alex Shev — Sat, 30 May 2026 23:18:20 +0000

AI video demos are getting easier to generate.

Publishing them is still weirdly fragile.

Last week I had a simple task: take an AI-generated demo video, attach it to a post, and publish it without the platform silently dropping the media.

The video looked fine locally.

The upload did not.

Wrong codec. Too large. Slow processing. Missing preview. No clear error. The browser said the upload was okay, but the composer did not actually register the file.

That is the kind of problem I do not want to solve from memory at 11 PM.

So I turned it into a Terminal Skill.

Not a giant automation platform. Not a magic agent prompt. Just a small repeatable workflow that takes a messy video file and produces an X-safe default MP4 with checks before and after the conversion.

Here is the use case.

The actual problem

AI video tools often export files that are technically valid but awkward for social platforms.

Common issues:

file is too large
codec is accepted by QuickTime but not by the platform
pixel format is not yuv420p
metadata is not optimized for web playback
resolution is higher than needed
audio track is missing or weird
duration is fine visually but platform processing hangs

You can fix most of this with FFmpeg.

But the problem is not knowing that FFmpeg exists.

The problem is remembering the exact flags, running them consistently, checking the output, and not trusting the browser upload until the platform shows a real preview.

That is where a Terminal Skill helps.

What I mean by a Terminal Skill

For this workflow, a Terminal Skill is a small folder with:

one script that does the conversion
one validation step before conversion
one validation step after conversion
a short SKILL.md explaining when to use it
predictable input and output paths
logs that make it obvious what happened

The important part is not the script itself.

The important part is that the workflow becomes reusable by a human or an agent without rediscovering the rules every time.

The skill answers:

When should I use this?
What input does it expect?
What output should it produce?
How do I know it worked?
When should I stop instead of publishing?

That last question matters a lot.

For external platforms, "the command succeeded" is not the same as "the post is safe to publish."

The folder structure

I kept the structure boring:

x-safe-video/
  SKILL.md
  make-x-safe.sh
  examples/
  output/

The script does the mechanical work.

The SKILL.md documents the operating rules:

# X-Safe Video

Use this when preparing generated video for X/Twitter upload.

Input:
- MP4, MOV, or WebM
- preferably under 2 minutes

Output:
- MP4
- H.264 video
- AAC audio if audio exists
- yuv420p pixel format
- faststart metadata
- scaled down if needed

Stop if:
- ffprobe cannot read the file
- output has no video stream
- output is larger than the target platform limit
- upload composer does not show a real video preview
- platform does not show Uploaded 100%

That is the difference between a script and a skill.

A script says:

Run this command.

A skill says:

Here is the workflow, the boundary, and the definition of done.

The conversion script

Here is the simplified version.

#!/usr/bin/env bash
set -euo pipefail

INPUT="${1:?Usage: ./make-x-safe.sh input-video}"
OUTDIR="${OUTDIR:-./output}"
mkdir -p "$OUTDIR"

if [[ ! -f "$INPUT" ]]; then
  echo "Input file not found: $INPUT" >&2
  exit 1
fi

BASENAME="$(basename "$INPUT")"
NAME="${BASENAME%.*}"
OUTPUT="$OUTDIR/${NAME}_x_safe.mp4"

echo "Inspecting input..."
ffprobe -v error \
  -select_streams v:0 \
  -show_entries stream=codec_name,width,height,pix_fmt,duration \
  -of default=noprint_wrappers=1 \
  "$INPUT"

echo "Converting to X-safe MP4..."
ffmpeg -y -i "$INPUT" \
  -vf "scale='min(1280,iw)':-2" \
  -c:v libx264 \
  -profile:v high \
  -pix_fmt yuv420p \
  -preset medium \
  -crf 23 \
  -movflags +faststart \
  -c:a aac \
  -b:a 128k \
  "$OUTPUT"

echo "Inspecting output..."
ffprobe -v error \
  -select_streams v:0 \
  -show_entries stream=codec_name,width,height,pix_fmt,duration \
  -of default=noprint_wrappers=1 \
  "$OUTPUT"

BYTES=$(wc -c < "$OUTPUT")
MB=$((BYTES / 1024 / 1024))

echo "Output: $OUTPUT"
echo "Size: ${MB}MB"

This is not the most advanced FFmpeg command in the world.

That is the point.

The goal is not to make the cleverest media pipeline. The goal is to make a reliable default that works under pressure.

Why these flags matter

The important pieces:

-c:v libx264

H.264 is still the safest default for social uploads.

-pix_fmt yuv420p

This avoids the classic "works locally, fails elsewhere" problem with pixel formats.

-movflags +faststart

This moves metadata to the beginning of the file so web playback can start faster.

-vf "scale='min(1280,iw)':-2"

This keeps smaller videos unchanged and scales oversized ones down to a practical width.

-crf 23

Good enough quality without creating a monster file.

Could I tune this per video? Yes.

Do I want to think about that every time I need to publish a 20-second AI demo? No.

The verification step is the real skill

The conversion is only half the workflow.

The platform check matters more.

My rule now:

Do not trust "upload succeeded."
Trust only the composer state.

For an AI video post, I want to see:

video preview visible
upload progress completed
platform disclosure selected if needed
post button enabled
final published post shows the video

If the browser automation says the file was uploaded but the composer does not show the video, the workflow stops.

That sounds obvious, but this is where a lot of automation breaks.

It checks the API call or file input state, not the actual user-facing publishing state.

The skill's job is to keep that distinction explicit.

Making it agent-friendly

The next step is making the workflow easy for an AI agent to use.

That means the skill needs plain instructions:

## Agent Instructions

1. Run `./make-x-safe.sh <video>`.
2. Read the output path from stdout.
3. Confirm `ffprobe` shows:
   - codec_name=h264
   - pix_fmt=yuv420p
4. Check file size.
5. Upload through the real platform composer.
6. Verify visual preview before posting.
7. After posting, open the final URL and verify the video is embedded.

Notice what is missing:

No vague "make it work."

No "post when ready."

No giant prompt with 40 edge cases.

The skill gives the agent a small operating procedure with a clear stop condition.

That is much easier to trust.

What changed

Before the skill:

Generate video.
Try upload.
Watch it fail.
Search old commands.
Re-encode.
Try again.
Hope the preview appears.

After the skill:

Generate video.
Run one command.
Check output.
Upload.
Verify preview.
Publish.
Verify final post.

The workflow is not glamorous.

It is better than glamorous: it is boring.

And boring is what I want from production media prep.

The bigger lesson

This is why I keep coming back to Terminal Skills.

A useful AI workflow is usually not one huge autonomous agent.

It is a set of small, documented, reusable capabilities:

prepare a video for upload
inspect a repo
generate thumbnails
validate a draft
resize images
check links
publish only after approval

Each skill removes one fragile piece of manual memory.

Each skill gives the agent a narrower job.

Each skill creates a cleaner definition of done.

That is the part I think a lot of AI tooling conversations miss.

The future is not just "agents can use tools."

The useful version is:

Agents can use well-defined skills with clear boundaries.

That is how the work becomes repeatable.

And for AI video uploads, repeatable beats clever every time.

If you build with AI agents, what is the workflow you keep fixing manually?

That is probably your next Terminal Skill.

Disclosure: I used AI assistance to draft and edit this article, then reviewed the workflow, commands, and claims before publishing.

Approval Gates for AI Agents: Draft Approval Is Not Publish Approval

Alex Shev — Wed, 27 May 2026 16:52:45 +0000

One of the easiest ways to make an AI agent dangerous is to give it a vague approval.

Not because the agent is malicious.

Because humans use words like "ok", "approved", and "ship it" casually.

In a normal conversation, that is fine. In an automated workflow, it can become a bug.

If an agent is drafting an article, writing a tweet, preparing a pull request, generating social posts, or touching an external platform, the question is not just:

Did the human approve this?

The better question is:

What exactly did the human approve?

That distinction matters more than most agent tooling discussions admit.

The small approval bug that becomes a big workflow problem

Imagine this simple flow:

An agent prepares 10 social posts.
The human says "looks good".
The agent publishes all 10 immediately.

Maybe that was correct.

Maybe it was not.

The human might have meant:

The copy looks good.

The agent interpreted it as:

Publish everything now.

That is not a model intelligence problem. It is a workflow design problem.

When an action leaves the local workspace, ambiguity gets expensive.

Publishing, emailing, commenting, deploying, charging a card, posting to social media, and modifying production systems should not depend on a loose interpretation of "ok."

They need explicit gates.

The approval types I separate now

For agent workflows, I like separating approval into at least four different meanings.

1. Approval to draft

This means:

Yes, prepare the thing.

The agent can research, outline, write, generate assets, create local files, and prepare a proposal.

But it cannot publish.

This is the safest default for content work.

For example:

Prepare a DEV.to draft about approval gates.

That should create a local markdown draft or an unpublished draft, depending on the workflow.

It should not post the article publicly.

2. Approval of the draft

This means:

The content direction is acceptable.

The human may be approving the argument, the structure, the tone, or the asset selection.

But that still does not automatically mean:

Publish it right now.

This is where many agent systems get sloppy.

Draft approval is not publish approval.

3. Approval to publish

This needs to be explicit.

The agent should know:

what platform
what asset or draft
whether to publish now or schedule
whether comments/replies/source attribution are included
whether the approval applies to one post or a batch

For example:

Publish this DEV.to article now.

or:

Post only the first X drafts, with the source as the first reply.

That is much safer than letting the agent infer a public action from a vague "ok."

4. Approval for automation reminders

This one is subtle.

A reminder or cron job should often prepare work, not perform external actions.

For example:

Every morning, find 3 candidate topics and ask me which one to use.

That is different from:

Every morning, publish a post.

The first one keeps the human in the loop.

The second one creates a recurring external action, which is much riskier.

Most teams should start with reminder-based automation before action-based automation.

Why this matters for CLI and agent workflows

CLI workflows make this even more important because agents can act fast.

A coding agent can edit files, run scripts, create branches, call APIs, deploy apps, write comments, and open browser sessions.

That speed is useful only if the boundaries are clear.

The workflow should define what the agent may do locally without asking and what requires a human gate.

For example, I am comfortable letting an agent do this freely:

inspect a repo
write a local draft
run tests
generate local assets
prepare a post
create a report
summarize findings

I want a stronger gate before this:

publish to DEV.to
post on X
send email
comment on Reddit
deploy to production
spend money
modify customer data

The difference is not whether the agent is capable.

The difference is whether the action is reversible, private, and low-risk.

Local work is cheap to revise.

External work creates a public or operational footprint.

A simple pattern: declare the gate in the task

One practical fix is to write the gate directly into the task.

Instead of:

Write an article about approval gates.

Use:

Prepare a local draft only. Do not publish. Return the file path and wait for explicit publish approval.

Instead of:

Make comments for Reddit.

Use:

Prepare comment drafts only. Do not post. Recommend the safest first set and wait for approval.

Instead of:

Deploy this.

Use:

Create a preview deployment first. Do not promote to production until I explicitly approve production.

This sounds boring, but boring is good here.

Good agent workflows are often just normal operational discipline written down clearly enough that the agent cannot guess wrong.

If the workflow supports metadata, make the gate machine-readable:

gate: draft_only
external_actions: false
next_approval_required: publish_to_devto
scope: single_article

That tiny block is not bureaucracy. It gives the agent a state it can report, verify, and refuse to exceed.

The agent should report its current gate

The agent should also say what state the work is in.

For example:

Status: local draft only.
No external publication happened.
Next gate: human approval to publish.

or:

Status: posted.
Verified final URL.
No additional replies published.

This makes the workflow auditable.

It also prevents the human from having to infer what happened.

When agents are doing real work, "done" is not enough.

The agent should report:

what changed
where the artifact is
what was verified
what did not happen
what approval is needed next

That last part is important.

Good agents should not only complete tasks. They should make the boundary of the next task clear.

Batch work needs an even stronger gate

Batch publishing is where approval ambiguity gets especially risky.

If an agent prepares 25 comments, does approval mean:

publish all 25 now?
publish the safest 5?
publish one per day?
publish only after checking each target again?
publish drafts after human edits?

Those are very different actions.

For batch workflows, I like adding two fields:

Approval scope:
Cadence decision:

Example:

Approval scope: first 5 comments only
Cadence decision: publish today, one by one, stop on warning

or:

Approval scope: all 10 posts approved as drafts
Cadence decision: schedule one per day at 10 AM

That tiny bit of structure prevents a lot of mistakes.

It also gives the agent a concrete stop condition.

This belongs inside skills

This is one reason I care about reusable agent skills.

A skill should not only say:

Here is how to perform the task.

It should also say:

Here is what requires approval.
Here is what can be done locally.
Here is how to verify the result.
Here is when to stop.

That is the difference between a tool and a workflow.

A tool gives the agent power.

A skill gives the agent operating rules.

It can encode approval gates as reusable policy, not just task steps.

For example, a publishing skill should define:

draft-only mode
review mode
publish mode
source/comment behavior
verification steps
rollback or correction process
rate-limit and spam-warning stop conditions

Without that, the agent has to infer the process from the conversation.

That is exactly where mistakes happen.

The rule I use

My current rule is simple:

If the action is external, public, paid, destructive, or hard to undo, the approval must name the action.

"Looks good" is enough for a draft.

It is not enough for publication.

"Ok" is enough to continue local work.

It is not enough to spend money, post publicly, email someone, or modify production.

For those actions, the approval should be explicit:

Publish this now.
Send this email.
Deploy to production.
Post these 5 comments.
Charge this card.

The goal is not to slow the agent down.

The goal is to make speed safe.

Final thought

The fix is not complicated: write the gates down.

Make the agent report which gate it is at.

And never let draft approval silently become publish approval.

I am collecting and building practical examples of this kind of agent workflow discipline at Terminal Skills: reusable skills that teach agents not only which tools to use, but how to work safely and repeatably.

Disclosure: I used AI assistance while drafting this article, then reviewed and edited it manually.

MCP Gave AI Agents Tools. A2A Gives Them Coworkers.

Alex Shev — Mon, 25 May 2026 15:30:15 +0000

Most AI agent conversations eventually run into the same question:

What happens when one agent is not enough?

A single agent can write code, search docs, run tests, call APIs, summarize files, and generate reports.

That already feels useful.

But real work rarely fits into one neat box.

A software project might need:

one agent to research requirements
one agent to write code
one agent to generate tests
one agent to review the result
one agent to deploy or monitor the system

A customer support workflow might need:

a billing agent
a technical support agent
a sales agent
a human handoff agent

A content workflow might need:

a research agent
a writing agent
an SEO agent
an editor
a publishing assistant

At that point, the interesting problem is no longer just:

Can an agent use a tool?

It becomes:

Can agents work with each other?

That is where the A2A protocol becomes interesting.

And that is why the a2a-protocol skill on Terminal Skills is worth looking at.

MCP vs A2A in one sentence

MCP gave agents tools.

A2A gives agents coworkers.

That is the simplest way I think about it.

MCP, or Model Context Protocol, is mostly about connecting an agent to tools, APIs, files, databases, search systems, and external data sources.

A2A, or Agent2Agent, is about connecting one agent to another agent.

The difference matters.

If an agent needs to query a database, call a calendar API, or inspect a repo, MCP is a good fit.

If an agent needs to delegate work to another autonomous agent with its own capabilities, state, task lifecycle, and output format, A2A is the better mental model.

A tool usually does one specific thing.

An agent can own a whole domain of work.

Why agents need a collaboration layer

Right now, many agent workflows are still stitched together with prompts.

Something like:

First research the topic.
Then write a draft.
Then review it.
Then create a final version.

That can work for small tasks.

But it gets messy as soon as the workflow becomes more serious.

What happens when research takes five minutes?
What happens when the writing agent needs structured data, not a paragraph?
What happens when the reviewer rejects the output?
What happens when the task needs human input?
What happens when the work should continue asynchronously?

At that point, a prompt chain is not enough.

You need something closer to a protocol.

The A2A model introduces concepts like:

an Agent Card for discovery
declared skills and capabilities
task lifecycle states
messages between agents
artifacts as outputs
streaming updates
push notifications
cancellation
structured data exchange

That sounds boring in the best possible way.

Boring protocols are what turn demos into infrastructure.

The Agent Card is the underrated part

One of the most useful ideas in A2A is the Agent Card.

An Agent Card is basically a description of what an agent is, where it lives, and what it can do.

It describes identity, capabilities, skills, endpoints, and auth without exposing the agent's internal tools, memory, or private state.

It can include:

the agent name
description
endpoint URL
version
capabilities
skills
input modes
output modes
authentication requirements

A standard discovery endpoint like this:

/.well-known/agent-card.json

may not sound exciting.

But it solves a real problem.

If agents are going to collaborate, they need to know who they are talking to.

A coding agent should be able to discover a code review agent.
A support router should be able to discover a billing agent.
A research agent should be able to discover a writer agent.

Without a discovery layer, every multi-agent workflow becomes custom glue.

With one, agents can start to behave more like services.

Example: a code pipeline made of agents

Imagine a simple software pipeline.

You do not want one giant agent doing everything.

Instead, you split the workflow:

Code writer generates the implementation.
Test writer creates tests.
Code reviewer checks both.
Orchestrator coordinates the flow.

Each agent can have a narrow responsibility.

The code writer does not need to know everything about test strategy.
The test writer does not need to own product requirements.
The reviewer does not need to write the initial implementation.

This makes the workflow easier to inspect.

If the tests are weak, improve the test agent.
If the review is shallow, improve the review agent.
If the handoff fails, improve the orchestrator.

That is much cleaner than debugging one giant prompt with five hidden roles inside it.

Example: a customer support router

A2A also makes sense for support.

A support router agent receives a customer message:

I was charged twice and my account still says inactive.

The router should not try to solve everything itself.

It can delegate:

billing question -> billing agent
account activation -> technical agent
refund eligibility -> policy agent
unclear or sensitive issue -> human handoff

The router's job is classification, coordination, and response assembly.

The specialized agents own their own domains.

This is how real teams work.

Not everyone does everything.

The interesting part is making agents follow that same pattern.

Where Terminal Skills fits

This is where Terminal Skills becomes useful.

The a2a-protocol skill is not just a link to a spec.

It gives an AI coding agent a practical operating guide for building A2A systems:

when to use A2A
how to define an Agent Card
how to build a server
how to build a client
how to handle task states
when to use streaming
how to orchestrate multiple agents
how A2A differs from MCP
what implementation guidelines to follow

That matters because agents often fail not from lack of intelligence, but from lack of workflow structure.

A model may understand A2A in general.

But a skill gives it a repeatable path:

terminal-skills install a2a-protocol

After that, the agent has a focused reference for the task instead of relying on vague memory or generic web knowledge.

This is the bigger idea behind Terminal Skills:

Agents do not just need more context. They need reusable operational instructions.

A2A is not a replacement for MCP

It is tempting to turn every new protocol into a winner-takes-all comparison.

I do not think that is the right framing here.

A2A and MCP solve different problems.

A practical way to separate them:

Need	Better fit
Call a tool or API	MCP
Read from a data source	MCP
Give an agent access to external systems	MCP
Delegate a task to another agent	A2A
Discover agent capabilities	A2A
Coordinate long-running agent work	A2A

MCP is agent-to-tool.

A2A is agent-to-agent.

In real systems, you probably want both.

An agent might use MCP internally to access tools, while exposing an A2A interface so other agents can delegate work to it.

That combination is where things get interesting.

What makes this useful for developers

For developers, the value is not the protocol acronym.

The value is fewer fragile workflows.

Instead of hardcoding every handoff, you can start thinking in terms of:

discoverable agents
declared capabilities
explicit task states
structured messages
cancellable work
streaming progress
artifacts as outputs

That is closer to software engineering than prompt engineering.

And honestly, that is the point.

The next generation of agent systems will not be built from one giant prompt.

They will look more like networks of specialized workers with clear contracts between them.

Some of those workers will call tools.

Some will call other agents.

The quality of the system will depend less on one model being magical and more on whether the handoffs are well designed.

The small shift that matters

The first wave of AI agents was about giving models tools.

That was a huge step.

But tools are not the whole story.

If agents are going to handle real work, they need coordination.
They need delegation.
They need discovery.
They need contracts.
They need ways to say:

I can do this task.
Here is how to call me.
Here is what I return.
Here is how I report progress.
Here is how I fail.

That is why A2A is worth watching.

Not because every project needs a multi-agent architecture today.

Most do not.

But because the direction is clear:

Agents are moving from isolated assistants to interoperable workers.

MCP gave agents tools.

A2A gives them coworkers.

And skills like a2a-protocol help agents learn how to build that future without starting from a blank prompt.

If you want to try it, the skill is here:

https://terminalskills.io/skills/a2a-protocol

Install:

terminal-skills install a2a-protocol

Then ask your coding agent to build one tiny A2A loop: Agent Card -> server -> client -> delegated task.

Start small.

One focused agent.
One clear capability.
One verifiable handoff.

That is usually where the real workflow begins.

From Blender Demos to Agent Toolchains: Why Terminal Skills Matter

Alex Shev — Tue, 19 May 2026 04:35:35 +0000

Most AI + Blender demos still follow the same pattern:

Ask the model for a prompt.
Generate a scene or script.
Hope the result looks close enough.
Try again when it breaks.

That can be useful for experiments.

But it is not how real creative work usually gets done.

Blender is not just an image generator. It is a full production environment with scenes, objects, cameras, lights, materials, modifiers, animation timelines, render settings, exporters, and a Python API.

So the interesting question is not:

Can an AI agent describe a Blender workflow?

The better question is:

Can an AI agent actually operate Blender as part of a repeatable toolchain?

That is where terminal-native skills become interesting.

The gap between “knowing Blender” and using Blender

Modern AI models can explain Blender concepts very well.

They can tell you what a bevel modifier does.
They can describe three-point lighting.
They can write a Python script that creates a simple scene.
They can explain camera focal lengths, materials, render engines, and file exports.

But knowing the tool is not the same as reliably using the tool.

If you ask an agent to create a product render in Blender, a lot can go wrong:

the camera may not frame the object
the lights may be too weak or too harsh
the material names may be inconsistent
the render settings may be missing
the script may assume the wrong scene state
the output file may never be verified
the workflow may work once and fail tomorrow

That is the difference between a demo and a production workflow.

A demo can be impressive once.

A workflow needs to be repeatable.

Why Blender is a good test case for AI agents

Blender is creative, but it is also deeply scriptable.

That makes it a useful benchmark for agent workflows.

It is not enough for the agent to say something plausible. At the end, there should be an actual artifact:

a .blend file
a rendered image
an animation preview
an exported asset
a contact sheet
a set of named cameras
a reusable scene setup

Either the output exists or it does not.

That makes Blender less forgiving than a text-only task, and that is exactly why it is valuable.

It forces the agent to move from language into execution.

The role of Terminal Skills

Terminal Skills is an open-source catalog of skills for AI agents.

The idea is simple:

Agents do not just need more prompts. They need reusable operational workflows.

A skill can teach an agent how to perform a specific type of work:

when to use the workflow
what inputs are expected
which commands or scripts matter
what conventions to follow
how to verify the result
what failure modes to avoid
what output should be returned

That is different from just giving the agent a tool.

A tool gives the agent capability.

A skill gives the agent a path.

For Blender, that path matters a lot.

From GUI work to agent-operable workflows

Blender has a powerful GUI, and artists should absolutely use it.

But a GUI is not always the best interface for an AI agent.

Agents work best when they can:

run a command
inspect files
read logs
modify scripts
verify outputs
repeat the process

That is why terminal-native workflows are such a natural fit.

A terminal workflow gives the agent a clean feedback loop:

intent → command/script → output → verification → next step

Instead of guessing inside a visual interface, the agent can perform concrete operations and check whether they worked.

For example, a Blender skill might help an agent:

create a clean scene setup
generate camera variants
apply consistent material conventions
create lighting presets
render previews
export assets
validate that the output file exists
return a short summary of what changed

The human still controls taste and direction.

The agent handles the repeatable production layer.

What belongs inside a Blender skill

A useful Blender skill is not just a prompt template.

It should behave more like a small operating manual for the agent.

It should define:

what the workflow is for
what inputs are required
which files the agent may create or modify
which commands or scripts should be run
what naming conventions to follow
what output artifacts must exist
how to verify those artifacts
what common failure modes to check before reporting success

For example, instead of giving the agent a vague request like this:

Make a Blender product scene.

A skill can define a stronger contract:

Create or update the scene.
Save the .blend file.
Render a preview.
Confirm the preview file exists.
Return the paths and a short summary of what changed.

That contract is the important part.

It gives the agent a definition of done that is stronger than “the response sounds plausible.”

The skill is the interface

A lot of agent tooling conversations focus on connectors.

Can the agent access this app?
Can it call this API?
Can it run this command?
Can it control this environment?

Those questions matter.

But access is not the whole workflow.

If an agent can run Blender from the terminal, that is useful. But the more important layer is the operating pattern around that access:

what should the agent do first?
what should it avoid touching?
how should files be named?
when should it render a preview?
what should it check before saying done?
what should it hand back to the human?

That is why I like thinking about skills as interfaces for work.

They make the task boundary explicit.

The agent is not just dropped into a powerful tool and told to figure it out.

It gets a workflow it can execute, inspect, and repeat.

A better definition of done

For many AI tasks, “done” is too fuzzy.

The model stops writing, so the interaction feels complete.

But production work needs a stronger definition.

For a Blender workflow, “done” might mean:

the .blend file was saved
the preview render exists
the output path was returned
the scene contains named cameras and lights
the agent reports what changed
the human has something concrete to review

This is where terminal-native skills become especially useful.

They can push the agent toward evidence-based completion.

Not just:

Here is a script you could run.

But:

I ran the workflow, created these artifacts, checked these outputs, and here is what needs review.

That difference is small in a demo.

It is huge in real work.

Why this matters for reproducibility

One-off AI outputs can be impressive, but they are hard to build on.

If the process is hidden inside a long prompt and a lucky generation, it is difficult to answer basic questions:

Can we run this again next week?
Can another agent follow the same process?
Can we change one input and keep the rest consistent?
Can we debug why the output failed?
Can we tell which step created which artifact?

Terminal-native workflows are not glamorous, but they help with all of that.

Commands can be rerun.
Files can be inspected.
Logs can be read.
Outputs can be checked.
Conventions can be documented.

A skill wraps those pieces into something the agent can reuse.

That is the real value.

Not magic.

Repeatability.

Why this matters beyond Blender

Blender is just one example.

The same pattern applies to many agent workflows:

video processing
data cleanup
documentation updates
screenshot generation
test automation
asset exports
report generation
deployment checks

In each case, the problem is not only whether the model understands the task.

The problem is whether the agent has a reliable way to perform the task.

That usually requires more than a prompt.

It requires operational knowledge:

steps
defaults
constraints
checks
outputs
failure handling

That is what skills are good at packaging.

Skills make agent work more auditable

One underrated benefit of terminal-native skills is auditability.

When an agent uses a repeatable workflow, it can leave evidence:

which files were created
which commands were run
which checks passed
where the output was saved
what still needs human review

That makes agent work easier to trust.

Not because the agent becomes magically perfect.

Because the workflow becomes visible.

For creative work, that matters.

A human should not have to guess whether the agent actually rendered the scene, exported the asset, or just stopped after writing a script.

The output should be inspectable.

The practical takeaway

If you are building with AI agents, do not only ask:

What tools can my agent access?

Ask:

What repeatable workflows can my agent follow?

Blender makes this obvious because the final result is concrete.

A good agent workflow should not end with “here is some code you could run.”

It should end with an artifact, a check, and a clear next step.

That is the shift Terminal Skills is designed around:

less one-off prompting
more reusable workflows
less hidden improvisation
more executable, verifiable work

Agents do not need to become artists.

But they can become much better production assistants.

And for tools like Blender, that is already a very useful place to start.

The bigger point

Blender is useful here because it makes the gap visible.

If the render file does not exist, the workflow failed.
If the camera misses the object, the workflow failed.
If the agent cannot explain what changed, the workflow is hard to trust.

That same lesson applies to other agent work too.

Terminal Skills is about turning repeatable work into reusable operational knowledge: not just what the agent should know, but how it should act, check itself, and report the result.

If you want to explore the catalog, Terminal Skills is open-source and available at terminalskills.io.

AI assistance was used while drafting this article. The final structure, edits, and publishing decisions are human-reviewed.

How AI Agents Can Use Blender Like a Real Tool, Not Just Generate Prompts

Alex Shev — Sat, 16 May 2026 13:44:21 +0000

Most AI demos around 3D creation still have the same shape:

Write a prompt.
Hope the model understands the scene.
Regenerate until it looks close enough.

That is not a workflow.

That is trial and error with prettier tooling.

That can be useful for concept art.

But Blender is not just a visual output machine.

Blender is a real production tool. It has scenes, objects, modifiers, materials, cameras, lighting, animation timelines, render settings, file exports, and a Python API that can control a large part of it.

So when an AI agent works with Blender, the goal should not be “make a nice image from a prompt.”

The goal should be:

Let the agent perform repeatable 3D work inside Blender.

That difference matters.

Because prompts are vague.

Tools are executable.

The problem: agents “know” Blender, but they do not always operate Blender well

A capable AI model can explain Blender concepts all day:

what a bevel modifier does
how materials work
how to set up a camera
how to create a simple animation
how to use Python inside Blender

That knowledge is useful.

But knowledge is not the same as a reliable workflow.

If you ask an agent:

Create a cinematic product render in Blender.

it may generate a decent Python script.

Or it may forget a render setting.

Or create a camera that points in the wrong direction.

Or use objects with inconsistent scale.

Or produce something that only works in one local Blender version.

Or fail silently because the script assumes a scene state that does not exist.

This is the gap between “the model knows Blender” and “the agent can actually use Blender as a tool.”

For real work, that gap gets expensive fast.

A better mental model: Blender skills as reusable workflows

Instead of asking the agent to invent a Blender workflow from scratch every time, give it a reusable skill.

A skill is not just a longer prompt.

A good skill packages the operational knowledge around a task:

what the agent should do
what files or inputs it should expect
which commands or scripts to run
what quality checks matter
what output should be produced
what common failure modes to avoid

For Blender, that might mean a skill for:

creating a clean scene setup
adding materials consistently
generating camera views
setting lighting presets
rendering thumbnails
exporting assets
automating turntable animations
building simple procedural objects
validating that the render actually exists

The agent still reasons.

But it does not need to rediscover the workflow every time.

It can follow a known path.

Example: scene setup should be boring

Scene setup is one of those tasks that sounds simple until you repeat it often.

You may want the same basics every time:

clear the default cube
set units
create a camera
add a key light and fill light
set world color
configure resolution
set render engine
save the file
optionally render a preview

This is exactly the kind of thing that should not depend on a fresh model improvisation.

A Blender skill can turn that into a repeatable operation:

Use the Blender scene setup skill.
Create a 16:9 product render scene.
Use a dark studio background.
Add camera, key light, fill light, and ground plane.
Return the .blend file and preview render.

Now the agent has a workflow boundary.

It knows the task is not “be creative forever.”

It knows the expected output.

It knows what success looks like.

That is a much better interface than a vague creative prompt.

Example: materials are easier when the agent has conventions

Materials are another good case.

If you ask an agent to “make it look premium,” it can go in many directions.

Sometimes that is fine.

But production work usually needs conventions:

use named materials
avoid random node spaghetti
keep roughness/metallic values sane
separate glass, metal, plastic, rubber, and emission materials
make assets readable under different lighting
document what was created

A skill can encode those conventions.

Then the agent can apply them consistently instead of guessing from scratch.

This is especially useful when multiple agents or multiple projects touch the same assets.

The output becomes easier to review.

And easier to fix.

Example: camera automation is not just aesthetics

Camera placement is one of the quickest ways to make a 3D result look broken.

The scene can be good, but if the camera is inside an object, too far away, pointed at the wrong target, or using the wrong focal length, the render fails as a deliverable.

A reusable camera skill can define practical defaults:

frame the selected object
use a target empty
set focal length by shot type
create front, side, top, and hero views
verify the object is visible
save camera names clearly
render contact sheets for review

That gives the agent a real operation:

Generate 4 review cameras for this model and render a contact sheet.

Not:

Make the camera look good.

The second version is subjective.

The first version is work.

Why terminal-based skills fit Blender well

Blender is not only a GUI app.

It can run headless.

It can execute Python scripts.

It can render from the command line.

That makes it a strong fit for agent workflows.

An agent can:

prepare a script
run Blender in background mode
generate or modify a scene
render a preview
inspect whether output files exist
return the result for review

A small workflow might look like this:

blender -b template.blend --python scripts/setup_scene.py -- --output renders/preview.png
test -f renders/preview.png

The agent runs Blender headless, lets the script create or update the scene, checks that the preview render exists, and returns the image for review.

That is a much stronger loop than “generate something and hope it worked.”

This is exactly where terminal skills become useful.

They give the agent a practical bridge between natural language and repeatable execution.

Instead of treating Blender as a mystical creative box, the agent treats it like a real toolchain.

That is the upgrade.

The key is not replacing artists

This is not about replacing 3D artists.

It is about removing repetitive setup work and making agent output easier to trust.

Artists still make judgment calls:

composition
style
taste
realism
brand fit
final polish

But agents can help with the mechanical layers:

generate scene variants
batch render previews
create consistent lighting setups
prepare template files
automate exports
produce review contact sheets
validate file outputs

That is useful because it gives humans more time for the parts where taste matters.

The agent handles the repeatable work.

The human reviews, directs, and improves.

Prompting is still part of the workflow — just not the whole workflow

Prompts are not bad.

They are still how we describe intent.

The issue is pretending the prompt is the entire production system.

For Blender work, the better structure is:

Intent → Skill → Execution → Output → Review

The prompt describes the goal.

The skill defines the workflow.

The tool executes the work.

The output proves what happened.

The review decides what comes next.

That loop is much healthier than endless prompt regeneration.

What this unlocks

When agents use Blender through reusable skills, a lot of workflows become more realistic:

product mockups
simple 3D thumbnails
procedural scenes
animation previews
asset cleanup
batch rendering
format conversion
lighting tests
camera studies
social media visuals

None of this requires the agent to be magically perfect.

It requires the agent to have a reliable way to act.

That is the point.

The future of AI agents in creative tools will not just be bigger models writing prettier prompts.

It will be agents using real tools through small, reliable workflows.

Blender is one of the best places to see that shift.

If you are building agent workflows around tools like Blender, the question is not only:

Can the model describe the task?

The better question is:

Can the agent perform the task, check the output, and repeat it tomorrow?

That is where skills matter.

And that is why AI agents should use Blender like a real tool — not just another prompt box.

I am building and collecting practical AI agent skills at Terminal Skills, including workflows for tools like Blender, FFmpeg, and other command-line-first creative/dev tools.

If you are experimenting with agents that need to do real work instead of just generate text, the useful shift is to think in skills, not only prompts.

AI assistance was used while drafting this article. The final structure, edits, and publishing decisions are human-reviewed.

Your AI Agent Does Not Need More Context. It Needs a Smaller Workflow.

Alex Shev — Wed, 13 May 2026 22:38:18 +0000

A lot of AI agent workflows are becoming expensive for a very boring reason:

We keep giving the agent too much context and not enough direction.

The usual pattern looks like this:

Here is my whole repo.
Here are 12 tools.
Here are 9 docs.
Here is the bug.
Please figure it out.

Sometimes it works.

But it is also how you end up with huge token usage, messy tool calls, slow runs, and an agent that spends half the session rediscovering what a human already knows.

More context feels safer.

In practice, it often makes the workflow worse.

The problem is not context. The problem is unfiltered context.

AI agents do need context.

They need the right files, the right constraints, the right examples, and the right definition of done.

What they do not need is every possible thing that might be relevant.

That is where many workflows go wrong.

A small task becomes a giant investigation because the agent has no boundary.

For example, imagine asking an agent to update a pricing component.

Bad version:

Read the app and update the pricing page.

Better version:

Update the pricing card component.
Only inspect files under components/pricing and app/pricing.
Do not change billing logic.
Run the component test and TypeScript check.
Return a short summary plus changed files.

The second prompt is not just shorter.

It is a smaller workflow.

That is the part that matters.

Tools make this easier and harder

MCP and other tool protocols are useful because they give agents cleaner access to external systems.

That is a real improvement.

But tool access also creates a new problem:

The agent can now search more, read more, call more APIs, and collect more context than it actually needs.

A connected agent is powerful.

A connected agent with no workflow boundary is expensive.

This is why I think a lot of teams are asking the wrong question.

The question is not only:

What tools can my agent access?

The better question is:

What is the smallest reliable workflow this task needs?

That one question changes how you design the agent.

A workflow beats a giant prompt

When I say “workflow,” I mean a repeatable operating pattern:

when to use it
which files or tools are in scope
which files or tools are out of scope
what steps to follow
what checks to run
what output format to return
what should trigger a human handoff

This is different from a one-off prompt.

A prompt is a request.

A workflow is a habit.

And agents need good habits more than they need another 2,000 words of instructions.

Example: debugging without reading the universe

Here is a simple debugging workflow I use mentally all the time.

Instead of telling the agent:

Debug this issue.

I want the agent to follow a narrow loop:

1. Reproduce the error.
2. Identify the smallest failing command or test.
3. Inspect only the files directly involved.
4. Make one focused change.
5. Re-run the failing check.
6. Stop if the fix requires product judgment.

That workflow does two useful things.

First, it keeps the agent from wandering through unrelated code.

Second, it creates a clear stopping point.

A lot of agent waste happens because there is no stopping point.

The agent keeps searching, summarizing, and patching because “done” was never defined.

Example: code review with a smaller surface area

The same idea applies to code review.

A vague agent review sounds like this:

Review this PR.

That can produce a long list of generic comments.

A smaller workflow is better:

Review only for:
- data loss risks
- auth or permission mistakes
- broken edge cases
- missing tests for changed behavior

Ignore style unless it affects correctness.
Return only high-confidence findings.

This turns the agent from a noisy reviewer into a useful filter.

It also saves attention.

The goal is not to make the agent say more.

The goal is to make the agent say fewer, better things.

This is where skills help

This is why I like packaging repeatable workflows as skills.

A skill can tell the agent:

For this kind of task, use this workflow.
Use these tools.
Avoid these traps.
Verify with these checks.
Return this output.

That is much more useful than repeatedly writing giant prompts.

For example, a media-processing skill could teach the agent the standard FFmpeg workflow.

A deployment skill could teach it the exact deploy and verification steps.

A code-review skill could teach it what kinds of issues matter and what kinds of comments to ignore.

A security skill could tell it when to stop and ask for human approval.

The skill is not magic.

It is just reusable judgment.

And reusable judgment is exactly what most agent setups are missing.

The smallest workflow test

Before giving an agent a task, I like to ask:

What is the smallest workflow that can finish this safely?

Then I reduce the task until the answer is clear.

That usually means:

fewer files
fewer tools
fewer open-ended instructions
more explicit checks
a clearer definition of done

This sounds less impressive than “agent with full repo access and every tool connected.”

But it works better.

Small workflows are easier to test.

Small workflows are easier to repeat.

Small workflows are easier to trust.

My rule of thumb

If an agent keeps burning tokens, I do not immediately add a better model.

I ask:

Is the task too broad?
Did I give it too many tools?
Did I define done?
Did I tell it what not to inspect?
Can this become a reusable skill?

Most of the time, the fix is not more intelligence.

It is less ambiguity.

Why this matters

AI agents are getting better fast.

But better models will not remove the need for workflow design.

If anything, stronger agents make workflow design more important because they can do more damage, spend more tokens, and move faster in the wrong direction.

The next layer of useful agent work is not just bigger context windows.

It is smaller, clearer, reusable workflows.

That is what I am collecting at Terminal Skills: practical examples of skills that give agents narrower, repeatable ways to do real work.

https://terminalskills.io

AI Receptionist Cost in 2026: How to Calculate ROI for Small Business Automation

Alex Shev — Thu, 07 May 2026 23:43:57 +0000

This is a practical breakdown for builders, consultants, and operators thinking about AI phone automation for small businesses.

The goal is not to hype “AI receptionists,” but to show what actually drives cost: call volume, integrations, booking workflows, follow-up, and missed-call recovery.

Small-business owners are not asking whether AI can answer the phone anymore. They are asking what an AI receptionist costs, what is included, and whether it pays for itself faster than hiring another front-desk person.

The honest answer is that AI receptionist cost depends on what you expect the system to do. A basic voice bot that takes messages is one category. A real AI employee that answers calls, texts missed callers, books appointments, updates your CRM, and follows up with leads is a different investment.

This guide breaks down AI receptionist pricing in practical terms so you can compare monthly cost against recovered calls, booked jobs, after-hours leads, and staff time saved.

What Affects AI Receptionist Cost?

Most pricing differences come from capability, not from the word “AI.” A cheap system may answer calls but fail when the caller asks a real question. A stronger system understands your business rules and turns conversations into outcomes.

The main drivers are:

call volume and whether pricing is per minute or flat monthly
voice quality and how natural the conversation feels
whether it can text, email, and follow up after the call
calendar, CRM, payment, and dispatch integrations
setup work: scripts, FAQs, service areas, offers, and escalation rules
reporting on missed calls, booked appointments, and revenue impact

Basic AI Answering vs. Full AI Receptionist

Basic call handling

Entry-level tools are useful if you only need a greeting, simple routing, or voicemail replacement. They are usually cheaper, but they often stop at message-taking. For businesses that depend on booked appointments, that is not enough.

Full AI receptionist

A full AI receptionist behaves more like a trained front-desk employee. It answers common questions, qualifies the lead, checks availability, books the next step, sends confirmations, and escalates edge cases. That is where ROI usually appears.

If an AI receptionist only saves labor, the math is decent. If it captures leads that were previously lost, the math gets much better.

How to Calculate ROI

Use a simple model before buying anything. Start with the calls you already receive, not fantasy traffic.

Ask:

How many calls do you miss each week?
How many after-hours calls go to voicemail?
What percentage of callers become booked appointments?
What is one booked job or consultation worth?
How much staff time goes into answering repeat questions?

If AI recovers just 5 to 10 extra leads per month and each booked customer is worth hundreds or thousands of dollars, the system can pay for itself quickly.

This is why phone-heavy businesses often see ROI faster than companies with low call volume.

Where Small Businesses Usually Overpay

Some businesses overpay for software that looks advanced but does not connect to operations. Others underpay for a cheap bot, then still need staff to clean up every conversation.

The goal is not the lowest monthly invoice. The goal is the lowest cost per qualified appointment.

Before choosing a tool, ask whether it can handle your actual calls:

pricing questions
service-area checks
emergency routing
appointment changes
lead qualification

If it cannot, the “savings” may disappear into manual cleanup.

Best Fit Businesses

AI receptionist systems tend to fit best when the phone is directly connected to revenue or operations.

Good examples include:

home service companies with technicians in the field
med spas, clinics, salons, and appointment-based local businesses
professional services that miss calls during client work
restaurants and hospitality businesses with repetitive phone questions
any company paying for Google Ads, Local Services Ads, or social traffic

Frequently Asked Questions

How much does an AI receptionist cost for a small business?

Pricing varies widely. Simple AI answering tools may be relatively low-cost, while full-service implementations with scheduling, CRM integration, follow-up, and reporting cost more.

Many full-service AI receptionist implementations can land around $1,000–3,000 per month, depending on scope.

The real cost depends on call volume, voice quality, scheduling, CRM integrations, and whether the system can book appointments instead of only taking messages.

Is an AI receptionist cheaper than hiring a human receptionist?

Usually yes, especially when the goal is 24/7 coverage. A full-time receptionist includes salary, payroll taxes, benefits, training, management, and coverage gaps.

An AI receptionist can provide round-the-clock call coverage and lead capture for a predictable monthly cost.

What gives an AI receptionist the fastest ROI?

The fastest ROI usually comes from recovering missed calls, booking after-hours leads, reducing voicemail leakage, and following up instantly with people who would otherwise call a competitor.

The Bottom Line

An AI receptionist should be judged by booked outcomes, not novelty.

If it answers calls, responds after hours, follows up instantly, and gets more prospects onto the calendar, it is not just a phone tool. It is a revenue-protection layer.

Originally published on AIEmployees:
https://aiemployees.us/blog/ai-receptionist-cost-small-business-2026

How I Built a CLI Skill to Batch-Process YouTube Shorts

Alex Shev — Thu, 07 May 2026 23:38:39 +0000

Last month, I had to process 16 YouTube Shorts.

Trim intros. Normalize audio. Add watermarks. Export multiple formats. Generate thumbnails.

Doing that manually in Premiere would have taken me most of an afternoon.

So I built a CLI skill instead.

It took about 2 hours to put together. On my machine, the batch itself finished in under 3 minutes once everything was set up.

This is the kind of repeatable workflow I think of as a Terminal Skill: small, documented, reusable automation that turns a messy manual task into one command.

Here’s the exact structure I used.

What I mean by a CLI skill

For me, a CLI skill is a reusable shell workflow with:

input validation
sensible defaults
predictable output
error handling
lightweight docs

Instead of retyping a long FFmpeg command every time, I run one script and get the same result every time.

# Instead of this:
ffmpeg -i input.mp4 -ss 00:00:02 -to 00:00:35 -vf "scale=1080:1920" -af "loudnorm=I=-14" -c:v libx264 -preset fast output.mp4

# I run this:
./process-short.sh input.mp4

That difference sounds small, but it removes the part that always breaks in real work: remembering the flags, the order, and the output steps.

Step 1: List the manual steps first

Before I wrote anything, I wrote down the full workflow:

trim the first 2 seconds
cut after 35 seconds
scale to 1080×1920
normalize audio to -14 LUFS
add watermark
export MP4
export WebM
generate a thumbnail

That gave me a real pipeline instead of a vague automation idea.

Step 2: Build one script that does the boring work

#!/bin/bash
set -euo pipefail

TRIM_START="00:00:02"
TRIM_END="00:00:35"
RESOLUTION="1080:1920"
AUDIO_TARGET="-14"
WATERMARK="./assets/watermark.png"
THUMB_TIME="00:00:05"

INPUT="${1:?Usage: process-short.sh <input.mp4>}"

if [[ ! -f "$INPUT" ]]; then
    echo "Error: File '$INPUT' not found."
    exit 1
fi

BASENAME=$(basename "$INPUT" .mp4)
OUTDIR="./output/${BASENAME}"
mkdir -p "$OUTDIR"

ffmpeg -y -ss "$TRIM_START" -to "$TRIM_END" -i "$INPUT" \
    -vf "scale=${RESOLUTION}" \
    -af "loudnorm=I=${AUDIO_TARGET}" \
    -c:v libx264 -preset fast -crf 23 \
    "${OUTDIR}/trimmed.mp4"

if [[ -f "$WATERMARK" ]]; then
    ffmpeg -y -i "${OUTDIR}/trimmed.mp4" -i "$WATERMARK" \
        -filter_complex "overlay=W-w-20:H-h-20" \
        "${OUTDIR}/${BASENAME}_final.mp4"
else
    cp "${OUTDIR}/trimmed.mp4" "${OUTDIR}/${BASENAME}_final.mp4"
fi

ffmpeg -y -i "${OUTDIR}/${BASENAME}_final.mp4" \
    -c:v libvpx-vp9 -crf 30 -b:v 0 \
    "${OUTDIR}/${BASENAME}.webm"

ffmpeg -y -i "${OUTDIR}/${BASENAME}_final.mp4" \
    -ss "$THUMB_TIME" -frames:v 1 \
    "${OUTDIR}/${BASENAME}_thumb.jpg"

rm -f "${OUTDIR}/trimmed.mp4"

A few choices mattered a lot:

set -euo pipefail so failures don’t get ignored
-y because this is a pipeline, not an interactive tool
temp file cleanup so output folders stay usable

Step 3: Make it batch-capable

One file is a demo. A directory is the real use case.

#!/bin/bash

INPUT_DIR="${1:?Usage: batch-process.sh <directory>}"
PROCESSED=0
FAILED=0

for file in "$INPUT_DIR"/*.mp4; do
    [[ -f "$file" ]] || continue
    if ./process-short.sh "$file"; then
        ((PROCESSED++))
    else
        echo "FAILED: $file"
        ((FAILED++))
    fi
done

echo "Batch complete: $PROCESSED processed, $FAILED failed"

This is where it actually became useful. I didn’t want a cool script. I wanted to stop babysitting repetitive exports.

Step 4: Add docs, even if it’s just for yourself

I also added a tiny SKILL.md with:

what the script does
requirements
usage
config variables
output files

That sounds boring, but it matters. A script without docs becomes archaeology in two weeks.

Step 5: Test annoying edge cases

This was the part that caught real bugs:

empty files
files with missing or unexpected audio tracks
short clips
weird filenames with spaces and brackets

That testing found multiple issues immediately. If I had skipped it, the batch version would have failed silently later.

One of the first things I ran into was how quickly “works on one file” falls apart on real footage. A weird filename, a silent clip, or a shorter-than-expected video is enough to break the whole flow if you never test for it.

The simplified script above assumes the input has an audio stream. In my real version, I added a small ffprobe check before applying loudnorm, because silent clips need a different path.

The result

Before: most of an afternoon for 16 videos

After: one command, then the machine does the repetitive part

./batch-process.sh ./raw-videos/

It’s still less flexible than Premiere. If I want one-off polish, I’ll still do it manually.

But for repeatable batch work, the tradeoff is absolutely worth it.

That’s why I like building CLI skills.

Not because they’re clever. Because they turn something fragile and repetitive into something boring and reliable.

If you build little terminal workflows or CLI skills like this too, I’d genuinely love to hear what you’ve automated.

What would you automate first?

MCP Is Not the Product — Reusable Skills Are

Alex Shev — Tue, 24 Mar 2026 22:20:07 +0000

Right now, a lot of people are talking about MCP.

And I get why.

It’s a clean idea: connect an AI agent to tools, data, and actions through a standard interface, and suddenly the model can actually do things.

That matters.

But I think a lot of people are still confusing the plumbing with the product.

MCP is useful. MCP is not the product.

The product is what happens after the connection exists.

The product is a reusable skill.

Tools are not enough

Giving an agent access to tools sounds powerful.

But in practice, raw tool access is messy.

If you just hand an agent ten tools, you usually get one of these outcomes:

it uses the wrong one
it uses the right one in the wrong order
it calls the same thing three times with weak assumptions
it produces output that technically worked, but isn’t reusable

That’s because tools are still too low-level.

A tool is a capability.
A skill is a workflow.

That difference is everything.

A simple example

I’ve seen this in boring, practical work more than once.

An agent with raw shell access and FFmpeg access can absolutely process a video.

But “can process a video” is not the same as “can reliably produce the same short-form output every time.”

The raw-tool version tends to drift:

wrong trim point
audio forgotten
watermark missing
output naming inconsistent
thumbnail skipped

The skill version is much narrower, but much more useful.

It knows the sequence.
It knows the defaults.
It knows what “done” looks like.

That’s the part people actually want in production.

Not possibility.
Reliability.

What a skill actually is

A skill is not just “the agent can run a command.”

A useful skill packages:

the goal
the right sequence of actions
defaults
constraints
output expectations
failure handling
context about when to use it

Compare these two ideas:

Tool: FFmpeg is available

Skill: Turn a raw clip into a 1080x1920 short with trimmed intro, normalized audio, watermark, and thumbnail

The second one is what people actually want.

No one wakes up thinking:
“I hope my agent has access to a media binary.”

They want the result.

Why this matters for AI agents

The current wave of agent demos still over-indexes on tool access.

You see a lot of:

database access
filesystem access
browser access
shell access
API access

That’s all necessary.
But it’s still not enough.

Because once the novelty wears off, teams ask a more practical question:

Can this do the same useful task reliably more than once?

That’s where most agents still fall apart.

They can improvise.
They can explore.
They can sometimes solve a task.

But reliable repeated execution comes from skills.

Skills are what turn “the agent figured it out once” into “the agent can do this whenever I need it.”

MCP helps. Skills deliver.

This is why I think the framing matters.

MCP is important because it standardizes access.

That’s good.

But once access is standardized, the real differentiation shifts somewhere else:

which workflows are packaged well
which tasks are reusable
which skills are trustworthy
which outputs are predictable
which agent behaviors are actually worth repeating

In other words:

MCP gives agents hands. Skills give them habits.

And habits are what make a system valuable.

The more useful pattern

The better pattern is:

use MCP to expose capabilities
use skills to package outcomes

That way the agent is not just connected.
It is directed.

And that is a much more practical way to build.

Why I care about this

This is exactly why I’m bullish on skill-based systems.

Because once you start packaging repeatable terminal workflows into reusable skills, a lot of noisy AI hype suddenly becomes much simpler.

You stop asking:

what tools can this model access?

And start asking:

what useful work can this system repeat reliably?

That is a better question.

And in my experience, it leads to much better products.

Tools unlock possibility. Skills unlock reliability.

And reliability is what people actually pay for.

If you want to see how we think about packaging repeatable terminal workflows into reusable skills, that’s exactly what we’re building at Terminal Skills.

How I Built a CLI Skill to Batch-Process YouTube Shorts

Alex Shev — Wed, 18 Mar 2026 04:16:33 +0000

Last month, I had to process 16 YouTube Shorts.

Trim intros. Normalize audio. Add watermarks. Export multiple formats. Generate thumbnails.

Doing that manually in Premiere would have taken me most of an afternoon.

So I built a CLI skill instead.

It took about 2 hours to put together. On my machine, the batch itself finished in under 3 minutes once everything was set up.

Here’s the exact structure I used.

What I mean by a CLI skill

For me, a CLI skill is a reusable shell workflow with:

input validation
sensible defaults
predictable output
error handling
lightweight docs

Instead of retyping a long FFmpeg command every time, I run one script and get the same result every time.

# Instead of this:
ffmpeg -i input.mp4 -ss 00:00:02 -to 00:00:35 -vf "scale=1080:1920" -af "loudnorm=I=-14" -c:v libx264 -preset fast output.mp4

# I run this:
./process-short.sh input.mp4

That difference sounds small, but it removes the part that always breaks in real work: remembering the flags, the order, and the output steps.

Step 1: List the manual steps first

Before I wrote anything, I wrote down the full workflow:

trim the first 2 seconds
cut after 35 seconds
scale to 1080×1920
normalize audio to -14 LUFS
add watermark
export MP4
export WebM
generate a thumbnail

That gave me a real pipeline instead of a vague automation idea.

Step 2: Build one script that does the boring work

#!/bin/bash
set -euo pipefail

TRIM_START="00:00:02"
TRIM_END="00:00:35"
RESOLUTION="1080:1920"
AUDIO_TARGET="-14"
WATERMARK="./assets/watermark.png"
THUMB_TIME="00:00:05"

INPUT="${1:?Usage: process-short.sh <input.mp4>}"

if [[ ! -f "$INPUT" ]]; then
    echo "Error: File '$INPUT' not found."
    exit 1
fi

BASENAME=$(basename "$INPUT" .mp4)
OUTDIR="./output/${BASENAME}"
mkdir -p "$OUTDIR"

ffmpeg -y -ss "$TRIM_START" -to "$TRIM_END" -i "$INPUT" \
    -vf "scale=${RESOLUTION}" \
    -af "loudnorm=I=${AUDIO_TARGET}" \
    -c:v libx264 -preset fast -crf 23 \
    "${OUTDIR}/trimmed.mp4"

if [[ -f "$WATERMARK" ]]; then
    ffmpeg -y -i "${OUTDIR}/trimmed.mp4" -i "$WATERMARK" \
        -filter_complex "overlay=W-w-20:H-h-20" \
        "${OUTDIR}/${BASENAME}_final.mp4"
else
    cp "${OUTDIR}/trimmed.mp4" "${OUTDIR}/${BASENAME}_final.mp4"
fi

ffmpeg -y -i "${OUTDIR}/${BASENAME}_final.mp4" \
    -c:v libvpx-vp9 -crf 30 -b:v 0 \
    "${OUTDIR}/${BASENAME}.webm"

ffmpeg -y -i "${OUTDIR}/${BASENAME}_final.mp4" \
    -ss "$THUMB_TIME" -frames:v 1 \
    "${OUTDIR}/${BASENAME}_thumb.jpg"

rm -f "${OUTDIR}/trimmed.mp4"

A few choices mattered a lot:

set -euo pipefail so failures don’t get ignored
-y because this is a pipeline, not an interactive tool
temp file cleanup so output folders stay usable

Step 3: Make it batch-capable

One file is a demo. A directory is the real use case.

#!/bin/bash

INPUT_DIR="${1:?Usage: batch-process.sh <directory>}"
PROCESSED=0
FAILED=0

for file in "$INPUT_DIR"/*.mp4; do
    [[ -f "$file" ]] || continue
    if ./process-short.sh "$file"; then
        ((PROCESSED++))
    else
        echo "FAILED: $file"
        ((FAILED++))
    fi
done

echo "Batch complete: $PROCESSED processed, $FAILED failed"

This is where it actually became useful. I didn’t want a cool script. I wanted to stop babysitting repetitive exports.

Step 4: Add docs, even if it’s just for yourself

I also added a tiny SKILL.md with:

what the script does
requirements
usage
config variables
output files

That sounds boring, but it matters. A script without docs becomes archaeology in two weeks.

Step 5: Test annoying edge cases

This was the part that caught real bugs:

empty files
videos without audio
short clips
weird filenames with spaces and brackets

That testing found multiple issues immediately. If I had skipped it, the batch version would have failed silently later.

The result

Before: most of an afternoon for 16 videos

After: one command, then the machine does the repetitive part

./batch-process.sh ./raw-videos/

It’s still less flexible than Premiere. If I want one-off polish, I’ll still do it manually.

But for repeatable batch work, the tradeoff is absolutely worth it.

That’s why I like building CLI skills.

Not because they’re clever. Because they turn something fragile and repetitive into something boring and reliable.

If you build little terminal workflows like this too, I’d genuinely love to hear what you’ve automated. I keep more reusable terminal workflow patterns on Terminal Skills.

What would you automate first?

8 FFmpeg Recipes I Use Every Week (That Most Developers Don't Know Exist)

Alex Shev — Wed, 11 Mar 2026 15:21:39 +0000

I've been using FFmpeg almost every day for the past year.

Mostly for boring real work: cutting Shorts, cleaning voice tracks, exporting web versions, generating thumbnails, and fixing weird media issues at the last minute when something breaks five minutes before publish.

Most FFmpeg tutorials cover the basics and stop there. These are the commands I actually come back to in production.

Here are 8 recipes I use constantly. Copy-paste ready.

1. Extract Audio from Video (and Clean It Up)

# Extract audio only
ffmpeg -i video.mp4 -vn -acodec libmp3lame -q:a 2 audio.mp3

# Extract + normalize loudness to broadcast standard
ffmpeg -i video.mp4 -vn -af "loudnorm=I=-16:TP=-1.5:LRA=11" -ar 44100 clean_audio.mp3

When I use it: Podcast edits, voiceover cleanup, and those annoying cases where a track sounds fine in headphones but way too quiet after upload. The loudnorm filter saved me from re-exporting more than once.

2. Create a GIF from a Video Clip (That Doesn't Look Terrible)

Most GIF conversions look like they were made in 2004. The trick is a two-pass approach with a custom palette.

# Generate optimized palette first
ffmpeg -ss 00:00:05 -t 3 -i video.mp4 \
    -vf "fps=15,scale=480:-1:flags=lanczos,palettegen" \
    palette.png

# Then use that palette for the GIF
ffmpeg -ss 00:00:05 -t 3 -i video.mp4 -i palette.png \
    -filter_complex "fps=15,scale=480:-1:flags=lanczos[x];[x][1:v]paletteuse" \
    output.gif

# Cleanup
rm palette.png

Why it matters: Single-pass GIFs use a generic 256-color palette. The two-pass method generates a palette optimized for your specific clip.

I learned this the hard way after making a product GIF that looked fine in preview and terrible after export. Washed-out gradients, ugly banding, weird skin tones. Two-pass fixed it immediately.

3. Batch Convert an Entire Folder

# Convert all MKV files to MP4 (preserving quality)
for f in *.mkv; do
    ffmpeg -i "$f" -c:v libx264 -crf 18 -c:a aac "${f%.mkv}.mp4"
done

# Convert all WAV to MP3 at 192kbps
for f in *.wav; do
    ffmpeg -i "$f" -codec:a libmp3lame -b:a 192k "${f%.wav}.mp3"
done

The pattern: ${f%.ext} strips the original extension. This is bash string manipulation, not FFmpeg — but it's the glue that makes FFmpeg scriptable.

This one saved me the most time in practice. Converting one file is nothing. Converting 40 files before lunch is where FFmpeg starts earning its keep.

4. Split a Video into Equal Chunks

Perfect for breaking long recordings into social-media-sized pieces.

# Split into 60-second chunks
ffmpeg -i long_video.mp4 -c copy -map 0 \
    -segment_time 60 -f segment \
    -reset_timestamps 1 \
    chunk_%03d.mp4

Output: chunk_000.mp4, chunk_001.mp4, chunk_002.mp4, etc.

The -c copy flag means no re-encoding — it's almost instant regardless of file size. The split happens at keyframes, so chunks might be slightly longer or shorter than 60 seconds.

5. Add Subtitles (Burned In)

# From an SRT file
ffmpeg -i video.mp4 -vf "subtitles=captions.srt:force_style='FontSize=24,FontName=Arial,PrimaryColour=&HFFFFFF,OutlineColour=&H000000,Outline=2'" output.mp4

# Quick one-liner subtitle (no SRT file needed)
ffmpeg -i video.mp4 -vf "drawtext=text='Hello World':fontsize=36:fontcolor=white:x=(w-text_w)/2:y=h-th-40:box=1:boxcolor=black@0.6:boxborderw=8" output.mp4

Pro tip: The force_style parameter in the SRT method lets you override subtitle styling without editing the SRT file. Useful when you get captions from auto-transcription services and need them to look consistent.

6. Picture-in-Picture (Two Videos Overlaid)

# Main video with small webcam overlay in bottom-right
ffmpeg -i main.mp4 -i webcam.mp4 \
    -filter_complex "[1:v]scale=320:240[pip];[0:v][pip]overlay=W-w-20:H-h-20" \
    -c:a copy \
    pip_output.mp4

Variations:

Top-left: overlay=20:20
Center: overlay=(W-w)/2:(H-h)/2
Animated (slide in): overlay='if(lt(t,1),W,W-w-20)':H-h-20

I use this for tutorial videos — screen recording as the main video, webcam as the PiP.

7. Speed Up / Slow Down Video

# 2x speed (video + audio)
ffmpeg -i input.mp4 -filter_complex "[0:v]setpts=0.5*PTS[v];[0:a]atempo=2.0[a]" -map "[v]" -map "[a]" fast.mp4

# 0.5x speed (slow motion)
ffmpeg -i input.mp4 -filter_complex "[0:v]setpts=2.0*PTS[v];[0:a]atempo=0.5[a]" -map "[v]" -map "[a]" slow.mp4

# 4x speed (atempo max is 2.0, so chain them)
ffmpeg -i input.mp4 -filter_complex "[0:v]setpts=0.25*PTS[v];[0:a]atempo=2.0,atempo=2.0[a]" -map "[v]" -map "[a]" 4x.mp4

The gotcha: setpts changes video speed, atempo changes audio speed. They're separate filters. If you only use setpts, you get a fast video with normal-speed audio — which is funny exactly once.

Also: atempo only accepts values between 0.5 and 2.0. For 4x, you chain two atempo=2.0 filters. I still have to look that up sometimes because it’s one of those FFmpeg details that never stays in my head.

8. Generate a Video from Images (Slideshow / Timelapse)

# From numbered images (img001.png, img002.png, etc.)
ffmpeg -framerate 24 -i img%03d.png -c:v libx264 -pix_fmt yuv420p slideshow.mp4

# From a folder of images (simple slideshow)
ffmpeg -framerate 1/3 -pattern_type glob -i '*.jpg' \
    -c:v libx264 -vf "fps=25,format=yuv420p" -pix_fmt yuv420p \
    slideshow.mp4

When I use it: Generating timelapse videos from screenshot sequences. Also useful for turning design mockups into a quick presentation video.

The -framerate 1/3 means each image shows for 3 seconds. Change the denominator to control display time.

The Cheat Sheet

Task	Key flags
Extract audio	`-vn` (no video)
No re-encode	`-c copy`
Quality control	`-crf 18` (lower = better, 0-51)
Normalize audio	`-af "loudnorm=I=-16"`
Scale video	`-vf "scale=1920:1080"`
Trim	`-ss START -to END`
Overwrite	`-y`

If you want more of these, I keep a small collection of terminal-first FFmpeg patterns on terminalskills.io.

What's your go-to FFmpeg recipe? Mine is still the two-pass GIF trick — not glamorous, but I use it constantly because bad GIF exports are way more common than they should be. If you have a better one, drop it in the comments 👇

DEV Community: Alex Shev

Stop Teaching Terminal Commands. Teach Terminal Workflows.

Commands are vocabulary

A terminal workflow has stages

1. Inspect

2. Decide

3. Run

4. Verify

5. Recover

Why this matters more with AI agents

Example: teaching grep vs teaching search

Example: teaching ffmpeg vs teaching media prep

This is where Terminal Skills fits

A useful skill has a definition of done

The better way to teach terminal work

How I Made AI Video Uploads Boring with a Terminal Skill

The actual problem

What I mean by a Terminal Skill

The folder structure

The conversion script

Why these flags matter

The verification step is the real skill

Making it agent-friendly

What changed

The bigger lesson

Approval Gates for AI Agents: Draft Approval Is Not Publish Approval

The small approval bug that becomes a big workflow problem

The approval types I separate now

1. Approval to draft

2. Approval of the draft

3. Approval to publish

4. Approval for automation reminders

Why this matters for CLI and agent workflows

A simple pattern: declare the gate in the task

The agent should report its current gate

Batch work needs an even stronger gate

This belongs inside skills

The rule I use

Final thought

MCP Gave AI Agents Tools. A2A Gives Them Coworkers.

MCP vs A2A in one sentence

Why agents need a collaboration layer

The Agent Card is the underrated part

Example: a code pipeline made of agents

Example: a customer support router

Where Terminal Skills fits

A2A is not a replacement for MCP

What makes this useful for developers

The small shift that matters

From Blender Demos to Agent Toolchains: Why Terminal Skills Matter

The gap between “knowing Blender” and using Blender

Why Blender is a good test case for AI agents

The role of Terminal Skills

From GUI work to agent-operable workflows

What belongs inside a Blender skill

The skill is the interface

A better definition of done

Why this matters for reproducibility

Why this matters beyond Blender

Skills make agent work more auditable

The practical takeaway

The bigger point

How AI Agents Can Use Blender Like a Real Tool, Not Just Generate Prompts

The problem: agents “know” Blender, but they do not always operate Blender well

A better mental model: Blender skills as reusable workflows

Example: scene setup should be boring

Example: materials are easier when the agent has conventions

Example: camera automation is not just aesthetics

Why terminal-based skills fit Blender well

The key is not replacing artists

Prompting is still part of the workflow — just not the whole workflow

What this unlocks

Your AI Agent Does Not Need More Context. It Needs a Smaller Workflow.

The problem is not context. The problem is unfiltered context.

Tools make this easier and harder

A workflow beats a giant prompt

Example: debugging without reading the universe

Example: code review with a smaller surface area

This is where skills help

The smallest workflow test

Example: teaching `grep` vs teaching search

Example: teaching `ffmpeg` vs teaching media prep