Building Interactive Programs inside Claude Code

#ai #claudecode #programming #productivity

Building Interactive Programs inside Claude Code

This is something I've been discovering as I go, and I thought it was worth sharing more broadly. The pattern is simple but surprisingly powerful: build a CLI that Claude can reason about, and let it decide how to invoke it based on your natural language prompt.

I stumbled into this while building an Android QA agent, you describe a test scenario in natural language and Claude executes it on a device but the patterns I found apply far beyond mobile testing. They're general-purpose building blocks for making any CLI tool feel like an intelligent, interactive program.

The Pattern: Claude as Your CLI's User

The idea is to build a simple CLI and put Claude in front of it. Your CLI doesn't need to be smart. It just needs to accept flags and do its job. The intelligence lives in a skill, a markdown file that tells Claude how to map natural language to CLI invocations.

In my case, the CLI wraps adb and records commands. Claude uses it like a human would, except it reads a skill file first to decide which flags to pass. The user never thinks about flags. They just describe what they want.

Prompt-Driven Feature Activation

This is where it gets interesting. Instead of exposing flags to the user, you teach Claude to detect intent from the prompt and activate features automatically.

The skill file is just a markdown document with simple rules:

Check the user's prompt for any of these keywords (case-insensitive): "track performance", "frame rate", "fps", "rendering".
If any keyword matches, add --perf to the command.

That's the entire mechanism. Claude reads the skill, scans the user's prompt, and adjusts the CLI invocation. The user says "measure performance while scrolling through the list" and the right flags get passed: no documentation to read, no syntax to remember.

You can stack these. In my project, saying "track performance and enable tracing" activates two independent features from a single sentence. Each feature has its own keyword list in the skill file, and Claude composes them naturally.

The underlying CLI stays simple: it accepts --perf and --trace flags, writes the config to a lock file, and the teardown script reads that lock file to know what to capture. The skill layer is what turns this mechanical flag-passing into something that feels conversational.

Human-in-the-Loop Decisions

Claude Code's `AskUserQuestionTool lets you build programs that pause for user input when they hit a genuine ambiguity and continue autonomously when there's nothing to ask.

For example: my tool needs to know which Android device to target. If one device is connected, it just picks it. If there are multiple, it shows a dropdown and asks. This is a pattern you can apply anywhere: selecting a deploy target, choosing a database, picking a branch. The tool stays autonomous by default but defers to the user exactly when it should.

Session Control: Skills Start It, Hooks Guarantee the Stop

A useful pattern for any tool that needs setup and teardown: use a skill to start the process, and a Claude Code hook to guarantee cleanup.

The skill tells Claude to call a start script before doing any work. This script creates a lock file that tracks the session state. When Claude finishes, it calls a stop script that reads the lock file, does the teardown, and cleans up.

But what if the user hits Ctrl+C, or Claude forgets? A Stop hook in .claude/settings.json catches that.

The lock file does double duty: it's a mutex preventing overlapping sessions, and a state store telling the stop script what to clean up. If Claude already stopped gracefully, the lock file is gone and the hook is a no-op. This pattern works for anything with lifecycle management — recording sessions, server processes, temporary resources.

Building the Tool from Within

Here's the part that still surprises me. I build this tool from the same Claude Code session I use to run it. Claude is smart enough to distinguish between "run this test on the device" and "add a new feature to the tool."

I haven't manually created any of the skill files in this project. They've all been generated by Claude as a byproduct of iterating on the CLI. You describe a behavior, Claude implements the script, then writes the skill that teaches itself how to use it. It's a self-reinforcing cycle.

The Takeaway

Technically, everything I've described maps to existing Claude Code features: skills, hooks, AskUserQuestionTool. But the way you arrive at them matters. You don't design a skill spec upfront. You build a CLI interactively, discover the interaction patterns through use, and let the skills emerge.

The recipe:

Build a simple CLI that accepts flags and does one thing well
Write a skill that maps natural language keywords to those flags
Use AskUserQuestion for genuine ambiguities that need human input
Add a hook for lifecycle guarantees (cleanup, finalization)
Iterate from within, let Claude build the next feature while you use the current one

If you're looking for an idea, think about a manual process that could benefit from automation. Pulling data from JIRA, running a deployment checklist, performing QA on a mobile device, auditing accessibility,.. anything where you follow a series of steps that a CLI could drive.

Build the CLI first, keep it simple. Then let Claude use it. You'll be surprised how quickly the skills emerge from real usage, and how naturally the tool evolves when your primary user can reason about what it does.

The project I built with this approach is open source at github.com/tobrun/android-qa-agent.