DEV Community

Cover image for I Was Mid-Setup When I Built the Skill That Finished Setting It Up
Daniel Petro
Daniel Petro

Posted on

I Was Mid-Setup When I Built the Skill That Finished Setting It Up

I was halfway through setting up Minitap's mobile-use SDK when I stopped and built the skill that would finish setting it up for me.

Let me explain. I wasn't reading docs—nobody does that anymore. I was pasting setup instructions into Claude, asking it to help me configure iOS dependencies, validate my Android SDK, set up API keys. The kind of workflow every developer uses now.

But I kept hitting validation issues. "Did the Xcode command line tools actually install? Is my simulator configured correctly? Did that API key get set?" My AI assistant would say "done," but I'd have to manually verify each step. When you're using AI to execute setup tasks, you realize pretty quickly: you need deterministic validation wrapped around that AI execution.

So mid-setup, I built an interactive wizard skill with built-in validation. Then I used that skill to finish my own setup. What was taking 30-60 minutes of back-and-forth now takes 3-5 minutes with confidence at each step.

The interactive setup wizard in action

Here's what I learned about building self-validating skills for AI-first workflows—and why this pattern matters when AI is actually executing tasks, not just generating code suggestions.

The Setup Problem

The mobile-use SDK is powerful. It lets AI agents control phones like humans—visually understanding interfaces, tapping buttons, navigating apps. The kind of thing that could finally make mobile test automation not suck.

But here's what mobile development looks like in 2026: Web developers ship multiple times a day while we're still fighting device fragmentation, slow simulators, and app store review cycles. The iteration loop is 10-100x slower than web.

This is where AI-first SDLC becomes critical. When AI is a first-class participant in development workflows—not just autocomplete, but actively executing tasks—it needs the right infrastructure. Tools that guide AI execution and validate outcomes.

AI agents that can visually understand mobile interfaces—like what Minitap is building—could change that. Test automation that adapts when UI changes instead of breaking. Debugging at web-speed because the AI can see what's happening.

But the best technology doesn't win if developers can't get started.

Here's the modern setup experience: paste docs into your AI coding assistant (or use something like context7 to load the whole repo), ask it to help with setup, execute commands it suggests, hit validation issues, iterate. Nobody's reading docs linearly anymore.

Getting mobile-use running meant:

  • Figuring out if you need iOS, Android, or both
  • Choosing between Platform mode and Local mode for your LLM
  • Installing the right dependencies for your target platform
  • Configuring device connections (physical device? simulator? cloud?)
  • Creating a properly structured project

None of this is hard. But when you're using AI to execute these steps, you hit a problem: the AI will confidently say "done" without actually verifying anything worked.

That's the validation gap. And in an AI-first workflow, it's not just annoying—it breaks the whole experience.

So mid-setup, actively stuck, I built something to solve it.

Building the Solution (Mid-Setup)

Here's the thing about building developer tools: the best time to build them is when you're actively feeling the pain.

I was literally stuck in setup—asking Claude to help me configure things, manually checking if they worked, feeding errors back. That's when the pattern became obvious: I need the AI to execute setup tasks, but I need deterministic code to validate each step actually worked.

Not "did the LLM say it worked?" but "does the command return the right exit code? Does the file exist? Is the service responding?"

So I built a skill that encoded the validation logic I was manually running. Decision tree for platform/LLM/device choices, deterministic checks after each step, specific remediation for known failure modes. Then I used that skill to finish my own setup.

Claude giving instructions for xCode setup

I started by mapping out the decision tree:

What platform? → iOS/Android/Both
  → Check dependencies
    → iOS: xcode-select -p, xcrun simctl list
    → Android: adb devices, $ANDROID_HOME
  → What LLM mode?
    → Platform: Validate API keys
    → Local: Check Ollama installation
  → What device type?
    → Validate connectivity
  → Generate config + example project
Enter fullscreen mode Exit fullscreen mode

The skill needed to handle common failure modes I'd just experienced:

  • Missing Xcode command line tools
  • Android SDK path not set
  • No devices connected
  • API keys not configured

I built it as an interactive wizard that asks questions in the right order, validates each step, and generates a working project pre-configured for your choices.

Inside the Skill

Here's how the skill actually works (8 phases):

Phase 1: Gather Requirements — Asks about target platform (iOS/Android/Both), LLM mode (Platform API vs. Local), and device type (physical/simulator/cloud).

Phase 2: Check Prerequisites — Runs deterministic checks for Python 3.12+, UV package manager, platform tools (libimobiledevice for iOS, ADB for Android, Appium).

Phase 3: Install Missing Dependencies — If checks fail, provides exact installation commands for macOS/Linux: brew install libimobiledevice, npm install -g appium, etc.

Phase 4: Create Project — Initializes project using UV: uv init, then adds dependencies: uv add minitap-mobile-use python-dotenv.

Phase 5: Configure Credentials — Guides .env setup for API keys (Platform mode) or config file editing (Local mode with Ollama).

Phase 6: Device-Specific Setup — Platform-tailored instructions: Xcode signing for iOS, developer options for Android.

Phase 7: Create Starter Script — Generates main.py with correct imports and configuration based on chosen mode (PlatformTaskRequest vs. AgentProfile).

Phase 8: Verify Setup — Runs validation commands to confirm everything works before declaring success.

The key insight: each phase validates before proceeding. Not "did the AI say it worked?" but actual deterministic checks:

  • python --version returns 3.12+? ✓
  • which uv finds the binary? ✓
  • adb devices shows connected device? ✓

This is the foundation of reliable AI-first workflows—deterministic validation wrapped around AI execution.

![Interactive wizard asking platform and LLM mode questions][screenshot-wizard-interactive]

Before and After

Before (traditional docs):

  1. Read platform requirements
  2. Wonder if you have everything installed
  3. Try to start a project
  4. Hit an error
  5. Google the error
  6. Install missing thing
  7. Try again
  8. Hit another error
  9. Repeat 4-8 until it works or you give up

Time: 30-60 minutes. Success rate: ~70%.

After (with skill):

  1. Tell your AI assistant: "setup minitap for iOS"
  2. Answer a few questions
  3. Get a working project

Time: 3-5 minutes. Success rate: ~95%.

The skill doesn't just save time—it eliminates the "did I miss something?" anxiety that kills momentum.

Skill validating dependencies

Skills Are the Future

As I was building this, I kept wondering why it felt so natural. Then I saw Vercel CEO's announcement:

This is the insight: npm solved JavaScript's "copy-paste library" problem. Skills solve AI's "copy-paste instructions" problem.

Before npm:

  • Download library manually
  • Hope dependencies work together
  • Update everything manually
  • Repeat across projects

After npm: npm install and it just works.

Skills are following the exact same pattern. And the infrastructure already exists:

  • skills.sh — Central registry with 22,000+ skills
  • Standardized installation — npx add-skill or npx skills add
  • Agent-agnostic — Works across Claude Code, Cursor, Windsurf, Copilot
  • Open format — Not vendor-locked

The npm parallel isn't aspirational—it's already here. What's still emerging is the focus on self-validating skills with deterministic checks. Most skills in the registry are prompt templates. Self-validating skills that encode verification logic are the next evolution.

Installing the skill I made with npx add-skill

(I'm writing a deep dive on self-validating skill architecture for my Substack—how to build skills with deterministic validation in AI-first SDLC.)

This changes how we think about developer experience:

  • Felt friction with a tool? Build a skill, submit a PR
  • Have deep expertise in something obscure? Package it as a skill
  • Want to onboard developers faster? Ship skills alongside your SDK

The best DevRel isn't just content anymore. It's shipping actual solutions that make AI agents smarter.

Why I Built This (And Why It Matters)

I built this because I needed it RIGHT THEN. Not after reflecting on the experience—mid-setup, actively stuck, using AI to try to get unstuck.

And that's the meta-narrative here: I used AI (Claude) to build a skill that makes AI better at setup tasks. The whole thing happened in an AI-first workflow. I wasn't writing code in a traditional IDE—I was collaborating with Claude to build the validation logic, test it on my own environment, iterate on failure modes.

This is how development works now. Nobody reads docs cover-to-cover. We paste them into AI chats, ask for help with specific steps, iterate when things break. But that workflow NEEDS validation, because AI agents will confidently tell you something worked when it didn't.

The skill took about a day to build and test. I contributed it to the mobile-use repo (merged January 20) because other developers are pasting those same docs into their AI assistants and hitting the same validation gaps.

I even had Claude Code run the test after setup and it even output a nice gif of the whole thing running on my iPhone!

Automation Results with Claude Code

The Takeaway

If you're building developer tools in 2026 and you're not thinking about skills, you're missing how developers actually work now.

They're not reading your docs—they're pasting them into Claude, Cursor, Windsurf. They're using AI agents to execute setup tasks. And when those AI agents hit validation issues (and they will), you want a skill ready that guides them through with deterministic checks at each step.

The future isn't just great documentation. It's packaging your expertise as self-validating skills that AI agents can execute with confidence.

And the beautiful thing? You can build these in the moment you feel the pain. Mid-setup, mid-deployment, mid-debugging. That's when you understand the failure modes. That's when you know what needs validation.

Feel pain. Build skill. Share with community. That's the ecosystem Rauch is talking about. That's what I'm excited to be part of.


Try the skill:

npx add-skill minitap-ai/mobile-use
Enter fullscreen mode Exit fullscreen mode

Works with Claude Code, Cursor, Windsurf—any AI coding assistant that supports skills.

Check out the skill: github.com/minitap-ai/mobile-use/tree/main/skills/mobile-use-setup

Follow my work: @DanielPetroAI

Top comments (1)

Collapse
 
daniel_petro profile image
Daniel Petro

This was so much fun to build and write about. I really hope someone finds it helpful whether you're thinking about building for or with AI or actually setting up the mobile-use sdk for a tighter AI dev loop!