How we turned a working agency's daily workflows into installable Claude skills

#ai #productivity #mcp #claude

I run a small digital marketing agency. Most weeks look the same from the inside: a technical SEO audit for one client, schema markup for another, an ad account review, a batch of content briefs, and the monthly reports that pull it all together. None of these jobs are mysterious. Each one follows a procedure I have refined over years of doing it wrong and then doing it a little less wrong. The problem was never knowing what to do. The problem was that every time I handed one of these jobs to a general-purpose AI, I started from an empty prompt and re-explained the entire procedure from scratch.

So we tried the obvious fix first: a big library of prompts. It did not hold up. A prompt captures the words you happened to type that day, but not the procedure underneath them, and it forgets everything the moment the chat ends. Paste the same prompt next week and the model has no memory of the fifty edge cases you learned the hard way. This is the story of what we built instead, and the distinction between skills, MCP, and agents that finally made it click. If you build with Claude or any MCP-compatible model, some of this might save you a detour.

Why a prompt pack was the wrong shape

The insight that took us too long to reach: the value in an agency is not the prompt, it is the accumulated judgment. Which schema type actually earns a rich result versus which one just validates. The order you run an ad account check so you do not waste the first hour. The specific way a client report needs to read so a non-technical owner trusts it. A prompt is a thin snapshot of that. What we wanted was a way to package the judgment itself, so the model loads it on demand and behaves like someone who has done the job a hundred times.

That packaging has three shapes, and getting them straight is most of the battle.

Skills, MCP, and agents are not the same thing

Here is the distinction as we use it day to day.

A skill is a ready-to-run capability that does one job. In practice it is a folder the model loads when the task matches: instructions plus any small scripts or reference files the procedure needs. It carries the judgment, not just the wording. Think "run a backlink profile analysis" or "write the robots policy for AI crawlers." One job, done the way you would do it.

An MCP connector is the plumbing. MCP (the Model Context Protocol) is how the model actually reaches an external service: a real, production-hardened integration to an API. A skill can tell the model how to think about Google Search Console data. The MCP connector is what lets it read that data at all. Skills are knowledge, MCP is reach.

An agent is an expert worker that runs a whole job end to end. It composes skills and MCP connectors, makes decisions between steps, and hands back a finished result instead of a single answer. Where a skill does one thing, an agent owns the outcome.

The mistake we made early was trying to cram all three into one giant prompt. Once we split them, each piece got simpler and easier to trust. A skill you can read and audit. A connector you can test in isolation. An agent you can watch make decisions.

Two concrete skills

Abstract taxonomy is easy to nod along to, so here are two real skills from what we ended up with.

The first controls AI crawler traffic. It decides which crawlers to allow (GPTBot, ClaudeBot, PerplexityBot, Google-Extended and others), generates the matching robots.txt and ai.txt, and writes the corresponding Cloudflare WAF or nginx rules. The part that makes it a skill rather than a snippet: it verifies bots by reverse DNS, so a scraper that simply sets its User-Agent to "GPTBot" does not get waved through. That reverse-DNS check is exactly the kind of hard-won detail a plain prompt drops.

The second is answer-engine optimization: getting a page cited by LLMs and surfaced in AI-generated answers rather than only in the classic blue links. It is not a trick. It is a checklist of structure and evidence that makes a page easy for a model to quote correctly, applied the same way every time. Boring on purpose, which is the point.

I am deliberately not attaching before-and-after numbers to either of these. The honest version is that they encode a repeatable procedure. What they do not do is guarantee a ranking, and I would distrust anyone who claimed otherwise.

What we learned

A skill is a procedure, not a personality. The prompts that read the best were the least reusable. The ones that worked were plain, almost dull, and described steps rather than vibes. Write for the second time you will run it, not the first.

Split knowledge from reach. Keeping "how to think about the data" (the skill) separate from "how to fetch the data" (the MCP connector) made both easier to fix. When something broke, we knew which half to look at.

The judgment is the moat, and it only comes from real work. We could only write these because we had done the jobs on real client sites first. A skill distilled from actual work carries the edge cases. A skill written from imagination carries confidence and not much else.

Name the boundary out loud. Every skill got an explicit "this is what it does not do." That single line prevented more bad output than any clever instruction, because it stopped the model from wandering past its competence.

Where this landed

We packaged the system up as forgehouse, because enough people asked to buy the thing we use rather than hire us to run it. It is the same skills, agents, and connectors we run on client work, not a separate demo. That is also why the site leans on a records room of shipped work instead of testimonials: the proof was already there, so we just filed it.

If you want to poke at the idea before anything else, the SEO starter set is on GitHub for free: github.com/development-candavarci/seo-starter. Clone it, read how a skill is actually structured, and steal whatever is useful.

The short version, if you take one thing away: stop writing prompts you will paste again next week, and start writing down the procedure. The model does not need your best wording. It needs your judgment, in a shape it can load.

Top comments (1)

Skillselion • Jul 5

The "name the boundary out loud" point is the one I'd underline twice. Most skills that misfire don't fail on the happy path, they fail because the model wanders one step past what the procedure actually covers and nobody told it where the edge was. An explicit "does not do" line is cheaper insurance than any amount of clever instruction, and it also makes the skill honest about its own scope when someone else picks it up.

Your split of knowledge (skill) from reach (MCP) matches what I keep seeing too: the skills that travel between people and projects are the narrow single-job ones, because they carry procedure, not your personal setup. The moment a skill assumes your creds, your file layout, or your one client's quirks, it stops being portable and starts being a config. The AEO one is a good example of a procedure that generalizes precisely because it's boring and structural.