Andrew Hrushetskyi

Posted on Jun 2 • Edited on Jun 10

Feel Behind on Skills, Hooks, Agents and Others? The Mental Model That Explains It All

#ai #productivity #agents

You're on your usual route through the Zone.

Same broken road, same dead trees, same nothing.
The wind drags through the dead branches. Your boots squelch in the wet gravel. Somewhere far off something creaks and you don't even bother turning your head.
The Geiger counter ticks along, slow and lazy. Nothing dangerous. Just background.
You're half bored, half on edge. That's the Zone for you - you stop feeling the fear, but it never actually goes anywhere.
You're cold. You just want to be home already.

And then you see, far off, something moving in the fog!
You raise your rifle, get ready to shoot. And suddenly you hear:

-- Don't shoot! We are not mutants!
-- (another voice): Listen! My name is Opus. Here are my friends: Sonnet and Haiku. We are reptiloids and we flew here from the planet Nibiru to study your planet. We have no ill intentions and pose no threat to you. Please, let us go.
-- (the last voice): Here, I've got an intergalactic modem, I'm giving it to you (puts it on the ground). You can write us any letters anytime. We're very well-read and we know a lot. Just please don't shoot!

You're standing there dumbfounded.
The rifle's still trained on them, but you have no idea what to say back, so you just nod. They start backing away quietly and after a couple of minutes they vanish somewhere into the fog.

You walk over to where they were standing and grab their modem.

And a few days later, once you've finally made it out of the Zone, you start checking out their miracle modem.

And as you test out this new gadget (that is, send your reptiloid friends intergalactic emails), you discover a few properties of these reptiloids:

These reptiloids are very, very well-read. They've read and listened to the works of all of humanity's philosophers, the speeches of Hitler, Putin Trump, Zelensky, and even the Lion King from the cartoon of the same name. These reptiloids have watched every episode of anime, all the Epstein files, all of reddit. They are incredibly well-read.

But! At the same time, when you write them an email like "remember last week I wrote you about ...", they can't recall it. It just so happened that they remember the whole internet, but your conversations with them they don't memorize and don't remember. And every time you send some email, you have to, within that same email, resend the entire history of the correspondence with all the attachments every single time. The whole context all over again, every time.

And of course it's terribly boring to manually copy the previous conversations by hand every single time.

This is routine nonsense that can and should be automated. So you decided to write a middleman program (an agent), which at the bottom will have a text input, and will keep your conversation history in memory. And you'll just type something of your own into the text input, and the program will take what you wrote, pull the history of your correspondence out of memory, glue your short prompt and the whole conversation history into one big prompt and send the reptiloid this big prompt. And when it sends back a reply - display it nicely as a chat and save its reply into memory. And so on, round and round.

Congrats. You've just invented the browser-based ChatGPT of 2022.

The key to everything that follows: LLMs are stateless!

LLMs are stateless. Somewhere inside themselves they remember the entire internet in the form it was in at the moment of their training, but they don't remember your correspondence with them. You have to send them everything all over again every time.

Every time you send a message, it's the sending of a completely new message, just with the conversation history attached.

Here's what it would look like if your middleman program didn't save the chat history and didn't resend it from scratch each time:

Let's try to guess what the regional office employee is going to ask =)

Tools

Life was getting better and better and more and more interesting. The reptiloids replied incredibly fast and helped find answers to many questions. Thanks to your primitive agent (the middleman program), which was more like a chat, communicating with the reptiloids was incredibly convenient.

But at some point you wanted more: you wanted the reptiloids to write code instead of you.

At first you started by writing to the reptiloids what code you wanted them to write. They'd ask you to send a couple of files for context. You'd send it to them, manually. Then they'd send you that code. And you'd go and paste that code into the file manually. And at some point this got boring too, and you wanted to go even further - to automate even this ctrl+c + ctrl+v.

So you improved your middleman program and it started working as follows:

When you type some text into the text field and press enter - your program takes your text, and to this text it adds:

Dear Opus. To make it easier for you to do this task, you have a few options. You can use any of them:

- Read. If you need to read the contents of some file - send me the keyword «!_&Read&_!» and along with it the path to the file you want to read.
- Write. If you want to write something into one of the files - send me the keyword «!_&Write&_!» and along with it the line to append to, and the content to append.
- Grep. If you want to perform a search through the file system - send me the keyword «!_&Grep&_!» and along with it the pattern to search by.
- Fetch. If you want to get the contents of some web page of our earthly clunky internet - send me the keyword «!_&Fetch&_!» and a url address.
- .... and many others.

Then your agent glues together your prompt that you typed into the text field + the conversation history + a description of the available Tools like in the example above. And it sends this whole solid prompt to the reptiloid.

The reptiloid reads all of this. If the information you provided is enough for it, then it sends back «!_&Write_! ./index.ts line:23 Content: const abracadabra...». Your agent, that is, your program, parses this reply, pulls the arguments out of this reply, and simply does the work on the file system.

If the reptiloid decides that the information given to it isn't enough, then it sends the keyword for Grep, your middleman program parses the reply, and performs the Grep, and then takes the results of that search, adds them to all and conversation history to new prompt, and resends all of this as a new email/request to the reptiloid.

Reptiloids (LLMs) live on Nibiru, they don't have direct access to your computer, to its file system, or to google, or to anything else. Reptiloids can only receive messages and reply to messages.

But your middleman program automatically builds a big prompt with all the details, it's your middleman program that pastes into your prompt the description of the Tools it can provide to the reptiloid. Then the reptiloid sends an order to your middleman program in a special format, and it's already your middleman program that performs the actions on the file system.

Here's how Claude Code "reports" what it did according to the LLM's orders

And here's how this can be represented schematically:

Hooks

With the invention of Tools, your life became much more convenient. Now you don't have to copy-paste everything manually. Your middleman program does it instead of you.

But you still have to run the linter manually.

What's more, you'd like to restrict the reptiloid's ability to read the .env file or run git reset --hard & git push --force.

Of course you could hardcode such functionality right into the code of the agent itself. But in one project there's one situation, one set of needs, in another project there are others.

So you decided to create hooks:

You extended your middleman program as follows:

In your project. I emphasize, right in the code of your project, in a special file in a special little folder, you write down a list of hooks. In each hook you write

the condition for when exactly this hook should intervene
what needs to be done

For example, to restrict access to the .env file you write a hook into that file, where you write:

the condition for when to intervene: before the Read tool is executed
what needs to be done: check whether among the list of files for Read that the reptiloid sent there's a file named .env. If there isn't, then continue. Or if there is .env, then reject the execution of the tool.

And after this, every time the reptiloid sends a reply that contains an order to your agent, the agent, before executing Read, first runs your hook.

To run the linter you add into that file, where you write:

the condition for when to intervene after the Write tool is executed
what needs to be done: npm run lint

I gave simple examples. But the whole point is that such an architecture of the agent allows you to configure the work of this agent very flexibly for each project.

Heads up!:

Hooks are not something that goes as context to the LLM! Hooks are a condition and code that needs to be run when it's met, which the middleman program runs regardless of whatever the LLM said. Your middleman program executes your hooks without asking the LLM about it and executes them even if the LLM is somehow against it.

Here's how this can be represented schematically:

If we take an analogy, you can imagine that writing a hook for claude code is like writing document.addEventListener("the type of event for when to intervene", (event) => what to do). And if we take this analogy, inside the "what to do" you can write event.preventDefault() (that is, abort the default execution).

Heads up:

I told only about two most commonly used hooks (PreToolUse (before the tool is executed) and PostToolUse (after the tool is executed)). But there are also Stop, Notification, SubagentStop and others.

MCP servers

You got on a roll. And wanted even more. You wanted the middleman program not only to add the contents of files or make changes to those files according to the reptiloid's replies.

You wanted the middleman program to also send queries to the database according to the reptiloid's replies, or even interact with the browser according to the reptiloid's replies, or go grab information from Figma and add it to the context for the reptiloid according to the reptiloid's orders.

And in general, in general, all of this could be done as a Tool. But this is all hardcode. And while in the case of Read/Write hardcode makes sense, because it's just basic functionality that's definitely needed everywhere. Then various additional tools - that's already something there can be a whole lot of, very different. And for one project it's this, for another project it's that.

So you did it differently:

First, from your mason friends you learned that they came up with a special MCP protocol for servers (special rules by which a server receives questions and sends replies. Much better adapted for the work of reptiloids than sending RestAPI request, for example).

Second, you added to your middleman program the ability to work with such MCP servers from Figma, from Playwright, and from others.

Now in the settings of each separate project you add only those MCP servers that are needed in this project.

And then when you type something into the middleman program, it sends both your prompt, and the conversation history, and the list of basic tools, and additionally also the list of available MCP servers, and then the reptiloid, if it considers it necessary, sends an order to the agent to send a query to one of the MCP servers. The middleman program parses this reply, sends such a query to that MCP server, and adds its reply to the next prompt.

Why Model Context Protocol? Why not another protocol?

Because after meeting the reptiloid Opus you started researching the topic of reptiloids more seriously and started going to the masons' gatherings. And there you learned that besides Opus from Nibiru, there's also Codex from the planet OpenAI, DeepSeek from the planet Hammer&Sickle, Grok from the planet FromTheHeartToTheSun, and others. And it turned out that the other masons were too lazy to make a separate protocol (and therefore a separate server) for each separate reptiloid, so they agreed to use the same protocol regardless of which reptiloid they're friends with. And you thought, like, "well why not? I'll use it too."

An important clarification: an MCP server is literally a server, it's not an abstraction. An MCP server can be run not only locally (like from playwright or from nextjs devtools), but also remotely: for example Figma, Github, Jira.

Subagents

Everything is already very, very good. But there's a problem:

The size of the message you can send the reptiloid at once is very big, but still limited.

At some point the conversation history with all the contents of files, with the history of what was written where into which file, with the history of queries to MCP servers and their replies, all of this, all this conversation history with the reptiloid becomes so enormous that it simply doesn't fit into one message to the reptiloid.

Worse than that. Even if the message size didn't exceed the limit - still, when it's too big - that's bad. The reptiloid, even though it's very powerful, is still an alien and even with well-picked context it might, instead of the expected result, send back some gibberish. And when that context is too tangled and contradictory, that is, when your agent crammed in everything - the risk of getting humanoid anime instead of js code in reply grows even stronger.

And so it turns out:

if the middleman program trims part of the conversation history, then context might get lost and the probability that the reptiloid sends gibberish grows
if the middleman program doesn't trim part of the conversation history, but just crams everything in indiscriminately, the probability that the reptiloid sends, once again, gibberish, also grows. Worse than that, sooner or later the conversation history will become too big and the reptiloid will stop replying altogether

What to do?

One of the options is to launch a new instance of the middleman program, in which you again send the reptiloid the list of tools, the list of MCP servers and everything else, and the task that the reptiloid assigned to the middleman program in the first main chat (in the main instance, in the main session) to forward that task to the reptiloid in a new session. And after the new instance of the middleman program (that is, the subagent) finishes the correspondence with the reptiloid, that subagent returns the brief results to the main session, to the main correspondence.

So that's what you did. You made it so that now your program, in addition to the information about the basic tools and the information about the MCP servers, also adds to your prompt:

Dear Opus! This is your friend Andrii!
As friends we must help one another.
So help me do this: ${promptFromUser}.
Remember that you can order me to perform one action or another, namely ${toolsWithGuide}.
Also remember that you can order me to send a query to an MCP server, namely ${mcpServersWithGuide}.

MOREOVER, dear friend Opus, if you think that none of the tools
or MCP servers I described above is enough to solve this problem, then describe in your own words what I should do and I will surely do it.

But after Opus sends a task to you, to you specifically, your middleman program doesn't wait for you specifically to do that task. After Opus sends a task to you, to you specifically, your middleman program creates a new instance of itself (of the middleman program, that is, launches a subagent), and passes this task into that new instance (into that subagent) as a starting prompt. And then when that new instance finishes the correspondence with Opus where Opus solves its own task with the hands of your subagent, that new instance of the agent (the subagent) returns to the first correspondence only the result of solving that task (and not the whole conversation history). And your middleman program writes back to Opus:

Dear Opus!
You and I were talking about:
${chatHistory}.

The task you assigned me, I diligently completed. Here's what came out of it ${answerFromSubagentChat}.
Taking this into account, please carry out the original task that I assigned you.

And although this sounds a bit cyberpunk, even a bit cheaty. But as we know, the reptiloid Opus doesn't remember previous conversations and clearer context is much better for him. Everything we want to tell it - we have to tell it in this one message. So this is also a fairly popular approach.

Fixed project information

A very important part of the context is the explanation of who you are, what your project is about and what kind of answer you want.
And so as not to copy this explanation manually every time, you decided to write it in README.md and make it so that your middleman program inserts the contents of this README.md into your message to the reptiloids every time.

But then you realized that README.md isn't quite convenient for this and that for the information you want to pass to the reptiloids it's better to use a different file with a different name. And so you decided to create CLAUDE.md and reworked your middleman program so that in each correspondence with the reptiloid it inserts into that correspondence the contents of the CLAUDE.md file, and not README.md.

On-demand knowledge loading (Skills)

Time went on. The number of notes for the reptiloid in CLAUDE.md kept growing. And at some point it became too enormous.

So you decided to get around this in the following way:

In CLAUDE.md you left only the most important information for the reptiloids. And everything else you cut out of there and split into many many smaller little markdown files. In each little file at the top you wrote a short name, a short description (max 1024 characters). And below - that very knowledge already in its full completeness.

And you reworked your middleman program in the following way:

Your middleman program sends the reptiloid your prompt, the list of available tools, MCP servers, and also, also sends the list of those .md little files and says

Dear Opus. Here's the list of available knowledge:
${knowledges.list("name", "description") /* the content is not passed */}.
If you need any of it, then write me the keyphrase "I need the knowledge named ___"
and I'll send it to you.

When Opus decides that it needs one of those pieces of knowledge, then it sends the corresponding reply. Then your middleman program parses that reply, and goes to carry out the order - pulls out that little file, attaches it to the conversation history and resends it as a new letter to Opus. And it generates the next reply already taking into account that piece of knowledge.

Why are skills called skills, and not on-demand knowledge loading?

I also agree that on-demand knowledge loading fits this feature much better.
But if you tell investors: "We added to the middleman program the ability to split knowledge from one big file into many small ones and to send the LLM only the list of these files at the start so as not to overload the context. And we named this feature On-demand Knowledge Loading" - nobody will give you any money.

But the statement: "Our agents are now not just a mix of a constructor for building prompts and a parser that then performs one action or another. Our agents are now something new and incredible: they can have SKILLS. This is a new revolution! A new breakthrough! The world has changed irreversibly. We've overtaken our competitors threefold, gotten twice as close to inventing AGI, and we're about to take over the world. Bring the money while we're still taking it!" attracts investments much better. So it just historically worked out that they were named Skills after all, and not On-demand Knowledge Loading. Frankly, it's time to get used to the fact that in the 21st century the names for tools are given by marketers, not engineers.

Commands

In the process of using your toolset, you noticed that you write some prompts regularly. That there are situations when you write the very same thing every time. Only the argument changes (for example, the path to a file).

At first, you kept a list of template prompts in a separate notebook and copied them manually whenever you needed one. Eventually that got tedious, so you built something better:

Inside your project, you created a small commands folder. Each template prompt got its own .md file. Then you updated your middleman program to read that folder - so now, instead of copying anything, you can just type /filename in the chat like a slash command, follow it with an argument, hit enter, and the agent automatically expands it into the full prompt text, as if you'd typed it yourself.

Here, by the way, the name conveys the essence pretty well.

Remarks

The material was written by a human. The reptiloids only helped with translation to English.
For a clean conscience I have to clarify: in real life Opus, Sonnet and Haiku don't show up to people in the Zone, they live in data centers (and not on Nibiru) and receive from the agent the list of available tools usually in JSON format and as a separate field in the payload according to a special syntax, and not that «!&Read&! inside the prompt like I gave as an example.
Sorry if thumbnail looks too much crazy. I try to experiment.
Despite all this development of middleman programs, LLMs have remained LLMs. And they still hallucinate. Even with a ton of settings, a ton of correctly picked contexts etc., they still hallucinate quite often. And they hallucinate very dangerously, because very plausibly (the mistakes look very similar to the truth and are very hard to tell apart). The author is merely reviewing the existing features and is not calling for YOLO-mode vibe-coding. Especially in production.

The End

Let's sum up:

LLMs are stateless - they don't remember your previous messages. The whole conversation history is resent every time by the middleman program (the agent).
Tools are not magic: the agent describes to the LLM which "orders" it can send, the LLM sends them, and then the agent performs the real actions on the computer. And the agent is not some "smart" thing, but an ordinary program that, besides button presses, uses the LLM's orders as a trigger to perform actions.
Hooks - a way to intervene the execution of a tool "before" or "after", to add your own logic (linter, access restrictions, etc.) or any other moment of lifecicle.
MCP servers - servers that work according to a special standard, the Model Context Protocol, so as not to hardcode every integration into the agent itself.
Subagents - these are sort of a "meta-tool". That is, when instead of Read, Write, or calling an MCP, the agent opens a new subagent, where it starts a correspondence with the LLM itself and after reaching result returns only short result, not all the history to keep context in main agent clean.
CLAUDE.md - a file with the fixed context of the project or folder, which the middleman program inserts into every prompt. Sort of an analog of README.md, but for the LLM.
Skills - On-demand knowledge loading - the same CLAUDE.md, but cut into pieces that are loaded on demand.
Commands - template prompts that are convenient to call via /prompt-name.