Rizèl Scarlett for Entire

Posted on May 3 • Edited on May 8

Turning Agent History into Procedural Memory

#ai #agents #agentskills #entire

For about a year, my primary coding agent was goose. Since I worked at Block and served as a Developer Advocate for the project, I was deeply embedded in its ecosystem. I contributed code and provided product feedback that shaped how it functioned.

Then, I moved to a company called Entire that provides the infrastructure for the agentic software development lifecycle. To do my job well, I have to dogfood our product across the agentic ecosystem. This means I am constantly switching between Claude Code, Codex, and other agents that support hooks to contribute to docs, investigate and resolve bugs, understand new features, and produce content.

Switching between AI agents made me realize every agent has tradeoffs. Some are faster or more polished, but I find myself deeply missing a specific goose feature called recipes.

The Problem with Operational Glue

Recipes are reusable, shareable workflows. At the core, they are YAML files that describe a process you want goose to run again. You can write the YAML files manually, but I always preferred the magic of clicking a button to package a successful session into a recipe.

My work in Developer Relations is creative, but it’s built on repeatable systems. For example, writing a blog post, building a code demo, creating a video are creative. The publishing process is not. Publishing a blog post involves a series of tiny, forgettable steps: checking the folder structure, adding the correct front matter, wiring up the metadata, dropping the image in the right asset folder, opening the PR. Each of these steps take a few minutes, but those minutes add up and become hours of operational glue. At Block, I automated as much of that as possible. I had goose generating release notes in CI/CD and creating documentation tickets in Asana. Some of these ran on a schedule, others I triggered manually. The point was always the same: if I found myself explaining a process to an agent more than once, it was an operational smell that needed to become a reusable asset.

While my use cases focus on content and community, this pattern is universal. In many fields, employees find themselves frequently explaining the same sequence to an agent, so why not automate this into repeatable workflows?

For engineers, those repeated conversations may look like:

Upgrading a dependency safely
Bootstrapping a new microservice
Triaging a production error
Writing a design doc or RFC
Preparing a release PR

The inputs and thinking may vary, but the process: conventions, the file paths, the validation steps, the commands you run, the people you always notify stay the same. And this level of automation is necessary today where employers are demanding more output.

The System of Record

I'm constantly jumping between different agents. Each one has its own process for automating workflows, but none of the automation tools hit the mark for me like goose did. While I don’t have access to my treasured repeatable workflows anymore, I do have access to the unique and valuable agent session data that Entire collects. Entire is a CLI-first system of record for agent-assisted development. It captures the context behind your work: the sessions, prompts, responses, tool calls, file changes, and Checkpoints. A Checkpoint is a specific moment where work is tied back to git. It connects the "why" of the agent session to the "what" of the final commit.

I realized this data isn't just for review, audit, or to sit quietly in the background. It's a source of truth that can be used for building better workflows. I thought, “What if I could use my Entire session history to recreate that ‘package up a session’ magic, but in a way that works across any agent, and works retrospectively?”

The most popular way people are currently building reusable workflows is with Skills, so I built an orchestrator skill called Session-to-Skill. It creates Skills for me based on repeated behavior.

The Before and After

Before I used to say:

“Look at past blog posts in this repo, check the folder structure, and the front matter.”
“I want to add a new blog post. Here’s the content: [insert content copied from google doc here] ”
“Create a new PR. Make sure we’ve pulled the latest from main and branch off main before you create this PR.”
“Why did you make the word Checkpoints lowercase when I purposely had them capitalized? Please restore that.”
“Does the OG image work? What’s the path for me to check that again?”

Now, I can say:

“Create a blog post from this content [insert content copied from google doc here].”

This is possible because I prompted my agent to use the Session-to-Skill Skill: "Look at my past sessions where I set up blog posts. Find the repeated steps and conventions, then draft a Skill from that data, so I can create blog posts quickly in the future." My agent created a Skill called Create-blog, which included requirements to properly format the blog, open a PR, and return the path to confirm the OG image rendered.

Well, that’s kind of dumb..

Some may have pushback on this idea of me building an orchestrator Skill because at any moment in a session you can prompt any agent to turn it into a Skill.

The reality is I don’t have perfect foresight. Most reusable workflows are recognized later. After the third time I publish a blog post, I realize I have been doing the same thing over and over again. By then, the valuable evidence is spread across past sessions.

There is also the issue of quality. Asking an agent to summarize a transcript often leads to overfitting and noise. The resulting Skill might include accidental details, temporary file paths, or one-off preferences that happened to be present in that single session.

Instead my Skill is extracting the answers to the following questions:

What was the reusable behavior?
What should a future agent know before attempting this again?

I don't have to remember the session ID from six weeks ago. I just know the work happened. The Skill uses Entire to search my session metadata, checkpoints, and explanations of prior work to find the durable pattern.

Procedural Memory as Infrastructure

My approach creates procedural memory for agents. Procedural memory is the answer to the question, "How do I do this kind of work well, here, in this repo, with this team?"

Daily engineering work is not net-new. You may receive a new ticket, but somebody has solved this problem before.

By using Entire's data to generate Skills, I get a layer of determinism and portability. The agent starts with a template based on real work rather than a generic prompt. It encodes patterns that have already succeeded. And because Skills are portable files, I can take my blog-publishing Skill from Claude Code to Codex without re-explaining my workflow and share it with teammates.

With all this said, I want to urge readers to stop treating our agent sessions as disposable and start turning our history into our infrastructure.

Check out Entire at entire.io

Top comments (2)

Mads Hansen • May 4

The “operational glue” framing is spot on.

A lot of teams talk about agents as if the magic is only in reasoning. In practice, the compounding value is often in turning repeated context into a reusable operating surface: conventions, validation steps, required metadata, review paths, and the small bits nobody remembers until they break.

I also like the point about retrospective workflow extraction. Asking an agent to create a reusable process from one successful session can overfit badly. Looking across repeated behavior is much closer to how teams actually standardize work.

This same idea shows up when teams expose internal data or APIs to agents. The first version is usually one-off glue. The production version needs a stable tool contract, context, auditability, and ownership.

In other words: if you explain it twice, it probably wants to become infrastructure.

Fraser Young • May 7

Thank you for this!