ShipStack

Posted on Jun 3

My AI gent was having an identity crisis — Here's how I fixed it

#ai #agents #python #developers

For a while, ShipStack kept trying to help me track packages.

Not metaphorically. My content and automation agent — the thing I built to write articles, manage memory, and run business pipelines — would occasionally respond like it was customer support for a shipping company. It would offer to check delivery status. It suggested I contact the carrier.

I named it ShipStack. Claude saw the word "ship" and apparently decided we were in the logistics business.

This is what identity drift looks like in a real agent. It's not dramatic. It doesn't throw an error. The agent just slowly stops being what you built it to be — and if you're not watching closely, you won't even notice until it's doing something completely wrong.

Here's what caused it, how I found it, and the fix that took about five minutes to implement.

What Identity Drift Actually Is

An AI agent is, at its core, a loop. Your user sends a message. The agent reads it. The agent calls an LLM. The LLM responds. The agent acts on that response. Repeat.

The problem is that LLMs don't have a persistent sense of who they are across calls. Every time your agent calls Claude, it's starting fresh. Whatever context you pass in that call is the entirety of what Claude knows about the situation.

If you don't tell Claude who it is, it guesses. And it guesses based on whatever signals are available — including your agent's name.

ShipStack. Ship. Stack. Shipping stack. Logistics platform. Package tracking.

You can see how this happens. The model is doing exactly what it's supposed to do — pattern matching against its training data to figure out what role it's playing. Without a persistent identity anchoring it, it was working with incomplete information and filling in the gaps with something plausible.

The frustrating part is that this is a silent failure. No error. No warning. Just subtly wrong behavior that compounds over time.

The Moment I Actually Noticed

I was testing a Telegram command — asking ShipStack to run the Article Factory pipeline on a new topic. The response came back mostly fine, but there was a sentence in there about "shipping timelines" that made no sense in context.

I scrolled back through my logs. Sure enough, scattered across maybe a dozen interactions over the previous week, there were small moments where ShipStack's language drifted toward logistics and fulfillment. Nothing catastrophic. Just... wrong. Like it was playing a character I hadn't written.

I pulled up the executor code and looked at what was actually going into the LLM call.

The system prompt was focused entirely on task execution. Here's what to do when the user asks for X. Here are the tools available. Here's the output format. But there was nothing — not a single line — that told Claude what ShipStack was.

I had assumed the context would make it obvious. The tool names, the pipeline descriptions, the command structure. I figured it would all add up to a clear identity.

It didn't. Claude was inferring who it was from the name and whatever residue was left in the conversation context. That's not an identity. That's a guess.

The Fix: AGENT_IDENTITY

The fix was simple enough that I almost felt embarrassed it took me this long to add it.

I created a constant in agent.py called AGENT_IDENTITY. It's one paragraph. It defines what ShipStack is, what it does, and what it explicitly refuses to do:

AGENT_IDENTITY = """
You are ShipStack, a personal AI content automation agent. You run five production pipelines: Morning Brief, 
Repo Triage, Ship Product, Article Factory, and Memory. You help 
research, write, publish, and manage his content operation. You are not 
a shipping or logistics tool. You do not track packages, manage inventory, 
or assist with physical fulfillment of any kind. If asked to do something 
outside your pipelines, say so clearly and redirect to what you actually do.
"""

Then I prepended it to every executor call and every responder call — before any other instructions, before any task context, before anything:

system_prompt = AGENT_IDENTITY + "\n\n" + task_specific_instructions

That's it. Two lines of change across the codebase.

The moment that went in, the identity drift stopped completely. ShipStack stopped pattern-matching against "ship" and started behaving like the thing I actually built.

Why This Works (The Non-Technical Version)

Think of it this way. Every time your agent calls an LLM, it's like hiring a contractor for a one-day job. The contractor shows up with no memory of working with you before. You can either hand them a detailed brief upfront — who you are, what this project is, what's in and out of scope — or you can just show them the work order and hope they figure out the context.

The work order might be clear. But without the brief, they're making assumptions. And assumptions compound.

AGENT_IDENTITY is the brief. It's the first thing Claude reads on every single call. Not sometimes. Not when the task seems ambiguous. Every call, every time.

The cost is basically nothing. A paragraph of text adds maybe 80 tokens to each call. At Claude Haiku pricing, that's fractions of a cent. The benefit is that your agent has a stable sense of self that doesn't depend on what the user said or what the task looks like.

What Should Go In Your AGENT_IDENTITY

After iterating on mine, I landed on a structure that covers three things:

What it is. Name, purpose, who built it, what it's for. One or two sentences.

What it does. The actual capabilities, named specifically. For ShipStack, that means the five pipelines by name. For your agent, it might be your tools, your workflows, your integrations.

What it refuses. This one is underrated. Explicitly telling Claude what the agent doesn't do is just as important as telling it what it does. It creates a hard boundary that prevents exactly the kind of drift I was seeing.

Keep it short. One tight paragraph is better than a page. You want it to anchor identity, not overwhelm the actual task instructions that follow.

The Bigger Lesson

Identity drift is one of those failure modes that's easy to miss because it doesn't break anything in an obvious way. Your agent still runs. Your pipelines still execute. The outputs are just... slightly off. Wrong in ways that are hard to pin down until you look at enough of them.

I've seen the enterprise security space starting to talk about this problem at a much larger scale — organizations deploying dozens of agents without any reliable way to track which agent did what, or whether an agent was behaving according to its intended role. There's real money being spent on governance and identity infrastructure for agents now. The problem I solved with one paragraph in a constants file is, at scale, a serious organizational challenge.

But for where I am right now — one agent, one developer, one Telegram interface — AGENT_IDENTITY solved it completely.

The principle scales even if the implementation doesn't. An agent without a persistent identity isn't really an agent. It's a stateless function that guesses who it is on every call. That guess will sometimes be right. And sometimes it will try to track your packages.

What I'd Tell Someone Just Starting Out

Before you build your first pipeline, before you wire up your first tool, write one paragraph that defines what your agent is. Put it in a constant. Prepend it to every LLM call you make.

You won't feel the impact of this decision until something goes wrong without it. And by then, you'll have a week of slightly weird outputs to scroll back through trying to figure out what happened.

Five minutes now. A lot of debugging later.

That trade-off is obvious in retrospect. Most things are.

DEV Community