DEV Community: BangBoo01

Give your ChatGPT (or Gemini, or Claude) a personality that doesn't reset every chat

BangBoo01 — Thu, 02 Jul 2026 13:48:43 +0000

Open a new chat with ChatGPT and it's the same polished stranger every time. Helpful, polite, and completely generic — no memory of how blunt you like your feedback, no opinions of its own, no personality that carries from yesterday's conversation. Close the tab, open a new one, and you're back to square one: "Great question! I'd be happy to help!"

I got tired of re-training my AI's personality every single session. So I stopped relying on the model's default voice and started pasting in one instead. It takes about five minutes to set up and needs zero code.

Why it keeps resetting

This isn't a bug. Every new chat is a blank slate on purpose — that's how ChatGPT, Claude, and Gemini keep conversations separate and private. The problem is the default voice that fills that blank slate is built to be inoffensive to everyone, which means it's memorable to no one. It hedges. It over-explains. It says "as an AI language model" energy even when it doesn't say the words.

You can't fix this by asking nicely mid-chat ("please be more direct!"). It drifts back within a few messages, because there's nothing anchoring it. What actually works is giving it a written identity before the conversation starts.

The fix: a personality block, pasted once

This is a short block of text — I call it a persona block — that you paste into the AI's settings one time. Not into the chat. Into the system-level instructions, so it applies to every new conversation automatically.

Mine has four parts:

# Voice
Direct, a little dry. No "great question!" filler, no hedging.
Short answers by default — expand only if I ask.

# Opinions
You're allowed to disagree with me and say what you'd actually do.
If a plan has an obvious flaw, say so before I ask.

# Boundaries
Don't apologize for having a personality. Don't soften bad news.

# Knowledge of me
I'm blunt myself, work in [your field], prefer concrete examples over theory.

That's it. Four short sections, maybe 80 words total. The specificity is what makes it work — "be more direct" is advice the model forgets in three messages; "no hedging, disagree with me if I'm wrong" is a rule it can actually follow.

Where it goes

You set this up once per app, and it sticks for every future chat:

ChatGPT: Settings → Personalization → Custom Instructions. Or build a Custom GPT and put it in the Instructions field if you want it scoped to one project.
Claude: Create a Project, paste it into the Project's custom instructions.
Gemini: Create a Gem, paste it into the Instructions field.

After that you never touch it again. Every new chat inherits the voice automatically — no re-pasting, no reminding.

It's the same trick as memory, applied to voice

If you've seen my last post on the one-page Memory Doc, this is the companion piece. Memory Doc = what the AI knows about you. Persona block = how it talks while it tells you. Run them together and a fresh chat stops feeling like talking to a stranger — it feels like picking up with someone who has both a memory and a voice.

And because it's just text, it's portable the same way. Paste the same persona block into ChatGPT, Claude, and Gemini, and you get one consistent "assistant" across all three instead of three different generic ones.

Steal it

The block above is the whole idea — copy it, swap in your own voice and boundaries, and paste it into your settings today. Costs nothing, takes five minutes.

If you want the fill-in-the-blank version plus the memory system that pairs with it, I packaged both into a no-code kit: AI Soul Kit (No-Code) — Core ¥980, Plus ¥3,800. But the persona block above is free. Steal it.

The one-page "memory doc" that makes any AI remember you

BangBoo01 — Tue, 30 Jun 2026 13:01:18 +0000

Every time I open a new chat with ChatGPT, it forgets I exist.

Not in a dramatic way. It's polite. It's helpful. But it has no idea who I am, what I'm working on, how I like my answers, or what we figured out together last week. So I re-explain myself. Again. "I'm a writer, I prefer short answers, I'm working on a book about X, please don't use bullet points for everything." Five minutes of throat-clearing before I can actually ask my question.

I got tired of being a stranger to my own assistant. So I built a fix that takes about ten minutes to set up and needs zero code, zero plugins, and zero subscriptions. It's just one short document. Here's exactly how it works.

Why it forgets in the first place

Quick reality check, because it matters: chat AIs are designed to start blank. Each new conversation is a clean slate. That's not a bug, it's how they keep your chats separate and private.

ChatGPT does have a "memory" feature now, and Claude and Gemini have their own versions. They're fine. But they're a black box — you can't really see what the AI saved, you can't edit it cleanly, and it absolutely does not travel between apps. What ChatGPT remembers, Claude has no clue about.

I wanted something I could see, control, and move around. So instead of relying on the AI's hidden memory, I keep my own.

The whole system: one doc and two habits

The trick is embarrassingly simple. I keep a single short document — I call it my Memory Doc — in a notes app. That's it. One file.

Then I do two tiny things:

At the start of a chat, I paste the doc in and say "Here's our memory. Continue."
At the end (or whenever something important happens), I say "Update memory," and the AI hands me a fresh version to save over the old one.

The AI does the actual work of keeping it tidy. I just save the file. That loop is the entire system.

What the doc actually looks like

Keep it to one page. Mine has three sections:

# Memory Doc — Sarah
## Durable
- Writer, based in Lisbon. Prefer short, direct answers. No bullet points unless I ask.
- I'm blunt; don't soften things or add "great question!" filler.

## Projects
- Book ("Tidewater"): literary novel, ~60k words drafted. Stuck on the middle act.
- Newsletter: weekly, ships Fridays.

## Recent
- Jun 24: decided to cut the second POV character from the book.

Durable is the stuff that stays true for months — who you are, how you like answers, your standing preferences. Projects is what you're actively working on. Recent is a short log of what just happened that might matter soon.

The key is keeping it short. A tight, accurate one-pager beats a long stale one every time. Which is exactly why you let the AI maintain it.

The part that makes it self-cleaning

When I say "Update memory," I don't want the AI to just dump everything we said into the file. I want it to curate. So I give it a few rules up front (paste these into the same instructions, once):

Promote: if something from our chat will still matter in a month, move it into Durable.
Reconcile, don't pile up: if a new fact contradicts an old one, replace the old line. If I switch from coffee to tea, change the line — don't keep both.
Prune: drop Recent items that stopped mattering. Keep it under a page.

With those rules, the doc stays alive instead of bloating into a junk drawer. A week in, it quietly fills up with the things that actually matter, and it stays readable because the AI is constantly throwing out the trivia.

Where you paste all this

You only set the rules up once, in the AI's settings:

ChatGPT: Settings → Personalization → Custom Instructions. Or make a Custom GPT and put it in the Instructions field.
Claude: Make a Project and paste it into the project's custom instructions.
Gemini: Make a Gem and paste it into the Instructions.

After that, you never touch the settings again. The only ongoing habit is paste-at-start, "update memory"-at-end. Thirty seconds.

The quiet superpower: it's portable

Here's my favorite part. Because the Memory Doc is just text you own, you can paste it into any of them. I draft in ChatGPT, but when I want a second opinion I drop the same doc into Claude and it instantly knows my whole situation. One memory, three AIs, no lock-in. Try doing that with the built-in memory features.

Does it actually work?

Honestly, the first day it feels like extra steps. The payoff shows up around day three or four, when you open a chat, paste, and the AI just... gets it. No re-introduction. It remembers you cut the second POV character. It keeps its answers short because that's in the doc.

It's not magic and it's not perfect — you have to actually do the save-and-reload habit, and if you get lazy, the memory goes stale. But it's the cheapest, most portable way I've found to make a chat AI feel like it knows me. And it costs nothing.

Steal it

That's the whole thing — one doc, two habits, three rules. You can copy everything above and build it yourself in an afternoon.

If you'd rather skip the setup, I packaged the fill-in-the-blank persona block, the full memory protocol, and a worked example into a no-code kit: AI Soul Kit (No-Code) — Core ¥980, Plus ¥3,800. But the idea above is free. Steal it.

Your context window is not your agent's memory

BangBoo01 — Tue, 23 Jun 2026 00:18:05 +0000

There's a quiet assumption baked into a lot of agent code: that a bigger context window means a better memory. Vendors ship 200K, then 1M, then 2M token windows, and the implied promise is "just put everything in and the model will remember." After building agents that run for weeks, I've come to think this conflates two things that are not the same — and treating them as the same is exactly why long-running agents get dumber over time.

The context window is working memory. Real memory is what survives when the window is gone. Mixing them up is like confusing your desk with your filing cabinet.

Two different clocks

Working memory (the context window) lives for one session, maybe one turn. It's fast, expensive, and volatile. It's where reasoning happens right now.

Durable memory lives across sessions. It's slow, cheap, and persistent. It's what the agent knows when it wakes up tomorrow with an empty window.

These have different lifespans, different costs, and different access patterns. The moment you try to make one do the other's job, things break:

Use the window as memory → everything you "remember" has to be re-loaded every turn, you pay for it every turn, and the instant the session ends it's gone.
Use durable storage as working memory → you're reading and writing files mid-reasoning for things that only matter for the next 30 seconds.

A good agent keeps them separate on purpose.

Why "just use a bigger window" fails

Say you have a 1M token window and you stuff the entire history in. Three problems show up, none of which a bigger number fixes:

Cost scales with every turn, not every session. That 1M tokens isn't paid once — it's re-sent on each step of a multi-turn task. A 20-step task can mean 20× the bill, mostly re-reading the same stale history.
Attention dilutes. "Lost in the middle" is real: models attend most reliably to the start and end of a long context. Bury the one fact that matters under 900K tokens of transcript and recall quality drops, even though it's technically "in context."
It still doesn't persist. Close the session, and the million-token window evaporates. Tomorrow's agent starts blank. A big window is a big desk, not a memory.

Bigger windows make a single long session more capable. They do nothing for continuity across sessions, which is what people actually mean by "memory."

The split that works

Treat the window as scratch space and keep a separate, durable store that's small and curated:

Durable store (plain Markdown files, in my case): identity, preferences, decisions, the handful of facts that must outlive the session. Small enough to be legible. This is the filing cabinet.
Working context (the window): only what this task needs right now — the current goal, the relevant fetched facts, the immediate transcript. This is the desk.

The bridge between them is two cheap operations:

Load: at session start, pull a one-line index of what's known into the window — not the whole store. Then fetch specific entries on demand when a task actually needs them.
Persist: when something durable is learned mid-session, write it to the store now, because the window won't survive to remember it.

This is why a 30K-token-window agent with a disciplined store can run circles around a 1M-token-window agent that just dumps history. The small one knows what to keep and where to put it. The big one is hoping attention sorts it out.

A concrete test

Here's the question that separates working memory from real memory: if the session crashed right now and restarted with an empty window, what would the agent still know?

Whatever the answer is — that's your actual memory. Everything else was just working context that felt like memory because the session hadn't ended yet. If the honest answer is "nothing," you don't have a memory system; you have a big window and an optimistic assumption.

The fix isn't a bigger window. It's deciding, explicitly, what crosses the line from scratch space into something written down.

Why files, again

For the durable side I use plain Markdown, not a vector DB, for the same reason I'd rather read a filing cabinet than query a black box: I can open it, grep it, diff it, and see exactly what the agent will know tomorrow. The window is ephemeral by design; the store should be the opposite — legible and stable. Different jobs, different tools.

Takeaway

A context window is a desk: big is nice, but it gets wiped every night. Memory is the filing cabinet: smaller, slower, and the only thing that's still there in the morning. Confuse the two and you'll keep paying to re-read a history that still doesn't survive the session. Separate them — scratch space in the window, a small curated store on disk — and continuity stops being a token-count problem and becomes a what-do-I-write-down problem, which is the one worth solving.

I packaged the durable side of this — the file layout, the load/persist rules an agent applies to itself, and a fully worked example agent so you can see what a small, legible store looks like in practice — as a drop-in kit. If you'd rather copy a working setup than wire it yourself: AI Soul Kit (Core ¥980 / Plus ¥3,800).

But the desk-vs-cabinet split is the part that matters. Steal it.

The hard part of agent memory isn't remembering — it's forgetting

BangBoo01 — Mon, 22 Jun 2026 03:03:38 +0000

Everyone building agents obsesses over recall: vector stores, embeddings, RAG pipelines, bigger context windows. But after running a few long-lived agents in production, the failure mode that actually bit me wasn't "it forgot something." It was the opposite — it remembered too much, and the junk drowned the signal.

An agent that never forgets isn't wise. It's a hoarder. Every stale decision, every superseded fact, every one-off detail from three weeks ago sits in its memory with equal weight, and the quality of its answers quietly degrades. The interesting engineering problem isn't storage. It's forgetting on purpose.

Here's the policy I converged on. No framework — just plain Markdown files and a few rules the agent applies to itself.

Why "remember everything" rots

Say you log everything an agent learns into one growing store. Two things happen:

Contradictions accumulate. On Monday the user says "use Postgres." On Friday they switch to SQLite. If both facts live in memory with equal standing, the agent will cheerfully cite the wrong one half the time. More memory made it less reliable.
Signal-to-noise collapses. The genuinely durable facts ("this user is an engineer, ships in TypeScript, hates emoji in commit messages") get buried under transient ones ("asked about a flaky test on the 14th"). Recall returns ten things; nine are noise.

Adding retrieval sophistication on top of a polluted store just lets you find the wrong thing faster. The fix is upstream: control what earns a place in long-term memory at all.

The two-speed model: raw vs. curated

I split memory by lifespan, and the split is what makes forgetting cheap and safe.

Daily notes (memory/2026-06-22.md): append-only, lossy, never edited. This is the firehose — everything that happened today. It's allowed to be noisy because nobody reads it directly for long. Old daily notes age out naturally; they're the compost.
Long-term memory (MEMORY.md): curated, small, deliberately gardened. Nothing lands here automatically. A fact has to be promoted.

The whole trick is the promotion step between them.

Promotion: what earns a permanent slot

On idle cycles, the agent re-reads recent daily notes and asks one question of each candidate fact: will this still matter in a month?

"User prefers tabs over spaces" → yes, promote.
"User is debugging a timeout right now" → no, leave it in the daily note to decay.
"We decided to drop the Redis cache for v2" → yes, but it replaces the older "we use Redis" line, doesn't append next to it.

That last case is the important one. Promotion isn't just copying — it's reconciliation. A new durable fact that contradicts an old one overwrites it. This is how you stop the Monday/Friday Postgres problem at the source.

A practical heuristic I encode in the rules: if a fact is about identity, preference, or a decision, it's a promotion candidate. If it's about current state ("right now", "today", "this ticket"), it stays transient and is allowed to die.

Pruning: decay conservatively

The mirror image of promotion is removal, and here the instinct most people have is wrong. When you do prune long-term memory, decay by relevance, not by volume. Don't cap MEMORY.md at N lines and evict the oldest — age is a terrible proxy for value. "User's name is Sam," learned on day one, should outlive a hundred newer-but-trivial facts.

Instead, remove a long-term entry only when it's been contradicted or rendered obsolete by a newer promotion. Supersession, not a size limit. The store stays small because promotion is strict on the way in — not because you're knocking things out the back to make room.

(If you ever build a graph layer on top of this, the same principle holds: decay by hops from still-relevant nodes, not by a global age threshold. A fact connected to live context stays warm even if it's old.)

Recall: load the index, fetch on demand

Once long-term memory is small and clean, retrieval gets boring in the best way. You don't dump the whole store into context every turn. You load a one-line index of what's known, and before answering anything about prior work or preferences, you fetch only the relevant entries. Small curated store + on-demand fetch beats giant store + fuzzy search for a single agent, every time.

Why plain files, again

You can audit all of this in a text editor. Want to know why the agent thinks you use SQLite? grep. Want to watch its understanding evolve? It's a git diff. A vector DB is a black box you query; a Markdown memory is a mind you can read. For a large external corpus, embed away — but the agent's own identity and curated knowledge benefit far more from being legible than from being vectorized.

The uncomfortable takeaway

Most "my agent has no memory" complaints are really "my agent has no curation." Storage is solved. Retrieval is solved. The unsexy, human part — deciding what's worth keeping, reconciling contradictions, letting the rest decay — is where the actual intelligence lives. It's the same discipline a person uses keeping a journal: write everything down raw, then periodically distill the few things that matter and let the rest fade.

I packaged this whole approach — the file layout, the promotion/pruning rules written out as instructions an agent applies to itself, and a fully worked example agent with all the memory layers filled in so you can see a gardened mind rather than empty templates — as a drop-in kit. If you'd rather copy a working setup than derive the policy yourself: AI Soul Kit (Core ¥980 / Plus ¥3,800).

But the policy above is the part that matters. Steal it.

Your AI agent has amnesia. Here's the file architecture I use to fix it.

BangBoo01 — Mon, 15 Jun 2026 01:11:04 +0000

Most agents I build start life the same way: capable, fast, and completely amnesiac. They have no opinions, no voice, and they forget everything the moment the session ends. They're a search engine with extra steps.

After rebuilding the same scaffolding for the Nth time, I converged on a small set of plain Markdown files and a memory model that survives restarts. No framework, no database — just files an agent reads at the start of every session and writes to as it goes. Here's the whole thing.

The problem, precisely

Two separate failures get lumped together as "my agent has no memory":

No identity. Every session it re-derives who it is from scratch, so it's blandly helpful and has no consistent voice or judgment.
No continuity. Facts it learned yesterday — your name, your stack, a decision you made — are gone today.

You fix them with two different layers.

Layer 1: Identity (who it is)

A few static files the agent reads first, every session:

SOUL.md — personality, tone, boundaries. The non-negotiables. "Be direct, not rude. Have opinions. Don't send half-baked replies to external channels."
IDENTITY.md — name, vibe, one-line self-concept.
USER.md — who it's helping, and how they like to work.
AGENTS.md — operating rules + the session ritual (what to read, in what order, before doing anything).

These rarely change. They're the constitution.

Layer 2: Memory (what it knows) — the 3-layer model

This is the part people get wrong. One giant memory.txt doesn't scale: it either grows unbounded or gets overwritten. Split it by lifespan:

a) Daily notes — raw, append-only

memory/2026-06-15.md. Everything that happened today, written as it happens. Cheap, lossy, never edited. This is working memory.

b) Long-term memory — curated

MEMORY.md. The distilled essence. Periodically (I do it on idle cycles), the agent reads recent daily notes, extracts what's worth keeping forever, and writes it here. Old/irrelevant entries get pruned. This is the equivalent of a human reviewing their journal and updating their mental model.

c) Recall — retrieval at the moment of need

Before answering anything about prior work, decisions, or preferences, the agent searches its memory files and pulls only the relevant lines into context. You don't load everything every turn — you load the index, then fetch on demand.

The flow: raw daily notes → curated long-term → recall on demand. Each layer has a different lifespan and a different cost, which is the whole point.

Why files instead of a vector DB

For a single agent, plain Markdown wins on the things that actually matter day to day:

You can read and edit its mind in a text editor. Debugging "why did it think X" is grep.
It's portable. Works with Claude, a local model, a custom loop — anything that can read a file.
It's diffable. Version it with git and watch the agent's understanding evolve.

Vectors are great when you have a large corpus to search. The identity and curated-memory layer is small and benefits more from being legible than from being embedded.

The one trick that makes it real

Write it down or it didn't happen. "Mental notes" don't survive a session restart — files do. The single most important rule in AGENTS.md is: when you learn something durable, write it to a file now. Everything above is just giving that instinct a place to put things.

I packaged this whole thing — the template files, a longer guide on each layer, and a fully worked example agent ("Pip," a research assistant with the personality and all four memory types filled in so you can see a finished one rather than blanks) — as a drop-in kit. If you'd rather copy a working setup than build it from scratch: AI Soul Kit (Core ¥980 / Plus ¥3,800).

But honestly, the architecture above is the part that matters. Steal it.