DEV Community: Sam

The forbidden fruit of vibe coding isn’t bad code.

Sam — Fri, 26 Jun 2026 14:20:03 +0000

It’s working code.

Because once something works, your brain **wants **to _trust _it.

The button clicks.
The page loads.
The dashboard renders.
The demo looks real.

And suddenly, it’s tempting to believe the project is further along than it actually is.

But working is not the same as ready.
Working is not the same as secure.
Working is not the same as understood.
And working is definitely not the same as remembered.

That was the surprising part for me.

AI-assisted coding helped me move fast, but it also created a new problem: forgotten assumptions.

The shortcuts.
The warnings.
The “we’ll fix that later” moments.
The decisions that made sense in one session but got fuzzy in the next.

The internet is noticing this too

“A lot of security is contextual.”

Read the full piece in The Verge

— Jack Cable, security researcher

“Speed without control is a liability, not an advantage.”

Read GitLab’s AI Accountability Report announcement

— Manav Khurana, GitLab

That was the surprising part for me.

AI-assisted coding made it easier to build fast.

But it also made it easier to forget why certain decisions were made in the first place.

And when a quick prototype starts becoming real software, that forgotten context starts to matter.

I wrote more about that here:

Empirical - Your vibe-coded app got serious faster than expected

Curious how others are handling this: where do you keep the “don’t forget this before this ships” stuff when working with AI coding tools?

Your AI Coding Agent Needs Scar Tissue

Sam — Mon, 15 Jun 2026 11:38:48 +0000

The most expensive AI mistake is not when your coding agent gets something wrong.

It is when it gets the same thing wrong again tomorrow. That is the part that starts to wear you down. Not because the model failed once.

That happens.

The frustrating part is when you already corrected it.

You explained the repo pattern.
You told it why that migration broke.
You pointed out the weird CI issue.
You showed it the dependency that already failed.

The agent fixed the task.
The session ended.

Then two days later, a new session suggests the same bad idea like none of it ever happened.

That is the problem I have been thinking about lately. AI coding agents do not just need bigger context windows.

They need scar tissue.

What I mean by scar tissue

Scar tissue is remembered failure.
It is not generic documentation.
It is not a massive chat transcript.

It is not another bloated AGENTS.md file that gets stuffed into every prompt whether it is relevant or not.

Scar tissue is the durable memory of what went wrong, why it went wrong, and what should not be repeated.

Examples:

Do not use this migration pattern in this repo.
It passes locally but breaks staging because of X.

Do not replace this middleware.
It looks redundant, but it protects the admin route.

Do not use this package again.
We tried it and it failed on Vercel because of native dependencies.

The Stripe webhook handler must preserve the raw body.
Normal JSON parsing breaks signature verification.

This test failure usually means the mock user is missing a role.
Do not rewrite the auth flow first.

That kind of knowledge is incredibly valuable.

But most of the time, it disappears.

It lives in someone’s head.
Or buried in Slack.
Or trapped in yesterday’s AI session.
Or hidden somewhere in a pull request comment nobody will ever read again.

Context is not the same thing as learning

A lot of AI coding workflows still treat context like the solution to everything.

Add more files.
Add more instructions.
Add more docs.
Add more examples.
Add more project history.

Eventually the prompt becomes a junk drawer. The agent has more text, but not necessarily more judgment. That is the distinction I care about. Context tells the agent what is nearby. Scar tissue tells the agent what it learned the hard way.

Those are not the same thing.

The old pattern

This is what a lot of AI coding sessions look like:

Session 1:
Agent suggests bad approach.
Developer corrects it.
Agent fixes the issue.
Session ends.

Session 2:
Agent has no memory of the correction.
Agent suggests the same bad approach.
Developer loses trust.

The model did not technically “forget.”

It never had durable memory in the first place. It only had temporary working space. Once the session ended, the lesson vanished.

The better pattern

This is the pattern I want instead:

Session 1:
Agent suggests bad approach.
Developer corrects it.
The lesson gets stored as a durable project memory.

Session 2:
Agent starts a similar task.
The relevant scar gets retrieved.
Agent avoids the old mistake.

That is a different kind of AI coding workflow.

Not just faster.
Not just cheaper.
Not just fewer tokens.

More experienced.

Why this matters more as agents get better

The better coding agents get, the more this matters. When agents only wrote tiny snippets, forgetting was annoying. Now they can touch real architecture.

They can refactor files.
They can generate migrations.
They can write tests.
They can modify production-adjacent code.

That makes repeated mistakes more expensive.

If an AI agent is going to operate inside a real codebase, it needs more than instructions.

It needs a memory of consequences. It needs to remember the things that hurt.

Where Empirical fits

This is one of the use cases I am exploring with Empirical.

Empirical is a memory layer for AI tools.

Instead of stuffing every lesson, decision, preference, and warning into a giant prompt, Empirical lets an agent retrieve the specific memory it needs when it needs it.

For coding agents, that means the memory layer can hold things like:

Project decisions
Repo conventions
Failed approaches
Bug history
CI/CD quirks
Security gotchas
Dependency warnings
“Never do that again” lessons

That is the stuff that usually gets lost between sessions.

And it is also the stuff that makes a developer more useful over time.

Why should an AI coding agent be any different?

The future is not just smarter agents

I do not think the next leap in coding agents is only going to come from smarter models.

Some of it will come from better memory.

Not memory as a transcript dump.
Not memory as “load the whole repo into context.”

Memory as accumulated judgment.
Memory as operational history.
Memory as scar tissue.

Because the real win is not just an agent that can write code. The real win is an agent that remembers why the last fix failed.

I wrote more about the idea here:

https://empirical.gauzza.com/blog/ai-coding-agent-scar-tissue-your-ai-coding-agent-needs-scar-tissue/

Maybe Bigger Context Windows Aren't the Answer

Sam — Fri, 05 Jun 2026 13:59:12 +0000

When humans need information, we don't load everything we know into our heads at once.

We ask questions.

We look things up.

We pull in details when they become relevant.

AI systems should probably work the same way.

Recently I updated Empirical's CLI documentation system.

Before, I could have dumped the entire command reference into every agent session and called it a day.

Instead, the installer adds a tiny instruction:

empirical doc
empirical doc <topic>
empirical doctor

That's it.

The agent doesn't get the entire manual.

It gets a pointer to the manual.

When it needs help with memory commands, it runs:

empirical doc memory

When it needs installation help:

empirical doc install

When it needs to discover what's available:

empirical doc

The detailed documentation is loaded only when needed.

Less context. Better timing.

What's interesting is that this is becoming a pattern throughout Empirical.

The CLI uses on-demand documentation.

Memory retrieval works the same way.

Conversation context works the same way.

Instead of shoving everything into the prompt and hoping the model finds what matters, Empirical tries to surface only the information relevant to the current task.

I've started thinking of this as progressive disclosure for AI.

Not bigger context.

Better context.

The future may not belong to systems that remember everything. It may belong to systems that know what not to load until it's actually needed.

This idea has become one of the guiding principles behind Empirical.

We're exploring what happens when AI systems retrieve information as needed instead of carrying everything around all the time.

If that sounds interesting, I'd love for you to take a look at Empirical and share your feedback:

https://empirical.gauzza.com

Seen this ChatGPT warning before? Here’s a fix.

Sam — Thu, 04 Jun 2026 13:27:08 +0000

I kept seeing this little warning under ChatGPT’s message box:

ChatGPT gets less accurate and may forget details in long conversations.

For a while, I ignored it.

Then I realized I was keeping months of notes in one long ChatGPT thread.

Dates. Numbers. Observations. Things I actually cared about.

And that warning started to feel a lot less theoretical.

The problem

Long chats feel like memory.

You can scroll back. You can ask for summaries. The thread is still there.

But a chat is not a database.

Eventually, I started noticing small details drift. A date would be off. A number would show up that I did not remember entering. The answer sounded confident, but parts of it were not from my actual notes.

That is the scary part.

Not that ChatGPT forgot.

That it filled in the blanks.

What I changed

I stopped keeping the important stuff only inside the chat.

I connected Empirical and asked ChatGPT to save the tracking data as separate memories.

That changed the setup.

The record no longer lived inside one conversation.

The conversation became one way to reach the record.

So if I start a new ChatGPT thread later, it can pull the saved context back from Empirical
instead of relying on one long chat to remember everything.

👉 Add Empirical to ChatGPT

Why it helps

I asked ChatGPT directly whether saving the data into Empirical actually helps with the long-chat forgetting problem.

Here’s the part that mattered:

Even if a future conversation does not contain the whole old chat, the saved context can still be retrieved from Empirical.

That is the fix.

The chat can forget.

The record does not have to disappear with it.

The takeaway

If you use ChatGPT for something important, do not let the only copy live inside one thread.

Use the chat for conversation.

Use memory for the stuff you actually want to keep.

For me, that means Empirical now holds the record, and ChatGPT, Claude, or Codex can pull from it when needed.

If you use coding agents too:

👉 Install Empirical for coding agents

Original post:
Seen this ChatGPT warning before? Here’s a fix.

Last night at 11:30 pm I screwed up. It led to an unexpected win.

Sam — Fri, 29 May 2026 13:42:53 +0000

Last night at 11:30 pm I screwed up. It led to an unexpected win.

I'd been heads-down in Empirical on something else for a while. Hadn't touched the public site in days. When I finally went to ship, I pushed and watched two weeks of UX work vanish. The site reverted to its pre-redesign state right in front of me.

The new pages were gone. Wasn't caching. Not in main. Not in any branch I could reach. I couldn’t find it anywhere I could realistically untangle at midnight.

So I asked Empirical what it remembered.

It pointed me at an unreachable WIP commit floating in the void after the cleanup. One git cherry-pick later, the redesign was back. Four minutes of recovery. A lot longer spent panicking before I thought to ask.

Unexpected win for Empirical. The use case I never would have pitched, never would have asked for, and absolutely needed at 11:34 on a Tuesday. The hero I didn't want, but the one I needed.

Full write-up👇

Empirical saved my ass. | Empirical Blog

First-person incident report on recovering lost frontend work by querying Empirical memory, locating an unreachable WIP commit, and restoring the missing public-site redesign. Empirical saved my ass.

empirical.gauzza.com

What's the dumbest thing you've ever done to your own repo at midnight?

I Cut Coding Agent Context Usage by 22–45% by Killing Context Bloat

Sam — Tue, 12 May 2026 19:02:23 +0000

A lot of AI coding workflows degrade the exact same way.

At first, everything feels incredible.

Your coding agent:

understands the project
moves insanely fast
eliminates boilerplate
compounds your momentum

Then a few weeks later:

AGENTS.md turns into a novel.

Prompts get bloated.

The model starts missing obvious things.

Responses become inconsistent.

Token usage quietly becomes absurd.

I kept running into this while building Empirical.

Eventually I realized the problem wasn’t:

“The model needs more context.”

The problem was:

“The model is carrying too much irrelevant context at once.”

That distinction changed everything.

The Hidden Failure Mode of Coding Agents

Most teams solve AI memory like this:

“Just add it to the prompt.”

And over time the context fills up with:

Permanent Context Soup

architecture decisions
coding standards
deployment notes
UI preferences
old implementation details
temporary fixes
abandoned experiments
half-finished thoughts

Eventually every request drags all of it around forever.

Even when most of it has absolutely nothing to do with the current task.

That creates a brutal signal-to-noise problem.

The model starts treating temporary junk and critical architecture decisions with equal importance.

You can actually feel the degradation happen.

Symptoms:

the agent gets fuzzier
architecture drift increases
outputs become inconsistent
you spend more time correcting than building

Bigger Context Windows Aren’t the Real Solution

I think the industry is optimizing the wrong thing right now.

Everyone keeps pushing toward:

Bigger Everything

million-token windows
infinite memory
larger context sizes
stuffing more into prompts

But humans don’t work like that either.

Good engineering teams don’t bring every document into every meeting.

Most information is situational.

Most memory should stay dormant until it becomes relevant.

That was the shift for me.

Not:

“How do I fit more into context?”

But:

“How do I load only what matters right now?”

What Worked Better

I started treating AI memory more like layered working memory instead of permanent prompt stuffing.

1. Lean Persistent Context

Keep permanent instructions extremely small.

Only things like:

architecture principles
coding philosophy
project identity
non-negotiables

That layer should stay lean on purpose.

2. Retrieved Context

Pull implementation knowledge dynamically based on:

Relevance Signals

semantic similarity
current task
related code paths
previous work in the same area

Only relevant context enters the active prompt.

3. Session Context

Use temporary working memory for:

Active Work

bugs
in-progress features
short-lived implementation decisions

Then let it expire naturally instead of polluting long-term memory forever.

What Changed

The biggest surprise wasn’t even the token savings.

It was how much sharper the agents became once the noise disappeared.

After reducing context bloat:

responses became more focused
architecture stayed more consistent
prompt babysitting dropped significantly
outputs drifted less between sessions

The token reduction was just the measurable side effect.

Results

Workflow	Context Reduction
Smaller focused tasks	~22%
Larger iterative workflows	Up to ~45%

That compounds fast once agents start looping.

The Bigger Realization

I think a lot of AI tooling is accidentally recreating bad human organizational habits.

We already know what happens when people dump everything into:

Organizational Chaos

giant docs
giant meetings
giant Slack threads
giant Notion pages

Clarity collapses.

Coding agents seem to behave better when memory works more like human working memory:

Better Memory Pattern

small active focus
relevant recall
long-term memory separated from immediate attention

That mattered far more than raw context size.

Full Breakdown

I wrote the complete breakdown here:

retrieval architecture
layered memory strategy
implementation lessons
where the 22–45% savings actually came from

→ Reducing Coding Agent Context Usage by 22–45% with Retrieval-Based Memory Systems

I’ve been using Empirical as my memory layer across AI tools.

Sam — Fri, 08 May 2026 16:36:24 +0000

ChatGPT memory helps.
Local MD files help.

But neither travels cleanly across everything I use, and packing too much into MD files eats context and tokens.

With Empirical, I keep my AGENTS.md lean and let Codex pull context dynamically when it actually needs it.

I can open ChatGPT on my phone, connected to Empirical, and it pulls the same memory context and writing tone I use in Codex or any other connected AI tool.
That means:

less repeated setup
cleaner, cheaper prompts
more consistent output across sessions

This is just the tip of the iceberg.

I wrote up a Codex example here:

How I Used Codex + Empirical to Lock In My Writing Voice | Empirical Blog

April 30 note on using Empirical with Codex to define a repeatable writing voice through guided questions and live revision.

empirical.gauzza.com

I Needed Memory That Survives Context Windows. Memory That Moves Across Environments

Sam — Thu, 09 Apr 2026 13:05:00 +0000

I kept running into the same thing with AI tools:

great context disappears
I repeat myself constantly
Every tool remembers different stuff (or nothing)
Moving between tools my context doesn't follow me

So I built Empirical.

It started in a pretty common place: I was iterating on a Philly-style hoagie roll recipe.

I wanted the AI to remember what I liked, what failed, and what I wanted to try next without re-explaining it every time.

I originally thought Empirical would be its own chatbot. I started down that path, then realized I was solving the wrong problem. Reinventing the wheel.

I didn’t need another chat interface.
I needed a memory layer I could use everywhere.

So I changed lanes and focused on MCP tools.

Now I use Empirical memory across:

Coding CLI's
ChatGPT
Claude Web
Claw Agents

Same memory, different interfaces. Now if ChatGPT is no longer _cool _ or Claude leaks it's entire codebase, I can switch to the latest hot thing and all my context and memories move with me.

Real examples that made this click for me

I can take a pic of a bourbon, say “I like this,” and that preference is saved as persistent memory.

I can send health data and query/chat over it later to help spot patterns.

I can write a PRD while going on a walk with ChatGPT, then pull it back up in a CLI session at my desk.

What’s next

I’m now working on connecting Empirical to more sources so memory reflects more of my actual life/workflow.

Current focus:

better pattern recognition over time
stronger multimodal memory (text + image + structured data)
cleaner memory workflows for agents

DEV Community: Sam

The forbidden fruit of vibe coding isn’t bad code.

It’s working code.

The internet is noticing this too

That was the surprising part for me.

Your AI Coding Agent Needs Scar Tissue

What I mean by scar tissue

Context is not the same thing as learning

The old pattern

The better pattern

Why this matters more as agents get better

Where Empirical fits

The future is not just smarter agents

Maybe Bigger Context Windows Aren't the Answer

AI systems should probably work the same way.

Seen this ChatGPT warning before? Here’s a fix.

The problem

What I changed

Why it helps

The takeaway

Last night at 11:30 pm I screwed up. It led to an unexpected win.

Empirical saved my ass. | Empirical Blog

I Cut Coding Agent Context Usage by 22–45% by Killing Context Bloat

A lot of AI coding workflows degrade the exact same way.

Your coding agent:

“The model is carrying too much irrelevant context at once.”

The Hidden Failure Mode of Coding Agents

Permanent Context Soup

Symptoms:

Bigger Context Windows Aren’t the Real Solution

Bigger Everything

Not:

But:

What Worked Better

1. Lean Persistent Context

Only things like:

2. Retrieved Context

Relevance Signals

3. Session Context

Active Work

What Changed

After reducing context bloat:

Results

The Bigger Realization

Organizational Chaos

Better Memory Pattern

Full Breakdown

I’ve been using Empirical as my memory layer across AI tools.

How I Used Codex + Empirical to Lock In My Writing Voice | Empirical Blog

I Needed Memory That Survives Context Windows. Memory That Moves Across Environments

So I built Empirical.

Now I use Empirical memory across:

Real examples that made this click for me

What’s next

If this clicks with you, I'd love for you to check it out and give it a try:

Empirical