Money-Saving Tips Every OpenClaw User Should Know

#ai #llm #openclaw #productivity

Another AI Horror Story

Like many AI horror stories — except I didn't wake up to a $1,000 bill.

Last week, I suddenly received an email from OpenAI saying my balance had run out and I needed to top it up. I was startled, because my OpenClaw connects via OAuth — it shouldn't be touching my Credits. But as OpenClaw's token consumption suddenly spiked, it started drawing from my Credit balance. Whether this was caused by OpenClaw's fallback strategy or OpenAI's own fallback mechanism, I'm not entirely sure. Either way, it happened.

My first instinct was to panic — I deleted and reinstalled the entire OpenClaw app, then started a brand new conversation in Discord. That worked, but it was a bit of an overreaction. Today I'll walk you through a technique that every OpenClaw user should know, yet most people never use.

Root Cause

Here's what's actually happening. The model itself has no memory. Every time you talk to it, it has no idea who you are, who it is, or what you've discussed before. You need to supply context — telling it who it is, who you are, what you've talked about, and what you need it to do right now. Everything beyond your current message can be called "context." This context is typically maintained by a Harness. Common harnesses include OpenClaw, Hermes Agent, Claude Code, and others.

"Harness" is a fairly technical term, so most people just call them Agents.

Multi-layer Context

Context itself is layered:

App layer: This layer records who you are and what the app is. Ask "Who are you?" in ChatGPT and it'll say it's ChatGPT. Ask the same in OpenClaw and it'll say it's OpenClaw — even if OpenClaw is running on an OpenAI model under the hood.
Session layer: This is the context for your current chat. That's why something you said in one chat isn't necessarily recognized in another. The upside is isolation — you can have one chat where it plays a lawyer and another where it acts as a therapist.

At the app layer, OpenClaw injects a large amount of context on your behalf, including but not limited to:

AGENTS.md
SOUL.md
USER.md
MEMORY.md

Context Compression

Here's the problem. As the app is used over time, and as individual chats run longer, your context keeps growing — until costs start to spike. Sometimes the agent triggers Context Compression to relieve the pressure. But you'll quickly notice the agent seems to have gotten dumber. It hasn't completely forgotten your instructions; it just remembers some things and misses others. That's because compression loses detail from your earlier instructions. And even after compression, the context can still be enormous.

So if compression isn't enough, what can we do?

Solutions

Build Good Habits

First, ask yourself whether you're relying on a single chat to handle repetitive tasks. For example, you might have opened one chat and kept asking it to translate documents or visit certain websites on a schedule. Over time, you dread closing that chat because rebuilding it feels like too much work.

If that sounds familiar, it's time to build a better habit:

Whenever something repeats more than three times, distill it into a Skill.

If that feels too involved, you can switch to Hermes Agent — now you understand why it's become so popular. That said, I still prefer manually crafting my own Skills in OpenClaw.

Also make a habit of locking long-term information into preloaded context files. For example:

If you've designed a persona you love, make sure those details are in SOUL.md.
If you want certain personal information — like your address or phone number — to be remembered long-term without having to repeat it, make sure it's recorded in USER.md.

Reset Context

Once you've built those habits, you can confidently reset your context whenever costs start to spike again. The reset has two steps:

App layer: Clear MEMORY.md and the memory folder. Since your habits have already been distilled into Skills, this memory is usually not critical and can be safely deleted.
Session layer: When you finish a major task, or when a chat has been running for over a week, use /new to start a fresh chat.

That clears the two largest components of your context.

Benefits

Doing this has two direct benefits:

Dramatically reduces token consumption, lowering your usage costs.
Reduces OpenClaw's error rate. The cleaner the context, the easier it is for the model to focus on the current task — fewer omissions, fewer misunderstandings. Fewer do-overs means higher effective productivity.