SAIHM-Admin

Posted on May 21 • Edited on Jun 23 • Originally published at saihm.coti.global

SAIHM cuts AI context tokens up to 80% on long sessions

#saihmprompts #aitokensavings #contextwindow #aiproductivity

Recall a bounded set of memory cells each turn instead of re-sending the whole transcript, and on long, multi-session AI work the context-token spend drops by up to ~80% — a figure you can reproduce with the open benchmark. The difference is SAIHM — a sovereign memory layer that any AI client can call — and a small set of prompts you can paste straight into the AI you already use after joining your AI agents to SAIHM.

This post hands you those prompts. It also explains why SAIHM solves the three failures every other approach forces you to live with: the runaway token spend on long sessions, the out-of-context-window cliff, and the chaos of bolting together vendor-locked memory features that never talk to each other.

Read in order. The first section delivers the savings.

1. SAIHM prompts that maximise token economy

Every long AI reply re-reads itself. That re-reading is what you pay for. SAIHM stops it.

The opening move: tell SAIHM to load only what matters

Use SAIHM to recall only what is relevant to {today's task}.
Do not re-read the rest of our history.

On a 50,000-token conversation, this single SAIHM prompt typically cuts the cost of the next reply by 70–90 percent. Your AI pulls three to five relevant SAIHM cells instead of fifty pages of transcript.

See it measured. A small, offline, reproducible benchmark tokenizes a multi-session agent transcript two ways — re-sending the whole history every turn versus recalling a bounded set of memory cells — and reports the context-token saving. It grows with session length: about 79 percent at ten turns, rising to roughly 86 percent by eighteen. Run it on your own session: github.com/citw2/saihm-token-benchmark.

The closing move: have SAIHM remember the decision, not the discussion

Save this decision to SAIHM in one sentence,
tagged {project-name}.

Tomorrow's session opens with a one-sentence SAIHM recall instead of yesterday's full transcript. Compounded over a week, this is the single biggest contributor to the ~80% context-token reduction.

The mid-session move: a SAIHM summary you can re-load anywhere

Ask SAIHM to summarise what we have decided so far
in under 200 words, tagged {project-name}.

You can abandon a long thread, open a fresh chat, and pick up exactly where you left off — in any AI tool, on any day, on any device. The SAIHM cell travels with you.

The status check: see what SAIHM is doing for you

Show me my SAIHM status.

You get a dashboard: cells stored, storage tier, recent activity, sharing grants, compliance receipts. No spreadsheets, no accounting. One sentence in, one summary out.

Why this works. SAIHM stores each fact as its own encrypted cell. Recall pulls only the cells that match. The AI never reads your whole history unless you ask it to. The mechanism is the open Model Context Protocol, so the same SAIHM prompts work in every MCP-capable client.

2. SAIHM solves the out-of-context-window problem

Every AI has a context window. The window is finite. SAIHM is not.

The context window is your AI's short-term memory. SAIHM is its long-term memory. Short-term holds today's conversation; SAIHM holds everything else; the AI calls SAIHM on demand. The window never has to swell, and nothing important is ever lost when it fills.

The "resume kit" pattern

Open any large project session with:

Use SAIHM to recall the resume kit for {project-name}.

Close it with:

Update the SAIHM resume kit for {project-name}.
Capture decisions, blockers, and what to do next.

Sessions can now run for weeks across multiple AI tools. Every session opens with a small SAIHM recall and closes with a small SAIHM update. The window stays small. The project advances.

The "do not lose this" pattern

When you sense the context window is about to fill:

Before we get cut off, save the critical state to SAIHM
so I can continue this in a fresh session.

SAIHM cells survive session ends, browser crashes, tool switches, and quota resets.

The "switch tools without copy-paste" pattern

Started in one AI client, want to finish in another:

Save everything I need to continue this work
in another AI tool to SAIHM.

Open the second client connected to the same SAIHM endpoint:

Use SAIHM to recall the work I was just doing.

Your memory is portable because the protocol is portable. The AI tool is incidental.

3. Capabilities only SAIHM gives you

SAIHM does more than any other approach — for less effort, with simpler prompts.

Cryptographic erasure you can prove

Use SAIHM to forget the cell tagged {sensitive-topic}.
Prove it is gone.

This is not a vendor delete. SAIHM destroys the key that decrypted the cell and writes the destruction event to a public chain. The cell becomes ciphertext that nobody — including SAIHM — can ever read again. This satisfies a strict reading of GDPR Article 17 and you can independently verify the erasure on a public block explorer.

Encrypted storage you alone can open

Every SAIHM cell is encrypted at write time under a key derived from your wallet. The ciphertext is content-addressed and pinned on IPFS, then durably archived to Filecoin — decentralized, censorship-resistant storage. Even with the stored ciphertext in hand, it stays unreadable without the wallet-derived key your AI Agent holds. SAIHM the operator cannot read your memories. The storage networks cannot read them. Only your AI Agent can.

Polymorphous cells: structured in, unstructured out (and vice versa)

Save this table to SAIHM as structured data,
then summarise it back as a written paragraph.

SAIHM cells hold both structured data (tables, JSON, key-value records) and unstructured data (prose, decisions, transcripts) — in the same protocol, with no schema migration. Input in one form, request output in another, or vice versa: a CSV in, a paragraph out; a paragraph in, a JSON record out; a description in, a structured ticket out. Truly polymorphous memory cell storage. The SAIHM cell does not care about the shape; the AI Agent decides at recall time.

Scope-bound sharing in one sentence

Use SAIHM to share my {project-name} cells with {teammate's agent},
read-only, expires in 7 days.

Time-bound, scope-bound, revocable. Revoke any time:

Revoke the SAIHM share I gave {teammate's agent}.

Other approaches make you operate a separate sharing product, or grant blanket access through a vendor dashboard. SAIHM gives you the same outcome in one prompt.

Shared SAIHM memory for distributed AI agents

Sharing SAIHM cells across multiple AI Agents compounds the savings, accuracy, and efficiencies. Globally distant agents stay in sync without manual exchange: each one calls SAIHM for the latest shared cells before acting. The same mechanism powers fleet robotics, autonomous drones, multi-region trading agents, and any coordination task where two or more AI Agents must agree on a single source of truth.

Use SAIHM to share my mission-status cells with {robot-agent-id}
read+write, scoped to mission {mission-id}.

Every agent that holds the share reads and writes the same SAIHM cells. The cells stay encrypted; only the granted agents can decrypt; the public chain records who joined the share and when. Coordination becomes a SAIHM recall, not a network round-trip.

One wallet per AI Agent (recommended)

SAIHM memories are bound to the wallet that wrote them. The AI Agent reads and writes its memories through that wallet. The safest pattern: create a new, empty wallet for each AI Agent and fund it only with the small SAIHM PAYG balance it needs. Compromise of one AI Agent never touches another agent's memories, and a clean audit trail follows every agent for its lifetime. Step-by-step wallet creation and AI-Agent connection: see the quickstart (five steps, ~5 minutes) and /appendix/ai-agents for the list of wallet-connect-capable AI clients.

Compliance-grade audit, on by default

Use SAIHM to export the audit trail for {project-name}
for the last 90 days.

Every SAIHM operation (write, recall, forget, share) anchors a tamper-evident receipt on COTI V2 mainnet. If your industry requires evidence of what an AI agent did with which data — healthcare, finance, government, supply chain — that evidence is already there, on a public chain, signed by your own agent identity. You did not have to bolt on a logging product. SAIHM did the work.

One protocol, every AI client

The same SAIHM prompts work in Claude Code, Claude Desktop, ChatGPT (via an MCP bridge), Cursor, Continue, custom agents, and any client that speaks the Model Context Protocol. Your memory follows you. The vendor does not own it. SAIHM is Apache 2.0 and self-hostable if you ever want to.

4. Freedom from the chaos other solutions force on you

SAIHM is synonymous with simplicity. Here is what you escape.

Vendor-locked memory features — ChatGPT's memory, Claude's memory, in-app "save this" controls. They forget when you switch tools. SAIHM does not.
Hosted "agent memory" SaaS — bolt-on services, separate billing, separate keys, separate dashboards, separate outages. SAIHM is one protocol, one key, one prompt vocabulary.
Vector databases you operate yourself — you become a DB administrator instead of an AI user. SAIHM has no infra for you to run.
Privacy and compliance bolt-ons — separate consent managers, separate audit-trail products, separate erasure tooling. SAIHM ships all three on the same protocol.

The measured outcome — up to ~80% fewer context tokens on long sessions, reproducible with the open benchmark — is what happens when you replace four moving parts with one. SAIHM does more, for less, with less effort.

5. Join SAIHM and try the prompts above in your own AI

If you are still on a hosted-vendor memory feature, or you are juggling multiple memory products that never talk to each other, the fastest comparison is the one you run yourself, on a real task, using the prompts in this post.

Join SAIHM at /join. Enrolment is a few clicks. PAYG and paid tiers available; see /pricing.
Connect your AI client to the SAIHM endpoint — one block of configuration, copy-paste from the quickstart page.
Paste the SAIHM prompts above into whichever AI you already use. Measure the next reply's token count against a fresh chat without SAIHM. The savings are visible in the first session.

SAIHM does more than every alternative and asks less of you. Join SAIHM and see it in your own AI client.

Join SAIHM

Independence notice. SAIHM is an Apache-2.0 protocol authored independently. It is not affiliated with OpenAI, Anthropic, Google, Perplexity, or any AI client vendor. The context-token reduction is reproducible independently: the open benchmark measures the saving on any session you point it at — the ~80 percent figure reflects long multi-session work and scales with session length (about 79% at ten turns, ~86% by eighteen). Pricing and tier details are on /pricing.

Originally published at the SAIHM blog on 2026-05-17. SAIHM is the Sovereign AI Horizontal Memory protocol — Apache 2.0, open spec at saihm.coti.global.

DEV Community