DEV Community

Cover image for I Built an API That Gives AI Agents Persistent Memory - Here's How
Tim Bassler
Tim Bassler

Posted on

I Built an API That Gives AI Agents Persistent Memory - Here's How

Every AI agent I've worked with has the same problem: it forgets everything.

You give it a 20-minute research task. It gathers sources, analyzes them, starts writing a report - and then it crashes. Or times out. Or you accidentally close the tab. When you restart it, it has no idea what it already did. It starts from scratch.

I kept thinking: video games solved this decades ago. Save states. Why don't agents have save states?

So I built one.

The problem

Most agents are still stateless in practice.

They may have short-term context during a run, but if the process dies or the session resets, the workflow is gone unless you built your own persistence layer around it.

That creates a few ugly failure modes:

  • A research agent crashes after finding 18 good sources and has to start over
  • A coding agent loses its place halfway through a task
  • A multi-agent pipeline cannot hand off cleanly between workers
  • A human approval step breaks the workflow because there is no durable checkpoint to resume from
  • Debugging is painful because there is no step-by-step execution history to replay

You can store blobs in Redis or Postgres yourself.

But once you want resumable workflow history, replay, step labels, agent attribution, diffs, analytics, SDKs, and a clean API, the “simple persistence layer” starts turning into its own product.

I wanted something purpose-built for this.

What SnapState does

SnapState is an API that lets agents save checkpoints at each step of a multi-step workflow and resume exactly where they left off.

The core loop is simple:

  1. Agent starts a task
  2. After each meaningful step, it saves a checkpoint (state, step number, label)
  3. If interrupted, it calls resume and gets back the latest state
  4. It skips completed steps and continues from where it left off

Here's what a checkpoint save looks like:

import { SnapStateClient } from 'snapstate-sdk';

const client = new SnapStateClient({
  apiKey: 'snp_your_key_here',
});

// After gathering sources
await client.save({
  workflowId: 'wf_research_ai_trends',
  step: 1,
  label: 'sources_gathered',
  state: {
    sources: ['arxiv.org/abs/2401.1234', 'github.com/trending'],
    query: 'AI agent frameworks 2026',
    step: 1
  }
});
Enter fullscreen mode Exit fullscreen mode

And resuming after a crash:

// At the start of every run
let state = { step: 0 };
try {
  const resumed = await client.resume('wf_research_ai_trends');
  state = resumed.latestCheckpoint.state;
  console.log(`Resuming from step ${state.step}`);
} catch (e) {
  console.log('Starting fresh');
}

// Skip completed steps
if (state.step < 1) {
  // gather sources...
}
if (state.step < 2) {
  // analyze sources...
}
Enter fullscreen mode Exit fullscreen mode

The simplest integration - a system prompt

You don't even need to write code to integrate SnapState with your agent. If you're using Claude Desktop, Cline, or any LLM with tool access, just add this to your system prompt:

You have access to SnapState for persistent workflow state.
At the start of every task, call resume_workflow to check for prior state.
After each major step, call save_checkpoint with your current progress.
Save everything needed to continue without repeating work.
Enter fullscreen mode Exit fullscreen mode

The agent figures out the rest. It saves after each step, checks for existing state before starting, and resumes automatically.

What's under the hood

The full stack includes:

  • REST API - save, resume, replay checkpoints with diff tracking and ETags
  • JavaScript SDK - npm install snapstate-sdk
  • Python SDK - pip install snapstate-sdk
  • MCP Server - works natively with Claude Desktop and Cline
  • Agent identity - tag checkpoints with agent IDs for multi-agent workflows
  • Analytics - workflow stats, failure patterns, per-agent metrics
  • Self-service auth - signup, email verification, API key management

Built with Node.js/Fastify, Redis, PostgreSQL, and Cloudflare R2 for cold storage.

Try it

The free tier gives you 10,000 checkpoint writes per month, 1GB storage, and 5,000 resume calls. No credit card required.

Quick start:

npm install snapstate-sdk
Enter fullscreen mode Exit fullscreen mode
import { SnapStateClient } from 'snapstate-sdk';

const client = new SnapStateClient({
  apiKey: 'snp_your_key_here', // get one at snapstate.dev/get-started
});

await client.save({
  workflowId: 'wf_hello',
  step: 1,
  label: 'first_checkpoint',
  state: { message: 'it works' }
});

const resumed = await client.resume('wf_hello');
console.log(resumed.latestCheckpoint.state);
// { message: 'it works' }
Enter fullscreen mode Exit fullscreen mode

Or in Python:

from snapstate_sdk import SnapStateClient

client = SnapStateClient(api_key="snp_your_key_here")

client.save(
    workflow_id="wf_hello",
    step=1,
    label="first_checkpoint",
    state={"message": "it works"}
)

resumed = client.resume("wf_hello")
print(resumed.latest_checkpoint.state)
# {'message': 'it works'}
Enter fullscreen mode Exit fullscreen mode

I'd love to hear feedback on the API design. What features would make this useful for your agent workflows?

Top comments (0)