Tony Spiro

Posted on Jun 18 • Originally published at cosmicjs.com

Cosmic as Agent Memory: Structured, Versioned, and Queryable

#ai #webdev #typescript #programming

AI agents get better the more they run. Every conversation turn, every task completed, every prompt refined adds to a growing body of context that shapes the next output. The compounding effect is real: an agent with 100 turns of memory and a versioned prompt history behaves meaningfully differently from one starting cold.

This post walks through using a structured, versioned, API-accessible store as the memory layer for AI agents, with TypeScript examples. Agent messages, system prompts, findings, and instructions are all stored as structured, versioned, API-accessible Objects. Each new turn adds to the record. Each prompt edit is tracked.

What Agent Memory Actually Needs

The compounding loop only works if the memory layer has the right properties. Most agent frameworks handle working memory well. The gap is episodic and semantic memory: what the agent learned, did, and produced across sessions.

Researchers at Elastic recently published a breakdown of agent memory tiers: working memory (in-context), episodic memory (past interactions), semantic memory (knowledge), and procedural memory (learned behaviors). Good persistent agent memory needs four properties:

Structured: queryable by type, status, date, or custom field, not just full-text search
Versioned: you need to know what the agent wrote at each point in time, not just the latest state
API-accessible: any model, any framework, any language should be able to read and write it
Human-reviewable: agents make mistakes; a human needs to inspect and correct outputs without touching a database

Objects as Agent Outputs

When an agent produces output, storing it as a structured Object gives you a queryable record with typed fields, a draft/published workflow so a human can review before promoting to production, a full audit trail of every change, REST API access from any runtime, and a dashboard UI where non-technical team members can inspect, edit, or approve agent outputs.

Here's a simple research agent that stores its findings as Cosmic Objects:

import { createBucketClient } from '@cosmicjs/sdk'

const cosmic = createBucketClient({
  bucketSlug: process.env.COSMIC_BUCKET_SLUG!,
  readKey: process.env.COSMIC_READ_KEY!,
  writeKey: process.env.COSMIC_WRITE_KEY!,
})

async function storeAgentFinding({
  topic,
  summary,
  sourceUrl,
  confidenceScore,
}: {
  topic: string
  summary: string
  sourceUrl: string
  confidenceScore: number
}) {
  const result = await cosmic.objects.insertOne({
    title: topic,
    type: 'agent-findings',
    status: 'draft', // human review before publishing
    metadata: {
      summary,
      source_url: sourceUrl,
      confidence_score: confidenceScore,
      reviewed: false,
    },
  })
  return result.object
}

The output is immediately visible in the dashboard. A team member can review the summary, edit it, toggle reviewed to true, and publish, all without touching code.

Storing Prompts, Context, and Conversation Memory

Agent outputs are only part of the picture. The other half is what goes in to the agent: system prompts, conversation history, and session context.

System Prompts as Objects

Instead of hardcoding system prompts in your codebase, store them as Objects. This gives you version control for prompts (draft a new version, test it, publish when ready, roll back if behavior degrades), non-engineer editable wording without a deploy, and environment-aware prompts per environment with zero code changes.

// Fetch the active system prompt for an agent
const { object: promptObject } = await cosmic.objects
  .findOne({
    type: 'agent-prompts',
    slug: 'content-research-agent',
    status: 'published',
  })
  .props('title,metadata.prompt_text,metadata.version')

const systemPrompt = promptObject.metadata.prompt_text

When you want to update the prompt, you edit it in the dashboard, save a new version, and publish. The agent picks it up on the next run with no deployment required.

Conversation Context and Message History

For agents that need to maintain state across sessions, store the conversation history as structured Objects:

async function storeMessage({
  sessionId,
  role,
  content,
  agentName,
}: {
  sessionId: string
  role: 'user' | 'assistant' | 'system'
  content: string
  agentName: string
}) {
  await cosmic.objects.insertOne({
    title: `${agentName} / ${sessionId} / ${role}`,
    type: 'agent-messages',
    status: 'published',
    metadata: {
      session_id: sessionId,
      role,
      content,
      agent_name: agentName,
    },
  })
}

// Retrieve full conversation context for a session
const { objects: messages } = await cosmic.objects
  .find({
    type: 'agent-messages',
    'metadata.session_id': sessionId,
  })
  .props('metadata.role,metadata.content,created_at')
  .sort('created_at')

The agent can reconstruct its full conversation history on every run. The history is human-readable in the dashboard, editable when needed, and queryable across sessions.

Querying Agent Memory

The real power is in retrieval. Because each agent output is a structured Object with typed metafields, you can query across your entire agent history:

// Get all unreviewed findings from the last 7 days
const { objects } = await cosmic.objects
  .find({
    type: 'agent-findings',
    'metadata.reviewed': false,
  })
  .props('id,title,metadata,created_at')
  .sort('-created_at')
  .limit(50)

// Get high-confidence findings on a specific topic
const { objects: topFindings } = await cosmic.objects
  .find({
    type: 'agent-findings',
    'metadata.confidence_score': { $gte: 0.85 },
  })
  .props('id,title,metadata')
  .sort('-metadata.confidence_score')

You are filtering by structured properties, sorting by custom scores, and scoping by review status. The agent's memory is queryable the same way any other content in your system is queryable.

Versioning: Know What the Agent Said When

A full revision history for every Object matters for auditability. If an agent's output informed a business decision, you need to know exactly what it said at the time of that decision, not just the current state. The same applies to prompts. When a prompt change shifts agent behavior, you can trace exactly which version was active and when. That's the kind of audit trail that matters as agents take on more consequential tasks.

Using the MCP Server

Cosmic ships a native MCP Server, which means any agent running in Claude, Cursor, Windsurf, or any MCP-compatible runtime can read and write Objects directly, with no custom API wrapper needed. The MCP Server exposes all 18 Cosmic tools to your agent: create objects, update objects, query by type, filter by metadata, manage media, and more.

Schema Design for Agent Context and Memory

The key to making this work well is defining clean Object types upfront. Three schemas cover most agent context and memory use cases:

agent-findings: summary (textarea), source_url (text), confidence_score (number 0-1), agent_name (text), session_id (text), reviewed (switch), tags (references)

agent-messages: role (select: user/assistant/system), content (textarea), agent_name (text), session_id (text)

agent-prompts: prompt_text (textarea), version (number), notes (textarea)

What You Get Out of the Box

You could build this with Postgres and a custom schema. A headless CMS includes a dashboard UI for every agent output with no custom admin to build, built-in revision history with no extra tables, a REST API ready to consume from any runtime, a draft/published workflow, media handling, and model agnosticism across any framework or language.

Read the full post on the Cosmic blog for the complete walkthrough, including the copy-paste schema setup and getting-started steps.

Top comments (4)

Hayrullah Kar • Jun 18

This is a refreshing take on the agent memory bottleneck. Everyone jumps straight to vector DBs for semantic retrieval, but they completely neglect the operational reality of running agents in production: debugging, human-in-the-loop validation, and prompt regression.

Treating a headless CMS/structured API store as the episodic memory layer is brilliant for one major reason: human observability. When an agent goes off the rails, a non-technical product manager can literally log into a dashboard, fix the corrupted memory object or roll back a borked system prompt version, and save the session without a single database migration.

My only slight critique is on the latency side for real-time conversation state. For high-frequency, multi-turn chat loops, hitting a full API endpoint to append every single raw message string can introduce noticeable network lag compared to local Redis states. But for episodic findings, system prompts, and structured tool outputs? This architecture is incredibly clean.

The native MCP server integration makes it an absolute no-brainer for Cursor/Claude stacks. Love the blueprint!

Tony Spiro • Jun 18

Thank you Hayrullah! It's been transformative for us at Cosmic, hoping others get on the bandwagon and use their CMS to store and audit agent context.

Some comments may only be visible to logged-in visitors. Sign in to view all comments.