VEKTOR + OpenAI Agents SDK: Production Memory in Three Lines

#ai #llm #agents #memory

The OpenAI Agents SDK gives you execution primitives: tools, handoffs, guardrails. What it doesn’t give you is memory. By default, every agent run is isolated. The agent doesn’t know what it decided last time. It doesn’t remember the user’s preferences. It has no concept of project history. You either manage context manually — which scales poorly — or you pay for a proprietary cloud memory solution that puts your data off-premises.

VEKTOR is the third option: local-first, one-time-purchase, zero-cloud persistent memory that integrates in three lines. Your agent gets a permanent, growing brain. Your data stays on your server. Your context window stays clean.

import { createMemory } from ‘vektor-slipstream’; const memory = await createMemory({ provider: ‘openai’ }); await memory.remember(”User wants to deploy on Vercel.”);

That’s it for the baseline. But the real power comes from wiring VEKTOR into your agent’s tool loop — so it remembers and recalls automatically, without any manual context management.

Wiring memory into the tool loop

import { Agent, tool } from ‘openai-agents’; import { createMemory } from ‘vektor-slipstream’; const memory = await createMemory({ provider: ‘openai’ }); // Give the agent memory tools const rememberTool = tool({ name: ‘remember’, description: ‘Save important information to long-term memory’, parameters: { content: ‘string’, importance: ‘number’ }, execute: async ({ content, importance }) => { await memory.remember(content, { importance }); return ‘Remembered.’; } }); const recallTool = tool({ name: ‘recall’, description: ‘Retrieve relevant memories for the current task’, parameters: { query: ‘string’ }, execute: async ({ query }) => { const memories = await memory.recall(query, { topK: 5 }); return memories.map(m => m.content).join(’\n’); } }); const agent = new Agent({ name: ‘persistent-agent’, model: ‘gpt-4o’, tools: [rememberTool, recallTool], instructions: ‘You have persistent memory. Always recall context before responding. Save important decisions.’ });

Local Transformers.js — no API calls for vectors

Most memory solutions require you to call an embedding API for every write and recall. At scale, this is a hidden cost that compounds quickly — 10,000 memory operations per month can cost $50–200 in embedding API calls alone.

VEKTOR generates embeddings locally using Transformers.js — running the embedding model directly on your hardware via WebAssembly. First run downloads the model (~80MB). Every subsequent embedding is free, instant, and private.

Three lines to integrate — no infra to configure

Local SQLite — one file, zero database overhead

Zero embedding costs — Transformers.js runs on your hardware

AUDN curation — no contradictions accumulate

Works with any OpenAI-compatible agent framework

Originally published at

https://vektormemory.com