DEV Community

Cover image for How to Build a Durable AI Agent in 10 Lines of TypeScript
Siddick FOFANA
Siddick FOFANA

Posted on • Edited on

How to Build a Durable AI Agent in 10 Lines of TypeScript

Your AI agent crashes after 2 minutes of work. You lose everything. It doesn't have to be this way.


I've seen it happen in every production AI project.

You build a beautiful agent. It works perfectly in local. You ship it.
Then it runs for 90 seconds, hits a timeout, crashes — and all context is gone.
Your user stares at a spinner. Your logs show nothing useful.

The problem isn't your LLM. It's your architecture.

AI agents are stateful, long-running processes. And we've been treating them like stateless HTTP handlers.

Let me show you how to fix this in 10 lines.


The Problem: Agents Are Ephemeral by Default

// Standard agent — everything lives in RAM
const agent = new StreamingToolAgent({ goal: 'Research assistant' }, llm)
const result = await agent.run('Research the top 10 AI papers of 2025')
// If this crashes at step 8/10... you restart from zero.
Enter fullscreen mode Exit fullscreen mode

The issue is that AI agent state lives only in memory. The moment the process dies — OOM, timeout, deployment, network blip — you lose everything.


The Solution: @orka-js/durable

import { DurableAgent, MemoryDurableStore } from '@orka-js/durable'
import { StreamingToolAgent } from '@orka-js/agent'
import { OpenAIAdapter } from '@orka-js/openai'

const llm = new OpenAIAdapter({ apiKey: process.env.OPENAI_API_KEY! })
const agent = new StreamingToolAgent({ goal: 'Research assistant', tools: [] }, llm)
const store = new MemoryDurableStore()
const durable = new DurableAgent(agent, store, { maxRetries: 3 })

// That's it. Your agent is now durable.
const job = await durable.run('job-001', 'Research the top 10 AI papers of 2025')
console.log(job.status)  // 'completed'
console.log(job.result)  // The full research
Enter fullscreen mode Exit fullscreen mode

That's literally 10 lines. Let me break down what just happened.


What DurableAgent Actually Does

1. Every job has a persistent ID

const job = await durable.run('job-001', 'Research AI papers')
Enter fullscreen mode Exit fullscreen mode

The job-001 identifier is the key insight. If this job already ran and completed, calling run('job-001', ...) again returns the cached result immediately — no LLM call made.

This is idempotency. It's the foundation of reliable distributed systems, now applied to AI agents.

2. Automatic retry with backoff

const durable = new DurableAgent(agent, store, {
  maxRetries: 3,
  retryDelayMs: 2000, // wait 2s between retries
})
Enter fullscreen mode Exit fullscreen mode

If your agent fails (rate limit, network error, LLM timeout), DurableAgent retries automatically. No try/catch soup. No manual retry loops.

3. Human-in-the-loop: pause and resume

This is where it gets genuinely powerful.

// Start a long research job
const job = await durable.run('analysis-42', 'Analyze Q4 financial reports')

// Pause it — maybe you need human approval before continuing
await durable.pause('analysis-42')

// Hours later, after your manager reviews the intermediate output...
const resumed = await durable.resume('analysis-42')
console.log(resumed.result)
Enter fullscreen mode Exit fullscreen mode

You can build approval workflows where an agent pauses waiting for a human decision, then resumes exactly where it left off. No state reconstruction. No prompt re-injection hacks.


Production: Switch to Redis in One Line

MemoryDurableStore is great for local dev. For production:

import { RedisDurableStore } from '@orka-js/durable'
import { createClient } from 'redis'

const redis = createClient({ url: process.env.REDIS_URL })
await redis.connect()

const store = new RedisDurableStore(redis) // <- one line change
const durable = new DurableAgent(agent, store, { maxRetries: 3 })
Enter fullscreen mode Exit fullscreen mode

Your agent jobs now survive:

  • Server restarts
  • Deployments
  • OOM crashes
  • Cloud instance preemptions

Scheduled Agents: Cron for AI

One more thing. You can schedule an agent to run on a cron:

const durable = new DurableAgent(agent, store)

// Run every day at 9am
durable.schedule('daily-briefing', '0 9 * * *', 'Generate the daily news summary')
Enter fullscreen mode Exit fullscreen mode

No cron infrastructure to set up. No separate worker process. Just declare it and forget it.


Stream Events While Persisting State

If your agent supports streaming (it should), DurableAgent preserves it:

const stream = durable.stream('job-002', 'Write a full market analysis')

for await (const event of stream) {
  if (event.type === 'text') process.stdout.write(event.content)
  if (event.type === 'tool_call') console.log('Using tool:', event.name)
  if (event.type === 'done') console.log('Job saved:', event.jobId)
}
Enter fullscreen mode Exit fullscreen mode

The user sees a streaming response. The job state is persisted in the background. Best of both worlds.


The Mental Model Shift

Before After
Agent = function call Agent = persistent job
Failure = start over Failure = resume from checkpoint
Long tasks = scary Long tasks = no problem
Human review = impossible Human review = pause/resume
Cron = separate infra Cron = built-in

Get Started

npm install @orka-js/durable @orka-js/agent @orka-js/openai
Enter fullscreen mode Exit fullscreen mode
import { DurableAgent, MemoryDurableStore } from '@orka-js/durable'
import { StreamingToolAgent } from '@orka-js/agent'
import { OpenAIAdapter } from '@orka-js/openai'

const llm = new OpenAIAdapter({ apiKey: process.env.OPENAI_API_KEY! })
const agent = new StreamingToolAgent({ goal: 'Assistant', tools: [] }, llm)
const durable = new DurableAgent(agent, new MemoryDurableStore(), { maxRetries: 2 })

const job = await durable.run('my-first-durable-job', 'Hello, world!')
console.log(job.status, job.result)
Enter fullscreen mode Exit fullscreen mode

That's it. Your agents are now production-grade.


OrkaJS is a TypeScript-first framework for building production AI agents. Modular, typed, and provider-agnostic.

If this solved a problem you've had, share it — other devs are fighting the same battle right now.

Top comments (0)