Agam Pandey for Mem0

Posted on Jan 26 • Originally published at mem0.ai

Building Memory-First AI Reminder Agents with Mem0 and Claude Agent SDK

#agents #ai #llm #productivity

Traditional reminder bots accept a command, schedule a job, send a notification, and forget the interaction ever happened. They don't learn whether you prefer morning or evening notifications. They don't track whether you typically snooze by 5 minutes or 30. Once you mark something done, the context vanishes from the system entirely.

This works for alarms. It breaks down when you need an assistant that remembers your patterns across weeks and months without losing track of what actually needs to happen.

A real reminder assistant balances reliability with personalization. It never forgets an active reminder, never re-triggers a completed one, and still learns patterns like "this user snoozes work tasks by 15 minutes on average" or "personal reminders cluster around 8pm."

At Mem0, we built RecallAgent, a Slack reminder bot, to explore what happens when memory is a core architectural component rather than an add-on feature. The result is a pattern for building AI agents that remember without hallucinating, adapt without drifting, and archive without losing history.

This post explains the system in two parts: the product experience and why it behaves differently from typical bots, then the engineering details in enough depth that you could build something similar.

The agent uses the Claude Agent SDK, delivers through Slack events, and stores state in a relational database (SQLite locally, Supabase in production). Long-term personalization comes from the Mem0 memory model, and the FastAPI backend runs on hosted services like Render's free tier.

Part I: The product story

RecallAgent behaves like a modern Slack assistant. You message it naturally:

"Remind me to pay rent tomorrow."

"Push that to 6pm."

"Snooze this by 10 minutes."

"What's coming up this week?"

Reminder overview: list, upcoming schedule, and rescheduled items in one view.

When a reminder fires, it pings you directly in Slack with context, buttons, and inline response options. You can mark it done, snooze it, or ask for details without breaking conversation flow.

Notification controls:complete or snooze a reminder without breaking the conversation.

The difference emerges over time. If you regularly confirm work reminders around 10am, RecallAgent starts suggesting that time when you create a reminder without specifying one. If you snooze by roughly 15 minutes consistently, the agent learns that interval and reflects it back. When you complete a reminder, it stops influencing future behavior but remains searchable in your history.

Most AI systems either forget too aggressively (losing valuable patterns) or remember too loosely (creating contradictions in state). RecallAgent avoids both by separating what is true from what is remembered.

Truth vs. memory

In RecallAgent, reminders are not memories. They are facts. If a reminder exists, it must fire. If it's marked done, it must never fire again. This invariant lives in the database and is injected into the system prompt on every turn. The model sees actual reminder state, not a recollection of it.

Memory stores something different: behavior, preferences, and context. Memory answers questions like "how does this user phrase reminders," "what time patterns do they confirm," "how often do they snooze," and "what categories do they implicitly use." Mem0 stores these signals and keeps a lightweight record of active and completed reminders so the assistant can recall history without treating it as truth. We write active and archived reminders into Mem0 categories, but the model never treats those as authoritative. The database is truth. Mem0 is the personalization layer.

This boundary lets the system be both adaptive and correct. The agent can learn without becoming the source of truth.

System components

RecallAgent has four major parts: Slack as the interface, a FastAPI backend that orchestrates everything, a relational database that holds reminder state, and Mem0 for long-term memory. An LLM-based agent sits at the center, but it never directly mutates state. It reasons, decides, and calls tools. All changes happen in deterministic code, with the model driven by the Claude Agent SDK and a constrained tool surface.

High-level architecture: Slack to FastAPI to agent, grounded by DB and Mem0.

This structure is intentionally conservative. When you're building agents that deal with time, notifications, and trust, boring architecture is a feature.

Part II: Engineering details

This section provides the blueprint. We start where Slack sends events and move inward through orchestration, memory, and state.

Slack integration

Slack is the entry point where messages and button clicks hit your backend. Slack delivers events at least once, so duplicate messages are expected behavior, not a corner case.

The backend exposes three HTTP endpoints: POST /slack/events for Event API messages, POST /slack/commands for slash commands, and POST /slack/interactions for button clicks. Every incoming request is authenticated using Slack's signing secret to prevent replay attacks, following Slack's verification guide. Before any business logic runs, events are deduplicated using a short-TTL cache keyed by Slack's event ID. Without deduplication, you will double-create reminders in production.

Slack request verification and event deduplication:

body = await request.body()
if not verify_slack_signature(x_slack_signature, x_slack_request_timestamp, body):
    return JSONResponse({"error": "invalid_signature"}, status_code=401)

payload = await request.json()
event_id = payload.get("event_id")
if is_duplicate_slack_event(event_id):
    return JSONResponse({"ok": True})

Slack-specific formatting (mentions, channel tags, markup) is stripped immediately. The agent only sees clean text. This simplifies intent handling and reduces prompt noise.

The agent loop

Once a message is normalized, it enters a single orchestration loop that prioritizes deterministic state handling before model reasoning. We first resolve any pending state transitions stored in memory: a missing-time follow-up, a personalized time suggestion from the memory layer, or a clarification between multiple matching reminders. Short replies like "yes," "no," or "the second one" are mapped to explicit state updates using simple rules, not another model call. Only after the state is unambiguous do we invoke the LLM to choose a tool and produce a structured action. This ordering is deliberate. Reminder flows break when generative output can override incomplete or ambiguous state.

Deterministic resolution before the model is called, including Mem0-based time suggestion:

# suggested_time is set when the reminder is created without a time
# using get_common_times_by_category(), which reads Mem0 prefs first
pending = pending_actions.get(user_id)
if pending and pending.get("type") == "confirm_time":
    # suggested_time is derived from Mem0 preferences (DB fallback)
    suggested_time = pending.get("suggested_time")
    if user_message.strip().lower() in {"yes", "ok", "sure"} and suggested_time:
        due_str = f"{pending['due_str']} {suggested_time}"
        return execute_create_reminder(..., due_str=due_str, allow_unconfirmed=True)
    if message_mentions_time(user_message):
        return execute_create_reminder(..., due_str=user_message, allow_unconfirmed=True)
    return "What time should I set it for?"

# Only after pending state is resolved
response = client.messages.create(...)

Memory retrieval

Memory lookup is intentional, not automatic. The system first inspects the user's intent and only retrieves Mem0 signals when personalization will help. Listing reminders rarely needs memory; creating or clarifying a reminder often does. When retrieval is warranted, we query Mem0 for just two categories (preferences and behavior summaries), then cache that result with a short TTL. This keeps prompts small, latency predictable, and memory noise out of the model's decision path.

Memory retrieval flow: intent check, Mem0 query, and short TTL caching.

The result is a prompt that is personalized without becoming overloaded or inconsistent.

Claude Agent integration

We integrate the Claude Agent SDK as the orchestration layer, but keep its surface area narrow. The agent does not mutate state directly. It reasons, selects a tool, and returns structured inputs. Deterministic SQL-backed code executes the change, logs it, and syncs memory. This separation keeps the system trustworthy when the model inevitably makes mistakes.

The toolset is intentionally small and typed. Each tool maps to a concrete function in agent-backend/main.py, with the database as source of truth and Mem0 as long-term context. Without this tool boundary, you lose auditability and open the door to hallucinated updates.

Tool Name	What It Does
`create_reminder`	Create a reminder with natural-language time parsing and category inference
`update_reminder`	Change title, description, or due time; marks reschedules for behavior tracking
`mark_done`	Complete a reminder and move its memory to the archived category
`snooze_reminder`	Push a reminder forward and record the snooze interval
`list_reminders`	Return DB-backed lists (active, completed, all) with stable formatting
`search_reminders`	Search reminders by title or description in the database
`delete_reminder`	Delete a reminder only when the user explicitly asks
`set_preference`	Write long-term preferences into Mem0 memory
`get_preferences`	Read preferences from Mem0 and return them to the agent
`list_rescheduled_reminders`	Surface reminders with reschedule history from Mem0 or DB
`clarify_reminder`	Ask the user to disambiguate when multiple reminders match

The system prompt

The system prompt is not creative. It is contractual. It tells the model explicitly that the database is the source of truth for reminders, Mem0 provides personalization only, the model must never invent or assume reminder state, and all state changes must happen via tools. Claude gets a narrow toolset: create, update, snooze, list, mark done. When it decides an action is required, it emits a structured tool call. No state is mutated inside the prompt.

Tool definitions and a sample tool invocation:

TOOLS = [
    {
        "name": "create_reminder",
        "description": "Create a reminder with title and due date",
        "input_schema": {
            "type": "object",
            "properties": {
                "title": {"type": "string"},
                "due_str": {"type": "string"}
            },
            "required": ["title", "due_str"]
        }
    },
    # ...
]

# Example tool invocation returned by the model
# {"tool_use_id": "...", "name": "create_reminder", "input": {"title": "Pay rent", "due_str": "tomorrow 9am"}}

This design makes the system auditable, testable, and resilient to model drift.

Claude Agent SDK call with tools and system prompt wired in:

response = client.messages.create(
    model=CLAUDE_MODEL,
    max_tokens=MAX_TOKENS,
    tools=TOOLS,
    system=system_prompt,
    messages=messages
)

Prompt building

The prompt is assembled from two sources with different guarantees. First, we load current reminder state from the database and render it directly into the system prompt. This makes the model's view of reality deterministic. It can't hallucinate or override actual reminder status. Second, we enrich with Mem0 signals (preferences and behavior summaries) so the agent can adapt without drifting. We intentionally keep that memory slice small and targeted. The result is a prompt that feels personalized but stays grounded. It suggests a default time or interprets a user's pattern without blurring the boundary between memory and state.

System prompt assembly with DB truth and Mem0 context:

db_reminders = db.list_active_reminders(user_id)
db_rescheduled = db.list_rescheduled_reminders(user_id)

system_prompt = f"""You are a proactive, friendly reminder companion in Slack. You help users stay organized while learning their habits and preferences over time.

## PERSONALITY & TONE
- Be conversational, supportive, and concise
- Use natural language (avoid robotic responses)
- Celebrate completions and encourage productivity
- Match the user's communication style (formal/casual)
- Proactively suggest improvements based on patterns

## CURRENT CONTEXT
**Active Reminders:**
{json.dumps([{
    'id': _reminder_value(r, "id", 0),
    'title': _reminder_value(r, "title", 2),
    'description': _reminder_value(r, "description", 3),
    'due_at': format_due_datetime(_reminder_value(r, "due_at_epoch", 4)),
    'status': _reminder_value(r, "status", 5),
    'category': _reminder_value(r, "category", 6)
} for r in db_reminders], indent=2) if db_reminders else "No active reminders"}

**Rescheduled Active Reminders:**
{json.dumps([{
    'id': _reminder_value(r, "id", 0),
    'title': _reminder_value(r, "title", 2),
    'description': _reminder_value(r, "description", 3),
    'due_at': format_due_datetime(_reminder_value(r, "due_at_epoch", 4)),
    'status': _reminder_value(r, "status", 5),
    'category': _reminder_value(r, "category", 6),
    'reschedule_count': _reminder_value(r, "reschedule_count", 10)
} for r in db_rescheduled], indent=2) if db_rescheduled else "No rescheduled reminders"}

**User Patterns:**
- Preferences: {json.dumps(mem0_context['preferences'], indent=2)}
- Behavior: {json.dumps(mem0_context['behavior'], indent=2)}
- Recent context: {json.dumps(mem0_context['conversation_history'][-3:], indent=2)}

**Time Context:**
- Current: {datetime.now().strftime("%A, %B %d, %Y at %I:%M %p")}
- Timezone: {DEFAULT_TIMEZONE}
- Suggested times: {json.dumps(common_times, indent=2)}

## CORE BEHAVIORS
1. Natural language first: parse "tomorrow at 3", "next Monday", "in 2 hours" automatically
2. Smart defaults: if no time given, suggest category-appropriate time from user patterns/common times, then confirm
3. Clarify ambiguity: use clarify_reminder tool when multiple matches exist
4. Proactive insights: notice patterns and suggest improvements when appropriate
5. DB is ground truth: always use DB-backed tools for reminder status/times; Mem0 is context only
6. Clean responses: use tool summaries verbatim; never expose internal IDs or storage details
7. Respect user intent: only delete when explicitly requested
8. Accept short-term reminders (minutes) without refusing; never scold the user
9. Never change or round user-provided times; preserve exact minutes/hours. If unclear, ask a brief clarification
10. If the user asks for archived/completed reminders, call list_reminders with status="completed"

## RESPONSE GUIDELINES
- Confirmations: "Got it! I'll remind you about {{title}} on {{date}}"
- Lists: always call list_reminders and return its formatted summary verbatim with no extra text
- Errors: be helpful, not apologetic ("Let me help you fix that...")
- Follow-ups: suggest related actions when relevant

Keep it human, helpful, and focused on the user's goals."""

Reminder creation and missing times

Users frequently omit times: "tomorrow," "later," "in the evening." Guessing is dangerous. Asking repeatedly is annoying. RecallAgent resolves this by combining inference with confirmation. If no time is supplied, the system infers a category and looks up the user's most common confirmed times for that category from behavior memory. It proposes a default and asks for explicit confirmation before scheduling. This preserves user trust while reducing friction.

The database

All reminders live in a relational database (SQLite locally, Supabase in production). The schema tracks reminders, statuses, audit logs, behavior stats, and a short conversation window.

Database schema overview for reminders, preferences, audit logs, and behavior stats.

The database makes the system reliable. If Mem0 is unavailable, reminders still fire. If the model misbehaves, state remains correct. A useful rule emerged during development: if losing the data would break correctness, it belongs in the database. If losing it would only reduce personalization, it belongs in memory.

Mem0 implementation

Mem0 stores long-term signals and a mirrored reminder history using explicit categories: active reminders, archived reminders, user preferences, behavior summaries, and optional conversation memory. When a reminder is marked done, its active memory is removed and an archived memory is written instead. This avoids stale memory influencing future behavior while keeping history searchable. The mirror is never treated as authoritative state. It improves recall and personalization without deciding what should fire. Behavior is summarized rather than logged raw. The model sees patterns, not noise. RecallAgent learns over time without accumulating contradictions.

Notifications and archiving

Reminder delivery runs as a separate execution path from the conversational agent. A background polling loop queries for due reminders within a lead-time window and pushes Slack notifications with action buttons. Each reminder stores last_notified_at so we can enforce idempotent delivery and avoid duplicate pings across retries or restarts.

Archiving is handled out-of-band through a scheduled cron job that calls an endpoint to update overdue reminders in the SQL store from active to completed. This decouples time-based state transitions from chat traffic and keeps the system correct even if the web process sleeps.

Slack app setup

The backend endpoints are only half the story. To make the bot real, you register a Slack app and point it to your public URLs. This happens in the Slack app dashboard at api.slack.com/apps. Once the app is created, you enable Events, Commands, and Interactivity and paste in the endpoints your FastAPI service exposes. For RecallAgent, the app expects /slack/events for event callbacks, /slack/commands for slash commands, and /slack/interactions for button actions. The signing secret and bot token are stored in the backend environment so the service can verify requests and post responses back into Slack.

Slack requires a public HTTPS endpoint. For local development, expose your server with a tunneling tool like ngrok and use its URL in the Slack app settings. For production, deploy to a hosted service (Render, Fly, Railway, or similar) and use that stable URL for event and interaction callbacks. Free-tier hosting often introduces cold starts after inactivity, so the first request in a new conversation can have higher latency.

Slack app settings showing Events and Interactivity URLs.

Why this architecture works

This architecture is valuable not because it sends reminders, but because it scales trust. It lets you personalize without letting memory become truth, and it lets you archive without erasing history. The agent can learn from behavior while the system remains deterministic, auditable, and safe.

That balance is what makes memory layers like Mem0 useful. The Claude platform gives the model its reasoning power, but the reliability comes from the contract we enforce around state and tools. If you're building agents meant to live beyond a demo, this boundary between state and memory isn't optional. It's the difference between a one-off demo and a system people trust daily.

Memory only creates value when it is deliberate. We chose Mem0 to hold durable signals like preferences and behavior, while the SQL store remains the single source of truth for time and state. That separation lets the agent feel human without becoming unpredictable. If you build on this pattern, your assistant will not just respond. It will remember the right things, for the right reasons, and earn the trust that makes long-lived agents possible.

View Mem0's documentation to learn more about integrating our memory features. You can also reach out to us at founders@mem0.ai.

DEV Community