Persist and Restore Conversations With Optional Encryption: JSONL Codec for LLM Message Histories

#hermeschallenge #ai #python #agents

Your conversational agent handles multi-session interactions. The user starts a conversation on Monday, comes back on Wednesday, and expects the agent to remember the context. Without persistence, every new session starts from scratch.

You need to save conversation history between sessions. You also need to know that if the history file contains sensitive information — which it will, because users say sensitive things — it is protected.

conversation-codec is a JSONL-based conversation persistence layer with optional Fernet encryption.

The Shape of the Fix

from conversation_codec import ConversationCodec

codec = ConversationCodec(
    path="./conversations/user-abc123.jsonl",
    # Optional: encrypt with a key from your secrets manager
    encryption_key=os.environ.get("CONVERSATION_KEY"),
)

# Load existing conversation history
messages = codec.load() or []

# Continue the conversation
messages.append({"role": "user", "content": user_input})
response = call_llm(messages)
messages.append({"role": "assistant", "content": response.content})

# Save after each turn
codec.save(messages)

Load on session start. Append each turn. Save after each turn. When encryption is enabled, the JSONL file is encrypted at rest. If the file is accessed without the key, it is unreadable.

What It Does NOT Do

conversation-codec does not manage encryption keys. You provide the key from your secrets manager. Key rotation, expiry, and storage are your responsibility. The codec just uses whatever key you give it.

It does not implement conversation summarization. As the conversation grows, the full JSONL file grows with it. Old turns are not summarized or pruned automatically. For context window management, use llm-context-rotate to trim the message list before each API call, while conversation-codec persists the full history.

It does not handle concurrent writes safely. If two processes write to the same conversation file simultaneously, the file may be corrupted. Use one writer per conversation, or add your own locking layer.

Inside the Library

The codec supports two modes: plaintext JSONL and Fernet-encrypted JSONL.

import json
from pathlib import Path
from typing import Callable

class ConversationCodec:
    def __init__(self, path: str, encryption_key: str | None = None, redact_fn: Callable | None = None):
        self._path = Path(path)
        self._path.parent.mkdir(parents=True, exist_ok=True)
        self._encryption_key = encryption_key
        self._redact = redact_fn

        if encryption_key:
            try:
                from cryptography.fernet import Fernet
                self._fernet = Fernet(encryption_key.encode() if isinstance(encryption_key, str) else encryption_key)
            except ImportError:
                raise ImportError("Install 'cryptography' for encryption: pip install cryptography")
        else:
            self._fernet = None

    def save(self, messages: list[dict]) -> None:
        # Apply redaction if configured
        if self._redact:
            messages = [self._redact(m) for m in messages]

        jsonl = "\n".join(json.dumps(m) for m in messages)

        if self._fernet:
            data = self._fernet.encrypt(jsonl.encode())
            self._path.write_bytes(data)
        else:
            self._path.write_text(jsonl)

    def load(self) -> list[dict] | None:
        if not self._path.exists():
            return None

        if self._fernet:
            encrypted = self._path.read_bytes()
            jsonl = self._fernet.decrypt(encrypted).decode()
        else:
            jsonl = self._path.read_text()

        messages = []
        for line in jsonl.splitlines():
            if line.strip():
                messages.append(json.loads(line))

        return messages if messages else None

    def append(self, message: dict) -> None:
        """More efficient than load() + save() for single-message appends."""
        messages = self.load() or []
        messages.append(message)
        self.save(messages)

    def clear(self) -> None:
        self._path.unlink(missing_ok=True)

    def message_count(self) -> int:
        messages = self.load()
        return len(messages) if messages else 0

The redact_fn parameter accepts any callable that takes a message dict and returns a cleaned version. You can pass llm_pii_redact.redact_message directly:

from llm_pii_redact import PIIRedact
from conversation_codec import ConversationCodec

pii = PIIRedact()
codec = ConversationCodec(
    path="./conversations/user-abc.jsonl",
    redact_fn=pii.redact_message,
    encryption_key=load_key_from_vault(),
)

When to Use It

Use it for multi-session conversational agents where users expect continuity. Customer support agents that remember previous tickets. Tutoring agents that remember what topics were covered. Personal assistants that remember preferences.

Use it with encryption for any agent that handles sensitive information. User conversations regularly contain PII, health information, financial details, and confidential business data. Encrypting the file at rest is a basic hygiene measure.

Use it with redact_fn for double protection: redact sensitive patterns before writing, and encrypt what remains. An attacker who gets the file without the key sees nothing; an attacker who gets the key sees a redacted version.

Skip it for stateless agents where each request is independent. If there is no meaningful continuity from one session to the next, persistence adds overhead without value.

Install

pip install git+https://github.com/MukundaKatta/conversation-codec

# With encryption support
pip install conversation-codec cryptography

from conversation_codec import ConversationCodec
from llm_context_rotate import ContextRotate

codec = ConversationCodec(
    path=f"./conversations/{user_id}.jsonl",
    encryption_key=get_user_key(user_id),
)

ctx = ContextRotate(max_turns=20)

# Load existing history
saved_messages = codec.load() or []
for msg in saved_messages:
    if msg["role"] == "user":
        ctx.add_user(msg["content"])
    else:
        ctx.add_assistant(msg["content"])

async def handle_turn(user_message: str) -> str:
    ctx.add_user(user_message)
    codec.append({"role": "user", "content": user_message})

    response = await call_llm(messages=ctx.messages())
    response_text = extract_text(response)

    ctx.add_assistant(response_text)
    codec.append({"role": "assistant", "content": response_text})

    return response_text

Sibling Libraries

Library	What it solves
`llm-context-rotate`	Sliding window for the in-memory conversation
`llm-pii-redact`	PII redaction to pass as redact_fn
`llm-redact-secrets`	Credential redaction to pass as redact_fn
`agent-state-checkpoint`	Checkpoint arbitrary state (not just messages)
`agent-step-log`	Structured step log alongside the conversation

The persistence stack: conversation-codec for message history, llm-context-rotate for in-memory windowing, llm-pii-redact as the redact_fn, agent-state-checkpoint for other run state.

What's Next

Per-message encryption: instead of encrypting the whole file, encrypt each JSONL line individually. Allows appending without decrypting and re-encrypting the full file, and enables selective message access with different keys.

Compression: gzip the JSONL before encryption. Long conversations can become large files (1-10MB for a 200-turn conversation). Compression typically reduces this by 60-70%.

Message search: codec.search(query) that decrypts and searches for messages matching a keyword or date range. Useful when users ask "what did we discuss about topic X last month?" without loading the entire history.

Built as part of the agent-stack family: composable Python primitives for production LLM agents.