Mukunda Rao Katta

Posted on May 25

Persist LLM Conversations Safely: JSONL, Redaction, and Encryption

#hermeschallenge #ai #python #agents

Multi-Turn Agents Lose State on Crash

A single-turn agent is stateless. Each run starts fresh. If it crashes, you restart it with the same input and try again.

A multi-turn agent is different. It has conversation history. The fifth message depends on messages one through four. If the process crashes between turn four and turn five, and you did not persist anything, you restart from scratch. The user loses context. If the conversation was long, they lose a lot.

Persisting conversation history solves this. But naive persistence has two problems:

The conversation may contain secrets. API keys passed in user messages, PII in model responses, authentication tokens from tool results. Writing those to a plain JSONL file is a data exposure risk.
Plain JSONL sitting on disk is readable by anything with file access. If this is a shared host, a multi-tenant environment, or a device that leaves the building, plaintext history is a liability.

conversation-codec handles both. It writes conversation history to JSONL with an optional redaction pass before writing and optional Fernet encryption at rest. The API covers the read/write lifecycle. You bring the messages; it handles the storage.

Main Code Example

Install:

pip install conversation-codec

For Fernet encryption, install the extras:

pip install "conversation-codec[crypto]"

Basic write and read

from conversation_codec import ConversationCodec

codec = ConversationCodec(path="conversations/session-42.jsonl")

messages = [
    {"role": "user", "content": "What is the revenue for Q1 2024?"},
    {"role": "assistant", "content": "Based on the report, Q1 2024 revenue was $4.2M."},
]

codec.write(messages)

# Later, restore and extend
loaded = codec.read()
loaded.append({"role": "user", "content": "What about Q2?"})
codec.write(loaded)

write() overwrites the file with the full current history. If you want append semantics for streaming:

codec.append({"role": "assistant", "content": "Q2 revenue was $5.1M."})

Redaction before write

Pass a callable that receives each message and returns a cleaned copy:

import re

def redact(message):
    content = message.get("content", "")
    # Redact anything that looks like a bearer token
    content = re.sub(r"Bearer\s+[A-Za-z0-9\-._~+/]+=*", "Bearer [REDACTED]", content)
    # Redact email addresses
    content = re.sub(r"[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}", "[EMAIL]", content)
    return {**message, "content": content}

codec = ConversationCodec(
    path="conversations/session-42.jsonl",
    redact=redact,
)

codec.write([
    {"role": "user", "content": "My email is alice@example.com and my token is Bearer eyJhbGc..."},
    {"role": "assistant", "content": "Got it, Alice."},
])
# Written file contains [EMAIL] and Bearer [REDACTED]

The redact callable receives the raw message dict and must return a message dict. Returning None drops the message entirely (useful for filtering tool results that should never persist).

Fernet encryption at rest

from cryptography.fernet import Fernet
from conversation_codec import ConversationCodec

key = Fernet.generate_key()  # store this key securely, not next to the file

codec = ConversationCodec(
    path="conversations/session-42.jsonl",
    fernet_key=key,
)

codec.write(messages)
# File on disk is encrypted ciphertext

loaded = codec.read()  # decrypts on read automatically

Encryption and redaction compose. Pass both and the codec redacts first, then encrypts. On read, it decrypts first, then returns the redacted content.

Resume after crash

import anthropic
from conversation_codec import ConversationCodec

codec = ConversationCodec(path="conversations/session-42.jsonl")
client = anthropic.Anthropic()

# Try to resume. If the file doesn't exist, start fresh.
messages = codec.read() or []

def run_turn(user_input):
    messages.append({"role": "user", "content": user_input})
    response = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=1024,
        messages=messages,
    )
    assistant_message = {"role": "assistant", "content": response.content[0].text}
    messages.append(assistant_message)
    codec.write(messages)  # persist after every turn
    return assistant_message["content"]

# If this crashes between turns, the next run calls codec.read() and picks up where it left off
run_turn("What is the Q1 revenue?")
run_turn("And Q2?")

Writing after every turn is the key pattern. If you batch writes, you risk losing the turns since the last write. For long-running sessions, this is the right default.

What It Does NOT Do

conversation-codec does not search or query conversation history. If you want to retrieve past conversations by topic, date range, or semantic similarity, you need a vector database or a search index alongside this. The codec writes and reads. It does not index.

It does not deduplicate messages. If you call append() with the same message twice, it writes twice. Deduplication is your responsibility before calling write() or append().

It does not manage keys. You generate and store the Fernet key yourself. If you lose the key, the encrypted file is unreadable. There is no key escrow, no key rotation helper, no key management. This is intentional: key management is a system-level concern that varies by deployment.

It does not version conversation files. If you write a new format and try to read an old file, you may get parse errors. Schema migration is out of scope.

It does not handle concurrent writers. Two processes writing to the same file will corrupt it. Use a process-level lock or a separate file per agent instance.

Design Reasoning

JSONL is the right format for conversations for the same reason it is the right format for logs: it is streamable, appendable, and line-delimited. You can process a JSONL conversation file with jq, grep, or any log aggregation tool without needing to load it all into memory.

The redact callable approach is intentional. Different deployments have different secrets. One system needs to redact bearer tokens. Another needs to redact credit card numbers. A third needs to redact names. A fixed regex would never cover all cases. A callable puts the redaction logic in the caller's hands.

Fernet is a symmetric authenticated encryption scheme from the cryptography package. It encrypts and authenticates in one step. If the file is tampered with, decryption raises an error rather than returning garbage. It is a reasonable default for "encrypt this file on disk" without needing to understand AES modes.

Writing after every turn rather than at the end of the session is the safer default. Sessions end unexpectedly. Writes are cheap. The only reason to batch writes is if you are in a latency-critical path where the file I/O matters, and in that case you already know what you are doing.

When This Applies (and When It Does Not)

Use this when:

You have a multi-turn agent that needs to survive process restarts
The conversation history may contain secrets or PII that should not persist in plaintext
You are on a shared host or in a regulated environment where data-at-rest encryption is required
You want the simplest possible conversation persistence without a database dependency

Skip it when:

You are already persisting conversations to a database (this is not a database replacement)
You need to search, filter, or embed past conversations (add a vector DB alongside this)
You are running stateless single-turn agents (no state to persist)
You need multi-writer concurrent access (use a database with proper locking instead)

Install or Quick-Start

pip install conversation-codec           # basic JSONL write/read + redaction
pip install "conversation-codec[crypto]" # adds Fernet encryption support

Minimal example:

from conversation_codec import ConversationCodec

codec = ConversationCodec(path="my-session.jsonl")
codec.write([{"role": "user", "content": "Hello"}, {"role": "assistant", "content": "Hi there."}])
print(codec.read())

GitHub: MukundaKatta/conversation-codec

Siblings Table

Library	What it does	How it pairs with `conversation-codec`
`agenttap`	Wire-level LLM request/response capture	Tap captures full payloads; codec persists the cleaned conversation
`agent-citation`	Claim-level citation tracking	Serialize citation store alongside conversation JSONL in the same session dir
`llm-redact-secrets`	Pattern-based secret redaction	Use as the `redact=` callable in `ConversationCodec`
`agent-scratchpad`	Keyed in-memory notepad for agent state	Persist scratchpad state in a separate JSONL alongside conversation
`agent-state-checkpoint`	Durable JSON checkpoint for agent state	Checkpoint agent state; codec checkpoints the conversation; both go in the same session dir

What is Next

The next planned feature is partial read support. Today codec.read() loads the entire file. For very long sessions (1000+ turns), that is wasteful if you only need the last 20 turns for the next model call. A codec.read(last_n=20) that reads only the tail of the JSONL file would be more efficient.

Key rotation is also on the roadmap. If you rotate your Fernet key, you need to decrypt the old file and re-encrypt with the new key. A codec.rekey(new_key) helper would handle that without requiring you to implement the read-decrypt-write cycle yourself.

For the Hermes Agent Challenge sprint, conversation-codec is the persistence layer that makes multi-turn agents reliable. Pair it with agenttap for full wire-level traces, agent-citation for claim traceability, and llm-redact-secrets for the redaction callable. Together they give you a conversation that is durable, auditable, and safe to store.

DEV Community