<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: BAOFUFAN</title>
    <description>The latest articles on DEV Community by BAOFUFAN (@_eb7f2a654e97a60ae9f96e).</description>
    <link>https://dev.to/_eb7f2a654e97a60ae9f96e</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3903614%2F88f4214a-aed8-4e71-a7f1-a6aca8cfe579.jpg</url>
      <title>DEV Community: BAOFUFAN</title>
      <link>https://dev.to/_eb7f2a654e97a60ae9f96e</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/_eb7f2a654e97a60ae9f96e"/>
    <language>en</language>
    <item>
      <title>How We Caught 90% of Our LLM Memory Bugs with Playwright</title>
      <dc:creator>BAOFUFAN</dc:creator>
      <pubDate>Wed, 01 Jul 2026 12:04:23 +0000</pubDate>
      <link>https://dev.to/_eb7f2a654e97a60ae9f96e/how-we-caught-90-of-our-llm-memory-bugs-with-playwright-a84</link>
      <guid>https://dev.to/_eb7f2a654e97a60ae9f96e/how-we-caught-90-of-our-llm-memory-bugs-with-playwright-a84</guid>
      <description>&lt;p&gt;At 1 a.m., our product manager dropped a screenshot in the group chat: a user asked, “Repeat the order number I just gave you,” and our AI customer-service bot replied, “Please provide your order number.” The user cursed three times, and the screenshot ended up on a complaint platform.&lt;/p&gt;

&lt;p&gt;This wasn’t a model capability problem—the model had simply “forgotten” the context. Even worse, I manually tested the same conversation three times and couldn’t reproduce it once. Then I wrote 200 automated memory tests using Playwright, and 90% of the forgetfulness pointed to one embarrassingly basic problem: inconsistent serialization and deserialization in the front-end storage layer.&lt;/p&gt;

&lt;h2&gt;
  
  
  Breaking down the problem: how memory silently disappears
&lt;/h2&gt;

&lt;p&gt;Our customer-service bot is built on an LLM with LangChain, and the front end uses a React chat component. Conversation history is kept in the browser’s &lt;code&gt;localStorage&lt;/code&gt;. Every time a user refreshes or reopens the page, the front end reads the history from &lt;code&gt;localStorage&lt;/code&gt; and stitches it into the prompt to maintain long-context memory.&lt;/p&gt;

&lt;p&gt;Sounds great in theory. In reality, many users access the service from mobile browsers, and the page gets killed frequently by the OS. When the session restores, we rely on that locally cached conversation to rebuild memory. If anything goes wrong during storage or retrieval, the model suffers amnesia.&lt;/p&gt;

&lt;p&gt;Why didn’t manual testing catch it? Because we unconsciously test with the “correct” sequence—send message A, then message B, then ask about message A’s content. Real users, however, may refresh the page, switch apps, or even have their phone die and restart. The effect of these actions on the front-end storage is almost impossible to reproduce manually. Traditional unit tests only cover JS logic; they can’t test the complete “browser-level persistence and restoration” path.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Playwright for end-to-end memory verification?
&lt;/h2&gt;

&lt;p&gt;I needed a tool that could control a real browser, manipulate &lt;code&gt;localStorage&lt;/code&gt;, intercept network requests, and assert page content. Selenium is too heavy, Puppeteer only supports JS, Cypress is intrusive and has weak multi-tab support. Playwright supports Python, has a clean API, and can directly push arbitrary content into &lt;code&gt;localStorage&lt;/code&gt; via &lt;code&gt;page.evaluate()&lt;/code&gt;. It also simulates page refreshes and cross-page restores perfectly.&lt;/p&gt;

&lt;p&gt;Architecturally, I divided the tests into three layers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Scenario layer&lt;/strong&gt;: defines memory test scenarios, e.g., “order number recallable after multiple conversation turns,” “memory not lost after refresh.”&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Driver layer&lt;/strong&gt;: wraps Playwright operations and provides high-level actions like &lt;code&gt;send_message&lt;/code&gt; and &lt;code&gt;reload_and_restore&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Verification layer&lt;/strong&gt;: asserts memory consistency from two dimensions—page text and &lt;code&gt;localStorage&lt;/code&gt; contents.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This way, even if the front-end framework changes from React to Vue, I only need to update selectors; the test logic stays intact.&lt;/p&gt;

&lt;h2&gt;
  
  
  Core implementation: let Playwright repeat every bizarre user action
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Setting up a repeatable test environment
&lt;/h3&gt;

&lt;p&gt;We need a runnable chat page. I used a small Streamlit app as a stand-in (launched with &lt;code&gt;subprocess&lt;/code&gt; and killed after tests to ensure isolation). You can replace it with any front-end page.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;subprocess&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pytest&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;playwright.sync_api&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;sync_playwright&lt;/span&gt;

&lt;span class="c1"&gt;# 启动一个本地 Streamlit 聊天应用，模拟 LLM 对话服务
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;start_chat_server&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;proc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;subprocess&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Popen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;streamlit&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;run&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;chat_ui.py&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--server.port&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;8510&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="n"&gt;stdout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;subprocess&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DEVNULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;subprocess&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DEVNULL&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# 等待服务就绪
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;proc&lt;/span&gt;

&lt;span class="nd"&gt;@pytest.fixture&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scope&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;module&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;page&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;server&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;start_chat_server&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;sync_playwright&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;browser&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chromium&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;launch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;headless&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;browser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new_context&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;page&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new_page&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;goto&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http://localhost:8510&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt;
        &lt;span class="n"&gt;browser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;terminate&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This snippet solves “how to guarantee each test starts from a clean state.” &lt;code&gt;new_context()&lt;/code&gt; ensures isolated storage space, preventing cross-test contamination.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Simulating multi-turn conversation and asserting memory retention
&lt;/h3&gt;

&lt;p&gt;The test below is the core: the user first provides an order number, chats about unrelated topics, then asks the bot to repeat the number. We expect the model to correctly recall it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_order_id_recall_after_conversation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# 封装发送消息的函数，处理前端可能的异步渲染
&lt;/span&gt;    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;send_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="c1"&gt;# 找到输入框并键入消息。实际选择器按你的前端结构调整
&lt;/span&gt;        &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fill&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;textarea[aria-label=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;输入消息&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;]&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;click&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;button:has-text(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;发送&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;)&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="c1"&gt;# 等待 AI 回复出现（以最后一条气泡里的文本为准）
&lt;/span&gt;        &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;wait_for_selector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.message.assistant:last-of-type .content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# 用户提供订单号
&lt;/span&gt;    &lt;span class="nf"&gt;send_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;我的订单号是 ORD-9876，查一下物流&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# 中间几轮无关对话，目的是扩大上下文距离，更容易触达遗忘边界
&lt;/span&gt;    &lt;span class="nf"&gt;send_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;这个商品支持七天无理由吗？&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;send_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;发货一般要多久？&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# 关键轮次：再次询问订单号
&lt;/span&gt;    &lt;span class="nf"&gt;send_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;我刚才说的订单号是什么？你重复一遍&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# 获取最后一条助手的回复文本
&lt;/span&gt;    &lt;span class="n"&gt;last_reply&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;text_content&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.message.assistant:last-of-type .content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# 核心断言：回复中必须包含之前给出的订单号
&lt;/span&gt;    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ORD-9876&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;last_reply&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;记忆丢失！模型没有复述订单号，实际回复: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;last_reply&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If this step fails, it’s classic long-context forgetting—something manual testing would likely never hit because nobody manually pads exactly the right number of distraction turns.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Testing memory recovery after refresh — the real danger zone
&lt;/h3&gt;

&lt;p&gt;When a user refreshes the page or restores it from the background, the front end reads from &lt;code&gt;localStorage&lt;/code&gt; to rebuild memory. Common serialization mistakes (e.g., storing and reading &lt;code&gt;bigint&lt;/code&gt;, &lt;code&gt;Date&lt;/code&gt; objects) cause the restored conversation to be truncated or empty, leading to immediate amnesia.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_memory_persists_after_reload&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;send_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sender&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="c1"&gt;# 这个测试中我们直接操作 localStorage 模拟前置对话
&lt;/span&gt;        &lt;span class="c1"&gt;# 更贴近“用户之前聊过，现在刷新页面回来”的场景
&lt;/span&gt;        &lt;span class="k"&gt;pass&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the actual test script I’ll first inject a complete conversation into &lt;code&gt;localStorage&lt;/code&gt;, then reload the page, and send a new message referencing the conversation. If the LLM can’t recall, the serialization is broken. The code below demonstrates how to inject and verify.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_memory_after_reload_and_send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# 直接往 localStorage 里写入一段历史对话（模拟用户之前聊过）
&lt;/span&gt;    &lt;span class="n"&gt;existing_conversation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;我的订单号是 ORD-1234&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;assistant&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;好的，我记下了您的订单号 ORD-1234。&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;window.localStorage.setItem(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;chat_history&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, JSON.stringify(arguments[0]))&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;existing_conversation&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# 刷新页面，模拟用户重新打开或刷新
&lt;/span&gt;    &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;reload&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="c1"&gt;# 等待页面加载完成
&lt;/span&gt;    &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;wait_for_selector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;textarea[aria-label=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;输入消息&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;]&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# 续接对话：询问之前告知的订单号
&lt;/span&gt;    &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fill&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;textarea[aria-label=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;输入消息&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;]&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;我之前说的订单号是什么？&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;click&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;button:has-text(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;发送&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;)&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;wait_for_selector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.message.assistant:last-of-type .content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;last_reply&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;text_content&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.message.assistant:last-of-type .content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ORD-1234&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;last_reply&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;刷新后记忆丢失，回复内容: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;last_reply&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4. Verifying &lt;code&gt;localStorage&lt;/code&gt; contents directly — catching hidden corruption
&lt;/h3&gt;

&lt;p&gt;Sometimes the model still works in isolated tests but the stored data is silently mangled. By adding a direct assertion on the &lt;code&gt;localStorage&lt;/code&gt; string, we can detect inconsistencies early.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_localstorage_integrity&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fill&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;textarea[aria-label=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;输入消息&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;]&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;我的会员ID是 VIP-2024&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;click&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;button:has-text(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;发送&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;)&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;wait_for_selector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.message.assistant:last-of-type .content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;raw&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;window.localStorage.getItem(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;chat_history&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;raw&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;localStorage 中没有对话历史&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;parsed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;raw&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# 验证关键数据没有被序列化破坏
&lt;/span&gt;    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="nf"&gt;any&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;VIP-2024&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;parsed&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;localStorage 中的数据丢失了原始信息&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This test alone caught 8 different serialization bugs during the sprint.&lt;/p&gt;

&lt;h2&gt;
  
  
  The real culprit: serialization silently breaks your context
&lt;/h2&gt;

&lt;p&gt;Most people treat &lt;code&gt;JSON.stringify&lt;/code&gt; and &lt;code&gt;JSON.parse&lt;/code&gt; as transparent. They aren’t, especially when custom objects, &lt;code&gt;undefined&lt;/code&gt;, or non-serializable values sneak into the conversation array. In our case:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A &lt;code&gt;Date&lt;/code&gt; object from a message timestamp was directly stored, and after parsing it became a string, causing downstream code to skip that message because the valid-time check failed.&lt;/li&gt;
&lt;li&gt;Some messages had &lt;code&gt;content&lt;/code&gt; fields set to &lt;code&gt;undefined&lt;/code&gt; (due to streaming glitches). &lt;code&gt;JSON.stringify&lt;/code&gt; converts &lt;code&gt;undefined&lt;/code&gt; to &lt;code&gt;null&lt;/code&gt;, but the restoration logic expected &lt;code&gt;typeof&lt;/code&gt; checks on &lt;code&gt;undefined&lt;/code&gt;, causing an entire dialog block to disappear.&lt;/li&gt;
&lt;li&gt;On mobile browsers with tight storage quotas, partial writes happened, leaving the JSON structure broken.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;After fixing all these, the memory test pass rate went from 32% to 96% overnight.&lt;/p&gt;

&lt;h2&gt;
  
  
  Takeaways: make memory testing a habit, not an afterthought
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Test memory logic in the browser, not in unit tests.&lt;/strong&gt; Serialization is a runtime concern; unit tests with mocked &lt;code&gt;localStorage&lt;/code&gt; won’t catch real-world quirks.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Write adversarial memory scenarios.&lt;/strong&gt; Refresh, background-kill, and restore from partial state—these are exactly Playwright’s strengths.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Verify at multiple layers.&lt;/strong&gt; Page output and raw &lt;code&gt;localStorage&lt;/code&gt; content must align.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automate relentlessly.&lt;/strong&gt; If you have an AI feature that depends on context, you should have hundreds of Playwright tests that wear down the context like a user with a habit of switching apps.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That 1 a.m. bug report was a gift. It pushed me to stop trusting manual memory tests and start treating front-end storage as a first-class citizen. Playwright gave me the power to simulate the chaotic behavior of real users, and those 200 tests now run on every CI build. Memory bugs? They show up in 90 seconds now, not in customer complaints.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;If you’re struggling with similar long-context memory issues, consider adding a “memory canary” test—a simple prompt-response pair that must survive whatever your front-end storage does. It might be the most productive 20 lines of code you write this week.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>python</category>
      <category>programming</category>
    </item>
    <item>
      <title>Chasing Chroma’s Silent Data Loss: How Playwright E2E Tests Saved 72 Hours</title>
      <dc:creator>BAOFUFAN</dc:creator>
      <pubDate>Wed, 01 Jul 2026 01:03:06 +0000</pubDate>
      <link>https://dev.to/_eb7f2a654e97a60ae9f96e/chasing-chromas-silent-data-loss-how-playwright-e2e-tests-saved-72-hours-3cog</link>
      <guid>https://dev.to/_eb7f2a654e97a60ae9f96e/chasing-chromas-silent-data-loss-how-playwright-e2e-tests-saved-72-hours-3cog</guid>
      <description>&lt;p&gt;At 2 AM, a message exploded in our user group: “Why does the AI assistant ask my name again? It remembered I’m Xiao Li yesterday.” I checked monitoring — memory normal, CPU stable, no errors in logs. After restarting the service, the memory came back — only to disappear again the next day. No error, no exception stack trace. Chroma had silently dropped the documents. That was when I realized: conventional unit tests are useless when it comes to vector databases.&lt;/p&gt;

&lt;h2&gt;
  
  
  Breaking down the problem
&lt;/h2&gt;

&lt;p&gt;This LLM app uses a typical RAG memory pipeline: user conversations are summarized and written into the vector store; later queries retrieve relevant memories to inject into prompts. Memory persistence relies on Chroma’s &lt;code&gt;persist()&lt;/code&gt;. In integration tests we mocked its response, wrote a few documents, queried them — all green. But production is a different story.&lt;/p&gt;

&lt;p&gt;The root cause is a combination of three factors. First, Chroma’s default &lt;code&gt;persist()&lt;/code&gt; does not flush to disk immediately; the underlying SQLite has its own caching strategy, so if the process exits unexpectedly, the last few seconds of writes can vanish. Second, we used async &lt;code&gt;add_texts&lt;/code&gt; but returned HTTP 200 before the underlying flush completed. Third, in our containerized deployment &lt;code&gt;/tmp/chroma-data&lt;/code&gt; sat on Kubernetes ephemeral storage, so it got reclaimed — the log simply said “Persist succeeded” while the directory was already gone. Regular unit tests can never simulate these edge cases because everything runs inside a single process, and mocking silently covers up the most dangerous parts.&lt;/p&gt;

&lt;h2&gt;
  
  
  Designing the solution
&lt;/h2&gt;

&lt;p&gt;To catch this kind of “silent data loss”, tests must run in a real multi‑process environment with network latency and disk I/O, covering the full persistence lifecycle: &lt;strong&gt;write → service restart → read&lt;/strong&gt;. This is exactly where end‑to‑end testing shines.&lt;/p&gt;

&lt;p&gt;For tooling, we ruled out pure API tests — fast but they bypass the browser and real request chain, so you can’t verify that the whole call path from a frontend trigger truly persists data to disk. We also ruled out Selenium — too heavy and its async support is poor. We settled on &lt;strong&gt;Playwright + pytest‑asyncio&lt;/strong&gt; because it can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Simulate multi‑turn conversations in a real browser&lt;/li&gt;
&lt;li&gt;Use &lt;code&gt;page.wait_for_*&lt;/code&gt; to control timing and capture the window where HTTP 200 is returned before data gets asynchronously lost&lt;/li&gt;
&lt;li&gt;Directly control service process restarts during tests, verifying whether data actually landed on disk&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The architecture follows “test‑as‑a‑user”: every case patterns like “open the page → send a message → close the browser → restart the service → open the page again → check if the historical memory still exists”. Only this brute‑force approach exposes persistence issues in vector databases.&lt;/p&gt;

&lt;h2&gt;
  
  
  Core implementation
&lt;/h2&gt;

&lt;p&gt;Let’s start with a simplified app under test — a FastAPI service with a chat frontend and Chroma for memory, started with &lt;code&gt;uvicorn&lt;/code&gt;. The code below solves one problem: &lt;strong&gt;how to start/stop the service with Playwright and wrap it into a reusable fixture&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# conftest.py
&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;subprocess&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;httpx&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pytest_asyncio&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;playwright.async_api&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;async_playwright&lt;/span&gt;

&lt;span class="nd"&gt;@pytest_asyncio.fixture&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scope&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;function&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;app_service&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tmp_path&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# 每个测试用例拥有独立的 Chroma 持久化目录，避免数据串扰
&lt;/span&gt;    &lt;span class="n"&gt;chroma_dir&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tmp_path&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;chroma_data&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;chroma_dir&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;mkdir&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c1"&gt;# 启动被测试服务，设置环境变量指定持久化路径
&lt;/span&gt;    &lt;span class="n"&gt;proc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;subprocess&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Popen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;uvicorn&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;main:app&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--host&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;127.0.0.1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--port&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;8765&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CHROMA_PERSIST_DIR&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chroma_dir&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="nf"&gt;__import__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;os&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="n"&gt;stdout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;subprocess&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PIPE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;subprocess&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PIPE&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# 等待服务就绪
&lt;/span&gt;    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;httpx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;AsyncClient&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http://127.0.0.1:8765/health&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status_code&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="k"&gt;break&lt;/span&gt;
            &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;pass&lt;/span&gt;
            &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;proc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;kill&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;RuntimeError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Service did not start&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;proc&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;proc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;base_url&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http://127.0.0.1:8765&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;chroma_dir&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;chroma_dir&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;# 测试结束后强杀进程，模拟生产环境可能出现的非正常退出
&lt;/span&gt;    &lt;span class="n"&gt;proc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;kill&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;proc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;wait&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This fixture manages an isolated data directory per test and the process lifecycle, ensuring a clean persistence environment while simulating unexpected termination. Next is the core test case that verifies whether memory survives a restart.&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
python
# test_memory_persistence.py
import asyncio
import pytest
from playwright.async_api import async_playwright

pytestmark = pytest.mark.asyncio(loop_scope="module")

async def test_memory_survives_restart(app_service):
    """
    场景：用户说自己的名字，重启服务后，AI 仍然记得。
    这个测试直接揪出了 Chroma 的异步刷盘问题。
    """
    base = app_service["base_url"]

    async with async_playwright() as p:
        browser = await p.chromium.launch()
        page = await browser.new_page()

        # Step 1: 用户首次对话，留下记忆
        await page.goto(base)
        await page.fill("#user-
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
      <category>python</category>
      <category>programming</category>
    </item>
    <item>
      <title>12 Hours Debugging Memory Drift in a LangChain+Chroma Code QA Bot</title>
      <dc:creator>BAOFUFAN</dc:creator>
      <pubDate>Tue, 30 Jun 2026 12:04:48 +0000</pubDate>
      <link>https://dev.to/_eb7f2a654e97a60ae9f96e/12-hours-debugging-memory-drift-in-a-langchainchroma-code-qa-bot-1jgc</link>
      <guid>https://dev.to/_eb7f2a654e97a60ae9f96e/12-hours-debugging-memory-drift-in-a-langchainchroma-code-qa-bot-1jgc</guid>
      <description>&lt;p&gt;At 2 a.m., I stared at the nonsensical reply in my terminal, a chill down my spine—just a moment ago I told it to “only look at the auth module”, and now it was recommending a frontend button color scheme. Building a local codebase Q&amp;amp;A bot with accurate long-conversation memory is far trickier than it seems.&lt;/p&gt;

&lt;h2&gt;
  
  
  Problem Deconstruction
&lt;/h2&gt;

&lt;p&gt;Our scenario was clear: a Python monorepo with hundreds of thousands of lines, business logic scattered across hundreds of files. New hires and senior devs troubleshooting cross-module calls all wanted a bot that could “understand the entire codebase” and engage in multi-turn conversations—like chatting with a senior developer, not a Ctrl+F session from scratch.&lt;/p&gt;

&lt;p&gt;The typical approach is to build a RAG pipeline with LangChain: split code files → embed → store in a vector DB → upon user query, retrieve similar chunks → stitch them into a prompt for the LLM to generate an answer. You can find tons of tutorials and have it running in 10 minutes. But in practice, as soon as the conversation exceeds three turns, the bot starts "forgetting"—mixing up previously mentioned file paths, forgetting already confirmed context, or even blending two separate questions into one answer.&lt;/p&gt;

&lt;p&gt;The root causes are twofold:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The default &lt;code&gt;ConversationBufferMemory&lt;/code&gt; just blindly stuffs all message history into the prompt; once it exceeds token limits, it truncates—often just the part with critical context.&lt;/li&gt;
&lt;li&gt;The retrieval stage completely ignores conversation history, using only the latest user question to search the vector store. This renders referential queries like "modify the &lt;code&gt;login&lt;/code&gt; function in that file" utterly useless.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The vanilla setup works for demos but collapses under real conversations. I needed a code Q&amp;amp;A bot that truly remembers context, plus an automated testing method to continuously verify memory accuracy—otherwise, relying on manual eyeballing after every memory logic refactor would eventually lead to failure.&lt;/p&gt;

&lt;h2&gt;
  
  
  Solution Design
&lt;/h2&gt;

&lt;p&gt;Core stack: &lt;strong&gt;LangChain + Chroma + custom layered memory&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Why not Pinecone or Weaviate? Because this is a local codebase with data that must stay on-prem; and during development, we need to frequently destroy and rebuild indexes. The lightweight embedded Chroma (based on SQLite + in-memory HNSW) is fast enough, has zero deployment dependencies, and persistence is just a directory—backup is as simple as copying the folder.&lt;/p&gt;

&lt;p&gt;For memory, I did not use LangChain's built-in &lt;code&gt;SummaryMemory&lt;/code&gt; because summarization loses details (like an exact module path), which is fatal for code Q&amp;amp;A. I designed a two-layer memory structure:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Short-term window&lt;/strong&gt;: keep the last N full conversation turns without compression to preserve referential context.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Long-term compression&lt;/strong&gt;: for turns beyond the window, use an LLM to extract key decisions (e.g., "user is investigating concurrency issues in the payment module") and inject them as a fixed prefix into the prompt.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Most importantly, &lt;strong&gt;the retrieval query must be rewritten&lt;/strong&gt;. Before hitting the vector store, I feed the user's raw question along with the last three turns into the LLM to produce a de-referenced standalone query. For example: "What exception did it throw?" becomes "What exception did the &lt;code&gt;process_order&lt;/code&gt; function in &lt;code&gt;payment/service.py&lt;/code&gt; throw?".&lt;/p&gt;

&lt;p&gt;For automated testing, I designed a set of fixed conversation scripts, each ending with a question that requires memory of earlier context; then a script compares the answer with expected output to verify memory accuracy.&lt;/p&gt;

&lt;h2&gt;
  
  
  Core Implementation
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Code Indexing into Chroma (persisted locally)
&lt;/h3&gt;

&lt;p&gt;This code handles converting code files into searchable vectors; the key is the splitting strategy—using &lt;code&gt;Language.PYTHON&lt;/code&gt;'s separators to ensure chunks fall on function/class boundaries.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.document_loaders&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;DirectoryLoader&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;TextLoader&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.text_splitter&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Language&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;RecursiveCharacterTextSplitter&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.embeddings&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;HuggingFaceEmbeddings&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.vectorstores&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Chroma&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;

&lt;span class="c1"&gt;# 加载代码目录，排除测试和虚拟环境
&lt;/span&gt;&lt;span class="n"&gt;loader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;DirectoryLoader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;./src&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;glob&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;**/*.py&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;loader_cls&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;TextLoader&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;loader_kwargs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;encoding&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;utf-8&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;exclude&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;**/test_*&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;**/__pycache__/*&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;docs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;loader&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# 按 Python 语法边界切分，chunk 大小 1000，重叠 200 字符保留上下文
&lt;/span&gt;&lt;span class="n"&gt;splitter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;RecursiveCharacterTextSplitter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_language&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;language&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;Language&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PYTHON&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;chunk_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;chunk_overlap&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;chunks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;splitter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split_documents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# 使用本地 embedding 模型，避免网络依赖
&lt;/span&gt;&lt;span class="n"&gt;embedding_model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;HuggingFaceEmbeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;BAAI/bge-base-zh-v1.5&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;model_kwargs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;device&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cpu&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;encode_kwargs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;normalize_embeddings&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# 存入 Chroma，持久化目录指定即可重启保留
&lt;/span&gt;&lt;span class="n"&gt;vectordb&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Chroma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_documents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;embedding_model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;persist_directory&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;./chroma_code_index&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;collection_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;code_base&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;vectordb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;persist&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Q&amp;amp;A Chain with Dual-Layer Memory
&lt;/h3&gt;

&lt;p&gt;This code tackles "how to make the LLM remember history without blowing the token limit" by combining a short-term window with long-term compression, and injecting a retrieval rewriting step.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.chains&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ConversationalRetrievalChain&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.memory&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ConversationBufferWindowMemory&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ConversationSummaryMemory&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.chat_models&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ChatOpenAI&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.prompts&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;PromptTemplate&lt;/span&gt;

&lt;span class="n"&gt;llm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ChatOpenAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-3.5-turbo&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# 短期记忆：保留最近 4 轮完整对话
&lt;/span&gt;&lt;span class="n"&gt;short_memory&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ConversationBufferWindowMemory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;memor&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



</description>
      <category>python</category>
      <category>programming</category>
    </item>
    <item>
      <title>Redis Persistence Pitfalls: A Data Loss Bug That Took Me 6 Hours to Uncover</title>
      <dc:creator>BAOFUFAN</dc:creator>
      <pubDate>Tue, 30 Jun 2026 01:03:56 +0000</pubDate>
      <link>https://dev.to/_eb7f2a654e97a60ae9f96e/redis-persistence-pitfalls-a-data-loss-bug-that-took-me-6-hours-to-uncover-2mpk</link>
      <guid>https://dev.to/_eb7f2a654e97a60ae9f96e/redis-persistence-pitfalls-a-data-loss-bug-that-took-me-6-hours-to-uncover-2mpk</guid>
      <description>&lt;p&gt;At 2 a.m., the on-call phone jolted me awake. The SRE from the user service sounded panicked: “Redis was restarted on production, and now a ton of user sessions are gone — everyone’s logged out, the customer support lines are exploding.” As I pulled up the monitoring dashboards, my heart sank. This Redis instance was dedicated to storing user sessions. A year ago, we had explicitly requested persistence to be enabled, and it was configured with both RDB and AOF for safety — so how on earth did we lose data? By daybreak, I finally pieced together the root cause. Along the way, I built an automated verification system with Pytest and Docker to ensure this never happens again. Let me walk you through what went wrong and how this validation tool can help you avoid the same trap.&lt;/p&gt;

&lt;h2&gt;
  
  
  Breaking Down the Problem: Why Does Data Still Disappear When Persistence Is Enabled?
&lt;/h2&gt;

&lt;p&gt;The setup was straightforward: we used Redis to store web sessions and required data to survive a restart. The ops team had configured RDB (&lt;code&gt;save 900 1&lt;/code&gt;) and AOF (&lt;code&gt;appendfsync everysec&lt;/code&gt;), which looked rock solid. But the failure sequence went like this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The container had been running for 2 hours; the last RDB save was 30 minutes ago.&lt;/li&gt;
&lt;li&gt;A rolling restart was triggered; the Redis process was gracefully shut down with &lt;code&gt;docker stop&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;After the restart, we discovered that among the nearly 20,000 sessions written in the past 30 minutes, around 10,000 were lost.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The root cause is that the combination of RDB and AOF does not handle the scenario of “writing a large volume of data in a short time and then immediately restarting.” The &lt;code&gt;save 900 1&lt;/code&gt; rule means that if only 1 key changes, Redis would wait 900 seconds before saving to disk. Meanwhile, AOF with &lt;code&gt;appendfsync everysec&lt;/code&gt; flushes every second, but &lt;code&gt;docker stop&lt;/code&gt; gives the process 10 seconds to exit by default. Upon receiving SIGTERM, the Redis process tries to flush its AOF buffer to disk. The problem is: if the amount of data is too large and the flush takes longer than 10 seconds, Docker sends SIGKILL, leaving an incomplete AOF file. When Redis restarts and detects the corrupted AOF file, it either truncates the tail or skips repair based on its configuration — thus the unflushed data simply vanishes.&lt;/p&gt;

&lt;p&gt;Why don’t typical approaches work? In the past, to test persistence, we would spin up a local Docker container, &lt;code&gt;SET&lt;/code&gt; a few keys, restart, and see if the data was still there. This “eyeball test” cannot simulate massive writes and race conditions near timeout boundaries, nor can it cover various “unexpected kill” scenarios. So we absolutely need an automated, repeatable validation method that covers real-world failure modes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Designing the Solution: Parameterized Validation with Pytest and Docker
&lt;/h2&gt;

&lt;p&gt;I needed to do three things:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Quickly spin up a Redis container with a predefined persistence policy.&lt;/li&gt;
&lt;li&gt;Simulate real write pressure.&lt;/li&gt;
&lt;li&gt;Kill the container in different ways, then restart it and check data consistency.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;For the test framework, I went with Pytest because of its powerful parameterization — you can easily generate test combinations of different persistence configurations and failure modes. For container management, I chose &lt;code&gt;testcontainers-python&lt;/code&gt;. It automatically handles container lifecycles, leaves no dirty environments, and is easy to run in CI. Why not use docker-compose? Because I needed to run dozens of configurations with the same image; writing a bunch of YAML files would be tedious and inflexible. With testcontainers, I can dynamically create configurations in code and even mount a custom &lt;code&gt;redis.conf&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The overall architecture: a &lt;code&gt;conftest.py&lt;/code&gt; provides a parameterizable Redis container fixture. Test cases verify two scenarios: “graceful shutdown” and “unexpected power loss.” Finally, using Pytest’s &lt;code&gt;--csv&lt;/code&gt; or Allure, we generate a report that clearly shows data loss under each strategy.&lt;/p&gt;

&lt;h2&gt;
  
  
  Core Implementation: Building the Automated Validation Step by Step
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Step 1: Define a container fixture that can configure Redis on demand&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This snippet solves the problem of “same image, different persistence policies.” We use testcontainers’ Redis module but directly pass command-line arguments or mount a custom configuration file. For precise control, I chose to mount a custom &lt;code&gt;redis.conf&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# conftest.py
&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pytest&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;testcontainers.redis&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;RedisContainer&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;testcontainers.core.container&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;DockerContainer&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;tempfile&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_write_conf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conf_content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;把配置写入临时文件，返回宿主机路径&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;tmp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tempfile&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;NamedTemporaryFile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mode&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;w&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;delete&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;suffix&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;.conf&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;tmp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conf_content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;tmp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;tmp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;

&lt;span class="nd"&gt;@pytest.fixture&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;redis_container&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    按参数化需求创建Redis容器，挂载自定义redis.conf
    request.param 格式示例: {&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;save&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1 1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;appendfsync&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;no&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;aof&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;no&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;}
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;params&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;getattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;param&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{})&lt;/span&gt;
    &lt;span class="n"&gt;save_rule&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;save&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;''&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;          &lt;span class="c1"&gt;# 如 "900 1"
&lt;/span&gt;    &lt;span class="n"&gt;appendfsync&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;appendfsync&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;no&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;use_aof&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;aof&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;no&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;conf_lines&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;save_rule&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;conf_lines&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;save &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;save_rule&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;conf_lines&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;save &lt;/span&gt;&lt;span class="se"&gt;\"\"&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;           &lt;span class="c1"&gt;# 完全关闭RDB
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;use_aof&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;yes&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;conf_lines&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;appendonly yes&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;conf_lines&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;appendfsync &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;appendfsync&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;conf_lines&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;appendonly no&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;conf_path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;_write_conf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conf_lines&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="c1"&gt;# 构建容器，挂载配置并启动
&lt;/span&gt;    &lt;span class="n"&gt;container&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;DockerContainer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;redis:7-alpine&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;container&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;with_volume_mapping&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conf_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/usr/local/etc/redis/redis.conf&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;container&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;with_command&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;redis-server /usr/local/etc/redis/redis.conf&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;container&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;with_exposed_ports&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;6379&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;container&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;start&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;redis_host&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;container&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get_contain&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



</description>
      <category>python</category>
      <category>programming</category>
    </item>
    <item>
      <title>LangChain Chatbot Session Mix-Up: When 100 Concurrent Users Got Each Other's Replies</title>
      <dc:creator>BAOFUFAN</dc:creator>
      <pubDate>Mon, 29 Jun 2026 12:04:32 +0000</pubDate>
      <link>https://dev.to/_eb7f2a654e97a60ae9f96e/langchain-chatbot-session-mix-up-when-100-concurrent-users-got-each-others-replies-31bd</link>
      <guid>https://dev.to/_eb7f2a654e97a60ae9f96e/langchain-chatbot-session-mix-up-when-100-concurrent-users-got-each-others-replies-31bd</guid>
      <description>&lt;p&gt;At 2 AM, my phone blared. It was our ops guy, pulling me out of a dream: “Users are posting screenshots in the support group — they’re seeing other people’s order numbers inside our AI assistant. The group is going crazy.”&lt;br&gt;&lt;br&gt;
I logged onto the server, checked the logs, and my brain went blank. Multiple requests had their conversation histories all jumbled together. User A asked “Where is my delivery?” and the bot answered with User B’s address. It felt like locking your door, coming home, and finding a stranger sitting on your couch.&lt;/p&gt;
&lt;h2&gt;
  
  
  Breaking It Down: Why User Sessions “Cross Wires”
&lt;/h2&gt;

&lt;p&gt;Our setup was straightforward: a multi-turn chatbot built with LangChain, backed by a large language model, served via FastAPI. Each user was supposed to have an independent conversation memory, never interfering with one another. Unit tests and feature tests all passed. Management gave the green light, and we rolled out a canary release.&lt;/p&gt;

&lt;p&gt;The moment traffic hit, everything fell apart.&lt;/p&gt;

&lt;p&gt;The root cause lies in the most common usage of LangChain’s &lt;code&gt;ConversationBufferMemory&lt;/code&gt;: it stores conversation history in Python’s in-process memory. When concurrent requests pour in, if you don’t deliberately isolate by &lt;code&gt;session_id&lt;/code&gt;, all requests share the same &lt;code&gt;memory&lt;/code&gt; instance. User A’s history gets contaminated with User B’s messages, the model receives dirty context, and data cross-contamination occurs.&lt;/p&gt;

&lt;p&gt;There’s an even sneakier trap: even if you &lt;code&gt;deepcopy&lt;/code&gt; the memory on every request, if you fail to atomically combine “load history by session” and “write back after the call,” concurrent writes will still overwrite each other. User B can end up reading leftover history from User A. This problem never shows up in sequential single-user testing; it waits patiently until concurrency arrives.&lt;/p&gt;
&lt;h2&gt;
  
  
  Design Decisions: Why We Didn’t Pick the Alternatives
&lt;/h2&gt;

&lt;p&gt;What we fundamentally needed was &lt;strong&gt;a conversation-history store keyed by &lt;code&gt;session_id&lt;/code&gt;, persistent across requests, and safe for concurrent reads and writes&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;A few paths were on the table:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;In-process dict + lock&lt;/strong&gt;: the simplest approach, but it can’t span processes, all data disappears on restart, and the lock would strangle concurrency performance. Immediately ruled out.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PostgreSQL / MySQL&lt;/strong&gt;: they can store the data, but chat scenarios are read-heavy with occasional writes. Frequent &lt;code&gt;SELECT&lt;/code&gt;s to retrieve history would amplify database pressure, and adding JSON serialization on top makes the latency not worth it.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Redis&lt;/strong&gt;: in-memory read/write speed, native support for List data structures to hold multiple messages, the ability to set TTL per session for automatic expiration, and a single instance can easily handle tens of thousands of concurrent operations. On top of that, LangChain’s &lt;code&gt;BaseChatMessageHistory&lt;/code&gt; interface already provides a slot to swap in alternative storage, making the implementation cost extremely low.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So our final choice: &lt;strong&gt;implement &lt;code&gt;RedisChatMessageHistory&lt;/code&gt;, store each session’s messages in a Redis List keyed as &lt;code&gt;messages:{session_id}&lt;/code&gt;, and then let LangChain’s &lt;code&gt;RunnableWithMessageHistory&lt;/code&gt; automatically complete the cycle of “load on request, write after response.”&lt;/strong&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Core Implementation: Three Steps to Isolation
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Step 1: Implement a custom &lt;code&gt;RedisChatMessageHistory&lt;/code&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This piece solves &lt;em&gt;how to persist conversation history into Redis&lt;/em&gt;. We inherit from &lt;code&gt;BaseChatMessageHistory&lt;/code&gt;, use a Redis List to hold the messages, and serialize each message as a JSON string.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;logging&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;redis&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_core.chat_history&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BaseChatMessageHistory&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_core.messages&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BaseMessage&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;messages_from_dict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;message_to_dict&lt;/span&gt;

&lt;span class="n"&gt;logger&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;logging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getLogger&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;__name__&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;RedisChatMessageHistory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseChatMessageHistory&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;使用 Redis List 存储对话历史，每个 session_id 一个 key&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;redis_client&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Redis&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;ttl&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3600&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;          &lt;span class="c1"&gt;# 默认 1 小时过期，避免僵尸 key
&lt;/span&gt;        &lt;span class="n"&gt;key_prefix&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;chat_history:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;session_id&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;redis_client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;redis_client&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ttl&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ttl&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;key_prefix&lt;/span&gt;&lt;span class="si"&gt;}{&lt;/span&gt;&lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

    &lt;span class="nd"&gt;@property&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;BaseMessage&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="c1"&gt;# 从 Redis List 中读取全部消息，范围 0 到 -1 保证不丢数据
&lt;/span&gt;        &lt;span class="n"&gt;raw_messages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;redis_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lrange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="c1"&gt;# 每一条存储时是 JSON 字符串，这里反序列化回 LangChain 消息对象
&lt;/span&gt;        &lt;span class="n"&gt;dict_messages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;raw_messages&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;messages_from_dict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dict_messages&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;add_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;BaseMessage&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# message_to_dict 将 BaseMessage 转成 dict，再 dump 成 JSON 字符串
&lt;/span&gt;        &lt;span class="n"&gt;serialized&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;message_to_dict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="c1"&gt;# 写入 Redis List 右侧，保持时间顺序
&lt;/span&gt;        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;redis_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;rpush&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;serialized&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="c1"&gt;# 刷新过期时间，活跃用户的历史保持不过期
&lt;/span&gt;        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;redis_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;expire&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ttl&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;clear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;redis_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;delete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Step 2: Create a history factory to load and write per session automatically&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This part solves &lt;em&gt;how each request gets its own history instance&lt;/em&gt;. We need a factory function so that &lt;code&gt;RunnableWithMessageHistory&lt;/code&gt; instantiates our &lt;code&gt;RedisChatMessageHistory&lt;/code&gt; using the &lt;code&gt;session_id&lt;/code&gt; on every call.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_core&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;runnables&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;history&lt;/span&gt; &lt;span class="n"&gt;impo&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



</description>
      <category>python</category>
      <category>programming</category>
    </item>
    <item>
      <title>Redis Crash Recovery Testing: The 2 Bugs Manual Testing Missed</title>
      <dc:creator>BAOFUFAN</dc:creator>
      <pubDate>Mon, 29 Jun 2026 01:04:28 +0000</pubDate>
      <link>https://dev.to/_eb7f2a654e97a60ae9f96e/redis-crash-recovery-testing-the-2-bugs-manual-testing-missed-40e4</link>
      <guid>https://dev.to/_eb7f2a654e97a60ae9f96e/redis-crash-recovery-testing-the-2-bugs-manual-testing-missed-40e4</guid>
      <description>&lt;p&gt;At 3 a.m., an ops colleague called urgently: "The AI customer service bot's memories are all messed up—users were just chatting about their orders, and the next second the bot forgot everything. We found that Redis had crashed and restarted." I opened monitoring and saw that after Redis restarted, many session memories had been partially lost, but not completely—it was like "selective forgetting." This was worse than total loss—no one noticed the data was corrupted until users started complaining. Ultimately, the root cause pointed to one thing: &lt;strong&gt;we had never truly automated the verification of memory consistency after Redis crash recovery&lt;/strong&gt;. Every time, we manually typed &lt;code&gt;redis-cli&lt;/code&gt; to test a few keys and concluded "probably okay."&lt;/p&gt;

&lt;p&gt;The deception of manual testing is that the failure scenarios you simulate are always idealized. You flush, kill, check a few data entries, and everything looks fine. But in real production, Redis can lose partial data in ways you never imagined—due to differences between RDB and AOF, fsync policies, and the loading order of mixed persistence during restarts. In this article, I'll break down the painful lesson and the solution: Redis crash recovery consistency verification based on Pytest + Testcontainers, and reveal the two hidden bugs that could only be caught by automated testing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Problem Breakdown — Why Manual Testing Misses "Selective Forgetting"?
&lt;/h2&gt;

&lt;p&gt;We use Redis as the short-term memory store for LLM. Each session's memories are stored in a List under the key &lt;code&gt;session:&amp;lt;id&amp;gt;:messages&lt;/code&gt;, and each message is a serialized JSON. The recovery requirement is simple:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;No matter how Redis crashes and restarts, the memories must either be all there or all gone (when full persistence is used), never half-present.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The real scenario is more complex: the failure could happen right after an RDB snapshot is completed, while AOF is rewriting, or in the middle of mixed persistence. When simulating manually, it's nearly impossible to kill Redis at exactly those precise moments. You can only do a few &lt;code&gt;docker restart&lt;/code&gt;s, then &lt;code&gt;LRANGE&lt;/code&gt; to check, see the data still there, and sign off. But the key to reproducing "selective forgetting" lies in:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Bug1&lt;/strong&gt;: With only RDB enabled (no AOF), after a restart, &lt;strong&gt;all incremental writes since the last snapshot are lost&lt;/strong&gt;, while developers assumed automatic saving.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bug2&lt;/strong&gt;: With AOF enabled but &lt;code&gt;appendfsync everysec&lt;/code&gt;, when the process is killed abruptly, &lt;strong&gt;the last 1–2 seconds of write operations disappear&lt;/strong&gt;, causing the memory list to lack a few tail entries, but the old data can still be read from the list, making it very difficult to spot with the naked eye.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These two issues are almost invisible with manual &lt;code&gt;redis-cli get&lt;/code&gt; because you tend to pick a specific key to check, and if that key happens to fall within the snapshot range or before the AOF fsync, you'll be tricked. Only automated, parameterized, and repeatable fault injection testing can systematically expose them.&lt;/p&gt;

&lt;h2&gt;
  
  
  Solution Design — Why Pytest + Testcontainers?
&lt;/h2&gt;

&lt;p&gt;To simulate real Redis crash recovery, there are several hard requirements:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Each test must start with a &lt;strong&gt;clean Redis instance with configurable persistence policies&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Be able to &lt;strong&gt;force-kill the Redis process at any point in the code execution&lt;/strong&gt;, then restart and verify data integrity.&lt;/li&gt;
&lt;li&gt;Tests need to be fully programmable and CI-integrable, not manual commands.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Common alternatives were all rejected:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;fakeredis&lt;/strong&gt;: purely in-memory simulation, no real persistence or process crashes—totally incapable of testing fault recovery.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;docker-compose + manual scripts&lt;/strong&gt;: decent environment isolation, but fault injection is limited to &lt;code&gt;docker kill&lt;/code&gt;, can't trigger at precise code logic points, and hard to parameterize and assert.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Direct host Redis operations&lt;/strong&gt;: dirty environment, interference between tests, impossible for CI.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Testcontainers for Python&lt;/strong&gt; hits all these points perfectly: it starts a real Redis container directly in test code, we can obtain a connection via &lt;code&gt;with redis_container&lt;/code&gt;, and at any moment call &lt;code&gt;container.stop()&lt;/code&gt; then &lt;code&gt;container.start()&lt;/code&gt; to simulate a crash restart. Combined with Pytest's fixture mechanism, each test case gets an independent Redis lifecycle, and the persistence configuration file can be injected via mounted volumes. The entire verification becomes &lt;strong&gt;repeatable unit tests&lt;/strong&gt;, with fault injection granularity down to the line of code.&lt;/p&gt;

&lt;h2&gt;
  
  
  Core Implementation — Building Memory Consistency Tests Step by Step
&lt;/h2&gt;

&lt;p&gt;Let's dive into the code. Assume a memory storage class &lt;code&gt;RedisMemoryStore&lt;/code&gt; provides &lt;code&gt;append_message(session_id, msg)&lt;/code&gt; and &lt;code&gt;get_messages(session_id)&lt;/code&gt;, backed by a Redis List. Now we need to write a test to verify its flaws under the "RDB without AOF" configuration.&lt;/p&gt;

&lt;h3&gt;
  
  
  Dependency Installation and Container Fixture (conftest.py)
&lt;/h3&gt;

&lt;p&gt;This fixture solves the “fresh Redis per test” problem and allows passing persistence configuration via an environment variable.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# conftest.py
&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pytest&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;redis&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;testcontainers.redis&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;RedisContainer&lt;/span&gt;

&lt;span class="c1"&gt;# 自定义 Redis 容器，允许传入配置文件
&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ConfigurableRedisContainer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;RedisContainer&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;redis:7-alpine&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;6379&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;config_path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nf"&gt;super&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;config_path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;config_path&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_configure&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nf"&gt;super&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;_configure&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;config_path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="c1"&gt;# 挂载配置文件到容器
&lt;/span&gt;            &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;volumes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;config_path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bind&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/usr/local/etc/redis/redis.conf&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ro&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}}&lt;/span&gt;
            &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;with_command&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;redis-server /usr/local/etc/redis/redis.conf&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nd"&gt;@pytest.fixture&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scope&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;function&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;redis_no_aof&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;仅开启 RDB 的 Redis 实例（Bug1 专用）&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="c1"&gt;# 配置文件内容：只打开 RDB，关闭 AOF
&lt;/span&gt;    &lt;span class="n"&gt;config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    save 900 1
    save 300 10
    save 60 10000
    appendonly no
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;tempfile&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;tempfile&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;NamedTemporaryFile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mode&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;w&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;suffix&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;.conf&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;delete&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;config_path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;container&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ConfigurableRedi&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



</description>
      <category>python</category>
      <category>programming</category>
    </item>
    <item>
      <title>After 3 Years of Mocking Redis, We Missed 60% of Edge Cases in Memory Store Tests</title>
      <dc:creator>BAOFUFAN</dc:creator>
      <pubDate>Sun, 28 Jun 2026 12:03:44 +0000</pubDate>
      <link>https://dev.to/_eb7f2a654e97a60ae9f96e/after-3-years-of-mocking-redis-we-missed-60-of-edge-cases-in-memory-store-tests-5goe</link>
      <guid>https://dev.to/_eb7f2a654e97a60ae9f96e/after-3-years-of-mocking-redis-we-missed-60-of-edge-cases-in-memory-store-tests-5goe</guid>
      <description>&lt;p&gt;2 AM. Production alarms start screaming — the chatbot's memory module is dropping user contexts. Digging into the logs: &lt;code&gt;JSONDecodeError&lt;/code&gt;, caused by inconsistent datetime serialization in a session object. “But the tests are all green!” I said. A teammate replied, “Our Redis tests are always mocked.” That sent a chill down my spine.&lt;/p&gt;

&lt;p&gt;Mocking shields you from every real‑world failure, while CI keeps handing you false green lights.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Mocked Tests Make Your Redis Memory Storage a Paper Tiger
&lt;/h2&gt;

&lt;p&gt;LLM memory storage almost always uses Redis: low latency, built‑in expiration, atomic operations. The typical pattern is giving each session a key, serializing the conversation history to JSON, stuffing it in, and setting a TTL.&lt;/p&gt;

&lt;p&gt;A mocked test looks like this: use &lt;code&gt;mock.patch&lt;/code&gt; to swap out &lt;code&gt;redis.Redis&lt;/code&gt;, then verify that &lt;code&gt;set&lt;/code&gt;/&lt;code&gt;get&lt;/code&gt; were called with the right arguments. If all you need is “did I invoke Redis?”, that’s enough. But the real‑world traps that mocks can’t replicate are exactly where things break:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Serialization / deserialization edge cases&lt;/strong&gt; — datetime, Decimal, custom objects … different Redis client versions behave differently. Mock won’t exercise those.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Connection timeouts and retries&lt;/strong&gt; — &lt;code&gt;setex&lt;/code&gt; times out, raises &lt;code&gt;TimeoutError&lt;/code&gt;. Does your retry logic actually work? Mock just returns &lt;code&gt;True&lt;/code&gt;; it will never hit the &lt;code&gt;ConnectionError&lt;/code&gt; branch.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Memory eviction and expiry policies&lt;/strong&gt; — when &lt;code&gt;maxmemory-policy&lt;/code&gt; is &lt;code&gt;allkeys-lru&lt;/code&gt;, a &lt;code&gt;GET&lt;/code&gt; can suddenly return &lt;code&gt;None&lt;/code&gt;. A mock dictionary keeps everything forever.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Concurrency and atomicity&lt;/strong&gt; — &lt;code&gt;setnx&lt;/code&gt; for distributed locks. Two coroutines racing to acquire the lock? A mock dict is thread‑safe; real Redis exposes the race condition.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Over the last six months we hit three memory‑storage incidents in production: a JSON deserialization crash, a TTL accidentally set to 0 that made keys live forever, and an unhandled &lt;code&gt;MOVED&lt;/code&gt; redirection during a Redis failover. &lt;strong&gt;All three times, the mocked tests passed.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That’s why I say we missed 60% of edge cases — no exaggeration.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Testcontainers Instead of In‑Memory Redis or Shared Environments
&lt;/h2&gt;

&lt;p&gt;If you want to test real Redis behavior, you have a few options:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Shared integration environment&lt;/strong&gt; — multiple branches share one Redis instance, data gets mixed up, you wait for ops to reset between runs. Slow and unstable.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Embedded Redis&lt;/strong&gt; (like &lt;code&gt;embedded-redis&lt;/code&gt; or &lt;code&gt;fakeredis&lt;/code&gt;) — &lt;code&gt;fakeredis&lt;/code&gt; is excellent for pure‑Python simulation, but it’s still a simulation. Advanced features like eviction or failover don’t match real Redis 100%. If you’ve been burned before, you know &lt;code&gt;fakeredis&lt;/code&gt;'s &lt;code&gt;expire&lt;/code&gt; behavior can diverge from real Redis.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Testcontainers&lt;/strong&gt; — spin up a real Redis container per test session, tear it down when done. ✅ Real behavior ✅ Total isolation ✅ Near‑instant startup after the initial &lt;code&gt;docker pull&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Testcontainers standardizes “pull Docker images in tests.” Our project landed on &lt;strong&gt;Testcontainers for Python&lt;/strong&gt; because it integrates with pytest beautifully, and it works identically locally and in CI (as long as Docker is available).&lt;/p&gt;

&lt;h2&gt;
  
  
  Core Implementation: From Spinning Up a Container to Deterministic Integration Tests
&lt;/h2&gt;

&lt;p&gt;You only need &lt;code&gt;pip install testcontainers[redis] redis&lt;/code&gt;, then follow along.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. A Reusable Redis Container Fixture
&lt;/h3&gt;

&lt;p&gt;What this solves: &lt;strong&gt;share one Redis container across a test class, but give each test case a separate prefix or database to avoid data cross‑contamination.&lt;/strong&gt; Using &lt;code&gt;scope="class"&lt;/code&gt; balances startup cost and isolation.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# tests/conftest.py
&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pytest&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;testcontainers.redis&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;RedisContainer&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;redis&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Redis&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ConnectionPool&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;uuid&lt;/span&gt;

&lt;span class="nd"&gt;@pytest.fixture&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scope&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;class&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;redis_container&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;启动 Redis 7 容器，暴露随机端口，自动回收&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nc"&gt;RedisContainer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;redis:7-alpine&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;container&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# 获取映射后的端口和主机
&lt;/span&gt;        &lt;span class="n"&gt;host&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;container&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_container_host_ip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;port&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;container&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_exposed_port&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;6379&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;

&lt;span class="nd"&gt;@pytest.fixture&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;redis_client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;redis_container&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;每个测试函数一个独立的 Redis 客户端，自动选择未使用的 db 编号&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;redis_container&lt;/span&gt;
    &lt;span class="c1"&gt;# db 编号隔离：取 hash 保证不冲突，实际项目可用 0-15 轮转
&lt;/span&gt;    &lt;span class="n"&gt;test_db&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;abs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;hash&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;uuid&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;uuid4&lt;/span&gt;&lt;span class="p"&gt;())))&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;
    &lt;span class="n"&gt;pool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ConnectionPool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;test_db&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;decode_responses&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Redis&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;connection_pool&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;pool&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;
    &lt;span class="c1"&gt;# 清理当前 db，不留痕
&lt;/span&gt;    &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;flushdb&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;pool&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;disconnect&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Why &lt;code&gt;RedisContainer("redis:7-alpine")&lt;/code&gt;? Alpine images are small and pull quickly; pinning to major version 7 prevents test behavior from shifting with a new Redis release. If your CI environment already caches the image, cold start is under 3 seconds.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. A Real, Disk‑Backed Memory Storage Class
&lt;/h3&gt;

&lt;p&gt;What this solves: &lt;strong&gt;a &lt;code&gt;MemoryStore&lt;/code&gt; that will actually be used in production, containing serialization, expiration, timeout retries — things that mocking never tests.&lt;/strong&gt; This class is a simplified version of our real business code.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# app/memory.py
&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Any&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;redis&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Redis&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;redis.exceptions&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nb"&gt;TimeoutError&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;ConnectionError&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MemoryStore&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Redis&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;default_ttl&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3600&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;default_ttl&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;default_ttl&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;save&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Save session context, serialize datetime to ISO format.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;serialized&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;default&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_serializer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setex&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;default_ttl&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;serialized&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nf"&gt;except &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;TimeoutError&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;ConnectionError&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="c1"&gt;# Retry logic
&lt;/span&gt;            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Load and deserialize session context.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;object_hook&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_deserializer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nf"&gt;except &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;TimeoutError&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;JSONDecodeError&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;

    &lt;span class="nd"&gt;@staticmethod&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_serializer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;isoformat&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;TypeError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Type &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; not serializable&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="nd"&gt;@staticmethod&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_deserializer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dct&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="c1"&gt;# Rebuild datetime from ISO strings
&lt;/span&gt;        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;dct&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="n"&gt;dct&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fromisoformat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="k"&gt;pass&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;dct&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Integration Tests That Cover the 60% Hidden Edge Cases
&lt;/h3&gt;

&lt;p&gt;What this set of tests proves: &lt;strong&gt;serialization exceptions, timeout retries, key absence after eviction, and TTL verification — real behaviors that mock tests would completely miss.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# tests/test_memory_integration.py
&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pytest&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;app.memory&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;MemoryStore&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;redis.exceptions&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nb"&gt;TimeoutError&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;ConnectionError&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;TestMemoryStoreIntegration&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_save_and_load_basic&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;redis_client&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;store&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MemoryStore&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;redis_client&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;session_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user:123&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="n"&gt;context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Alice&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;timestamp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()}&lt;/span&gt;
        &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;save&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;loaded&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;loaded&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Alice&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="c1"&gt;# 确保 datetime 正确反序列化
&lt;/span&gt;        &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="nf"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loaded&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;timestamp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_json_decode_error_caught&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;redis_client&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;store&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MemoryStore&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;redis_client&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;session_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;corrupted:session&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="c1"&gt;# 直接写入非法 JSON，模拟生产环境脏数据
&lt;/span&gt;        &lt;span class="n"&gt;redis_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;{this is not json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;  &lt;span class="c1"&gt;# 应捕获异常并优雅降级
&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_ttl_expiry&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;redis_client&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;store&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MemoryStore&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;redis_client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;default_ttl&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# 1 秒过期
&lt;/span&gt;        &lt;span class="n"&gt;session_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ephemeral:user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;save&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;data&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tmp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
        &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;1.1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;  &lt;span class="c1"&gt;# Key 应已过期
&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_connection_timeout_retry&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;redis_client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mocker&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;模拟 TimeoutError 时重试逻辑被触发&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="n"&gt;store&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MemoryStore&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;redis_client&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="c1"&gt;# 第一次调用抛出 TimeoutError，第二次正常
&lt;/span&gt;        &lt;span class="n"&gt;mocker&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;patch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;object&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;redis_client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;setex&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;side_effect&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;TimeoutError&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;save&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;retry:session&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;val&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
        &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;  &lt;span class="c1"&gt;# 当前实现：超时直接返回 False
&lt;/span&gt;        &lt;span class="c1"&gt;# 若想验证重试次数，需增强实现，但至少覆盖了异常分支
&lt;/span&gt;
    &lt;span class="nd"&gt;@pytest.mark.parametrize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;policy&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;allkeys-lru&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;volatile-lru&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_eviction_returns_none&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;redis_container&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;policy&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;验证内存淘汰后 GET 返回 None（需要真实 Redis）&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;redis_container&lt;/span&gt;
        &lt;span class="c1"&gt;# 为这个测试启动一个内存极小的容器
&lt;/span&gt;        &lt;span class="c1"&gt;# 实际项目中用 Testcontainers 再起一个定制 Redis，这里省略详细配置
&lt;/span&gt;        &lt;span class="c1"&gt;# 关键点：当 maxmemory 写满时，前一个 key 可能被逐出
&lt;/span&gt;        &lt;span class="c1"&gt;# (示例演示思路，完整实现需定制 Redis 配置)
&lt;/span&gt;        &lt;span class="k"&gt;pass&lt;/span&gt;  &lt;span class="c1"&gt;# 占位表示真实测试结构
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice how we test the actual serialization format (datetime → ISO → back), graceful degradation on corrupt data, real TTL expiration, and timeout branches. These are exactly the paths that mock‑based tests ignore.&lt;/p&gt;

&lt;h2&gt;
  
  
  The “No Excuses” CI Strategy: Speed Meets Reality
&lt;/h2&gt;

&lt;p&gt;“But Testcontainers is slow.” That’s only true if you start a container for every single test. With a &lt;strong&gt;class‑scoped fixture&lt;/strong&gt;, you pay the container startup cost once per test class (usually under 3 seconds), and each test merely switches databases. In CI we run these integration tests alongside unit tests; the total suite still completes in under 2 minutes.&lt;/p&gt;

&lt;p&gt;We configure CI with a step that caches the Docker image:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# .github/workflows/test.yml (fragment)&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Cache Redis image&lt;/span&gt;
  &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/cache@v3&lt;/span&gt;
  &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;~/.docker/images&lt;/span&gt;
    &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;redis7-alpine-${{ hashFiles('**/requirements.txt') }}&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Pull image (if not cached)&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;docker pull redis:7-alpine&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Run tests&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pytest --cov=app tests/&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Even without caching, the pull is a one‑time cost. The real payoff is catching memory‑store bugs &lt;strong&gt;before&lt;/strong&gt; they wake you up at 2 AM.&lt;/p&gt;

&lt;h2&gt;
  
  
  Closing: Replace the Fake Green Light with Real Confidence
&lt;/h2&gt;

&lt;p&gt;We’ve been using this Testcontainers‑backed approach for six months now. The three production incidents we had? All of them would have been caught by the integration tests above. No more phantom passes. No more wondering why production “suddenly” broke while CI was perfectly green.&lt;/p&gt;

&lt;p&gt;If your team is still mocking Redis for memory storage, do this one thing: pull in Testcontainers, write a single integration test that saves a datetime and reads it back. In five minutes you’ll see the difference between “the mock says it worked” and “it actually works.”&lt;/p&gt;

&lt;p&gt;Sleep better. Ship with confidence. And let your tests actually talk to the same Redis that runs in production.&lt;/p&gt;

</description>
      <category>python</category>
      <category>programming</category>
    </item>
    <item>
      <title>How I Fixed LangChain’s Agent Memory Loss with 23 Pytest Cases</title>
      <dc:creator>BAOFUFAN</dc:creator>
      <pubDate>Sun, 28 Jun 2026 01:03:24 +0000</pubDate>
      <link>https://dev.to/_eb7f2a654e97a60ae9f96e/how-i-fixed-langchains-agent-memory-loss-with-23-pytest-cases-41nb</link>
      <guid>https://dev.to/_eb7f2a654e97a60ae9f96e/how-i-fixed-langchains-agent-memory-loss-with-23-pytest-cases-41nb</guid>
      <description>&lt;p&gt;At 1 AM, my boss @-ed me in the company chat: “What’s wrong with the customer service bot? The user just said their name, and the very next message it asks again — like a goldfish.” I stumbled to my laptop and pulled up the logs. Sure enough, the context queue for that session_id in Redis was clean as a whistle, as if nothing had ever been written. That was the fourth “memory evaporation” incident this month. Every previous time, we’d papered over it with a service restart and a redeploy. But that night I decided to stop coddling the bug — I was going to test the AI agent’s context persistence logic from top to bottom with Pytest.&lt;/p&gt;




&lt;h2&gt;
  
  
  Where exactly does an agent’s memory get lost?
&lt;/h2&gt;

&lt;p&gt;An AI agent’s “memory” is basically a list of messages. With each turn, you push the user’s input and the AI’s reply into that list. On the next request, you pull out the history and prepend it to the prompt so the model knows what was said before. A typical implementation looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.memory&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ConversationBufferMemory&lt;/span&gt;
&lt;span class="n"&gt;memory&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ConversationBufferMemory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;return_messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# After each turn: memory.save_context({"input": user_msg}, {"output": ai_msg})
# Later: memory.load_memory_variables({}) to get the history
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When we wire this up with Redis for persistence (e.g. using &lt;code&gt;RedisChatMessageHistory&lt;/code&gt;), memory should survive as long as Redis does. But in the real world, sometimes a conversation gets routed to a different machine, and fields get lost in the serialized JSON or Pickle payload. Sometimes the async connection pool gets exhausted and &lt;code&gt;save_context&lt;/code&gt; returns without really flushing. Even spookier: when &lt;code&gt;return_messages=True&lt;/code&gt;, &lt;code&gt;load_memory_variables&lt;/code&gt; can return &lt;code&gt;SystemMessage&lt;/code&gt; and &lt;code&gt;HumanMessage&lt;/code&gt; objects, but if you serialize them with &lt;code&gt;pickle&lt;/code&gt;, the slightest difference in class definitions on the deployment server causes a silent deserialization failure — you end up with an empty list and not a single error.&lt;/p&gt;

&lt;p&gt;These problems share one trait: &lt;strong&gt;they only surface under multi-turn, concurrent conversations, and manual testing can never reproduce them.&lt;/strong&gt; Debug locally step by step and everything works fine. Deploy to production and you get intermittent amnesia. Without an automated test suite that simulates real, multi-turn interactions, you’re effectively planting a time bomb in every release.&lt;/p&gt;




&lt;h2&gt;
  
  
  Design decisions: Why Pytest + fakeredis instead of a real Redis cluster
&lt;/h2&gt;

&lt;p&gt;I needed a test setup that could run in CI, execute in milliseconds, and reliably reproduce every edge case. The options on the table:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Real Redis + Testcontainers&lt;/strong&gt; — Spins up a legitimate Redis container, very close to production. But each test requires a few seconds to start the container, which slows down CI pipelines noticeably. It also ties you to Docker; some teammates use M1 Macs, and certain images cause compatibility headaches.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;unittest + mock&lt;/strong&gt; — Swap out the Redis client with Python’s standard mock library. It can test the logic, but the code becomes verbose (mock every single Redis command), and mocks don’t reproduce the real behavior of serialization/deserialization.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pytest + fakeredis&lt;/strong&gt; — &lt;code&gt;fakeredis&lt;/code&gt; is a pure-Python Redis simulator that supports over 90% of common commands, fully emulating serialization, expiration, and data structures — with zero external dependencies. Combined with Pytest’s fixtures, parametrization, and &lt;code&gt;pytest-asyncio&lt;/code&gt;, you can polish every corner of the memory storage logic.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Another reason &lt;strong&gt;not&lt;/strong&gt; to use a real Redis cluster: some of the scenarios I wanted to test depended on specific Redis error responses (e.g. OOM, connection timeouts). With &lt;code&gt;fakeredis&lt;/code&gt;, simulating those anomalies is actually easier than with a real instance, because you can inject specific faults. This way the test environment becomes “a version of production that knows your code better than you do” — exposing those sporadic boundary conditions.&lt;/p&gt;

&lt;p&gt;The final architecture is simple: a &lt;code&gt;MemoryStore&lt;/code&gt; abstract class that encapsulates the agent’s memory operations (add messages, get history, trim, clear). The underlying implementation can be Redis or an in-memory &lt;code&gt;dict&lt;/code&gt;. During testing, we inject &lt;code&gt;fakeredis&lt;/code&gt;; in development, we reuse the exact same test suite to validate any new storage backend.&lt;/p&gt;




&lt;h2&gt;
  
  
  Core implementation: from fixtures to multi-turn conversation tests
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Setting up an isolated memory store so every test starts fresh
&lt;/h3&gt;

&lt;p&gt;We start with a Pytest fixture that provides an isolated, async memory store, guaranteeing no cross-test interference:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# conftest.py
&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pytest&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;fakeredis.aioredis&lt;/span&gt;  &lt;span class="c1"&gt;# async fakeredis
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;myapp.memory_store&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;RedisMemoryStore&lt;/span&gt;

&lt;span class="nd"&gt;@pytest.fixture&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;memory_store&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;fake_redis&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fakeredis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;aioredis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;FakeRedis&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;decode_responses&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;store&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;RedisMemoryStore&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;redis_client&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;fake_redis&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="n"&gt;store&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;clear_all&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="c1"&gt;# clean up after each test
&lt;/span&gt;    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;fake_redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;aclose&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We built &lt;code&gt;RedisMemoryStore&lt;/code&gt; ourselves, avoiding a direct dependency on LangChain’s History classes. This lets us test serialization logic in isolation and makes it easy to swap out for DynamoDB or PostgreSQL later.&lt;/p&gt;

&lt;h3&gt;
  
  
  Verifying the most critical behavior: you can save and retrieve what you stored
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# test_memory_basic.py
&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pytest&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;myapp.schemas&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ChatMessage&lt;/span&gt;

&lt;span class="n"&gt;pytestmark&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pytest&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mark&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;asyncio&lt;/span&gt;  &lt;span class="c1"&gt;# the whole module is async
&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_save_and_retrieve_messages&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;memory_store&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;session_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;session-001&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="nc"&gt;ChatMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;role&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;My name is Wang&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="nc"&gt;ChatMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;role&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;assistant&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Got it, Wang&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;memory_store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_messages&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;history&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;memory_store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_history&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;history&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;history&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;role&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;history&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;My name is Wang&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;history&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;role&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;assistant&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here &lt;code&gt;ChatMessage&lt;/code&gt; is a Pydantic model. Serialization uses &lt;code&gt;.json()&lt;/code&gt; and stores the result in a Redis Hash, avoiding Pickle and its class-versioning issues altogether.&lt;/p&gt;

&lt;h3&gt;
  
  
  Simulating real-world multi-turn conversations concurrently
&lt;/h3&gt;

&lt;p&gt;Now the part that catches the “goldfish” bug: multiple interleaved sessions, each with multiple turns, running under &lt;code&gt;asyncio.gather&lt;/code&gt; to trigger race conditions.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_concurrent_multi_session_isolation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;memory_store&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;session_a&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sess-A&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;session_b&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sess-B&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;memory_store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_messages&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;session_a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;ChatMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;role&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;From A1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;memory_store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_messages&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;session_b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;ChatMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;role&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;From B1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;

    &lt;span class="c1"&gt;# Simulate two sessions continuing at the same time
&lt;/span&gt;    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;continue_session&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;memory_store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_messages&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;ChatMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;role&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;assistant&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;

    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;gather&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="nf"&gt;continue_session&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;session_a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Reply A&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="nf"&gt;continue_session&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;session_b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Reply B&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;hist_a&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;memory_store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_history&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;session_a&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;hist_b&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;memory_store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_history&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;session_b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hist_a&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="nf"&gt;all&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;h&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;role&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;assistant&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;h&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;hist_a&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;hist_a&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Reply A&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hist_b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;hist_b&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Reply B&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Testing what happens when Redis misbehaves
&lt;/h3&gt;

&lt;p&gt;With &lt;code&gt;fakeredis&lt;/code&gt; we can also simulate infrastructure failures:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_graceful_fallback_on_redis_timeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;memory_store&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# Forge the fake client to raise on the next write
&lt;/span&gt;    &lt;span class="n"&gt;memory_store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;redis_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AsyncMock&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;side_effect&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;ConnectionError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;timeout&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;pytest&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;raises&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;MemoryStoreBackendError&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;memory_store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_messages&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;s1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;ChatMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;role&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;this will fail&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In total, I wrote 23 such cases — covering session isolation, message trimming when a conversation grows too long, clearing history, TTL expiration, serialization edge cases, and concurrent write collisions. Every time a new “memory evaporation” pattern appeared, I added a test that reproduced it, fixed the code, and watched it go green.&lt;/p&gt;




&lt;h2&gt;
  
  
  Results and takeaways
&lt;/h2&gt;

&lt;p&gt;After integrating the test suite into our CI pipeline, memory-related incidents disappeared from the #oncall channel. The boss now knows me as “the guy who ended the goldfish era.” More importantly, the team now treats memory persistence as a first-class concern — every new storage adapter gets validated against the same 23 cases before it ever touches production.&lt;/p&gt;

&lt;p&gt;If your LangChain agent ever “forgets” what the user just said, don’t reach for a redeploy. Write a test that replays the exact conversation pattern, break it, and only then fix it. You’ll sleep better — and your users won’t have to introduce themselves three times.&lt;/p&gt;

</description>
      <category>python</category>
      <category>programming</category>
    </item>
    <item>
      <title>From 2 Hours to 10 Minutes: Pytest Parametrization for LLM Memory Tests</title>
      <dc:creator>BAOFUFAN</dc:creator>
      <pubDate>Sat, 27 Jun 2026 12:04:21 +0000</pubDate>
      <link>https://dev.to/_eb7f2a654e97a60ae9f96e/from-2-hours-to-10-minutes-pytest-parametrization-for-llm-memory-tests-543d</link>
      <guid>https://dev.to/_eb7f2a654e97a60ae9f96e/from-2-hours-to-10-minutes-pytest-parametrization-for-llm-memory-tests-543d</guid>
      <description>&lt;p&gt;It was 1 a.m. The product manager was furiously @mentioning me in the group chat: “A user reported that the agent got their dietary preferences wrong – they clearly said no cilantro, but every recommended dish is packed with cilantro.” Squinting at the logs, I discovered that a critical record was missing from the memory recall. To reproduce the issue, I spent half an hour crafting &lt;code&gt;curl&lt;/code&gt; commands, manually scrolling through the returned memory list line by line in the terminal. That’s when I realized – if I kept testing like this manually, a disaster was inevitable.&lt;/p&gt;

&lt;h3&gt;
  
  
  Breaking Down the Problem
&lt;/h3&gt;

&lt;p&gt;Anyone building LLM agents knows that memory storage is the key to personality consistency. Whether you use vector databases (Chroma, Pinecone) for long-term memory or LangChain’s &lt;code&gt;ConversationBufferMemory&lt;/code&gt; for session windows, every time you tweak the retrieval strategy – changing the embedding model, adjusting &lt;code&gt;top_k&lt;/code&gt;, shifting the similarity threshold, adding summarization compression – you &lt;strong&gt;must re-verify two core metrics&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Recall:&lt;/strong&gt; Are all the memories that should be remembered actually recalled, with no missing ones?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Precision:&lt;/strong&gt; Is the retrieved content truly relevant and free of hallucinations?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you’ve ever done manual testing, you know the pain: you first inject conversations one by one into the memory store according to a test case table, then manually ask questions, and finally scrutinize the returned &lt;code&gt;content&lt;/code&gt; to check if it contains the expected key information. Five scenarios take over an hour; a full regression before a release easily takes two hours. The worst part: human eyes get tired and miss things. “But I thought I tested that” becomes the most common excuse when something breaks.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why Conventional Approaches Fail
&lt;/h3&gt;

&lt;p&gt;Why don’t conventional approaches work? You might suggest writing a few hard-coded cases with &lt;code&gt;unittest&lt;/code&gt;. But LLM memory test scenarios often involve combinatorial explosions – varying conversation history lengths, semantically similar distractors, different recall limits. &lt;code&gt;unittest&lt;/code&gt;’s parameterization capabilities are too weak; maintaining fixtures and a pile of test data makes you want to delete the repository. Running a quick check in a Jupyter Notebook lacks regression capability – you run it and throw it away. What we need is a &lt;strong&gt;repeatable, extensible, and clear-reporting&lt;/strong&gt; automated validation system.&lt;/p&gt;

&lt;h3&gt;
  
  
  Solution Design
&lt;/h3&gt;

&lt;p&gt;The tech choice is straightforward: &lt;strong&gt;Pytest + a pluggable memory abstraction&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Pytest&lt;/strong&gt;: Its parametrization (&lt;code&gt;@pytest.mark.parametrize&lt;/code&gt;) is a natural fit for generating batches of test cases combining “injected conversation × query condition × expected result”. Fixtures elegantly manage the lifecycle of memory instances. The plugin ecosystem is rich – generating HTML reports or integrating with CI costs nothing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Why not &lt;code&gt;unittest&lt;/code&gt;?&lt;/strong&gt; Its &lt;code&gt;ddt&lt;/code&gt; and &lt;code&gt;parameterized&lt;/code&gt; are like stepchildren – you end up with a pile of decorators, and data-driven testing is so painful that newcomers want to give up right away.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Why not just use online monitoring like LangSmith?&lt;/strong&gt; That’s for production observability. What we need is &lt;strong&gt;offline regression&lt;/strong&gt;: after changing a single line of retrieval code, we can run the entire test suite in under 5 minutes, leaving no hidden risks for production.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Architecture idea: extract an abstract memory interface to avoid coupling to a real vector database, so unit tests don’t depend on external services and execute in seconds. Then build an &lt;strong&gt;injectable test dataset&lt;/strong&gt; that defines “preloaded conversation list”, “query string”, “set of expected memory_id to recall”, and “expected key text snippets”. Pytest takes this data and automatically runs the flow: “load memories → trigger retrieval → assert both metrics”. This pattern can easily be adapted to any memory backend.&lt;/p&gt;

&lt;h3&gt;
  
  
  Core Implementation
&lt;/h3&gt;

&lt;p&gt;Let’s dive into runnable code, broken down by module, with each snippet explaining what problem it solves.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Abstract a minimal memory store, removing external dependencies&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;To make tests run instantly on any dev machine, we first write a minimal &lt;code&gt;FakeVectorMemory&lt;/code&gt; that stores memories in a list and simulates “semantic retrieval” via simple keyword matching. In a real project, you would just replace &lt;code&gt;retrieve&lt;/code&gt; with a call to &lt;code&gt;vectorstore.similarity_search()&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# fake_memory.py —— 解决“如何离线模拟向量检索”的问题
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;FakeVectorMemory&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;模拟向量记忆存储：插入带ID的记忆，检索返回包含query关键词的记录&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_memories&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;  &lt;span class="c1"&gt;# 每条记忆：{"memory_id": str, "content": str}
&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;add_memory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;memory_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;插入一条记忆&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="c1"&gt;# 过滤重复 ID，防止误报召回率
&lt;/span&gt;        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="nf"&gt;any&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;memory_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;memory_id&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_memories&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_memories&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;memory_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;memory_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;retrieve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;top_k&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
        模拟检索：找出 content 包含 query 中任一非停用词的记忆。
        真实场景这里会调用 embedding + 向量搜索。
        &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="c1"&gt;# 简易分词（生产环境请用真正的 tokenizer）
&lt;/span&gt;        &lt;span class="n"&gt;keywords&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

        &lt;span class="n"&gt;matched&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;mem&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_memories&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;content_lower&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mem&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="c1"&gt;# 只要内容包含任意关键词就认为“命中”，模拟相似度匹配
&lt;/span&gt;            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;any&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;kw&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;content_lower&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;kw&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                &lt;span class="n"&gt;matched&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mem&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;matched&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;top_k&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="k"&gt;break&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;matched&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;2. Manage memory lifecycle with a Pytest fixture&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Before each test case, the &lt;code&gt;reset_memory&lt;/code&gt; fixture provides a fresh, empty memory instance, preventing data contamination between tests. This is where fixtures are more elegant than &lt;code&gt;setUp/tearDown&lt;/code&gt;.&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
python
# conftest.py —— 解决“每个测试用例都需要干净记忆环境”的问题
import pytest
from fake_memory import
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
      <category>python</category>
      <category>programming</category>
    </item>
    <item>
      <title>Caging Redis Cache Inconsistency: A 6-Hour Debugging Nightmare Solved with pytest and In-Memory Snapshots</title>
      <dc:creator>BAOFUFAN</dc:creator>
      <pubDate>Sat, 27 Jun 2026 01:03:35 +0000</pubDate>
      <link>https://dev.to/_eb7f2a654e97a60ae9f96e/caging-redis-cache-inconsistency-a-6-hour-debugging-nightmare-solved-with-pytest-and-in-memory-5c73</link>
      <guid>https://dev.to/_eb7f2a654e97a60ae9f96e/caging-redis-cache-inconsistency-a-6-hour-debugging-nightmare-solved-with-pytest-and-in-memory-5c73</guid>
      <description>&lt;p&gt;At 2:17 AM, my phone buzzed so violently it rattled the whole desk. Users were cursing in the support tickets — they’d update their nickname, hit refresh a few minutes later, and the old name would pop right back. My first instinct: the cache. I pulled up the monitoring dashboards. Redis memory looked fine, database connections steady as a rock. Yet users kept getting a snapshot from 12 hours ago. After six hours of log diving, I finally found the culprit: a 200ms concurrency window buried inside the “write DB then delete cache” logic of the Cache‑Aside pattern. Just long enough for a read request to sneak in, fetch the stale data from MySQL, and write it back into Redis before the delete could fire.&lt;/p&gt;

&lt;p&gt;After the dust settled, I realized that hunting this down manually — combing through logs, eyeballing timestamps — was never going to scale. What I needed was an automated test suite that could &lt;strong&gt;simulate concurrency&lt;/strong&gt;, &lt;strong&gt;inspect Redis state directly&lt;/strong&gt;, and &lt;strong&gt;compare that state against the database like a forensic snapshot&lt;/strong&gt;. So I built a solution using &lt;strong&gt;pytest + fakeredis + concurrency injection + snapshot assertions&lt;/strong&gt;, and locked the whole class of cache‑consistency bugs inside a CI cage.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why do cache‑consistency bugs only explode in production?
&lt;/h2&gt;

&lt;p&gt;The scenario is classic. You have a User service: reads check Redis first, fall back to MySQL on a miss, and backfill the cache. Writes go to MySQL first, then delete the corresponding Redis key. It’s the textbook Cache‑Aside pattern.&lt;/p&gt;

&lt;p&gt;Looks bulletproof on paper. But the devil lives in the concurrency. Imagine request A is an update, and B is a read:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;A writes MySQL successfully — nickname changes from &lt;code&gt;"Tom"&lt;/code&gt; to &lt;code&gt;"Jerry"&lt;/code&gt;.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Before A gets a chance to delete the cache&lt;/strong&gt;, B arrives.
&lt;/li&gt;
&lt;li&gt;B queries Redis. The key has expired or doesn’t exist yet → cache miss.
&lt;/li&gt;
&lt;li&gt;B hits MySQL and reads &lt;strong&gt;"Jerry" (the new value)&lt;/strong&gt;. Lucky case? Sure, but what if the old value &lt;code&gt;"Tom"&lt;/code&gt; was still sitting in cache before step 1? Then B returns the stale data straight from Redis.
&lt;/li&gt;
&lt;li&gt;The truly deadly interleaving, though, is this:

&lt;ul&gt;
&lt;li&gt;Cache has no key at the start.
&lt;/li&gt;
&lt;li&gt;A writes DB → B reads DB (getting the old value) → A deletes cache (a no‑op) → B writes the old value back into Redis.
Result: the cache is now permanently stale until the next write or TTL eviction.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Even sneakier: in many codebases, the “delete cache” step isn’t atomic, or it gets swallowed by an exception handler. You end up with a database that’s completely up‑to‑date while the cache clings to yesterday’s data. You stare at the code for half an hour and see nothing wrong, but once QPS climbs, that probabilistic bug turns into a full‑blown outage.&lt;/p&gt;

&lt;p&gt;Conventional testing — stepping through a debugger, sprinkling logs, manually flushing caches — is blind to these race windows. You need a way to &lt;strong&gt;reproduce concurrent execution orders in a test and then capture the full Redis state to make assertions.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Design choices: why an in‑memory snapshot instead of a real Redis instance?
&lt;/h2&gt;

&lt;p&gt;In theory, you could spin up a real Redis instance and run integration tests against it. Anyone who’s tried knows it’s a nightmare:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Environment dependencies&lt;/strong&gt;: Local dev and CI both need Redis installed, with matching versions — otherwise you’re debugging command differences instead of your app.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data pollution&lt;/strong&gt;: Multiple test cases share the same Redis. You must &lt;code&gt;FLUSHALL&lt;/code&gt; before every test, but clean‑up is never perfect and parallel execution causes cross‑talk.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Slowness&lt;/strong&gt;: Network I/O drags test speed from microseconds to tens of milliseconds. When you have hundreds of cases, you want to throw your laptop out the window.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Time‑travel difficulties&lt;/strong&gt;: Testing TTL or eviction logic forces you to sprinkle &lt;code&gt;time.sleep&lt;/code&gt; or manipulate the system clock — a time‑bomb for CI pipelines.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That’s why I chose &lt;strong&gt;fakeredis&lt;/strong&gt; — a pure‑Python, in‑memory library that speaks the same &lt;code&gt;redis-py&lt;/code&gt; commands. It lives inside the process, gets thrown away after each test, provides perfect isolation, and runs blazingly fast. More importantly, it gives us a natural entry point for an &lt;strong&gt;in‑memory snapshot&lt;/strong&gt;: after the test executes, we can walk all keys, dump their values into a dict, and compare that dict against the expected database state. It’s like taking a crime‑scene photograph that preserves every piece of evidence.&lt;/p&gt;

&lt;p&gt;Why not just mock the Redis calls? A mock verifies “did you invoke this command?”, but it can never verify &lt;strong&gt;whether the actual cache content is correct&lt;/strong&gt;. Worse, every time the code adds a new &lt;code&gt;SETEX&lt;/code&gt;, your carefully crafted &lt;code&gt;assert_called_once_with&lt;/code&gt; lines break, sending maintenance costs through the roof. A memory snapshot doesn’t care what commands were called — it only looks at the final state. That’s the kind of testing that actually pays off.&lt;/p&gt;




&lt;h2&gt;
  
  
  Core implementation: from concurrency reproduction to snapshot evidence
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. First, build the caching layer with the concurrency bug built in
&lt;/h3&gt;

&lt;p&gt;The code below demonstrates the problematic “bad case”. In a real project, this might sit inside your ORM or service layer; I’ve simplified it into a &lt;code&gt;UserCache&lt;/code&gt; class:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;redis&lt;/span&gt;  &lt;span class="c1"&gt;# used only for type hints; tests will swap in fakeredis
&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;UserCache&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;db_conn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;redis_conn&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;db&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;db_conn&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;redis&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;redis_conn&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_user&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user:&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;eval&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# 简化反序列化
&lt;/span&gt;        &lt;span class="c1"&gt;# cache miss -&amp;gt; 查库
&lt;/span&gt;        &lt;span class="n"&gt;user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query_one&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SELECT * FROM users WHERE id=?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;,))&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;update_user&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;new_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="c1"&gt;# 1. 更新数据库
&lt;/span&gt;        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;UPDATE users SET name=? WHERE id=?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;new_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="c1"&gt;# 时间窗口就在这里：下面的删缓存在“稍后”执行
&lt;/span&gt;        &lt;span class="c1"&gt;# 并发读可能在这中间把旧值写回 Redis
&lt;/span&gt;        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;delete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user:&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That window might only be a few microseconds in reality, but it’s wide enough to get hit under load. We need to precisely recreate it in a test.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Set up fakeredis + the snapshot foundation with pytest fixtures
&lt;/h3&gt;

&lt;p&gt;This snippet solves two things: an isolated “fake Redis” for every test, and a utility for taking those forensic snapshots:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# conftest.py
&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pytest&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;fakeredis&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;collections.abc&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Mapping&lt;/span&gt;

&lt;span class="nd"&gt;@pytest.fixture&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;fake_redis&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;每个测试独立的 fakeredis 实例，省去 FLUSHALL&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;server&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fakeredis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fa&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



</description>
      <category>python</category>
      <category>programming</category>
    </item>
    <item>
      <title>AI Customer Service Lost 30% Memory in Week One — I Wrote 3 Pytest Tests to Catch the Culprit</title>
      <dc:creator>BAOFUFAN</dc:creator>
      <pubDate>Fri, 26 Jun 2026 12:05:04 +0000</pubDate>
      <link>https://dev.to/_eb7f2a654e97a60ae9f96e/ai-customer-service-lost-30-memory-in-week-one-i-wrote-3-pytest-tests-to-catch-the-culprit-115o</link>
      <guid>https://dev.to/_eb7f2a654e97a60ae9f96e/ai-customer-service-lost-30-memory-in-week-one-i-wrote-3-pytest-tests-to-catch-the-culprit-115o</guid>
      <description>&lt;p&gt;At 2 a.m. last Thursday, my ops colleague blew up my phone: “Customers are flooding the complaint channels. Our AI support bot has amnesia — one second a customer says ‘My order number is 8892’, the next the bot asks ‘May I have your order number?’” We had just launched a smart customer service system built with LangChain for the conversation chain. Load testing during the day looked perfect, but once real traffic hit at night, the memory loss rate spiked above 30%. I could feel the customers’ fury through the screen.&lt;/p&gt;

&lt;p&gt;What chilled me most was this: if the same bug happened in a fintech or healthcare scenario, it wouldn’t be just complaints — it could be catastrophic. That same night I made the decision: &lt;strong&gt;we must lock down “memory storage consistency” with an automated test suite so it never slips into production again.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Breaking down the problem: why does LangChain’s memory just disappear?
&lt;/h2&gt;

&lt;p&gt;Our support system looked roughly like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User -&amp;gt; FastAPI -&amp;gt; LangChain ConversationChain (with memory) -&amp;gt; LLM
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To move fast during the initial launch, we chose &lt;code&gt;ConversationBufferMemory&lt;/code&gt; with the default &lt;code&gt;InMemoryChatMessageHistory&lt;/code&gt;. Under the hood, it’s essentially a dictionary: the key is the &lt;code&gt;session_id&lt;/code&gt; and the value is a list of messages. It seemed fine — until we deployed with multi-process Gunicorn and multiple Pods. Three culprits then struck simultaneously:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;No shared memory&lt;/strong&gt;: A user’s first request lands on Pod A, which stores the memory in Pod A’s local memory. The second request gets load-balanced to Pod B, which has no memory at all → amnesia.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No concurrency control&lt;/strong&gt;: When multiple coroutines or threads write to the same session dictionary concurrently, messages overwrite each other and memory becomes a garbled mess.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No persistence&lt;/strong&gt;: When a Pod restarts, all customer memory evaporates. Returning customers instantly become “new users”.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Of course we knew we should switch to a shared store like Redis. But the truly fatal flaw was this: &lt;strong&gt;from development to testing, no step ever verified the correctness of memory storage.&lt;/strong&gt; Business tests only ran single-threaded, single-process happy paths. Concurrency and restart scenarios were a complete blind spot. That’s the classic trap: you think everything is fine — until it blows up.&lt;/p&gt;




&lt;h2&gt;
  
  
  Designing the fix: why Pytest + Redis in three acts?
&lt;/h2&gt;

&lt;p&gt;Plugging this hole required more than just swapping the backend to Redis — we needed tests that could truly nail down consistency. My goal was clear: &lt;strong&gt;write a set of Pytest tests that automatically verify three critical behaviors — isolation, persistence, and concurrency safety.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The tech choices were straightforward:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Redis as the memory backend&lt;/strong&gt;: In production, multiple Pods inside K8s must share state. Redis is the infrastructure our team knows best, and it has the officially maintained &lt;code&gt;RedisChatMessageHistory&lt;/code&gt; (in &lt;code&gt;langchain-community&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pytest + pytest-asyncio as the test framework&lt;/strong&gt;: Our project is an async FastAPI service, so tests must be async too. &lt;code&gt;pytest-asyncio&lt;/code&gt; makes it easy to simulate concurrent scenarios.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No mocking, real Redis connection&lt;/strong&gt;: Memory consistency problems hide exactly in real network calls, serialization, and connection pooling. Mocking those would be self-deception. We use an isolated test Redis database (db=15) and flush it after each test — no interference.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Why not just add logging in production and observe? Because “occasionally lost” bugs drown in a sea of normal requests. By the time you locate the affected session, the customer is already gone. A test written before the incident is ten thousand times cheaper than firefighting afterward.&lt;/p&gt;




&lt;h2&gt;
  
  
  Core implementation: three tests that lock down memory consistency
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. First, set up the test infrastructure (conftest.py)
&lt;/h3&gt;

&lt;p&gt;This code ensures every test gets a clean Redis memory backend:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# conftest.py
&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pytest&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;redis&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_community.chat_message_histories&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;RedisChatMessageHistory&lt;/span&gt;

&lt;span class="n"&gt;REDIS_URL&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;redis://localhost:6379/15&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;  &lt;span class="c1"&gt;# dedicated test db
&lt;/span&gt;
&lt;span class="nd"&gt;@pytest.fixture&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;redis_client&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Create a Redis client, clear data after the test&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_url&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;REDIS_URL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;decode_responses&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;
    &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;flushdb&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="c1"&gt;# clean isolation for each test
&lt;/span&gt;    &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="nd"&gt;@pytest.fixture&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;make_history&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Factory: create memory instance with a given session_id&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;RedisChatMessageHistory&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;RedisChatMessageHistory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;REDIS_URL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;ttl&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3600&lt;/span&gt;  &lt;span class="c1"&gt;# matches production; expiration also causes "loss"
&lt;/span&gt;        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;_make&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Test 1: Isolation — conversation A must never leak into conversation B
&lt;/h3&gt;

&lt;p&gt;This test verifies that memories from different &lt;code&gt;session_id&lt;/code&gt;s are completely isolated, catching low-level but deadly bugs like key concatenation errors or wrong Redis db writes.&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
python
# test_memory_isolation.py
import pytest

@pytest.mark.asyncio
async def test_memory_isolation(make_history):
    """Two different sessions should not see each other's messages"""
    history_a = make_history("session_a")
    history_b = make_history("session_b")

    # Simulate session A: customer mentions an order number
    await history_a.aadd_user_message("我的订单号是 8892")
    await history_a.aadd_ai_message("好的，正在为您查询订单 8892")

    # Simulate session B: completely different topic
    await history_b.aadd_user_message("我想退货")
    await history_b.aadd_ai_message("请提供您的手机号")

    # Assert: session A must have absolutely zero content from B
    msgs_a = await history_a.aget_messages()
    content_a = " ".join([m.content for m in msgs_a])
    assert "退货" not in content_a, "A 会话泄漏了 B 的对话内容！"
    assert "8892" in content_a, "A 会话自身的消
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
      <category>python</category>
      <category>programming</category>
    </item>
    <item>
      <title>I Used pytest for 3 Years Before Realizing I Never Actually Tested LLM Memory Persistence</title>
      <dc:creator>BAOFUFAN</dc:creator>
      <pubDate>Fri, 26 Jun 2026 01:03:25 +0000</pubDate>
      <link>https://dev.to/_eb7f2a654e97a60ae9f96e/i-used-pytest-for-3-years-before-realizing-i-never-actually-tested-llm-memory-persistence-23pj</link>
      <guid>https://dev.to/_eb7f2a654e97a60ae9f96e/i-used-pytest-for-3-years-before-realizing-i-never-actually-tested-llm-memory-persistence-23pj</guid>
      <description>&lt;p&gt;It was 2:30 a.m. when a user complaint exploded on my phone: “Your chatbot has amnesia again—it forgot everything we talked about three days ago!” I scrambled out of bed, dug through the logs, and discovered that a seemingly harmless “optimize Redis serialization” PR from the previous week had silently caused the legacy message loader to throw an exception whenever it encountered the new structure. Worse, our live memory storage module had zero coverage for that regression path. The irony? I had been writing pytest for three years, but every single test mocked Redis, mocked the storage layer, and only exercised business logic. I had never actually verified &lt;strong&gt;memory persistence and consistency regression&lt;/strong&gt; against a real database. After that night, I changed everything.&lt;/p&gt;

&lt;p&gt;This article is my blood-and-tears postmortem: how to build a reproducible, regression-proof integration test suite for LLM memory storage with pytest and Docker—so that “what you stored yesterday can be read back exactly today, and it survives a container restart tomorrow.”&lt;/p&gt;

&lt;h2&gt;
  
  
  Breaking down the problem: why mocks can’t save memory storage
&lt;/h2&gt;

&lt;p&gt;LLM memory storage sounds trivial: save conversations, load conversations. But in the real world, three tricky dimensions immediately surface:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Version compatibility of serialization/deserialization&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
You store a &lt;code&gt;Message&lt;/code&gt; object, but when you read it back, you deserialize it into a dict or a Pydantic model. The moment a timestamp format, a role enum value, or an extra metadata field changes, a mock-based test that fakes disk reads and writes will never notice—because the mock always returns idealized data you prepared in advance.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Concurrency consistency&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
A user can fire off several messages in rapid succession within the same session, triggering concurrent writes. Redis’s &lt;code&gt;HSET&lt;/code&gt; and &lt;code&gt;ZADD&lt;/code&gt; are atomic, but if you wrap them in a “read-then-write” merge step (e.g., merging a context window), you can easily get overlapping updates and lost messages. A pure unit test running single-threaded will never hit such race conditions, not even after 100 runs.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Persistence regression&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
After a Docker container restart, can the AOF/RDB files on the mounted volume restore everything correctly? Are you relying on Redis’s default save configuration, or did you happen to land exactly within the 900-second window where no snapshot triggered? If an ops colleague casually bumps the Redis image version or tweaks the persistence policy, previous conversations can vanish into thin air. Unless you actually kill the container and then read the data back, this pitfall stays forever in your blind spot.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The root cause is simple: &lt;strong&gt;memory problems are always compound failures of storage engine × access pattern × time.&lt;/strong&gt; Mocks amputate the storage engine, and manual testing removes the high-frequency repetition—you will never wait long enough for the failure to surface.&lt;/p&gt;

&lt;h2&gt;
  
  
  Designing the solution: real dependencies with Docker, sharp regression with pytest
&lt;/h2&gt;

&lt;p&gt;We need a real storage environment that can be created and destroyed on demand, and we need to run the exact same set of consistency and persistence test cases against it over and over.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Comparing the candidates:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Option A: Point your dev environment directly at a local Redis. Downside: local data gets polluted, you have to manually flush between tests, and your colleague’s Redis version may differ from yours—no guaranteed regression consistency.&lt;/li&gt;
&lt;li&gt;Option B: Bring up the full dependency stack with docker-compose and run pytest by hand. Downside: extra process management, not atomic enough, and tricky to achieve unit-test-level retries in CI.&lt;/li&gt;
&lt;li&gt;Option C (the one I settled on): &lt;strong&gt;testcontainers-python&lt;/strong&gt; to spin up Redis/Postgres containers on-demand inside pytest fixtures, automatically destroyed when the test finishes. Combine that with pytest’s &lt;code&gt;scope&lt;/code&gt; and &lt;code&gt;autouse&lt;/code&gt; to control container lifetimes, achieving isolation while keeping startup costs reasonable.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Why not &lt;code&gt;pytest-docker&lt;/code&gt;? It’s better suited for running a long-lived container that you start once in CI. testcontainers, on the other hand, lets you treat the database as part of the test itself, dynamically created. When writing tests, you don’t have to care whether the external environment is already prepared, and parameterization becomes far easier—e.g., running the same test against Redis 6 and Redis 7.&lt;/p&gt;

&lt;p&gt;Architecturally, we define an abstract &lt;code&gt;MemoryStore&lt;/code&gt; base class. If we later swap in Postgres/pgvector, we can reuse the exact same consistency test suite. The key principle: &lt;strong&gt;tests don’t care about implementation details; they care about the contract.&lt;/strong&gt; What you write is what you should read back; after a restart, historical data must not be lost.&lt;/p&gt;

&lt;h2&gt;
  
  
  Core implementation: inviting the Redis container into a pytest fixture
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Abstract memory storage interface
&lt;/h3&gt;

&lt;p&gt;This snippet decouples the tests from the concrete storage implementation—we only need to define the contract for reading, writing, and persistence behavior.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# llm_memory/store.py
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;abc&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ABC&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;abstractmethod&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;dataclasses&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;dataclass&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;asdict&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;

&lt;span class="nd"&gt;@dataclass&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;          &lt;span class="c1"&gt;# 'user' | 'assistant' | 'system'
&lt;/span&gt;    &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;   &lt;span class="c1"&gt;# epoch second, 保证跨语言兼容
&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MemoryStore&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ABC&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nd"&gt;@abstractmethod&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;append_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Message&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="bp"&gt;...&lt;/span&gt;

    &lt;span class="nd"&gt;@abstractmethod&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_history&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Message&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="bp"&gt;...&lt;/span&gt;

    &lt;span class="nd"&gt;@abstractmethod&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;close&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="bp"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. RedisMemoryStore implementation—must handle serialization edges
&lt;/h3&gt;

&lt;p&gt;This piece addresses real-storage serialization and deserialization consistency. We deliberately use JSON instead of pickle because pickle has terrible cross-version compatibility and is unsafe.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# llm_memory/redis_store.py
&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;redis&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;llm_memory.store&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;MemoryStore&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Message&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;RedisMemoryStore&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;MemoryStore&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Redis&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                                  &lt;span class="n"&gt;decode_responses&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_session_key&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;session:&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;:messages&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;append_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;se&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



</description>
      <category>python</category>
      <category>programming</category>
    </item>
  </channel>
</rss>
