<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Sayantan Bhattacharjee</title>
    <description>The latest articles on DEV Community by Sayantan Bhattacharjee (@sayantan_bhattacharjee_1a).</description>
    <link>https://dev.to/sayantan_bhattacharjee_1a</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3885472%2Fca90f857-62c0-4fd7-84a1-27f7d68ee7f4.jpg</url>
      <title>DEV Community: Sayantan Bhattacharjee</title>
      <link>https://dev.to/sayantan_bhattacharjee_1a</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/sayantan_bhattacharjee_1a"/>
    <language>en</language>
    <item>
      <title>RecallPal</title>
      <dc:creator>Sayantan Bhattacharjee</dc:creator>
      <pubDate>Sat, 18 Apr 2026 04:26:46 +0000</pubDate>
      <link>https://dev.to/sayantan_bhattacharjee_1a/recallpal-1c9a</link>
      <guid>https://dev.to/sayantan_bhattacharjee_1a/recallpal-1c9a</guid>
      <description>&lt;h1&gt;
  
  
  I Built a Real-Time Face Recognition Memory Aid for Dementia Patients — Here's How the Memory Layer Actually Works
&lt;/h1&gt;

&lt;p&gt;My grandmother stopped recognizing my father about two years before she passed. She knew someone was in the room. She could tell he cared about her. But the name, the relationship, the decades of shared history — gone. That experience sat with me for a long time.&lt;/p&gt;

&lt;p&gt;So when I started thinking about what kind of system I actually wanted to build, the answer was straightforward: something that bridges that gap in real time. Not a research prototype. A working system that a caregiver could set up in an afternoon and that would tell a dementia patient — quietly, clearly — who just walked into the room and why they matter.&lt;/p&gt;

&lt;p&gt;The result is &lt;strong&gt;Dementia Assist&lt;/strong&gt;: a real-time face recognition system backed by a persistent memory layer that stores not just &lt;em&gt;who&lt;/em&gt; someone is, but &lt;em&gt;what you know about them&lt;/em&gt; — their relationship to the patient, their interests, how long it's been since their last visit, and a dynamically generated conversation starter to help break the ice.&lt;/p&gt;




&lt;h2&gt;
  
  
  How the System Hangs Together
&lt;/h2&gt;

&lt;p&gt;The architecture is deliberately simple. A Next.js frontend captures webcam frames once per second and POSTs them as base64 JPEGs to a Flask backend. The backend runs them through a FaceNet embedding model via DeepFace, compares the result against a pickled embedding database, and — if there's a match — queries a memory store for everything we know about that person. The response comes back with a name, a confidence score, relationship context, and a suggestion like &lt;em&gt;"Ask Sarah about her painting class."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The stack:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Face recognition&lt;/strong&gt;: DeepFace + FaceNet (128-dimensional embeddings, Euclidean distance matching)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Memory&lt;/strong&gt;: &lt;a href="https://github.com/vectorize-io/hindsight" rel="noopener noreferrer"&gt;Hindsight&lt;/a&gt; Cloud with a local JSON fallback&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Backend&lt;/strong&gt;: Python 3.9, Flask, flask-cors&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Frontend&lt;/strong&gt;: Next.js 14, TypeScript, Tailwind CSS&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Voice&lt;/strong&gt;: Web Speech API for audio announcements when a face is recognized&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The whole thing starts with a single script — &lt;code&gt;./run.sh&lt;/code&gt; on Linux/macOS, &lt;code&gt;run.bat&lt;/code&gt; on Windows — which installs dependencies, starts both servers, and opens the browser. I wanted caregivers to be able to run this without touching a terminal again after setup.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Core Technical Problem: Recognition Alone Is Not Enough
&lt;/h2&gt;

&lt;p&gt;Early on I made a mistake that I think a lot of people make when building systems like this: I treated face recognition as the hard problem and memory as an afterthought.&lt;/p&gt;

&lt;p&gt;Recognition is actually fairly tractable. DeepFace with FaceNet gives you solid 128-d embeddings. You store them in a pickle file, compute Euclidean distances at inference time, set a threshold, done. What I underestimated was &lt;em&gt;what you do with the name once you have it.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;A dementia patient doesn't just need to know a name. They need context. &lt;em&gt;Who is this person? When did I last see them? What do we talk about?&lt;/em&gt; If you flash "Sarah" on a screen with no surrounding information, you've built a party trick, not an aid.&lt;/p&gt;

&lt;p&gt;That's where I knew I needed &lt;a href="https://vectorize.io/what-is-agent-memory" rel="noopener noreferrer"&gt;agent memory&lt;/a&gt; — not a database query, but something that could store rich, evolving context about each person and retrieve it intelligently at recognition time.&lt;/p&gt;

&lt;p&gt;I looked at a few options and landed on &lt;a href="https://hindsight.vectorize.io/" rel="noopener noreferrer"&gt;Hindsight&lt;/a&gt; for the memory layer. What sold me was the combination of a clean Python client, an isolated bank-per-application model, and the fact that it degrades gracefully — if you don't have an API key, it falls back to a local JSON store with the exact same interface. That meant I could build and test everything offline without mocking anything out.&lt;/p&gt;




&lt;h2&gt;
  
  
  How Memory Storage and Retrieval Actually Works
&lt;/h2&gt;

&lt;p&gt;Each person in the system has two representations:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;A face embedding&lt;/strong&gt; (or several) stored in &lt;code&gt;face_db.pkl&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A memory record&lt;/strong&gt; stored in Hindsight (or &lt;code&gt;memory_store.json&lt;/code&gt; locally)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The memory record looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;PREDEFINED_PEOPLE&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Sayantan&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Sayantan is a 21-year-old computer science student interested in AI and &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;coding. He enjoys chess and music and often helps with technical work.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;age&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;      &lt;span class="mi"&gt;21&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;relation&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Friend&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;likes&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AI&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Coding&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Chess&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Music&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;notes&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Helpful and calm personality&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="bp"&gt;...&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On startup, &lt;code&gt;seed_initial_data()&lt;/code&gt; checks whether each predefined person already exists in Hindsight and skips them if so — idempotent seeding, no duplicates. New people enrolled live go through the same &lt;code&gt;store_person()&lt;/code&gt; path:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;store_person&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;relation&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;notes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;age&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;likes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;now&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;_now_iso&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;existing&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_store&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;people&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{})&lt;/span&gt;
    &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_store&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;people&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;       &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;age&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;        &lt;span class="n"&gt;age&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;relation&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;   &lt;span class="n"&gt;relation&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;likes&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;      &lt;span class="n"&gt;likes&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;notes&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;      &lt;span class="n"&gt;notes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;last_seen&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="n"&gt;existing&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;last_seen&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;first_seen&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;existing&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;first_seen&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;added&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;      &lt;span class="n"&gt;existing&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;added&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One thing I got wrong initially: I was storing names with their original casing in the face database but querying Hindsight with whatever case the recognition engine returned. This caused silent misses — the face would be recognized, the Hindsight lookup would return nothing, and the UI would show a name with no context. The fix was straightforward but important: &lt;strong&gt;normalize everything to lowercase at every write path&lt;/strong&gt;. The face engine now lowercases all keys on load and on every &lt;code&gt;add_person()&lt;/code&gt; call.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Recognition Pipeline: Crop First, Embed Second
&lt;/h2&gt;

&lt;p&gt;The other place I lost hours was "No face detected" errors from DeepFace. The original approach sent the full webcam frame directly to &lt;code&gt;DeepFace.represent()&lt;/code&gt; — which works fine when the person is close and centered, and fails silently when they're not.&lt;/p&gt;

&lt;p&gt;The fix was to introduce an OpenCV Haar cascade as a pre-processing step. The cascade detects the face first, crops it with 20% padding, and only then hands the region to DeepFace:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_crop_largest_face&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;frame&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ndarray&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ndarray&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="n"&gt;gray&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cv2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cvtColor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;frame&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cv2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;COLOR_BGR2GRAY&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;faces&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;_FACE_CASCADE&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;detectMultiScale&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;gray&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scaleFactor&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;1.1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;minNeighbors&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;minSize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;faces&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
    &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;h&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;faces&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;pad_x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pad_y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.20&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;h&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.20&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;frame&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;pad_y&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;&lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;frame&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;h&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;pad_y&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;pad_x&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;&lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;frame&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;pad_x&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With &lt;code&gt;enforce_detection=False&lt;/code&gt; on the cropped input (since the cascade already confirmed a face exists), DeepFace failure rates dropped dramatically. Separately, matching was improved by comparing the query embedding against the &lt;em&gt;average&lt;/em&gt; embedding per person rather than against each stored sample individually — this reduces noise from outlier captures significantly.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Suggestion Engine
&lt;/h2&gt;

&lt;p&gt;Once a face is matched and memory is retrieved, the backend generates a conversation starter. The logic is layered:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Recency first&lt;/strong&gt;: if the patient hasn't seen this person in over a month, say so — &lt;em&gt;"You haven't seen Sarah in over a month. They might have news to share!"&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Likes second&lt;/strong&gt;: pick a random item from the &lt;code&gt;likes&lt;/code&gt; list — &lt;em&gt;"Ask Sarah about her painting class."&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Notes keyword scan&lt;/strong&gt;: if the notes mention "dog", "travel", "wedding", surface a relevant prompt.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Generic fallback&lt;/strong&gt;: &lt;em&gt;"Ask Sarah about their day."&lt;/em&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;generate_suggestion&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;memory&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;memory&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Ask &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; about their day&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;last_seen_dt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;_parse_last_seen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;memory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;last_seen&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;last_seen_dt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;delta_days&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;timezone&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;utc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;last_seen_dt&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;days&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;delta_days&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;You haven&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;t seen &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; in over a month. They might have news to share!&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;likes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;memory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;likes&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;likes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Ask &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; about &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;choice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;likes&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="bp"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is deliberately simple. The goal isn't to be clever — it's to give a patient a single, low-friction way to start a conversation without having to remember context they no longer have.&lt;/p&gt;




&lt;h2&gt;
  
  
  What a Real Interaction Looks Like
&lt;/h2&gt;

&lt;p&gt;A caregiver sets up the laptop in the living room. The patient's daughter walks in. Within about one second:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The camera crops her face, generates an embedding, matches it against the database&lt;/li&gt;
&lt;li&gt;The backend queries Hindsight for her memory record&lt;/li&gt;
&lt;li&gt;The UI displays: &lt;strong&gt;Sarah · Daughter · 87% match&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Below that: &lt;em&gt;"Ask Sarah about her painting class"&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;The Web Speech API reads aloud: &lt;em&gt;"This is Sarah, your daughter."&lt;/em&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If someone completely new walks in, the UI shows an amber badge: &lt;em&gt;"Unknown person detected."&lt;/em&gt; A caregiver can click "Add This Person," capture 5–10 photos from different angles using the in-modal webcam, fill in name, relationship, and notes, and the person is enrolled immediately. The next scan cycle will recognize them.&lt;/p&gt;




&lt;h2&gt;
  
  
  Lessons Learned
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. Memory is the product, recognition is the plumbing.&lt;/strong&gt; Face recognition is a solved problem with off-the-shelf tools. What makes a system like this genuinely useful is the richness of the context layer. Invest there.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Normalize your keys at the boundary.&lt;/strong&gt; The lowercase bug cost me a full afternoon. Any time you have two systems storing the same identifier (face DB and Hindsight in this case), pick a canonical form at the write path and never deviate from it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Crop before you embed.&lt;/strong&gt; Sending full frames to a model that expects a face is a reliability trap. A lightweight cascade detector as a pre-filter is cheap and it eliminates an entire class of errors.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Average embeddings beat per-sample matching.&lt;/strong&gt; A single bad capture — eyes closed, bad angle — will skew your nearest-neighbor result if you're comparing against all stored samples individually. Averaging collapses noise. It's one extra line per person on load and it makes matching noticeably more stable.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Graceful degradation is a feature, not a fallback.&lt;/strong&gt; The local JSON memory store isn't a hack — it's a first-class deployment mode. Some caregivers will not have API keys and will not create accounts. If your system requires a cloud service to function at all, you've locked out your most constrained users. Build the offline path properly from day one.&lt;/p&gt;




&lt;p&gt;The codebase is available on GitHub. If you're building anything in the assistive technology space and want to talk through the memory architecture, I'm reachable. There's a lot more to do here — multi-patient support, mobile-first UI, more nuanced memory retrieval — but the core loop works, and the people it's designed for don't have time to wait for perfect.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>mentalhealth</category>
      <category>showdev</category>
      <category>sideprojects</category>
    </item>
  </channel>
</rss>
