<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: wellallyTech</title>
    <description>The latest articles on DEV Community by wellallyTech (@wellallytech).</description>
    <link>https://dev.to/wellallytech</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2750397%2F35d7e98d-b131-48d3-838d-317b14dc6f7e.jpg</url>
      <title>DEV Community: wellallyTech</title>
      <link>https://dev.to/wellallytech</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/wellallytech"/>
    <language>en</language>
    <item>
      <title>Your Browser is the Doctor: Privacy-First Skin Screening with WebLLM &amp; WebGPU 🚀</title>
      <dc:creator>wellallyTech</dc:creator>
      <pubDate>Wed, 13 May 2026 01:30:00 +0000</pubDate>
      <link>https://dev.to/wellallytech/your-browser-is-the-doctor-privacy-first-skin-screening-with-webllm-webgpu-26j7</link>
      <guid>https://dev.to/wellallytech/your-browser-is-the-doctor-privacy-first-skin-screening-with-webllm-webgpu-26j7</guid>
      <description>&lt;p&gt;In the world of digital health, privacy isn't just a feature—it's a requirement. When dealing with sensitive medical data like dermatological photos, users are often (rightfully) hesitant to upload their images to a remote server. Enter &lt;strong&gt;Edge AI&lt;/strong&gt; and the revolution of &lt;strong&gt;WebGPU&lt;/strong&gt;. &lt;/p&gt;

&lt;p&gt;By leveraging &lt;strong&gt;WebLLM&lt;/strong&gt; and the &lt;strong&gt;TVM (Tensor Virtual Machine)&lt;/strong&gt; stack, we can now run sophisticated vision models directly inside the browser. This approach enables high-performance, real-time &lt;strong&gt;privacy-preserving AI&lt;/strong&gt; where the image never leaves the user's device. In this guide, we’ll explore how to implement a skin lesion screening tool using &lt;strong&gt;WebGPU&lt;/strong&gt; and &lt;strong&gt;TypeScript&lt;/strong&gt;, moving the heavy lifting from the cloud to the client's GPU.&lt;/p&gt;

&lt;h2&gt;
  
  
  🏗 The Architecture: High-Performance Edge Inference
&lt;/h2&gt;

&lt;p&gt;Traditional web-based AI often relies on slow API calls. Our solution uses the browser's hardware acceleration via &lt;strong&gt;WebGPU&lt;/strong&gt;, allowing us to execute compiled model kernels at near-native speeds.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;graph TD
    A[User Image/Camera] --&amp;gt; B{WebGPU Support?}
    B -- No --&amp;gt; C[Fallback: CPU/Wasm]
    B -- Yes --&amp;gt; D[Canvas API / Image Preprocessing]
    D --&amp;gt; E[WebLLM / TVM Runtime]
    E --&amp;gt; F[VLM / Vision Model Shards]
    F --&amp;gt; G[GPU-Accelerated Inference]
    G --&amp;gt; H[Screening Report &amp;amp; Insights]
    H --&amp;gt; I[UI Display]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  🛠 Prerequisites
&lt;/h2&gt;

&lt;p&gt;To follow along, you'll need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Tech Stack&lt;/strong&gt;: WebLLM, WebGPU-capable browser (Chrome 113+), TypeScript, and Vite.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;A Vision Model&lt;/strong&gt;: We’ll use a quantized version of a vision-language model (VLM) compatible with the TVM runtime.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  🚀 Step 1: Initializing the WebGPU Engine
&lt;/h2&gt;

&lt;p&gt;First, we need to check for WebGPU compatibility and initialize the &lt;code&gt;webLLM&lt;/code&gt; engine. Unlike standard REST APIs, we are loading the actual model weights into the browser's indexedDB or memory.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nx"&gt;webllm&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@mlc-ai/web-llm&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;initializeScreeningEngine&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;modelId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Llama-3-8B-Vision-Instruct-q4f16_1-MLC&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Example VLM&lt;/span&gt;

  &lt;span class="c1"&gt;// Progress callback to update the UI during heavy model download&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;initProgressCallback&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;report&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;webllm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;InitProgressReport&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`Loading Model: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;report&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; - &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;report&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;progress&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;&lt;span class="s2"&gt;%`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;};&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;engine&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;webllm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;CreateMLCEngine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nx"&gt;modelId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;initProgressCallback&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;engine&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  🖼 Step 2: Processing Pixels for the Model
&lt;/h2&gt;

&lt;p&gt;Skin screening requires high-fidelity input. We use the browser's &lt;code&gt;CanvasRenderingContext2D&lt;/code&gt; to resize and normalize the image before passing it to the WebGPU buffer.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;processImage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;imageElement&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;HTMLImageElement&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;canvas&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createElement&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;canvas&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;ctx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;canvas&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getContext&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2d&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="c1"&gt;// Standardize input size for the vision encoder&lt;/span&gt;
  &lt;span class="nx"&gt;canvas&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;width&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;448&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nx"&gt;canvas&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;height&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;448&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nx"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nf"&gt;drawImage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;imageElement&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;448&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;448&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="c1"&gt;// Convert to Base64 for WebLLM vision input&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;canvas&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toDataURL&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;image/jpeg&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  🧠 Step 3: Local Inference
&lt;/h2&gt;

&lt;p&gt;Now for the magic. We send the processed image and a prompt to our local model. Because the &lt;strong&gt;TVM runtime&lt;/strong&gt; has compiled the model specifically for the user's GPU architecture, this happens in milliseconds.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;runScreening&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;engine&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;webllm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;MLCEngine&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;imageBase64&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;webllm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ChatCompletionMessageParam&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;text&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Identify potential skin lesions in this image and provide a preliminary risk assessment.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;image_url&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;image_url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;imageBase64&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;];&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;reply&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;engine&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="nx"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;temperature&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// Keep it deterministic for medical screening&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;reply&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  💡 The "Official" Way to Scale
&lt;/h2&gt;

&lt;p&gt;While building a prototype in the browser is exciting, productionizing Edge AI requires handling model versioning, weight sharding, and cross-device performance optimization. &lt;/p&gt;

&lt;p&gt;For advanced implementation patterns, performance benchmarks on different GPU architectures, and production-ready Edge AI templates, I highly recommend checking out the technical deep dives at &lt;strong&gt;&lt;a href="https://www.wellally.tech/blog" rel="noopener noreferrer"&gt;WellAlly Tech Blog&lt;/a&gt;&lt;/strong&gt;. It's an incredible resource for developers looking to bridge the gap between "cool demo" and "robust healthcare application."&lt;/p&gt;

&lt;h2&gt;
  
  
  📈 Optimization &amp;amp; Benchmarking
&lt;/h2&gt;

&lt;p&gt;Using &lt;strong&gt;TypeScript&lt;/strong&gt; and &lt;strong&gt;TVM&lt;/strong&gt;, we observed that once the model is cached in the browser's &lt;code&gt;CacheStorage&lt;/code&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Cold Start&lt;/strong&gt;: 5-10 seconds (Model loading).&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Inference Time&lt;/strong&gt;: ~200ms - 800ms (depending on GPU).&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Data Egress&lt;/strong&gt;: 0KB (Completely private).&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  🎯 Conclusion
&lt;/h2&gt;

&lt;p&gt;The browser is no longer just a document viewer; it's a powerful AI execution environment. By combining &lt;strong&gt;WebLLM&lt;/strong&gt; and &lt;strong&gt;WebGPU&lt;/strong&gt;, we can build healthcare tools that are fast, cost-effective, and—most importantly—private by design.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What's next?&lt;/strong&gt; &lt;br&gt;
Try integrating this with a mobile PWA to create a "Skin Journal" app that alerts users to changes in their skin over time, all without a single server-side database. &lt;/p&gt;

&lt;p&gt;🥑 &lt;strong&gt;Found this helpful?&lt;/strong&gt; Follow me for more "Learning in Public" notes on Edge AI, and don't forget to visit &lt;strong&gt;&lt;a href="https://www.wellally.tech/blog" rel="noopener noreferrer"&gt;WellAlly Tech&lt;/a&gt;&lt;/strong&gt; for more high-level architecture insights!&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webgpu</category>
      <category>webllm</category>
      <category>typescript</category>
    </item>
    <item>
      <title>Your Heartbeat, Your Privacy: Running Fine-Tuned Llama-3 on Mac with Apple MLX 🍎</title>
      <dc:creator>wellallyTech</dc:creator>
      <pubDate>Tue, 12 May 2026 01:20:00 +0000</pubDate>
      <link>https://dev.to/wellallytech/your-heartbeat-your-privacy-running-fine-tuned-llama-3-on-mac-with-apple-mlx-4h6n</link>
      <guid>https://dev.to/wellallytech/your-heartbeat-your-privacy-running-fine-tuned-llama-3-on-mac-with-apple-mlx-4h6n</guid>
      <description>&lt;p&gt;Data privacy in healthcare isn't just a "nice-to-have" feature; it's a fundamental right. When dealing with sensitive medical data—from heart rate variability to personal diagnostic logs—sending this information to a cloud-based API can feel like a gamble. This is where &lt;strong&gt;Edge AI&lt;/strong&gt; and &lt;strong&gt;Local LLMs&lt;/strong&gt; change the game. By leveraging the power of &lt;strong&gt;Apple Silicon&lt;/strong&gt; and the &lt;strong&gt;Apple MLX framework&lt;/strong&gt;, you can now run production-grade, medically fine-tuned models like &lt;strong&gt;Llama-3-8B&lt;/strong&gt; directly on your MacBook.&lt;/p&gt;

&lt;p&gt;In this tutorial, we will explore how to implement a high-performance local inference pipeline. We’ll focus on using &lt;strong&gt;LoRA (Low-Rank Adaptation)&lt;/strong&gt; for domain-specific medical tasks and utilize the unified memory architecture of M1/M2/M3 chips to achieve lightning-fast response times without a single byte leaving your machine. If you're looking for &lt;strong&gt;Edge AI privacy&lt;/strong&gt; solutions or &lt;strong&gt;Apple MLX optimization&lt;/strong&gt; techniques, you're in the right place.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Architecture: Why MLX?
&lt;/h2&gt;

&lt;p&gt;Traditional AI frameworks like PyTorch or TensorFlow are great, but they aren't optimized for the unified memory architecture of Apple Silicon. &lt;strong&gt;MLX&lt;/strong&gt;, designed by Apple's silicon team, allows the CPU and GPU to share the same memory pool, eliminating the bottleneck of moving data between devices.&lt;/p&gt;

&lt;h3&gt;
  
  
  Local Medical AI Flow
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;graph TD
    A[User Input: Heartbeat/Health Data] --&amp;gt; B{Privacy Filter}
    B --&amp;gt;|Stay Local| C[Apple MLX Runtime]
    C --&amp;gt; D[Llama-3-8B Base Model]
    E[Medical LoRA Adapters] --&amp;gt; D
    D --&amp;gt; F[Local Unified Memory - GPU/CPU]
    F --&amp;gt; G[Instant Medical Insight]
    G --&amp;gt; H[Encrypted Local Storage]
    subgraph Apple Silicon Mac
    C
    D
    E
    F
    end
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;To follow along, ensure your setup meets these requirements:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Hardware&lt;/strong&gt;: A Mac with M1, M2, or M3 chip (16GB RAM recommended).&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Environment&lt;/strong&gt;: Python 3.10+, &lt;code&gt;pip&lt;/code&gt;, and &lt;code&gt;huggingface-cli&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Tech Stack&lt;/strong&gt;: &lt;code&gt;Apple MLX&lt;/code&gt;, &lt;code&gt;Llama-3-8B&lt;/code&gt;, &lt;code&gt;LoRA/QLoRA&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Step 1: Setting up the MLX Environment 🛠️
&lt;/h2&gt;

&lt;p&gt;First, let's create a dedicated environment and install the necessary libraries. MLX is rapidly evolving, so staying updated is key.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Create a virtual environment&lt;/span&gt;
python &lt;span class="nt"&gt;-m&lt;/span&gt; venv mlx_env
&lt;span class="nb"&gt;source &lt;/span&gt;mlx_env/bin/activate

&lt;span class="c"&gt;# Install MLX and dependencies&lt;/span&gt;
pip &lt;span class="nb"&gt;install &lt;/span&gt;mlx-lm huggingface_hub hf_transfer
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 2: Converting and Loading Llama-3
&lt;/h2&gt;

&lt;p&gt;Llama-3-8B is a powerhouse, but to run it efficiently on a Mac, we typically use 4-bit quantization (QLoRA). We will load a pre-converted MLX model or convert a standard Llama-3 weights file.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;mlx_lm&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;load&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;generate&lt;/span&gt;

&lt;span class="c1"&gt;# Loading the model and tokenizer
# You can use a medical-fine-tuned Llama-3 from Hugging Face
&lt;/span&gt;&lt;span class="n"&gt;model_path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mlx-community/Meta-Llama-3-8B-Instruct-4bit&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; 
&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tokenizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model_path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;✅ Model loaded successfully into Unified Memory!&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 3: Integrating Medical LoRA Adapters
&lt;/h2&gt;

&lt;p&gt;For "Medical Knowledge," we don't just want a general-purpose model. We want one that understands clinical terminology. We can swap in "adapters" (LoRA) that have been trained on medical datasets like PubMed.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# logic to include local adapters if you have them fine-tuned
# MLX-LM makes it easy to apply adapters during generation
&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Interpret this heart rate data: 110bpm at rest, history of hypertension.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;formatted_prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;|begin_of_text|&amp;gt;&amp;lt;|start_header_id|&amp;gt;user&amp;lt;|end_header_id|&amp;gt;&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;|eot_id|&amp;gt;&amp;lt;|start_header_id|&amp;gt;assistant&amp;lt;|end_header_id|&amp;gt;&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;tokenizer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;formatted_prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;temp&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt; &lt;span class="c1"&gt;# Low temperature for medical consistency
&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Medical Analysis: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Advanced Patterns: Going Beyond Local Scripts
&lt;/h2&gt;

&lt;p&gt;Running a script is one thing; building a production-ready health app is another. When you need to scale these edge solutions or integrate them into enterprise workflows, you need to consider state management, model versioning, and secure data orchestration.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;🥑 &lt;strong&gt;Pro-Tip&lt;/strong&gt;: For more production-ready examples and advanced patterns regarding local-first AI architectures, check out the deep dives over at &lt;a href="https://www.wellally.tech/blog" rel="noopener noreferrer"&gt;WellAlly Tech Blog&lt;/a&gt;. They provide excellent resources on bridging the gap between experimental notebooks and robust AI infrastructure.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Step 4: Quantization for Efficiency
&lt;/h2&gt;

&lt;p&gt;If you're running on a MacBook Air with 8GB or 16GB of RAM, every bit counts. MLX allows you to quantize models yourself to find the "sweet spot" between accuracy and memory usage.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Example command for 4-bit quantization&lt;/span&gt;
python &lt;span class="nt"&gt;-m&lt;/span&gt; mlx_lm.convert &lt;span class="nt"&gt;--hf-path&lt;/span&gt; meta-llama/Meta-Llama-3-8B &lt;span class="nt"&gt;--q-bits&lt;/span&gt; 4
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Why This Matters for the Future 🚀
&lt;/h2&gt;

&lt;p&gt;By keeping the data on the &lt;strong&gt;edge&lt;/strong&gt;, we solve three major problems:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Privacy&lt;/strong&gt;: Zero data leaves the device.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Latency&lt;/strong&gt;: No network round-trips to a server in Virginia.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Cost&lt;/strong&gt;: Why pay $0.01 per token when your M3 Max can do it for free while you sleep?&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Local LLMs are no longer a pipe dream for Mac users. With Apple MLX and Llama-3, we have the tools to build empathetic, intelligent, and most importantly, &lt;strong&gt;private&lt;/strong&gt; medical assistants. &lt;/p&gt;

&lt;p&gt;What are you planning to build with local AI? Whether it's a private therapist, a heart-health monitor, or a secure document analyzer, the power is now literally in your hands.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Drop a comment below with your thoughts or questions, and don't forget to star the MLX repo!&lt;/strong&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;For more technical insights on Edge AI and privacy-first development, visit &lt;a href="https://www.wellally.tech/blog" rel="noopener noreferrer"&gt;wellally.tech/blog&lt;/a&gt;.&lt;/em&gt; 💻✨&lt;/p&gt;

</description>
      <category>ai</category>
      <category>python</category>
      <category>privacy</category>
      <category>webdev</category>
    </item>
    <item>
      <title>From Messy Med-Notes to Clinical Insights: Building an AI-Powered EMR with FHIR and LlamaIndex 🚀</title>
      <dc:creator>wellallyTech</dc:creator>
      <pubDate>Mon, 11 May 2026 01:20:00 +0000</pubDate>
      <link>https://dev.to/wellallytech/from-messy-med-notes-to-clinical-insights-building-an-ai-powered-emr-with-fhir-and-llamaindex-1ddb</link>
      <guid>https://dev.to/wellallytech/from-messy-med-notes-to-clinical-insights-building-an-ai-powered-emr-with-fhir-and-llamaindex-1ddb</guid>
      <description>&lt;p&gt;Have you ever tried to make sense of a decade's worth of personal medical records? Between the cryptic PDFs, various lab result formats, and scattered doctor's notes, it's a data engineering nightmare. In the world of &lt;strong&gt;Precision Medicine&lt;/strong&gt;, the gap between raw data and actionable insights is huge. &lt;/p&gt;

&lt;p&gt;Today, we’re going to bridge that gap. We'll build a sophisticated &lt;strong&gt;Personal Electronic Medical Record (EMR) Vector Store&lt;/strong&gt; using the &lt;strong&gt;FHIR Standard&lt;/strong&gt;, &lt;strong&gt;LlamaIndex&lt;/strong&gt;, and &lt;strong&gt;Qdrant&lt;/strong&gt;. We aren't just doing basic RAG (Retrieval-Augmented Generation); we’re diving into structured medical data cleaning and &lt;strong&gt;Hybrid Search&lt;/strong&gt; optimization to ensure that when you ask about your "HbA1c levels," the AI doesn't hallucinate a random number.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Pro-Tip&lt;/strong&gt;: Building production-grade healthcare AI requires more than just a &lt;code&gt;VectorStoreIndex&lt;/code&gt;. For advanced medical data patterns and production-ready RAG architectures, I highly recommend checking out the deep dives over at &lt;a href="https://www.wellally.tech/blog" rel="noopener noreferrer"&gt;WellAlly Blog&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  The Architecture: From FHIR to Embeddings 🏗️
&lt;/h2&gt;

&lt;p&gt;The biggest challenge in medical AI is &lt;strong&gt;interoperability&lt;/strong&gt;. We use the &lt;strong&gt;HL7 FHIR (Fast Healthcare Interoperability Resources)&lt;/strong&gt; standard to ensure our data has a predictable schema before it hits the vector database.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;graph TD
    A[Raw Medical Data/PDFs] --&amp;gt; B{FHIR Converter}
    B --&amp;gt;|Structured JSON| C[Data Cleaning &amp;amp; Normalization]
    C --&amp;gt; D[LlamaIndex Ingestion Pipeline]
    D --&amp;gt; E[Embedding Model: MedCPT/OpenAI]
    E --&amp;gt; F[(Qdrant Vector Store)]
    G[User Query] --&amp;gt; H[Hybrid Search Logic]
    H --&amp;gt;|Semantic| F
    H --&amp;gt;|Keyword/Metadata| F
    F --&amp;gt; I[Context-Augmented Response]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Step 1: Parsing the FHIR Standard 🧬
&lt;/h2&gt;

&lt;p&gt;FHIR organizes data into "Resources" (e.g., &lt;code&gt;Patient&lt;/code&gt;, &lt;code&gt;Observation&lt;/code&gt;, &lt;code&gt;Condition&lt;/code&gt;). Instead of dumping a giant JSON into a vector store, we need to extract the "Human Readable" narrative and the "Coded" clinical values.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;fhir.resources.observation&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Observation&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;transform_fhir_to_readable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fhir_json&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Extracts clinical meaning from FHIR Observation resources.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;obs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Observation&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse_obj&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fhir_json&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Extracting the 'What' and the 'Value'
&lt;/span&gt;    &lt;span class="n"&gt;test_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;obs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;code&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;coding&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;display&lt;/span&gt;
    &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;obs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;valueQuantity&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;obs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;valueQuantity&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;N/A&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;unit&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;obs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;valueQuantity&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;unit&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;obs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;valueQuantity&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;
    &lt;span class="n"&gt;date&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;obs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;effectiveDateTime&lt;/span&gt;

    &lt;span class="c1"&gt;# Create a dense string for the LLM to understand context
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Observation: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;test_name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; measured on &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;date&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;. Result: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;unit&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="c1"&gt;# Example usage
&lt;/span&gt;&lt;span class="n"&gt;raw_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;resourceType&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Observation&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;code&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;coding&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;display&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Glucose&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}]},&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;valueQuantity&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;value&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;95&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;unit&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mg/dL&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}}&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;transform_fhir_to_readable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;raw_data&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Step 2: High-Performance Indexing with Qdrant ⚡
&lt;/h2&gt;

&lt;p&gt;Medical queries require extreme precision. We’ll use &lt;strong&gt;Qdrant&lt;/strong&gt; as our vector database because of its robust support for payload filtering and hybrid search.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;llama_index.core&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;StorageContext&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;VectorStoreIndex&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;llama_index.vector_stores.qdrant&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;QdrantVectorStore&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;qdrant_client&lt;/span&gt;

&lt;span class="c1"&gt;# 1. Initialize Qdrant Client
&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;qdrant_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;QdrantClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;localhost&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;6333&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# 2. Setup Vector Store with LlamaIndex
&lt;/span&gt;&lt;span class="n"&gt;vector_store&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;QdrantVectorStore&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;collection_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;personal_emr&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;storage_context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;StorageContext&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_defaults&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vector_store&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;vector_store&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# 3. Ingest cleaned FHIR documents
# (Assuming 'documents' is a list of LlamaIndex Document objects)
&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;VectorStoreIndex&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_documents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;storage_context&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;storage_context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;show_progress&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Step 3: Hybrid Search Tuning for Medical Terms 🔍
&lt;/h2&gt;

&lt;p&gt;Pure semantic search (vector distance) often fails on specific medical codes like &lt;code&gt;ICD-10&lt;/code&gt; or &lt;code&gt;LOINC&lt;/code&gt;. If you search for "Type 2 Diabetes," a vector search might return general "health" articles. We need &lt;strong&gt;Hybrid Search&lt;/strong&gt; (Dense Vector + BM25/Sparse Vector).&lt;/p&gt;

&lt;p&gt;In LlamaIndex, we can optimize the retriever:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;llama_index.core.retrievers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;VectorIndexRetriever&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;llama_index.core.query_engine&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;RetrieverQueryEngine&lt;/span&gt;

&lt;span class="c1"&gt;# Create a retriever with metadata filtering and hybrid search
&lt;/span&gt;&lt;span class="n"&gt;retriever&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;VectorIndexRetriever&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;similarity_top_k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="c1"&gt;# We can filter by 'category' like 'Lab Results' or 'Medications'
&lt;/span&gt;    &lt;span class="c1"&gt;# filters=MetadataFilters(...) 
&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;query_engine&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;RetrieverQueryEngine&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_args&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;retriever&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;response_mode&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;compact&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; 
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;query_engine&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What are my latest blood glucose trends?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Why this matters for Precision Medicine 🥑
&lt;/h2&gt;

&lt;p&gt;By transforming unstructured data into a FHIR-compliant vector store, we enable:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Longitudinal Analysis&lt;/strong&gt;: Tracking symptoms over years, not just days.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Cross-Reference Checks&lt;/strong&gt;: Instantly checking if a new prescription conflicts with a condition buried in a note from 2018.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Data Sovereignty&lt;/strong&gt;: You own the vector store; you control your health narrative.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;For those looking to scale this into a production environment—handling millions of medical records or implementing HIPAA-compliant HIPAA-compliant RAG—the &lt;a href="https://www.wellally.tech/blog" rel="noopener noreferrer"&gt;WellAlly Blog&lt;/a&gt; offers fantastic resources on advanced prompt engineering and orchestration for healthcare systems.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion 🏁
&lt;/h2&gt;

&lt;p&gt;We’ve moved past the "PDF-to-Text" basics. By leveraging &lt;strong&gt;FHIR&lt;/strong&gt; for data integrity and &lt;strong&gt;LlamaIndex + Qdrant&lt;/strong&gt; for semantic retrieval, we’ve built the foundation for a truly intelligent personal health assistant. &lt;/p&gt;

&lt;p&gt;Next steps? Try adding an agentic layer that can calculate BMI trends or flag abnormal results automatically! &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What's your biggest challenge with medical data? Let's discuss in the comments! 👇&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>rag</category>
      <category>python</category>
      <category>opensource</category>
      <category>dataengineering</category>
    </item>
    <item>
      <title>Quantified Self: Building a Real-Time Health Dashboard with RedisTimeSeries, Go, and Grafana 🚀</title>
      <dc:creator>wellallyTech</dc:creator>
      <pubDate>Sun, 10 May 2026 01:30:00 +0000</pubDate>
      <link>https://dev.to/wellallytech/quantified-self-building-a-real-time-health-dashboard-with-redistimeseries-go-and-grafana-39a1</link>
      <guid>https://dev.to/wellallytech/quantified-self-building-a-real-time-health-dashboard-with-redistimeseries-go-and-grafana-39a1</guid>
      <description>&lt;p&gt;Are you tired of being locked into the "Walled Gardens" of big tech? If you own an &lt;strong&gt;Apple Watch&lt;/strong&gt;, a &lt;strong&gt;Garmin&lt;/strong&gt; for your runs, and a &lt;strong&gt;Whoop&lt;/strong&gt; for recovery, you know the struggle: your health data is scattered across three different ecosystems. &lt;/p&gt;

&lt;p&gt;In the world of &lt;strong&gt;Quantified Self&lt;/strong&gt;, data fragmentation is the enemy. To truly understand our physiology, we need a unified view. Today, we’re going to build a high-performance, real-time health monitoring pipeline. We’ll be using &lt;strong&gt;RedisTimeSeries&lt;/strong&gt; for ultra-fast data ingestion, &lt;strong&gt;Go&lt;/strong&gt; for our backend collector, and &lt;strong&gt;Grafana&lt;/strong&gt; to visualize our metrics in a beautiful, geeky dashboard.&lt;/p&gt;

&lt;p&gt;By the end of this guide, you’ll have a production-grade setup to sync and analyze your &lt;strong&gt;wearable data integration&lt;/strong&gt; and &lt;strong&gt;real-time health monitoring&lt;/strong&gt; metrics in one place.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Architecture 🏗️
&lt;/h2&gt;

&lt;p&gt;The goal is to create a seamless flow from various wearable APIs to a centralized time-series database. Here is how the data flows through our system:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;graph TD
    A[Wearable Devices: Apple/Garmin/Whoop] --&amp;gt;|Webhook/API| B[Go Ingestion Service]
    B --&amp;gt;|TS.ADD| C[(RedisTimeSeries)]
    C --&amp;gt;|Query| D[Grafana Dashboard]
    D --&amp;gt;|Visualization| E[User]

    subgraph "Infrastructure (Docker)"
    C
    D
    end
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Prerequisites 🛠️
&lt;/h2&gt;

&lt;p&gt;Before we dive into the code, ensure you have the following installed:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Docker &amp;amp; Docker Compose&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Go&lt;/strong&gt; (1.20+)&lt;/li&gt;
&lt;li&gt;  A basic understanding of &lt;strong&gt;Redis&lt;/strong&gt; commands&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Step 1: Spin up the Stack with Docker 🐳
&lt;/h2&gt;

&lt;p&gt;We’ll use the &lt;code&gt;redis/redis-stack&lt;/code&gt; image because it comes pre-packaged with the &lt;strong&gt;RedisTimeSeries&lt;/strong&gt; module, and a standard Grafana image.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# docker-compose.yml&lt;/span&gt;
&lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;3.8'&lt;/span&gt;
&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;redis&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;redis/redis-stack:latest&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;6379:6379"&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;8001:8001"&lt;/span&gt; &lt;span class="c1"&gt;# RedisInsight UI&lt;/span&gt;

  &lt;span class="na"&gt;grafana&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;grafana/grafana:latest&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;3000:3000"&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;GF_AUTH_ANONYMOUS_ENABLED=true&lt;/span&gt;
    &lt;span class="na"&gt;depends_on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;redis&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run &lt;code&gt;docker-compose up -d&lt;/code&gt; to get your infrastructure humming.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: Ingesting Data with Go 🐹
&lt;/h2&gt;

&lt;p&gt;We need a lightweight service to receive data (e.g., Heart Rate, HRV, Steps) and push it into Redis. RedisTimeSeries uses the &lt;code&gt;TS.ADD&lt;/code&gt; command, which is perfect for high-frequency wearable data.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;package&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"context"&lt;/span&gt;
    &lt;span class="s"&gt;"fmt"&lt;/span&gt;
    &lt;span class="s"&gt;"github {dot} com/redis/go-redis/v9"&lt;/span&gt;
    &lt;span class="s"&gt;"time"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Background&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;rdb&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Options&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;Addr&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"localhost:6379"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="c"&gt;// Simulate receiving a Heart Rate data point&lt;/span&gt;
    &lt;span class="n"&gt;heartRate&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="m"&gt;72.5&lt;/span&gt;
    &lt;span class="n"&gt;timestamp&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;UnixMilli&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;sensorID&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="s"&gt;"apple_watch_series_9"&lt;/span&gt;

    &lt;span class="c"&gt;// TS.ADD key timestamp value [RETENTION retentionPeriod] [LABELS label value...]&lt;/span&gt;
    &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;rdb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Do&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"TS.ADD"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
        &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"health:heart_rate:%s"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sensorID&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; 
        &lt;span class="n"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
        &lt;span class="n"&gt;heartRate&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
        &lt;span class="s"&gt;"LABELS"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"heart_rate"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"source"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sensorID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Err&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Failed to add data: %v&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Println&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"✅ Data point ingested successfully!"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 3: Visualizing in Grafana 📊
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt; Open Grafana at &lt;code&gt;http://localhost:3000&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt; Add a &lt;strong&gt;Data Source&lt;/strong&gt; and search for &lt;strong&gt;Redis&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt; Set the URL to &lt;code&gt;redis:6379&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt; Create a new Dashboard and add a "Time series" panel.&lt;/li&gt;
&lt;li&gt; Use the Query: &lt;code&gt;TS.RANGE health:heart_rate:apple_watch_series_9 - +&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Moving to Production: The "Official" Way 🥑
&lt;/h2&gt;

&lt;p&gt;While this setup is great for a weekend project, managing multi-source data at scale requires handling rate limits, data normalization, and OAuth2 flows for different wearable vendors. &lt;/p&gt;

&lt;p&gt;If you're looking for more production-ready patterns, advanced data synchronization logic, or how to handle massive scale in health-tech apps, I highly recommend checking out the technical deep-dives over at the &lt;a href="https://www.wellally.tech/blog" rel="noopener noreferrer"&gt;&lt;strong&gt;WellAlly Tech Blog&lt;/strong&gt;&lt;/a&gt;. They cover the nuances of building resilient health data pipelines that go far beyond basic CRUD operations.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion 🏁
&lt;/h2&gt;

&lt;p&gt;You’ve just broken down the walls of your wearable ecosystem! By leveraging &lt;strong&gt;RedisTimeSeries&lt;/strong&gt;, you have a database that can handle thousands of heart rate samples per second with negligible latency. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Next Steps:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  Try adding &lt;strong&gt;Garmin Connect API&lt;/strong&gt; integration.&lt;/li&gt;
&lt;li&gt;  Implement &lt;code&gt;TS.CREATERULE&lt;/code&gt; in Redis to automatically downsample your data (e.g., calculating hourly average heart rate).&lt;/li&gt;
&lt;li&gt;  Add alerting in Grafana to notify you if your recovery score (HRV) drops too low!&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Happy hacking, and stay healthy! 🏃‍♂️💨&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Love this tutorial? Follow me for more "Learning in Public" sessions. Don't forget to star the repo and share your own dashboards in the comments!&lt;/em&gt;&lt;/p&gt;

</description>
      <category>redis</category>
      <category>ai</category>
      <category>webdev</category>
      <category>python</category>
    </item>
    <item>
      <title>Privacy-First AI: How to Run Local Llama-3 on iPhone to Analyze Your HealthKit Data via MLX</title>
      <dc:creator>wellallyTech</dc:creator>
      <pubDate>Sat, 09 May 2026 01:25:00 +0000</pubDate>
      <link>https://dev.to/wellallytech/privacy-first-ai-how-to-run-local-llama-3-on-iphone-to-analyze-your-healthkit-data-via-mlx-15f7</link>
      <guid>https://dev.to/wellallytech/privacy-first-ai-how-to-run-local-llama-3-on-iphone-to-analyze-your-healthkit-data-via-mlx-15f7</guid>
      <description>&lt;p&gt;Privacy is no longer just a "feature"—it is a fundamental requirement, especially when dealing with sensitive medical information. With the rise of &lt;strong&gt;Edge AI&lt;/strong&gt; and &lt;strong&gt;Local LLMs&lt;/strong&gt;, we no longer have to choose between high-level intelligence and data sovereignty. &lt;/p&gt;

&lt;p&gt;In this tutorial, we will explore how to leverage the &lt;strong&gt;MLX framework&lt;/strong&gt; and &lt;strong&gt;Apple Silicon’s Unified Memory Architecture&lt;/strong&gt; to run a quantized &lt;strong&gt;Llama-3&lt;/strong&gt; model directly on an iPhone. By the end of this guide, you’ll be able to perform deep trend analysis on your &lt;strong&gt;HealthKit&lt;/strong&gt; data without a single byte of information ever leaving your device. This is the ultimate synthesis of &lt;strong&gt;Privacy-First Health&lt;/strong&gt; and cutting-edge machine learning. 🚀&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Local Inference?
&lt;/h2&gt;

&lt;p&gt;Sending heart rate variability, sleep cycles, and glucose levels to a cloud-based API (like OpenAI or Anthropic) poses significant security risks. By using &lt;strong&gt;Llama-3&lt;/strong&gt; locally via &lt;strong&gt;MLX&lt;/strong&gt;, we achieve:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Zero Latency&lt;/strong&gt;: No round-trip to a server.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Total Privacy&lt;/strong&gt;: Data stays in the "Secure Enclave" mindset.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Offline Capability&lt;/strong&gt;: Your health insights work in airplane mode.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  The Architecture: From Sensors to Insights
&lt;/h2&gt;

&lt;p&gt;The data flow relies on the tight integration between iOS's native HealthKit and the high-performance MLX-Swift bindings.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;graph TD
    A[iPhone Sensors/Apple Watch] --&amp;gt;|Encrypted Data| B(HealthKit Store)
    B --&amp;gt;|Query| C[Swift Application]
    C --&amp;gt;|Context Injection| D[Prompt Builder]
    E[Llama-3 8B Quantized] --&amp;gt;|Loaded via MLX| F[MLX-Swift Engine]
    D --&amp;gt;|Local Inference| F
    F --&amp;gt;|Local Insights| G[User Interface]
    G --&amp;gt;|Feedback| D
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;To follow this advanced guide, you'll need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Device&lt;/strong&gt;: iPhone 15 Pro/Pro Max or any M-series iPad (8GB+ RAM recommended).&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Tools&lt;/strong&gt;: Xcode 15+, Python 3.10 (for model conversion).&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Tech Stack&lt;/strong&gt;: &lt;code&gt;MLX-Swift&lt;/code&gt;, &lt;code&gt;HealthKit&lt;/code&gt;, &lt;code&gt;Llama-3 (4-bit/8-bit quantized)&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Step 1: Accessing Sensitive Health Data
&lt;/h2&gt;

&lt;p&gt;First, we need to request authorization from the user to access their health metrics. In this example, we’ll focus on &lt;strong&gt;Step Count&lt;/strong&gt; and &lt;strong&gt;Sleep Analysis&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight swift"&gt;&lt;code&gt;&lt;span class="kd"&gt;import&lt;/span&gt; &lt;span class="kt"&gt;HealthKit&lt;/span&gt;

&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="kt"&gt;HealthDataManager&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="nv"&gt;healthStore&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kt"&gt;HKHealthStore&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="kd"&gt;func&lt;/span&gt; &lt;span class="nf"&gt;requestPermissions&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="nv"&gt;healthTypes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;Set&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="kt"&gt;HKObjectType&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;quantityType&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;forIdentifier&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;stepCount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="kt"&gt;HKObjectType&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;categoryType&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;forIdentifier&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sleepAnalysis&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;
        &lt;span class="p"&gt;]&lt;/span&gt;

        &lt;span class="n"&gt;healthStore&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;requestAuthorization&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;toShare&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;read&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;healthTypes&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;success&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;error&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;success&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"✅ HealthKit Access Granted"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"❌ Access Denied: &lt;/span&gt;&lt;span class="se"&gt;\(&lt;/span&gt;&lt;span class="kt"&gt;String&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;describing&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="se"&gt;)&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Step 2: Preparing Llama-3 for Apple Silicon
&lt;/h2&gt;

&lt;p&gt;Standard weights are too heavy for mobile RAM. We must use &lt;strong&gt;MLX&lt;/strong&gt; to convert and quantize Llama-3 into a format optimized for the Apple Neural Engine and GPU.&lt;/p&gt;

&lt;p&gt;Run this on your Mac before importing to your Xcode project:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install MLX and tools&lt;/span&gt;
pip &lt;span class="nb"&gt;install &lt;/span&gt;mlx-lm

&lt;span class="c"&gt;# Convert Llama-3 to 4-bit quantization&lt;/span&gt;
python &lt;span class="nt"&gt;-m&lt;/span&gt; mlx_lm.convert &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--hf-path&lt;/span&gt; meta-llama/Meta-Llama-3-8B-Instruct &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-q&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--q-bits&lt;/span&gt; 4 &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--output-path&lt;/span&gt; ./Llama-3-8B-4bit-MLX
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Step 3: Local Inference with MLX-Swift
&lt;/h2&gt;

&lt;p&gt;Now, let's look at the core logic where we feed the HealthKit data into the local Llama-3 model. We use &lt;code&gt;MLXLLM&lt;/code&gt; to manage the model lifecycle.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight swift"&gt;&lt;code&gt;&lt;span class="kd"&gt;import&lt;/span&gt; &lt;span class="kt"&gt;MLX&lt;/span&gt;
&lt;span class="kd"&gt;import&lt;/span&gt; &lt;span class="kt"&gt;MLXLMCommon&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;func&lt;/span&gt; &lt;span class="nf"&gt;generateHealthInsights&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;healthData&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;String&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Load the model from the app bundle&lt;/span&gt;
    &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="nv"&gt;modelPath&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kt"&gt;Bundle&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;resourceURL&lt;/span&gt;&lt;span class="o"&gt;!.&lt;/span&gt;&lt;span class="nf"&gt;appendingPathComponent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Llama-3-8B-4bit-MLX"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="nv"&gt;modelConfiguration&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kt"&gt;ModelConfiguration&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;directory&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;modelPath&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;tokenizer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;try!&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="kt"&gt;LLMModelFactory&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shared&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loadContainer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;configuration&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;modelConfiguration&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="nv"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"""
    &amp;lt;|begin_of_text|&amp;gt;&amp;lt;|start_header_id|&amp;gt;system&amp;lt;|end_header_id|&amp;gt;
    You are a private health assistant. Analyze the following user health data and provide 3 actionable insights. 
    Be concise and professional.
    &amp;lt;|eot_id|&amp;gt;&amp;lt;|start_header_id|&amp;gt;user&amp;lt;|end_header_id|&amp;gt;
    Data: &lt;/span&gt;&lt;span class="se"&gt;\(&lt;/span&gt;&lt;span class="n"&gt;healthData&lt;/span&gt;&lt;span class="se"&gt;)&lt;/span&gt;&lt;span class="s"&gt;
    &amp;lt;|eot_id|&amp;gt;&amp;lt;|start_header_id|&amp;gt;assistant&amp;lt;|end_header_id|&amp;gt;
    """&lt;/span&gt;

    &lt;span class="c1"&gt;// Perform inference locally on GPU&lt;/span&gt;
    &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="nv"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;try!&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="nv"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nv"&gt;parameters&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;GenerateParameters&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;temperature&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.7&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Health Analysis: &lt;/span&gt;&lt;span class="se"&gt;\(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="se"&gt;)&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  The "Official" Way: Advanced Patterns &amp;amp; Optimization
&lt;/h2&gt;

&lt;p&gt;While the implementation above works for a prototype, production-grade local AI requires sophisticated memory management and prompt engineering to avoid "Out of Memory" (OOM) crashes on iOS. &lt;/p&gt;

&lt;p&gt;For a deep dive into &lt;strong&gt;advanced production patterns&lt;/strong&gt;, including &lt;strong&gt;KV-cache optimization&lt;/strong&gt; and &lt;strong&gt;Model Distillation&lt;/strong&gt; for even smaller footprints, check out the expert resources at: &lt;/p&gt;

&lt;p&gt;👉 &lt;strong&gt;&lt;a href="https://www.wellally.tech/blog" rel="noopener noreferrer"&gt;WellAlly Tech Blog: Production-Ready Edge AI&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;At WellAlly, we explore how to bridge the gap between "it works on my machine" and "it works flawlessly for millions of users." Our research into local-first architectures served as a primary inspiration for the techniques used in this guide. 🥑&lt;/p&gt;




&lt;h2&gt;
  
  
  Performance Considerations 📈
&lt;/h2&gt;

&lt;p&gt;Running Llama-3 8B (4-bit) on an iPhone 15 Pro yields approximately &lt;strong&gt;8-12 tokens per second&lt;/strong&gt;. &lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Cloud (GPT-4o)&lt;/th&gt;
&lt;th&gt;Local (Llama-3 via MLX)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Data Privacy&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Conditional&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Absolute&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cost&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Per Token&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Free&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Latency&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;1s - 5s&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~100ms (First Token)&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Reliability&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Depends on WiFi&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Works Anywhere&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Key Optimizations:
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Unified Memory&lt;/strong&gt;: MLX allows the GPU and CPU to share the same memory space, preventing expensive data copies.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Quantization&lt;/strong&gt;: Moving from 16-bit to 4-bit reduces the memory footprint from ~15GB to ~4.5GB, fitting comfortably within the 8GB RAM of modern iPhones.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;The future of health tech is local. By combining &lt;strong&gt;HealthKit&lt;/strong&gt;'s rich data ecosystem with the raw power of &lt;strong&gt;MLX&lt;/strong&gt; and &lt;strong&gt;Llama-3&lt;/strong&gt;, we can build applications that are both incredibly smart and unfailingly private.&lt;/p&gt;

&lt;p&gt;Are you ready to move your AI workloads to the edge? Start by experimenting with the MLX-Swift examples and don't forget to share your results with the community!&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Questions?&lt;/strong&gt; Drop a comment below or join the discussion over at &lt;a href="https://www.wellally.tech/blog" rel="noopener noreferrer"&gt;WellAlly Tech&lt;/a&gt;! 💻✨&lt;/p&gt;

</description>
      <category>ai</category>
      <category>swift</category>
      <category>ios</category>
      <category>llama3</category>
    </item>
    <item>
      <title>Stop Slouching! Build a Real-Time Spine Posture Corrector with MediaPipe &amp; Electron 🖥️💻</title>
      <dc:creator>wellallyTech</dc:creator>
      <pubDate>Fri, 08 May 2026 01:40:00 +0000</pubDate>
      <link>https://dev.to/wellallytech/stop-slouching-build-a-real-time-spine-posture-corrector-with-mediapipe-electron-4jj</link>
      <guid>https://dev.to/wellallytech/stop-slouching-build-a-real-time-spine-posture-corrector-with-mediapipe-electron-4jj</guid>
      <description>&lt;p&gt;We’ve all been there: deeply focused on a debugging session, only to realize four hours later that our spine is shaped like a question mark. "Tech Neck" is a real productivity killer. But as developers, why buy a posture corrector when we can build one?&lt;/p&gt;

&lt;p&gt;In this tutorial, we are diving deep into &lt;strong&gt;Computer Vision&lt;/strong&gt; and &lt;strong&gt;Real-time Posture Monitoring&lt;/strong&gt; using the power of &lt;strong&gt;MediaPipe&lt;/strong&gt;, &lt;strong&gt;OpenCV&lt;/strong&gt;, and &lt;strong&gt;Electron&lt;/strong&gt;. We will transform your webcam into a smart health assistant that detects slouching and provides personalized stretching advice. By leveraging advanced &lt;strong&gt;landmark detection&lt;/strong&gt;, we can calculate the exact angle of your cervical spine to keep you upright and healthy.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Architecture 🏗️
&lt;/h2&gt;

&lt;p&gt;The system works by capturing a video stream, processing each frame to find human pose landmarks, and calculating the angle between the ear, shoulder, and hip.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;graph TD
    A[Webcam Feed] --&amp;gt; B[OpenCV Pre-processing]
    B --&amp;gt; C[MediaPipe BlazePose Engine]
    C --&amp;gt; D{Landmark Analysis}
    D --&amp;gt;|Bad Posture| E[Trigger Notification]
    D --&amp;gt;|Good Posture| F[Keep Monitoring]
    E --&amp;gt; G[Local LLM / Stretch Logic]
    G --&amp;gt; H[Electron UI Alert]
    H --&amp;gt; A
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Prerequisites 🛠️
&lt;/h2&gt;

&lt;p&gt;To follow along, make sure you have the following installed:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;MediaPipe&lt;/strong&gt;: For high-fidelity body tracking.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OpenCV&lt;/strong&gt;: To handle image processing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Electron&lt;/strong&gt;: To package our solution into a desktop app.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Python/Node.js&lt;/strong&gt;: Depending on how you bridge the CV logic to the UI.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Step 1: Detecting Pose Landmarks with MediaPipe
&lt;/h2&gt;

&lt;p&gt;The heart of our application is the &lt;strong&gt;MediaPipe Pose&lt;/strong&gt; model. It provides 33 3D landmarks. For spine correction, we specifically care about landmarks:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;11&lt;/code&gt; (Left Shoulder) &amp;amp; &lt;code&gt;12&lt;/code&gt; (Right Shoulder)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;7&lt;/code&gt; (Left Ear) &amp;amp; &lt;code&gt;8&lt;/code&gt; (Right Ear)
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;cv2&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;mediapipe&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;mp&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;numpy&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;

&lt;span class="n"&gt;mp_pose&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;solutions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pose&lt;/span&gt;
&lt;span class="n"&gt;pose&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mp_pose&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Pose&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;min_detection_confidence&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;min_tracking_confidence&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;calculate_angle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Calculates the angle between three points.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;array&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# First point (Ear)
&lt;/span&gt;    &lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;array&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# Mid point (Shoulder)
&lt;/span&gt;    &lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;array&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# End point (Hip/Vertical)
&lt;/span&gt;
    &lt;span class="n"&gt;radians&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;arctan2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;arctan2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;angle&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;abs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;radians&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mf"&gt;180.0&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pi&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;angle&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;180.0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;angle&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;360&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;angle&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;angle&lt;/span&gt;

&lt;span class="c1"&gt;# Example usage in a loop
# angle = calculate_angle(ear_coords, shoulder_coords, [shoulder_coords[0], 0])
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 2: The Core Logic – Detecting the "Slouch"
&lt;/h2&gt;

&lt;p&gt;The "Forward Head Posture" is usually identified when the angle between your ear and shoulder (relative to the vertical axis) exceeds 20-30 degrees.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Extracting coordinates
&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pose&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;process&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pose_landmarks&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;landmarks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pose_landmarks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;landmark&lt;/span&gt;

    &lt;span class="c1"&gt;# Get coordinates for the ear and shoulder
&lt;/span&gt;    &lt;span class="n"&gt;ear&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;landmarks&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;mp_pose&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PoseLandmark&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LEFT_EAR&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
           &lt;span class="n"&gt;landmarks&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;mp_pose&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PoseLandmark&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LEFT_EAR&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;shoulder&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;landmarks&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;mp_pose&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PoseLandmark&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LEFT_SHOULDER&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
                &lt;span class="n"&gt;landmarks&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;mp_pose&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PoseLandmark&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LEFT_SHOULDER&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="c1"&gt;# Calculate the angle relative to a vertical line
&lt;/span&gt;    &lt;span class="n"&gt;posture_angle&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;calculate_angle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ear&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;shoulder&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;shoulder&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;posture_angle&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;25&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;🚨 Warning: Slouching detected!&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 3: Wrapping it in Electron ⚛️
&lt;/h2&gt;

&lt;p&gt;To make this useful for daily work, we wrap the Python logic in an &lt;strong&gt;Electron&lt;/strong&gt; wrapper. This allows the app to sit in the system tray and send desktop notifications when your posture slips.&lt;/p&gt;

&lt;p&gt;If you're looking for more production-ready examples and advanced architectural patterns for integrating AI with desktop environments, I highly recommend checking out the technical deep-dives at &lt;a href="https://www.wellally.tech/blog" rel="noopener noreferrer"&gt;WellAlly Blog&lt;/a&gt;. They cover excellent strategies for optimizing local model performance which inspired the latency reduction techniques used here! 🥑&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 4: Adding "Smart" Stretch Advice
&lt;/h2&gt;

&lt;p&gt;Instead of a boring "Sit Up" message, we can integrate a local LLM logic (like a simple prompt to Ollama) to suggest a specific stretch based on the duration of slouching.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Duration&lt;/th&gt;
&lt;th&gt;Posture State&lt;/th&gt;
&lt;th&gt;Suggestion&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;5 mins&lt;/td&gt;
&lt;td&gt;Mild Slouch&lt;/td&gt;
&lt;td&gt;"Roll your shoulders back."&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;15 mins&lt;/td&gt;
&lt;td&gt;Heavy Slouch&lt;/td&gt;
&lt;td&gt;"Stand up and do a doorway stretch."&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;30 mins&lt;/td&gt;
&lt;td&gt;Persistent&lt;/td&gt;
&lt;td&gt;"Time for a 2-minute break!"&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Conclusion 🚀
&lt;/h2&gt;

&lt;p&gt;Building a posture corrector isn't just a fun weekend project; it's a practical way to apply &lt;strong&gt;Computer Vision&lt;/strong&gt; to improve your daily life. By combining &lt;strong&gt;MediaPipe's&lt;/strong&gt; speed with &lt;strong&gt;Electron's&lt;/strong&gt; accessibility, we've created a tool that saves your back while you write code.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What's next?&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; Add a "Dashboard" to track your posture score over a week.&lt;/li&gt;
&lt;li&gt; Integrate a "Privacy Mode" that blurs the background.&lt;/li&gt;
&lt;li&gt; Check out &lt;a href="https://www.wellally.tech/blog" rel="noopener noreferrer"&gt;wellally.tech/blog&lt;/a&gt; for more advanced tutorials on AI and developer wellness.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Happy coding, and stay upright!&lt;/strong&gt; 🥑✨&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Feel free to drop a comment below if you have questions about the coordinate math or the Electron-Python bridge!&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>python</category>
      <category>opensource</category>
      <category>discuss</category>
    </item>
    <item>
      <title>🚀 Data Wrangling Magic: Healing Conflicting Oura Ring &amp; Garmin Time-Series Data with LLMs</title>
      <dc:creator>wellallyTech</dc:creator>
      <pubDate>Thu, 07 May 2026 00:50:00 +0000</pubDate>
      <link>https://dev.to/wellallytech/data-wrangling-magic-healing-conflicting-oura-ring-garmin-time-series-data-with-llms-3c3o</link>
      <guid>https://dev.to/wellallytech/data-wrangling-magic-healing-conflicting-oura-ring-garmin-time-series-data-with-llms-3c3o</guid>
      <description>&lt;p&gt;In the world of &lt;strong&gt;Data Engineering&lt;/strong&gt;, handling &lt;strong&gt;heterogeneous time-series data&lt;/strong&gt; is a recurring nightmare. Whether you are a biohacker trying to optimize your sleep or a data scientist building a health app, syncing data from an &lt;strong&gt;Oura Ring&lt;/strong&gt; and a &lt;strong&gt;Garmin&lt;/strong&gt; watch often results in nasty timestamp overlaps, conflicting heart rate readings, and "ghost" activity logs. &lt;/p&gt;

&lt;p&gt;Standard interpolation works for smooth curves, but what happens when Garmin says you were running at 140 BPM while Oura says you were napping? This is where &lt;strong&gt;LLM-based data cleaning&lt;/strong&gt; enters the chat. In this guide, we'll build a pipeline using &lt;strong&gt;Pandas&lt;/strong&gt;, &lt;strong&gt;Dask&lt;/strong&gt;, and &lt;strong&gt;Instructor&lt;/strong&gt; to automatically resolve data conflicts and fix outliers with the power of GPT-4.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Pro-Tip&lt;/strong&gt;: For more production-ready engineering patterns and advanced health-tech data architectures, definitely check out the deep dives over at the &lt;a href="https://www.wellally.tech/blog" rel="noopener noreferrer"&gt;WellAlly Blog&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  🏗️ The Architecture: The "Smart" Cleaning Pipeline
&lt;/h2&gt;

&lt;p&gt;Before we dive into the code, let's visualize how we move from messy, overlapping CSVs to a unified, clean time-series dataset.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;graph TD
    A[Oura Ring Data] --&amp;gt; C(Time-Series Alignment)
    B[Garmin Connect Data] --&amp;gt; C
    C --&amp;gt; D{Conflict Detected?}
    D -- No --&amp;gt; E[Linear Interpolation]
    D -- Yes --&amp;gt; F[Instructor/LLM Agent]
    F --&amp;gt; G[Context-Aware Repair]
    E --&amp;gt; H[Final Merged Dataframe]
    G --&amp;gt; H
    H --&amp;gt; I[Dask for Parallel Processing]
    I --&amp;gt; J[Clean Bio-Data API/Dashboard]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🛠️ The Tech Stack
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Pandas&lt;/strong&gt;: Our bread and butter for data manipulation.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Dask&lt;/strong&gt;: To handle larger-than-memory datasets and parallelize LLM calls.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Instructor&lt;/strong&gt;: A brilliant library that uses &lt;strong&gt;Pydantic&lt;/strong&gt; to force LLMs to return structured data.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Python&lt;/strong&gt;: The glue holding it all together. 🐍&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🛠️ Step 1: Defining the Conflict Schema
&lt;/h2&gt;

&lt;p&gt;Standard cleaning scripts fail because they lack "context." An LLM, however, understands that if your &lt;code&gt;step_count&lt;/code&gt; is 5000 but your &lt;code&gt;heart_rate&lt;/code&gt; is 55, one of those sensors is lying. &lt;/p&gt;

&lt;p&gt;We use &lt;strong&gt;Instructor&lt;/strong&gt; to define a schema that the LLM must follow when resolving conflicts.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pydantic&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Field&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;instructor&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OpenAI&lt;/span&gt;

&lt;span class="c1"&gt;# Initialize Instructor
&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;instructor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_openai&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;DataRepair&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;resolved_value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(...,&lt;/span&gt; &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;The corrected value for the metric.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;reasoning&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(...,&lt;/span&gt; &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Explanation of why this device&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s data was chosen or why an average was used.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;is_outlier&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(...,&lt;/span&gt; &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Whether the original data point was a sensor malfunction.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;resolve_conflict_with_llm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metric&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;val_oura&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;val_garmin&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4o&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;response_model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;DataRepair&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;system&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;You are a specialized bio-data engineer.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Conflict in &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;metric&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: Oura says &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;val_oura&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;, Garmin says &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;val_garmin&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;. Context: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🕒 Step 2: Time-Series Alignment with Pandas
&lt;/h2&gt;

&lt;p&gt;First, we need to get both datasets onto the same temporal grid. Garmin might log every second, while Oura logs every 5 minutes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pandas&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;align_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;oura_df&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;garmin_df&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# Standardize time to UTC
&lt;/span&gt;    &lt;span class="n"&gt;oura_df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;timestamp&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;to_datetime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;oura_df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ts&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]).&lt;/span&gt;&lt;span class="n"&gt;dt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;tz_localize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;garmin_df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;timestamp&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;to_datetime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;garmin_df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ts&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]).&lt;/span&gt;&lt;span class="n"&gt;dt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;tz_localize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Reindex to 1-minute intervals and join
&lt;/span&gt;    &lt;span class="n"&gt;merged&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;merge_asof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;oura_df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sort_values&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;timestamp&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;garmin_df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sort_values&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;timestamp&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;on&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;timestamp&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;direction&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;nearest&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;tolerance&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Timedelta&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;2min&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;merged&lt;/span&gt;

&lt;span class="c1"&gt;# Example of finding a "Conflict"
# If Heart Rate difference &amp;gt; 15 BPM, we flag it for the LLM
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🚀 Step 3: Scaling the "Repair" with Dask
&lt;/h2&gt;

&lt;p&gt;Calling an LLM for every single row is expensive and slow. We use &lt;strong&gt;Dask&lt;/strong&gt; to only send the "conflict rows" to the LLM in parallel.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;dask.dataframe&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;dd&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;process_chunk&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# Logic to identify conflicts
&lt;/span&gt;    &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;conflict&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;hr_oura&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;hr_garmin&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]).&lt;/span&gt;&lt;span class="nf"&gt;abs&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;15&lt;/span&gt;

    &lt;span class="c1"&gt;# Apply LLM resolution only where needed
&lt;/span&gt;    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;repair_row&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;conflict&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
            &lt;span class="n"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;resolve_conflict_with_llm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;HeartRate&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;hr_oura&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;hr_garmin&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; 
                &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Activity level: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;activity_type&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;resolved_value&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;hr_garmin&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="c1"&gt;# Default to Garmin
&lt;/span&gt;
    &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;hr_cleaned&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;apply&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;repair_row&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;axis&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;

&lt;span class="c1"&gt;# Convert to Dask and compute
# dask_df = dd.from_pandas(merged_df, npartitions=4)
# clean_df = dask_df.map_partitions(process_chunk).compute()
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🥑 Why This Matters: The Biohacking Context
&lt;/h2&gt;

&lt;p&gt;Data cleaning isn't just about deleting &lt;code&gt;NaN&lt;/code&gt; values anymore. In the era of LLMs, we can perform &lt;strong&gt;Semantic Data Cleaning&lt;/strong&gt;. &lt;/p&gt;

&lt;p&gt;Imagine your Garmin registers a spike in stress because you were watching a horror movie, but your Oura Ring knows your body temperature is normal. A hard-coded script can't distinguish that; an LLM backed by a Pydantic schema can.&lt;/p&gt;

&lt;p&gt;If you are looking for more "production-ready" examples of how to build these types of agents—specifically for personal health optimization—I highly recommend browsing the technical articles at &lt;a href="https://www.wellally.tech/blog" rel="noopener noreferrer"&gt;wellally.tech/blog&lt;/a&gt;. They cover the intersection of AI, health data, and robust software engineering in much more detail.&lt;/p&gt;




&lt;h2&gt;
  
  
  🎯 Conclusion
&lt;/h2&gt;

&lt;p&gt;By combining the structural power of &lt;strong&gt;Pandas&lt;/strong&gt; with the reasoning capabilities of &lt;strong&gt;LLMs (via Instructor)&lt;/strong&gt;, we transform a messy data-engineering headache into a clean, high-fidelity stream of insights.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What's next?&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; Try adding a "Confidence Score" to your Pydantic model.&lt;/li&gt;
&lt;li&gt; Use Dask to process years of historical data.&lt;/li&gt;
&lt;li&gt; Let me know in the comments: How do you handle sensor conflicts in your projects? 💬&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Happy coding!&lt;/strong&gt; 🚀💻&lt;/p&gt;

</description>
      <category>python</category>
      <category>opensource</category>
      <category>machinelearning</category>
      <category>ai</category>
    </item>
    <item>
      <title>From Scans to Structured Data: Converting Medical Reports to JSON with Pydantic &amp; LLMs</title>
      <dc:creator>wellallyTech</dc:creator>
      <pubDate>Wed, 06 May 2026 01:30:00 +0000</pubDate>
      <link>https://dev.to/wellallytech/from-scans-to-structured-data-converting-medical-reports-to-json-with-pydantic-llms-4am7</link>
      <guid>https://dev.to/wellallytech/from-scans-to-structured-data-converting-medical-reports-to-json-with-pydantic-llms-4am7</guid>
      <description>&lt;p&gt;Have you ever looked at a stack of physical medical reports and wished you could just "Ctrl+F" your health history? 📑 We’ve all been there. Every hospital has a different layout, different units, and cryptic abbreviations that make manual data entry a nightmare.&lt;/p&gt;

&lt;p&gt;In the world of &lt;strong&gt;data engineering&lt;/strong&gt;, turning unstructured "messy" documents into &lt;strong&gt;structured data extraction&lt;/strong&gt; pipelines is a superpower. Today, we’re going to build a robust system that uses &lt;strong&gt;Pydantic&lt;/strong&gt;, &lt;strong&gt;Instructor&lt;/strong&gt;, and &lt;strong&gt;Azure Form Recognizer&lt;/strong&gt; to transform scanned medical reports into standardized JSON (following medical standards like LOINC) with 100% type safety. 🚀&lt;/p&gt;

&lt;h2&gt;
  
  
  Why "Prompting" isn't enough
&lt;/h2&gt;

&lt;p&gt;If you just throw OCR text at an LLM and ask for "JSON," it will eventually fail. It might hallucinate a field, change a data type, or forget a closing bracket. To build production-grade health tech, we need &lt;strong&gt;validation&lt;/strong&gt;. &lt;/p&gt;

&lt;p&gt;By combining &lt;strong&gt;Pydantic&lt;/strong&gt; for schema definition and &lt;strong&gt;Instructor&lt;/strong&gt; for steering the LLM, we ensure that the output isn't just "JSON-like"—it's a strictly typed Python object.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Architecture: From Pixels to Patterns
&lt;/h2&gt;

&lt;p&gt;Here is how the data flows from a blurry JPEG to a clean, queryable database:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;graph TD
    A[Scanned Report/Image] --&amp;gt;|OCR Extraction| B[Azure AI Document Intelligence]
    B --&amp;gt;|Raw Text &amp;amp; Tables| C[Instructor + LLM]
    C --&amp;gt;|Schema Enforcement| D[Pydantic Model]
    D --&amp;gt;|Validation Check| E{Is it Valid?}
    E --&amp;gt;|No| C
    E --&amp;gt;|Yes| F[Standardized JSON - LOINC Compatible]
    F --&amp;gt;|Storage| G[PostgreSQL/Vector DB]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Step 1: Defining the Medical Schema
&lt;/h2&gt;

&lt;p&gt;First, we define exactly what a "Medical Test" looks like. We want to capture the test name, the result, the unit, and that pesky reference range.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pydantic&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Field&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MedicalTestResult&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;test_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(...,&lt;/span&gt; &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;The name of the test, e.g., &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Hemoglobin&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt; or &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;HbA1c&lt;/span&gt;&lt;span class="sh"&gt;'"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(...,&lt;/span&gt; &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;The numerical result of the test&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;unit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(...,&lt;/span&gt; &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;The measurement unit, e.g., &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;g/dL&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt; or &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;mmol/L&lt;/span&gt;&lt;span class="sh"&gt;'"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;is_normal&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(...,&lt;/span&gt; &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Flag indicating if the result is within the reference range&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;reference_range&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;The normal range provided by the lab&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;HealthReport&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;patient_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;report_date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;hospital_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;MedicalTestResult&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 2: Extracting Text with Azure Form Recognizer
&lt;/h2&gt;

&lt;p&gt;Before the LLM can "understand" the report, we need to extract the text. &lt;strong&gt;Azure AI Document Intelligence&lt;/strong&gt; (formerly Form Recognizer) is fantastic at handling tables in scanned PDFs.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;azure.ai.formrecognizer&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;DocumentAnalysisClient&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;azure.core.credentials&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AzureKeyCredential&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;extract_raw_text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file_path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;DocumentAnalysisClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;endpoint&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;YOUR_AZURE_ENDPOINT&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
        &lt;span class="n"&gt;credential&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;AzureKeyCredential&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;YOUR_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;rb&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;poller&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;begin_analyze_document&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;prebuilt-layout&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;document&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;poller&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;result&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="c1"&gt;# Returns the full text content
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 3: The Magic Hook (Instructor + LLM)
&lt;/h2&gt;

&lt;p&gt;This is where the magic happens. Instead of using the raw OpenAI SDK, we use &lt;strong&gt;Instructor&lt;/strong&gt;. It patches the OpenAI client so that it returns a Pydantic object directly.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;instructor&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OpenAI&lt;/span&gt;

&lt;span class="c1"&gt;# Patch the client to add 'response_model' support
&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;instructor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;patch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;parse_report_with_llm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;raw_text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;HealthReport&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4o&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;response_model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;HealthReport&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;system&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;You are a medical data specialist. Extract data into the specified JSON format. Map common names to LOINC standards where possible.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Extract data from this report: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;raw_text&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="n"&gt;max_retries&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="c1"&gt;# If validation fails, it will automatically retry!
&lt;/span&gt;    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🥑 Level Up: Advanced Patterns
&lt;/h2&gt;

&lt;p&gt;While this setup works for basic reports, production environments often require handling multi-page documents, handling PII (Personally Identifiable Information), and mapping values to global standards like &lt;strong&gt;LOINC&lt;/strong&gt; or &lt;strong&gt;SNOMED CT&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;For a deeper dive into scaling these pipelines and implementing advanced medical data architectures, I highly recommend checking out the &lt;strong&gt;&lt;a href="https://www.wellally.tech/blog" rel="noopener noreferrer"&gt;WellAlly Tech Blog&lt;/a&gt;&lt;/strong&gt;. They have some incredible resources on high-performance data engineering and production-ready AI patterns that go far beyond a simple tutorial.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why this matters
&lt;/h2&gt;

&lt;p&gt;By structuring this data, we move from &lt;strong&gt;"Pictures of Documents"&lt;/strong&gt; to &lt;strong&gt;"Actionable Insights."&lt;/strong&gt; 📈 &lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Trend Analysis&lt;/strong&gt;: Plot your glucose levels over 5 years.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Early Detection&lt;/strong&gt;: Use algorithms to spot patterns across different hospitals.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Portability&lt;/strong&gt;: Easily share your data with new doctors without carrying a physical folder.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Structuring messy medical data doesn't have to be a headache. With the &lt;strong&gt;Pydantic + Instructor&lt;/strong&gt; stack, you get the reasoning power of an LLM with the strictness of a compiler. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What are you building next?&lt;/strong&gt; Are you going to automate your lab results or perhaps build a custom health dashboard? Let me know in the comments below! 👇&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Happy coding! If you enjoyed this post, don't forget to ❤️ and 🦄 it!&lt;/em&gt;&lt;/p&gt;

</description>
      <category>python</category>
      <category>ai</category>
      <category>pydantic</category>
      <category>openai</category>
    </item>
    <item>
      <title>From Snore to Score: Real-time Sleep Apnea Detection using Whisper v3 and FFT on the Edge 😴🔊</title>
      <dc:creator>wellallyTech</dc:creator>
      <pubDate>Tue, 05 May 2026 00:45:00 +0000</pubDate>
      <link>https://dev.to/wellallytech/from-snore-to-score-real-time-sleep-apnea-detection-using-whisper-v3-and-fft-on-the-edge-k4</link>
      <guid>https://dev.to/wellallytech/from-snore-to-score-real-time-sleep-apnea-detection-using-whisper-v3-and-fft-on-the-edge-k4</guid>
      <description>&lt;p&gt;Have you ever wondered what’s actually happening while you sleep? Beyond the dreams of flying or forgetting your pants at a meeting, your breathing patterns tell a vital story about your health. Traditional sleep studies (polysomnography) involve being strapped to a dozen wires in a cold clinic. But what if we could use &lt;strong&gt;real-time sleep apnea detection&lt;/strong&gt;, &lt;strong&gt;Whisper v3&lt;/strong&gt;, and &lt;strong&gt;Fast Fourier Transform (FFT)&lt;/strong&gt; to turn your smartphone into a clinical-grade monitor?&lt;/p&gt;

&lt;p&gt;In this tutorial, we are building a non-invasive sleep quality analyzer. By combining the physical precision of &lt;strong&gt;audio signal processing&lt;/strong&gt; with the deep learning power of OpenAI's Whisper v3, we can filter out ambient noise (like a whirring fan) and focus specifically on the frequency signatures of snoring and obstructive sleep events. &lt;/p&gt;




&lt;h2&gt;
  
  
  The Architecture: Physics Meets AI 🏗️
&lt;/h2&gt;

&lt;p&gt;The biggest challenge in audio-based health tech is "noise." A car driving by or a blanket rustling can look like a breathing event to a naive model. Our solution uses a dual-stage pipeline:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;FFT (Fast Fourier Transform)&lt;/strong&gt;: Analyzes the frequency spectrum to identify the "texture" of the sound.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Whisper v3&lt;/strong&gt;: Processes the temporal sequence to identify specific breathing patterns and distinguish between regular snoring and apnea events.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;graph TD
    A[Raw Audio Input - React Native] --&amp;gt; B{FFmpeg Stream}
    B --&amp;gt; C[FFT Analysis - Librosa]
    C --&amp;gt;|High-Freq Noise| D[Filter Out]
    C --&amp;gt;|Low-Freq Snore Signature| E[Whisper v3 Encoder]
    E --&amp;gt; F[Pattern Recognition]
    F --&amp;gt; G[Apnea Event Detection]
    G --&amp;gt; H[React Native Dashboard]
    H --&amp;gt; I[Weekly Health Report]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Prerequisites 🛠️
&lt;/h2&gt;

&lt;p&gt;To follow this build, you'll need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Tech Stack&lt;/strong&gt;: Whisper v3 (Large-v3 or Distil-Whisper for edge), Librosa (Python), FFmpeg, and React Native.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Environment&lt;/strong&gt;: A Python backend (FastAPI/Flask) for the heavy lifting or a specialized ONNX runtime for true edge performance.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Step 1: Extracting the "Signature" with FFT
&lt;/h2&gt;

&lt;p&gt;Before we talk to the AI, we need to see the sound. Snoring usually sits in the &lt;strong&gt;20Hz to 2kHz&lt;/strong&gt; range, with specific harmonic peaks. We use &lt;code&gt;Librosa&lt;/code&gt; to perform a Short-Time Fourier Transform (STFT).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;librosa&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;numpy&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;analyze_snore_density&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;audio_path&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# Load audio (sampled at 16kHz for Whisper compatibility)
&lt;/span&gt;    &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;librosa&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;audio_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;16000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Calculate Short-Time Fourier Transform
&lt;/span&gt;    &lt;span class="n"&gt;stft&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;abs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;librosa&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stft&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="c1"&gt;# Convert to decibels
&lt;/span&gt;    &lt;span class="n"&gt;db_spec&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;librosa&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;amplitude_to_db&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stft&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ref&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Calculate Spectral Centroid to identify "heavy" sounds
&lt;/span&gt;    &lt;span class="n"&gt;centroid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;librosa&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;feature&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;spectral_centroid&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;sr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# If the energy is concentrated in low frequencies, it's likely a snore/breath
&lt;/span&gt;    &lt;span class="n"&gt;is_breathing_event&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;centroid&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;1500&lt;/span&gt; 
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;is_breathing_event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;db_spec&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Step 2: Contextual Analysis with Whisper v3
&lt;/h2&gt;

&lt;p&gt;Whisper isn't just for transcribing podcasts. Its encoder is incredibly robust at understanding audio context. By feeding the filtered audio segments into &lt;strong&gt;Whisper v3&lt;/strong&gt;, we can classify the &lt;em&gt;type&lt;/em&gt; of sound.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;transformers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AutoModelForSpeechSeq2Seq&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;AutoProcessor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pipeline&lt;/span&gt;

&lt;span class="n"&gt;device&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cuda:0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="c1"&gt;# or "cpu"
&lt;/span&gt;&lt;span class="n"&gt;model_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;openai/whisper-large-v3&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="c1"&gt;# Initialize the pipeline
&lt;/span&gt;&lt;span class="n"&gt;pipe&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;pipeline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;automatic-speech-recognition&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;model_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;classify_audio_segment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;audio_data&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# We use Whisper to "transcribe" the environment. 
&lt;/span&gt;    &lt;span class="c1"&gt;# In a specialized health model, we'd use the hidden states.
&lt;/span&gt;    &lt;span class="c1"&gt;# Here, we look for non-speech tokens and patterns.
&lt;/span&gt;    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;pipe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;audio_data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;return_timestamps&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Logic: If Whisper detects long silences followed by gasping sounds
&lt;/span&gt;    &lt;span class="c1"&gt;# (often transcribed as [breathing] or [gasping] tags), 
&lt;/span&gt;    &lt;span class="c1"&gt;# we flag a potential Apnea event.
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Step 3: Bridging to the Edge with React Native
&lt;/h2&gt;

&lt;p&gt;On the mobile side, we use &lt;code&gt;react-native-ffmpeg&lt;/code&gt; to downsample the microphone input in real-time before sending it to our analysis engine.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;FFmpegKit&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ffmpeg-kit-react-native&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;processAudioForAnalysis&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;inputPath&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;outputPath&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;RNFS&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;CachesDirectoryPath&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/processed_audio.wav`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="c1"&gt;// Convert to 16kHz, Mono, PCM 16-bit (Whisper's favorite format)&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;FFmpegKit&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`-i &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;inputPath&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; -ar 16000 -ac 1 -c:a pcm_s16le &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;outputPath&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;outputPath&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  The "Official" Way to Scale 🚀
&lt;/h2&gt;

&lt;p&gt;Building a prototype is easy, but making it production-ready (handling multiple users, ensuring privacy, and optimizing latency) is where the real challenge lies. &lt;/p&gt;

&lt;p&gt;If you are looking for advanced signal processing patterns, high-performance AI deployment strategies, or more production-ready examples of edge-computing, I highly recommend checking out the &lt;strong&gt;&lt;a href="https://www.wellally.tech/blog" rel="noopener noreferrer"&gt;WellAlly Tech Blog&lt;/a&gt;&lt;/strong&gt;. It's a goldmine for developers looking to bridge the gap between "it works on my machine" and "it works for a million users."&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion: Data-Driven Sleep 💤
&lt;/h2&gt;

&lt;p&gt;By combining &lt;strong&gt;Whisper v3&lt;/strong&gt; and &lt;strong&gt;FFT&lt;/strong&gt;, we move away from simple "noise detection" toward "intelligent audio analysis." This setup allows users to track their health without wearing a single sensor.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key Takeaways:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;FFT&lt;/strong&gt; acts as our first-line filter, saving computational power.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Whisper v3&lt;/strong&gt; provides the deep contextual understanding needed to differentiate a cough from a life-threatening apnea event.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Edge Computing&lt;/strong&gt; ensures that sensitive bedroom audio never has to leave the device if configured correctly.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Are you ready to build the future of health-tech? Drop a comment below or share your results if you try this stack! 🥑💻&lt;/p&gt;

</description>
      <category>ai</category>
      <category>python</category>
      <category>reactnative</category>
      <category>whisper</category>
    </item>
    <item>
      <title>Calories from Pixels: Building a Precision Food Tracking Pipeline with GPT-4o Vision &amp; SAM</title>
      <dc:creator>wellallyTech</dc:creator>
      <pubDate>Mon, 04 May 2026 01:30:00 +0000</pubDate>
      <link>https://dev.to/wellallytech/calories-from-pixels-building-a-precision-food-tracking-pipeline-with-gpt-4o-vision-sam-2733</link>
      <guid>https://dev.to/wellallytech/calories-from-pixels-building-a-precision-food-tracking-pipeline-with-gpt-4o-vision-sam-2733</guid>
      <description>&lt;p&gt;We’ve all been there: staring at a delicious plate of &lt;em&gt;Beef Wellington&lt;/em&gt; or a complex &lt;em&gt;Poke Bowl&lt;/em&gt;, wondering exactly how many calories are hiding behind those textures. Manual logging is a chore, and most AI calorie counters fail because they can't distinguish between the food and the plate—or worse, they miss the side of fries entirely. 🍟&lt;/p&gt;

&lt;p&gt;In this guide, we are building a high-precision &lt;strong&gt;Computer Vision&lt;/strong&gt; pipeline. By combining Meta's &lt;strong&gt;Segment Anything Model (SAM)&lt;/strong&gt; for surgical object isolation and &lt;strong&gt;GPT-4o Vision&lt;/strong&gt; for semantic understanding and volume estimation, we’re moving from "guessing" to "calculating." We will use &lt;strong&gt;FastAPI&lt;/strong&gt; to glue it all together and &lt;strong&gt;PostgreSQL&lt;/strong&gt; to persist our nutritional logs.&lt;/p&gt;

&lt;p&gt;If you are looking to master &lt;strong&gt;Food Calorie Estimation&lt;/strong&gt; using cutting-edge &lt;strong&gt;GPT-4o Vision&lt;/strong&gt; and &lt;strong&gt;SAM&lt;/strong&gt; workflows, you’re in the right place! 🥑&lt;/p&gt;

&lt;h2&gt;
  
  
  The Architecture 🏗️
&lt;/h2&gt;

&lt;p&gt;The secret sauce here is &lt;strong&gt;preprocessing&lt;/strong&gt;. Instead of feeding a messy, high-resolution photo directly to the LLM, we use SAM to generate masks. This tells the AI exactly &lt;em&gt;what&lt;/em&gt; to look at, significantly improving the accuracy of volume and macro estimation.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;graph TD
    A[User App / Image Upload] --&amp;gt;|POST /analyze| B(FastAPI Backend)
    B --&amp;gt; C{SAM Module}
    C --&amp;gt;|Identify Objects| D[Generate Bounding Boxes &amp;amp; Masks]
    D --&amp;gt; E[Crop &amp;amp; Process Segments]
    E --&amp;gt; F[GPT-4o Vision API]
    F --&amp;gt;|Reasoning: Type, Mass, Calories| G[Pydantic Validation]
    G --&amp;gt; H[(PostgreSQL Storage)]
    H --&amp;gt; I[Response: Caloric Breakdown]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;To follow along, you'll need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Python 3.10+&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;OpenAI API Key&lt;/strong&gt; (for GPT-4o)&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Segment Anything (SAM) Weights&lt;/strong&gt; (ViT-H or ViT-L)&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;FastAPI &amp;amp; SQLAlchemy&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Step 1: Isolating Food with SAM
&lt;/h2&gt;

&lt;p&gt;The &lt;strong&gt;Segment Anything Model (SAM)&lt;/strong&gt; allows us to generate high-quality masks for any object in an image. By isolating the food items, we reduce "background noise" (like the table or napkins) that often confuses vision models.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;numpy&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;segment_anything&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;sam_model_registry&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;SamPredictor&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;FoodSegmenter&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;checkpoint&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sam_vit_h_4b8939.pth&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;device&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cuda&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cuda&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;is_available&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cpu&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sam&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sam_model_registry&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;vit_h&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="n"&gt;checkpoint&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;checkpoint&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sam&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;predictor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SamPredictor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sam&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_masks&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;image_array&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;predictor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_image&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image_array&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="c1"&gt;# For simplicity, we use automatic mask generation or center-point prompting
&lt;/span&gt;        &lt;span class="c1"&gt;# Here we assume we've identified the main dish areas
&lt;/span&gt;        &lt;span class="n"&gt;masks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scores&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;logits&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;predictor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;point_coords&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;array&lt;/span&gt;&lt;span class="p"&gt;([[&lt;/span&gt;&lt;span class="n"&gt;image_array&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;image_array&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]]),&lt;/span&gt;
            &lt;span class="n"&gt;point_labels&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;array&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt;
            &lt;span class="n"&gt;multimask_output&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;masks&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;argmax&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scores&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 2: The GPT-4o Vision Brain 🧠
&lt;/h2&gt;

&lt;p&gt;Once we have our isolated food item, we send it to &lt;strong&gt;GPT-4o&lt;/strong&gt;. We use a specific prompt designed to force the model to think about &lt;em&gt;density&lt;/em&gt; and &lt;em&gt;volume&lt;/em&gt; relative to standard objects (like the plate size).&lt;/p&gt;

&lt;h3&gt;
  
  
  Defining the Schema
&lt;/h3&gt;

&lt;p&gt;Using Pydantic ensures our AI output is structured and ready for our database.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pydantic&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BaseModel&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;FoodItem&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;estimated_weight_g&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;
    &lt;span class="n"&gt;calories&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;
    &lt;span class="n"&gt;protein_g&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;
    &lt;span class="n"&gt;carbs_g&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;
    &lt;span class="n"&gt;fat_g&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;
    &lt;span class="n"&gt;confidence_score&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;NutritionAnalysis&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;FoodItem&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;total_calories&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;
    &lt;span class="n"&gt;health_score&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The API Call
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;analyze_food_vision&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image_base64&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4o&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
                    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Analyze this food item. Estimate volume in cm3, then weight in grams based on density. Provide a JSON response for calories, protein, fat, and carbs.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
                    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;image_url&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;image_url&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;url&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;data:image/jpeg;base64,&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;image_base64&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}}&lt;/span&gt;
                &lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="n"&gt;response_format&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;json_object&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Advanced Patterns &amp;amp; Production Scaling 🚀
&lt;/h2&gt;

&lt;p&gt;Building a prototype is easy, but making it production-ready is where the real challenge lies. You need to handle rate limiting, image compression, and model fallbacks. &lt;/p&gt;

&lt;p&gt;For a deeper dive into &lt;strong&gt;Production AI Architectures&lt;/strong&gt; and more robust implementations of multimodal pipelines, I highly recommend checking out the technical breakdowns at &lt;strong&gt;&lt;a href="https://www.wellally.tech/blog" rel="noopener noreferrer"&gt;WellAlly Tech Blog&lt;/a&gt;&lt;/strong&gt;. They cover advanced patterns for scaling FastAPI backends and optimizing LLM latency that are crucial for high-traffic health apps.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: Integrating the Pipeline with FastAPI
&lt;/h2&gt;

&lt;p&gt;Now we wrap everything into a clean endpoint.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;fastapi&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;FastAPI&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;UploadFile&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;File&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;cv2&lt;/span&gt;

&lt;span class="n"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;FastAPI&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="nd"&gt;@app.post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/analyze-meal&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;analyze_meal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;UploadFile&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;File&lt;/span&gt;&lt;span class="p"&gt;(...)):&lt;/span&gt;
    &lt;span class="c1"&gt;# 1. Load Image
&lt;/span&gt;    &lt;span class="n"&gt;contents&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;nparr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;frombuffer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;contents&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;uint8&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;img&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cv2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;imdecode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nparr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cv2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;IMREAD_COLOR&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# 2. Run SAM (Optional Pre-processing)
&lt;/span&gt;    &lt;span class="c1"&gt;# segmenter = FoodSegmenter()
&lt;/span&gt;    &lt;span class="c1"&gt;# mask = segmenter.get_masks(img)
&lt;/span&gt;
    &lt;span class="c1"&gt;# 3. Call GPT-4o Vision
&lt;/span&gt;    &lt;span class="c1"&gt;# (Assume image conversion to base64 here)
&lt;/span&gt;    &lt;span class="n"&gt;analysis_result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;analyze_food_vision&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;encoded_image&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# 4. Store in PostgreSQL
&lt;/span&gt;    &lt;span class="c1"&gt;# db.save(analysis_result)
&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;success&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;data&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;analysis_result&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Conclusion: The Future of Visual Dietetics 🍎
&lt;/h2&gt;

&lt;p&gt;By combining &lt;strong&gt;SAM&lt;/strong&gt;'s spatial awareness with &lt;strong&gt;GPT-4o&lt;/strong&gt;'s world knowledge, we've created a system that doesn't just "see" food—it understands it. This pipeline can be extended to recognize kitchen utensils for scale reference or even detect degrees of "doneness" to adjust caloric density.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key Takeaways:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;SAM&lt;/strong&gt; is essential for precision; it prevents the LLM from hallucinating calories based on the tablecloth.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Structured Outputs&lt;/strong&gt; (JSON mode) are non-negotiable for building real applications.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;FastAPI&lt;/strong&gt; provides the asynchronous speed needed for a smooth user experience.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Are you building something in the Vision AI space? Drop a comment below or share your results! And don't forget to visit &lt;strong&gt;&lt;a href="https://www.wellally.tech/blog" rel="noopener noreferrer"&gt;WellAlly Tech&lt;/a&gt;&lt;/strong&gt; for more advanced engineering guides. Happy coding! 💻🚀&lt;/p&gt;

</description>
      <category>ai</category>
      <category>python</category>
      <category>react</category>
      <category>opensource</category>
    </item>
    <item>
      <title>Unifying Your Health Data: Building a High-Frequency ETL Pipeline for Apple HealthKit and Google Health Connect 🏃‍♂️📊</title>
      <dc:creator>wellallyTech</dc:creator>
      <pubDate>Sun, 03 May 2026 01:25:00 +0000</pubDate>
      <link>https://dev.to/wellallytech/unifying-your-health-data-building-a-high-frequency-etl-pipeline-for-apple-healthkit-and-google-214a</link>
      <guid>https://dev.to/wellallytech/unifying-your-health-data-building-a-high-frequency-etl-pipeline-for-apple-healthkit-and-google-214a</guid>
      <description>&lt;p&gt;Ever wondered why your Apple Watch says you're a marathon runner while Google Fit thinks you're a couch potato? Building a &lt;strong&gt;Quantified Self Data Lake&lt;/strong&gt; is the ultimate dream for data nerds, but syncing &lt;strong&gt;Apple HealthKit&lt;/strong&gt; and &lt;strong&gt;Google Health Connect&lt;/strong&gt; involves more than just a simple API call. Between mismatched sampling frequencies and the eternal nightmare of &lt;strong&gt;timezone conversions&lt;/strong&gt;, creating a reliable &lt;strong&gt;high-frequency ETL pipeline&lt;/strong&gt; is a true test of your &lt;strong&gt;Data Engineering&lt;/strong&gt; mettle.&lt;/p&gt;

&lt;p&gt;In this tutorial, we are going to architect a robust system that pulls raw telemetry from mobile SDKs, processes it through &lt;strong&gt;Apache Hop&lt;/strong&gt;, and stores it in a structured &lt;strong&gt;PostgreSQL&lt;/strong&gt; data lake. We'll solve the "heterogeneous data" problem and ensure your heart rate variability (HRV) and step counts are synchronized with millisecond precision. 🚀&lt;/p&gt;

&lt;h2&gt;
  
  
  The Architecture: From Pulse to Postgres
&lt;/h2&gt;

&lt;p&gt;To handle the high-frequency nature of health data (which can generate thousands of rows per hour), we need a decoupled architecture. We use a Python-based middleware to bridge the mobile SDKs and an orchestration layer to handle the heavy lifting.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;graph TD
    A[Apple HealthKit SDK] --&amp;gt;|JSON Stream| B(Python FastAPI Middleware)
    C[Google Health Connect] --&amp;gt;|Batch Export| B
    B --&amp;gt;|Raw Storage| D[(PostgreSQL Staging)]

    subgraph ETL Orchestration
    E[Apache Hop] --&amp;gt;|Extract &amp;amp; Normalize| D
    E --&amp;gt;|Transform Timezones| F{Data Validator}
    F --&amp;gt;|Load| G[(Quantified Self Data Lake)]
    end

    G --&amp;gt; H[Grafana / Superset Visualization]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Prerequisites 🛠️
&lt;/h2&gt;

&lt;p&gt;Before we dive into the code, ensure you have the following in your toolkit:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Python 3.10+&lt;/strong&gt;: For our middleware and data validation.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;PostgreSQL&lt;/strong&gt;: Our destination data lake.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Apache Hop&lt;/strong&gt;: The successor to Kettle/PDI for visual ETL orchestration.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;HealthKit/Health Connect SDKs&lt;/strong&gt;: Configured in your mobile project.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Step 1: Handling Heterogeneous Data with Pydantic
&lt;/h2&gt;

&lt;p&gt;The biggest hurdle is that Apple and Google represent data differently. Apple uses "Samples," while Google often aggregates into "Records." We'll use &lt;strong&gt;Pydantic&lt;/strong&gt; to enforce a unified schema before the data even hits our staging area.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pydantic&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Field&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Union&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;HealthMetric&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;source_device&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="c1"&gt;# 'apple_watch' or 'pixel_watch'
&lt;/span&gt;    &lt;span class="n"&gt;metric_type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;   &lt;span class="c1"&gt;# 'step_count', 'heart_rate', 'active_calories'
&lt;/span&gt;    &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;
    &lt;span class="n"&gt;unit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;start_time&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;
    &lt;span class="n"&gt;end_time&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;
    &lt;span class="n"&gt;timezone&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;UTC&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;

&lt;span class="c1"&gt;# Example: Validating a high-frequency Heart Rate sample
&lt;/span&gt;&lt;span class="n"&gt;raw_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;source_device&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;apple_watch_s8&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;metric_type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;heart_rate&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;value&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;72.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;unit&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;count/min&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;start_time&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2023-10-27T10:00:00Z&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;end_time&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2023-10-27T10:00:00Z&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;timezone&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;America/New_York&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;metric&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;HealthMetric&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;raw_data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Validated: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;metric&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;metric_type&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; at &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;metric&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start_time&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 2: Designing the Data Lake Schema
&lt;/h2&gt;

&lt;p&gt;We need a schema that supports "Upsert" operations. Why? Because mobile devices often resync old data when they reconnect to Wi-Fi. We don't want duplicate steps (as much as we'd like the extra credit! 😅).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;health_metrics_raw&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="n"&gt;gen_random_uuid&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="n"&gt;external_id&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;UNIQUE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;-- Hash of (source + metric + start_time)&lt;/span&gt;
    &lt;span class="n"&gt;source_device&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;metric_type&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="nb"&gt;NUMERIC&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;unit&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;ts_start&lt;/span&gt; &lt;span class="nb"&gt;TIMESTAMP&lt;/span&gt; &lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="nb"&gt;TIME&lt;/span&gt; &lt;span class="k"&gt;ZONE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;ts_end&lt;/span&gt; &lt;span class="nb"&gt;TIMESTAMP&lt;/span&gt; &lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="nb"&gt;TIME&lt;/span&gt; &lt;span class="k"&gt;ZONE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="nb"&gt;TIMESTAMP&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="k"&gt;CURRENT_TIMESTAMP&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_metric_time&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;health_metrics_raw&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metric_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ts_start&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 3: Orchestration with Apache Hop
&lt;/h2&gt;

&lt;p&gt;While Python is great for ingestion, &lt;strong&gt;Apache Hop&lt;/strong&gt; shines in complex ETL logic. We create a pipeline that:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Reads&lt;/strong&gt; from the PostgreSQL staging table.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Standardizes Units&lt;/strong&gt;: Converts everything to SI units (e.g., all energy to Kilocalories).&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Timezone Normalization&lt;/strong&gt;: Uses the &lt;code&gt;timezone&lt;/code&gt; field to convert all &lt;code&gt;ts_start&lt;/code&gt; to a localized user timeline and a universal UTC timeline.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Deduplication&lt;/strong&gt;: Uses a "Unique Rows" transform based on the &lt;code&gt;external_id&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Pro Tip:&lt;/strong&gt; For advanced data engineering patterns and more production-ready examples of high-throughput pipelines, check out the deep-dive articles at &lt;a href="https://www.wellally.tech/blog" rel="noopener noreferrer"&gt;WellAlly Tech Blog&lt;/a&gt;. They have fantastic resources on scaling PostgreSQL for time-series data! 🥑&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Step 4: The Timezone Trap 🕰️
&lt;/h2&gt;

&lt;p&gt;When you fly from New York to Tokyo, your "daily steps" become a mess. If you store data in UTC, your 10k steps might appear split across two days. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Solution:&lt;/strong&gt; Always store the &lt;strong&gt;Offset&lt;/strong&gt; and the &lt;strong&gt;Local Time&lt;/strong&gt;.&lt;br&gt;
In your ETL logic, create a generated column:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;health_metrics_raw&lt;/span&gt; 
&lt;span class="k"&gt;ADD&lt;/span&gt; &lt;span class="k"&gt;COLUMN&lt;/span&gt; &lt;span class="n"&gt;local_day&lt;/span&gt; &lt;span class="nb"&gt;DATE&lt;/span&gt; &lt;span class="k"&gt;GENERATED&lt;/span&gt; &lt;span class="n"&gt;ALWAYS&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ts_start&lt;/span&gt; &lt;span class="k"&gt;AT&lt;/span&gt; &lt;span class="nb"&gt;TIME&lt;/span&gt; &lt;span class="k"&gt;ZONE&lt;/span&gt; &lt;span class="n"&gt;timezone&lt;/span&gt;&lt;span class="p"&gt;)::&lt;/span&gt;&lt;span class="nb"&gt;DATE&lt;/span&gt; &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;STORED&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This allows you to query "Steps per day" based on where you physically were at that moment.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion: Data-Driven Wellness
&lt;/h2&gt;

&lt;p&gt;By building this pipeline, you've moved from "guessing" your health to "engineering" your wellness. You now have a unified, deduplicated, and timezone-aware data lake ready for Analysis or even training your own ML models.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What's next?&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; Connect &lt;strong&gt;Grafana&lt;/strong&gt; to your Postgres instance for real-time dashboards.&lt;/li&gt;
&lt;li&gt; Implement &lt;strong&gt;Anomaly Detection&lt;/strong&gt; to alert you when your resting heart rate spikes.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Did you run into issues with the HealthKit background delivery? Or maybe Google's OAuth 2.0 flow is giving you a headache? Let’s chat in the comments below! 👇&lt;/p&gt;




&lt;p&gt;&lt;em&gt;If you enjoyed this build, don't forget to follow for more "Learning in Public" tutorials and visit &lt;a href="https://www.wellally.tech/blog" rel="noopener noreferrer"&gt;wellally.tech/blog&lt;/a&gt; for the full source code of this pipeline!&lt;/em&gt; 🚀💻&lt;/p&gt;

</description>
      <category>datascience</category>
      <category>ai</category>
      <category>python</category>
      <category>react</category>
    </item>
    <item>
      <title>Beyond Simple Image Recognition: Building a Precise AI Nutritionist with GPT-4o and Segment Anything (SAM)</title>
      <dc:creator>wellallyTech</dc:creator>
      <pubDate>Sat, 02 May 2026 01:20:00 +0000</pubDate>
      <link>https://dev.to/wellallytech/beyond-simple-image-recognition-building-a-precise-ai-nutritionist-with-gpt-4o-and-segment-29ml</link>
      <guid>https://dev.to/wellallytech/beyond-simple-image-recognition-building-a-precise-ai-nutritionist-with-gpt-4o-and-segment-29ml</guid>
      <description>&lt;p&gt;We've all been there: you take a photo of your lunch with a generic calorie-tracking app, and it tells you your 500-gram lasagna is a "medium slice of cake." 🤦‍♂️ The struggle with &lt;strong&gt;AI nutrition tracking&lt;/strong&gt; isn't just identifying the food; it's the spatial awareness—understanding volume, portion size, and the hidden ingredients in complex dishes.&lt;/p&gt;

&lt;p&gt;In this tutorial, we are leveling up. We are building a sophisticated &lt;strong&gt;Visual RAG (Retrieval-Augmented Generation)&lt;/strong&gt; pipeline. By combining the semantic power of &lt;strong&gt;GPT-4o Vision&lt;/strong&gt; with the surgical precision of Meta's &lt;strong&gt;Segment Anything Model (SAM)&lt;/strong&gt;, we can isolate individual ingredients and cross-reference them with a nutritional database to provide professional-grade calorie and macronutrient auditing. If you are looking for production-ready patterns for AI vision systems, be sure to check out the deep dives over at &lt;a href="https://www.wellally.tech/blog" rel="noopener noreferrer"&gt;WellAlly Tech Blog&lt;/a&gt;, where we explore high-performance AI architectures.&lt;/p&gt;




&lt;h2&gt;
  
  
  🏗️ The Architecture: Precision Vision Pipeline
&lt;/h2&gt;

&lt;p&gt;Standard vision models often treat an image as a single "bag of pixels." Our pipeline treats it as a structured scene. We use SAM to generate precise masks, calculate the relative area of food items, and then feed those high-context crops to GPT-4o for final reasoning.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;graph TD
    A[User Uploads Meal Photo] --&amp;gt; B{SAM Engine}
    B --&amp;gt;|Segment| C[Isolated Food Masks]
    B --&amp;gt;|Calculate| D[Relative Volume/Area]
    C --&amp;gt; E[GPT-4o Vision Analysis]
    D --&amp;gt; E
    E --&amp;gt; F[Semantic Food Tags]
    F --&amp;gt; G[PostgreSQL Nutrition DB]
    G --&amp;gt; H[Final Nutrient Report]
    H --&amp;gt; I[User Feedback Loop]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🛠️ The Tech Stack
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;GPT-4o&lt;/strong&gt;: Our "Reasoning Engine" for identifying complex food types and textures.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;SAM (Segment Anything Model)&lt;/strong&gt;: To precisely delineate where one food item ends and another begins.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;FastAPI&lt;/strong&gt;: For the high-performance asynchronous API layer.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;PostgreSQL&lt;/strong&gt;: Storing our ground-truth nutritional data for RAG.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  👨‍💻 Step 1: Defining the Structured Output
&lt;/h2&gt;

&lt;p&gt;To ensure our pipeline is reliable, we need &lt;strong&gt;GPT-4o&lt;/strong&gt; to return structured data. We’ll use Pydantic to define what a "Meal Analysis" looks like.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pydantic&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Field&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;FoodItem&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(...,&lt;/span&gt; &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Name of the food item&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;estimated_weight_grams&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(...,&lt;/span&gt; &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Estimated weight based on volume&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;confidence&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(...,&lt;/span&gt; &lt;span class="n"&gt;ge&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;le&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;ingredients&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MealReport&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;FoodItem&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;total_calories&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;
    &lt;span class="n"&gt;macros&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;default_factory&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;protein&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;carbs&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fat&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🧠 Step 2: The SAM + GPT-4o Synergy
&lt;/h2&gt;

&lt;p&gt;The magic happens when we don't just send a raw photo. We send the photo plus the coordinates/masks generated by SAM. This helps GPT-4o "focus" its attention on specific regions.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;fastapi&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;FastAPI&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;UploadFile&lt;/span&gt;

&lt;span class="n"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;FastAPI&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="nd"&gt;@app.post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/analyze-meal&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;analyze_meal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;UploadFile&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# 1. Process image with SAM (Pseudo-code for the segmentation step)
&lt;/span&gt;    &lt;span class="c1"&gt;# masks, scores = sam_model.predict(image)
&lt;/span&gt;
    &lt;span class="c1"&gt;# 2. Extract metadata and prepare for GPT-4o
&lt;/span&gt;    &lt;span class="n"&gt;image_bytes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4o&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;system&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;You are a professional nutritionist. Analyze the image and segmented areas to provide a precise nutrient breakdown.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
                    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Analyze this meal. Note that I have segmented the main protein from the side carbs.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
                    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;image_url&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;image_url&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;url&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;data:image/jpeg;base64,&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;encode_image&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image_bytes&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}}&lt;/span&gt;
                &lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="n"&gt;response_format&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;json_object&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🥗 Step 3: Improving Accuracy with Visual RAG
&lt;/h2&gt;

&lt;p&gt;The hardest part of nutrition AI is "hallucination." GPT-4o might think a sauce is tomato-based when it's actually a high-calorie chili oil. By implementing a &lt;strong&gt;Visual RAG&lt;/strong&gt; pattern, we take the labels identified by GPT-4o and query our &lt;strong&gt;PostgreSQL&lt;/strong&gt; database for verified nutritional profiles.&lt;/p&gt;

&lt;p&gt;For even more advanced implementations of RAG in multimodal environments, I highly recommend checking out the technical guides at &lt;a href="https://www.wellally.tech/blog" rel="noopener noreferrer"&gt;wellally.tech/blog&lt;/a&gt;. They cover how to optimize vector embeddings for visual features, which is a game-changer for this specific use case. 🥑&lt;/p&gt;

&lt;h3&gt;
  
  
  The SQL Query Strategy
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Querying verified nutrients based on AI tags&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;calories_per_100g&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;protein&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;carbs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;fat&lt;/span&gt; 
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;nutrition_db&lt;/span&gt; 
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;food_tag&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="k"&gt;ANY&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ARRAY&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'grilled_chicken'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'quinoa'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'broccoli'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;similarity&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🚀 Conclusion: The Future of Precision Health
&lt;/h2&gt;

&lt;p&gt;By combining &lt;strong&gt;Segment Anything (SAM)&lt;/strong&gt; and &lt;strong&gt;GPT-4o&lt;/strong&gt;, we move from "guessing" to "calculating." This pipeline allows for:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Overlapping Food Detection&lt;/strong&gt;: Distinguishing between the rice and the curry on top.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Volume Estimation&lt;/strong&gt;: Using mask areas as a proxy for portion size.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Auditability&lt;/strong&gt;: Users can see exactly which parts of the image were identified as which food.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Building these types of &lt;strong&gt;Computer Vision Calorie Estimation&lt;/strong&gt; tools is just the beginning. As multimodal models become faster and more efficient, we will see these pipelines moving directly to edge devices.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What's next?&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  Try integrating a depth-sensing camera (LiDAR) for 100% accurate volume calculation.&lt;/li&gt;
&lt;li&gt;  Add a feedback loop where the user can correct the AI to fine-tune the local embeddings.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you enjoyed this tutorial, drop a comment below and let me know what you're building! And don't forget to visit &lt;a href="https://www.wellally.tech/blog" rel="noopener noreferrer"&gt;WellAlly Tech&lt;/a&gt; for more cutting-edge AI development content. Happy coding! 💻🔥&lt;/p&gt;

</description>
      <category>ai</category>
      <category>chatgpt</category>
      <category>webdev</category>
      <category>python</category>
    </item>
  </channel>
</rss>
