<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Monika Sonnad Math</title>
    <description>The latest articles on DEV Community by Monika Sonnad Math (@monikasonnadmath).</description>
    <link>https://dev.to/monikasonnadmath</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3966298%2F26d5c20b-7a1b-4d3f-88fc-746328f6c712.jpeg</url>
      <title>DEV Community: Monika Sonnad Math</title>
      <link>https://dev.to/monikasonnadmath</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/monikasonnadmath"/>
    <language>en</language>
    <item>
      <title>What Working in Healthcare Tech Taught Me About Writing Software That Actually Matters</title>
      <dc:creator>Monika Sonnad Math</dc:creator>
      <pubDate>Fri, 12 Jun 2026 15:51:33 +0000</pubDate>
      <link>https://dev.to/monikasonnadmath/what-working-in-healthcare-tech-taught-me-about-writing-software-that-actually-matters-2955</link>
      <guid>https://dev.to/monikasonnadmath/what-working-in-healthcare-tech-taught-me-about-writing-software-that-actually-matters-2955</guid>
      <description>&lt;p&gt;I've worked in two industries as a software developer — telecom and healthcare. Both involve large-scale distributed systems, event-driven architecture, real-time data pipelines. The technical problems are genuinely similar.&lt;/p&gt;

&lt;p&gt;But the way I think about writing software is completely different now. And I think healthcare did that to me.&lt;/p&gt;

&lt;p&gt;*&lt;em&gt;**In Telecom, the Stakes Were Scale&lt;/em&gt;*&lt;br&gt;
In Telecom I was building Kafka pipelines that handled hundreds of millions of events a day. The engineering problems were real and interesting — latency, fault tolerance, schema evolution, consumer lag. When something went wrong, throughput dropped, dashboards went red, and the on-call engineer had a bad night.&lt;/p&gt;

&lt;p&gt;That's stressful. But the worst case scenario of most failures was: some events got delayed, some metrics were stale, some retry queues backed up. You fixed it, you wrote a post-mortem, you moved on.&lt;/p&gt;

&lt;p&gt;The users of the system — the telecom customers — mostly never knew anything had happened.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;In Healthcare, the Stakes Are Different&lt;/strong&gt;&lt;br&gt;
My first week at my Organisation now, I was shown around the clinical operations. I saw the actual documents my code would be processing — patient referrals, discharge summaries, operation notes. Real documents, with real patient names, real diagnoses, real medication lists.&lt;/p&gt;

&lt;p&gt;And I realised something that sounds obvious but hadn't fully landed before: the output of my code connects to a real person's care.&lt;/p&gt;

&lt;p&gt;A wrong patient URN doesn't just fail a unit test. It means a document gets filed against the wrong patient record. In the best case someone notices and fixes it manually. In a worse case, a clinician makes a decision based on incomplete or incorrect information.&lt;/p&gt;

&lt;p&gt;I'm not saying this to be dramatic. I'm saying it because it genuinely changed how I write code.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Three Things That Changed&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. I stopped tolerating silent failures&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In a lot of systems, silent failures are acceptable. A job fails, logs an error, moves on. You catch it in the morning during the daily review.&lt;/p&gt;

&lt;p&gt;In healthcare I became almost allergic to this pattern. If something goes wrong I want to know immediately and I want to know exactly what went wrong. Not a generic exception. Not a null pointer that bubbles up three layers. The specific thing that failed, why it failed, and what state the data was left in.&lt;/p&gt;

&lt;p&gt;This led me to design the OCR service I built at Kingsbridge with explicit failure modes. If the system can't confidently extract a patient URN, it doesn't guess — it flags the document for manual review. That's a deliberate design decision, not a fallback. The uncertainty is surfaced, not hidden.&lt;/p&gt;

&lt;p&gt;I've taken this thinking back into every system I build now. Silent failures are technical debt you pay in trust.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. I started thinking about the person at the end of the pipeline&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In telecom I thought about throughput, latency, and uptime. These are the right things to think about.&lt;/p&gt;

&lt;p&gt;In healthcare I started asking a different question before designing any feature: who is going to use this, and what will they do with the output?&lt;/p&gt;

&lt;p&gt;For the OCR service the answer was: a clinical admin team member, who will take this Excel output and use it to match documents to patient records. That person is not a developer. They don't know what OCR is. They don't care about my architecture. They care about whether the output is correct and whether the errors are clearly labelled so they can handle the exceptions without calling IT.&lt;/p&gt;

&lt;p&gt;That shift in perspective — from "does the system work" to "does the system work for this specific person in this specific context" — is something I now apply everywhere.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. I push back more&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This one surprised me.&lt;/p&gt;

&lt;p&gt;I always thought I was reasonably good at asking questions about requirements. But healthcare gave me a much lower tolerance for building things I wasn't sure about.&lt;/p&gt;

&lt;p&gt;The cost of building the wrong thing in a clinical context is high. Not just wasted engineering time — potentially wrong data in production, potentially a process that introduces errors that are hard to find later. So I started pushing back in planning meetings more. Not aggressively, but persistently. "What happens if the document doesn't have a label in the expected position?" "What should the system do if the confidence score is below threshold?" "Who is responsible for reviewing the exceptions?"&lt;/p&gt;

&lt;p&gt;These questions feel slower in the moment. They make planning meetings longer. But they make the actual software much more reliable because you've thought through the edge cases before you've written a single line of code.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Unexpected Benefit&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Something I didn't expect: writing software this way is more satisfying.&lt;/p&gt;

&lt;p&gt;When I shipped the OCR service and it went into production, I knew exactly what it did, what it didn't do, how it failed, and who was responsible for each failure mode. I knew the clinical staff using it had been involved in testing it on real documents. I knew the exception handling matched their actual workflow.&lt;/p&gt;

&lt;p&gt;That's a different feeling from shipping something that works in the demo but has a bunch of edge cases you're not sure about.&lt;/p&gt;

&lt;p&gt;I think most developers want to write software that matters. Healthcare has a way of making that very concrete, very quickly. The feedback loop between what you build and whether it actually helps someone is short and visible.&lt;/p&gt;

&lt;p&gt;That's a privilege, honestly.&lt;/p&gt;

&lt;p&gt;*&lt;em&gt;**What I'd Tell Developers Thinking About Healthcare Tech&lt;/em&gt;*&lt;br&gt;
It's not as inaccessible as it looks. You don't need a medical background. You need the same things you need in any domain: curiosity about the actual problem, willingness to ask questions, and the discipline to handle edge cases properly.&lt;/p&gt;

&lt;p&gt;What you get in return is a clarity of purpose that's harder to find in other domains. The work matters in a way that's tangible. And that tends to make you a better engineer.&lt;/p&gt;

</description>
      <category>healthtech</category>
      <category>programming</category>
      <category>career</category>
      <category>webdev</category>
    </item>
    <item>
      <title>I Work in Healthcare Tech. Here's Why I Built a RAG Tool for Clinical Documents.</title>
      <dc:creator>Monika Sonnad Math</dc:creator>
      <pubDate>Wed, 03 Jun 2026 10:24:19 +0000</pubDate>
      <link>https://dev.to/monikasonnadmath/i-work-in-healthcare-tech-heres-why-i-built-a-rag-tool-for-clinical-documents-3630</link>
      <guid>https://dev.to/monikasonnadmath/i-work-in-healthcare-tech-heres-why-i-built-a-rag-tool-for-clinical-documents-3630</guid>
      <description>&lt;p&gt;I didn't set out to build a RAG application. I set out to solve an annoying problem I kept watching happen.&lt;br&gt;
I work as a senior software developer in healthcare technology in Belfast. A big part of that job is understanding what actually slows them down, and figuring out where software can help. Not what's technically impressive what's genuinely useful.&lt;br&gt;
One thing I kept noticing: a lot of time gets spent navigating documents. Not reading them carefully and thoughtfully — just navigating. Ctrl+F for a patient name. Scrolling to find a medication dosage. Hunting for a follow-up instruction buried in paragraph four of page seven of a discharge summary.&lt;br&gt;
It sounds small. Multiply it by every clinical document, every working day, and it adds up fast.&lt;br&gt;
When I started thinking about RAG as a solution, the first thing I did was slow down and think about what "accuracy" means in a clinical setting — because it means something different here than it does in most software contexts.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;The Problem With Most AI Demonstrations in Healthcare&lt;/strong&gt;&lt;br&gt;
Most AI demos in healthcare go like this: upload a document, ask a question, get a fluent confident answer. It looks impressive. The problem is that fluent and confident doesn't mean accurate. Language models are optimised to produce coherent text. Left to their own devices they'll fill gaps, make inferences, and sometimes just invent things — in a way that reads exactly like a real answer.&lt;br&gt;
In most contexts that's an acceptable tradeoff. In a clinical context it isn't. A patient's discharge medications, their follow-up appointments, their documented allergies — these things need to come from the actual document, not from a model's best guess based on what usually appears in documents like this.&lt;br&gt;
So before I wrote a single line of code I made two decisions:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; Temperature 0. No creativity. Deterministic responses only. The model either finds the answer in the document or it says it doesn't know.&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Explicit system prompt. The model is told directly: answer only from the provided context. If the answer isn't there, say so clearly. Do not guess.&lt;br&gt;
These aren't complicated decisions. But they're the right ones for this use case, and I see a lot of healthcare AI demos where nobody made them.&lt;/p&gt;



&lt;p&gt;&lt;strong&gt;How RAG Actually Works (Without the Hype)&lt;/strong&gt;&lt;br&gt;
RAG stands for Retrieval-Augmented Generation. The idea is straightforward:&lt;br&gt;
Instead of asking a language model a question and hoping it knows the answer from training, you first retrieve relevant sections from your actual documents, then ask the model to answer based only on what you retrieved.&lt;br&gt;
The pipeline looks like this:&lt;br&gt;
Document (PDF)&lt;br&gt;
  ↓&lt;br&gt;
Extract text&lt;br&gt;
  ↓&lt;br&gt;
Split into chunks&lt;br&gt;
  ↓&lt;br&gt;
Embed each chunk as a vector&lt;br&gt;
  ↓&lt;br&gt;
Store in a vector database&lt;/p&gt;

&lt;p&gt;↓ (at query time)&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Question&lt;br&gt;
      ↓&lt;br&gt;
Embed the question as a vector&lt;br&gt;
      ↓&lt;br&gt;
Find the most similar chunks (semantic search)&lt;br&gt;
      ↓&lt;br&gt;
Send those chunks + question to the LLM&lt;br&gt;
      ↓&lt;br&gt;
Get an answer grounded in the document&lt;br&gt;
The key word is grounded. The LLM doesn't know what's in your documents from training — it only knows what you retrieved and passed to it. If the answer isn't in the retrieved chunks, a well-instructed model will tell you that.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Building It: The Decisions That Mattered&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;Chunking strategy&lt;/strong&gt;&lt;br&gt;
How you split documents into chunks matters more than most tutorials acknowledge. Too small and you lose context — a medication dosage split from its drug name is useless. Too large and retrieval gets imprecise.&lt;br&gt;
I landed on 500-character chunks with 50-character overlap. The overlap is important — it means the boundary between two chunks always has context from both sides, so you don't lose meaning at the split point.&lt;br&gt;
splitter = RecursiveCharacterTextSplitter(&lt;br&gt;
    chunk_size=500,&lt;br&gt;
    chunk_overlap=50,&lt;br&gt;
    separators=["\n\n", "\n", ". ", " ", ""],&lt;br&gt;
)&lt;br&gt;
The separator hierarchy matters too. We try to split at paragraph boundaries first, then sentence boundaries, then spaces. Splitting mid-sentence is a last resort.&lt;br&gt;
Embedding model&lt;br&gt;
I used OpenAI's text-embedding-3-small. It's fast, cheap, and good enough for document retrieval. For production clinical systems handling complex medical terminology you'd want to evaluate domain-specific embeddings — but for a general-purpose tool this works well.&lt;br&gt;
The system prompt&lt;br&gt;
This is where clinical RAG lives or dies:&lt;br&gt;
system_prompt = (&lt;br&gt;
    "You are a helpful assistant that answers questions about clinical documents. "&lt;br&gt;
    "Answer only from the provided context. If the answer is not in the context, "&lt;br&gt;
    "say so clearly — do not guess or make up information. "&lt;br&gt;
    "In a clinical setting, accuracy matters more than completeness."&lt;br&gt;
)&lt;br&gt;
That last sentence — accuracy matters more than completeness — is doing a lot of work. It gives the model permission to say "I don't know" rather than producing a plausible-sounding answer. In my testing, including it made a real difference to the rate of hallucinations on edge cases.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;What I Learned Building This&lt;/strong&gt;&lt;br&gt;
The messy document problem is real. Most RAG tutorials use clean, well-structured PDFs. Clinical documents are not clean or well-structured. Scanned discharge summaries with inconsistent formatting, referral letters with abbreviations that differ between hospitals, guidelines with tables that don't chunk cleanly — all of these degrade retrieval quality. I've started thinking about pre-processing pipelines to handle this better and may add that to the project.&lt;br&gt;
Confidence matters. I added a simple confidence indicator based on how many relevant chunks were retrieved. It's a rough heuristic — more retrieved chunks suggests a more answerable question — but it gives users a signal about how much to trust the response. In a clinical context, knowing when to verify against the source document is as important as the answer itself.&lt;br&gt;
The UI needs to be dead simple. Clinical staff are not developers. If the interface requires any technical knowledge to operate it won't get used. The Streamlit app has three interactions: upload a file, type a question, read the answer. That's it.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;The Code&lt;/strong&gt;&lt;br&gt;
The project is open source on GitHub: github.com/Monika-Sonnadmath/clinical-rag&lt;br&gt;
It has a Python API for developers who want to embed it in their own pipelines, and a Streamlit web app for anyone who just wants to use it directly.&lt;br&gt;
pip install -r requirements.txt&lt;br&gt;
export OPENAI_API_KEY=sk-...&lt;br&gt;
streamlit run app.py&lt;br&gt;
Upload a PDF, ask a question, get an answer. That's the whole thing.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;What's Next&lt;/strong&gt;&lt;br&gt;
A few things I want to add:&lt;br&gt;
• Local LLM support via Ollama — so it works without an API key and keeps documents entirely on-premises. For clinical use cases, data leaving the building is a concern.&lt;br&gt;
• Multi-document querying — ask a question across a folder of documents at once&lt;br&gt;
• Better pre-processing — handling scanned documents with variable quality, which is very much a solved problem in OCR but needs connecting to the retrieval pipeline&lt;br&gt;
If you work in healthcare tech and have thoughts on what would make this more useful, I'd genuinely like to hear from you.&lt;/p&gt;

</description>
      <category>python</category>
      <category>ai</category>
      <category>programming</category>
      <category>rag</category>
    </item>
  </channel>
</rss>
