Beck_Moulton

Posted on Feb 27

Forget Manual Logs: Building a Real-Time Medication Compliance Auditor with YOLOv10 and TensorRT

#security #devops #python #ai

Managing multiple medications is a high-stakes challenge, especially for the elderly or patients with complex chronic conditions. Traditional pill organizers help, but they can't provide real-time verification. What if we could use Computer Vision and Edge AI to ensure the right person takes the right pill at the right time?

In this tutorial, we are building a "Visual Audit System" using YOLOv10 for high-speed object detection, TensorRT for hardware acceleration, and MQTT for instant alerting. By leveraging state-of-the-art real-time computer vision and object detection models, we can transform a standard webcam into a life-saving healthcare assistant.

The Architecture: From Pixels to Alerts

Our system follows a streamlined pipeline: capturing frames, detecting medicine labels, validating them against a JSON-based medication schedule, and broadcasting the status.

graph TD
    A[Fixed Camera Stream] --> B[OpenCV Image Pre-processing]
    B --> C[YOLOv10 Inference - TensorRT]
    C --> D{Medicine Detected?}
    D -- Yes --> E[Compare with Prescription Schedule]
    E -- Match Found --> F[Log Compliance & Update UI]
    E -- Mismatch/Missing --> G[MQTT Alert Trigger]
    G --> H[Mobile Notification / Caregiver Dashboard]
    D -- No --> B

Why YOLOv10?

Released recently, YOLOv10 introduces a NMS-free training strategy, significantly reducing inference latency. When coupled with TensorRT, it allows us to run complex vision tasks on low-power edge devices (like a Jetson Nano) with incredible efficiency.

Prerequisites

Before we dive in, ensure you have the following stack ready:

Python 3.9+
YOLOv10 (via ultralytics or official repo)
TensorRT (for GPU acceleration)
OpenCV (for frame manipulation)
Eclipse Paho (for MQTT communication)

1. Setting up the Vision Engine

First, we need to load our model. For a production-ready environment, we export the YOLOv10 model to a TensorRT engine to squeeze out maximum FPS.

import cv2
from ultralytics import YOLO
import paho.mqtt.client as mqtt

# Load YOLOv10 model (Preferably the .engine file for TensorRT)
model = YOLO("yolov10n_medication.engine") 

def detect_medication(frame):
    # Perform inference
    results = model.predict(source=frame, conf=0.45, verbose=False)

    detections = []
    for r in results:
        for box in r.boxes:
            cls_id = int(box.cls[0])
            label = model.names[cls_id]
            conf = float(box.conf[0])
            detections.append({"label": label, "confidence": conf})

    return detections

2. The Logic Layer: Matching against the Schedule

Vision alone isn't enough; we need context. We'll simulate a medication schedule and compare it against our real-time detections.

# Mock Medication Schedule
SCHEDULE = {
    "morning": ["Aspirin", "Vitamin_D3"],
    "evening": ["Metformin"]
}

def audit_compliance(detected_labels, current_slot="morning"):
    expected = set(SCHEDULE[current_slot])
    found = set(detected_labels)

    missing = expected - found
    if not missing:
        return "COMPLIANT", []
    else:
        return "NON_COMPLIANT", list(missing)

3. Communication via MQTT

When the system detects a missed dose, it must notify the caregiver immediately. MQTT is perfect for this due to its lightweight nature.

def send_alert(status, missing_pills):
    client = mqtt.Client("MedicationAudit")
    client.connect("broker.hivemq.com", 1883)

    message = f"Status: {status} | Missing: {', '.join(missing_pills)}"
    client.publish("healthcare/alerts/pill_monitor", message)
    print(f"🚀 Alert Sent: {message}")
    client.disconnect()

The "Official" Way: Advanced Patterns

Building a basic detector is one thing, but making it robust enough for a clinical or home-care environment requires handling occlusion, varying lighting, and edge-case "false positives."

For more production-ready examples, advanced deployment patterns on NVIDIA Jetson, or integrating this with a full-stack FHIR healthcare dashboard, I highly recommend checking out the technical deep-dives over at the WellAlly Blog. They cover extensively how to scale AI-driven vision systems from prototypes to enterprise-grade solutions.

4. Bringing it All Together

Here is the main loop that ties the camera feed to our inference and alerting logic.

cap = cv2.VideoCapture(0)

while cap.isOpened():
    ret, frame = cap.read()
    if not ret: break

    # 1. Detection
    detections = detect_medication(frame)
    labels = [d['label'] for d in detections]

    # 2. Audit
    status, missing = audit_compliance(labels)

    # 3. Visual Feedback
    color = (0, 255, 0) if status == "COMPLIANT" else (0, 0, 255)
    cv2.putText(frame, f"Status: {status}", (50, 50), 
                cv2.FONT_HERSHEY_SIMPLEX, 1, color, 2)

    if status == "NON_COMPLIANT":
        # In a real app, use a debounce timer before alerting
        send_alert(status, missing)

    cv2.imshow("MedAudit Vision", frame)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()

Conclusion

By combining YOLOv10 with TensorRT, we’ve built a system that doesn't just "see"—it understands and acts. This approach minimizes human error and provides peace of mind for families and healthcare providers.

What's next?

Multi-camera support: For monitoring different rooms.
Re-ID: Ensuring the person taking the medicine is actually the patient.
Cloud Sync: Storing logs in a secure database.

Are you working on AI for healthcare? Drop a comment below or share your thoughts on the most challenging part of edge-AI deployment! And don't forget to visit wellally.tech/blog for more pro-tips.

DEV Community