QWEN Cloud Hackathon - #1 Technical Deep dive on what I am building

#qwen #qwenvision #bestmodel #alibabacloud

The Problem: A week ago I went to my grandmother who is suffering from demantia. At first I thought it was normal at this second but i wanted to know the cause so i looked up in google and saw that over 55 million patentis are navigating a world where they wake up and cannot recognize their own children's faces. They miss critical life spanning medications and suffer catastrophic falls when unmonitored.

We cannot solve a dynamic, high stakes human crisis with a single, static LLM prompt or a standard chatbot wrapper. If a healthcare AI hallucinates a medication dosage or fails to recognize a family caregiver, the consequences are life threatening.

That's why we are introducing Smriti - Building a 4-Agent Elderly Care system with QwenVL, MemoAssistant and persevere thinking.

The Solution - 4 Specialized QwenNative Agents working together:

Agent1: - Vision Agent (Qwen-VL) -> Responsible For Recognition, medicine label reading, hazard detection.It is optimized to perform high speed facial detection, medicine label reading, and spatial hazard analysis (such as identifying a water spill on the floor or an open stove burner).

Agent2: - Memory Agent(MemoAssistant): Persistent KeyValue Storage Across sessions.The Memory Agent uses MemoAssistant to manage persistent key-value storage across sessions. It caches family profiles, authorized medical staff, and historical routines so the system doesn't have to re-evaluate static profiles on every frame loop.

Agent3: - Guardial Agent(Qwen 3.6 Max + Persevere Thinking) - Hallunication blocking with reasoning traces.It utilizes Qwen's trillion-parameter MoE architecture and forces a reasoning trace calculation using the preserve_thinking parameter. This layer acts as a strict hallucination blocker by forcing the model to explicitly evaluate safety weights and confidence before executing actions.

MOE STRUCTURE

Internal Reasoning for Guardrail Agent:

Analyzing Vision Agent input: Unidentified individual detected in bedroom at 23:15.
Cross-referencing Memory Agent database: No matching family token or caregiver schedule found.
Checking risk matrix: Midnight proximity to patient bed presents high hazard potential.
Evaluating validation metrics: Match confidence is 12.5%.
Action: Threshold (85%) unmet. Halt direct patient voice output; route to Caregiver Agent.

Agent4: - When the Guardrail Agent detects an anomaly or drops below the 85% confidence score, it forwards the state payload to the Caregiver Agent via the Model Context Protocol (MCP). This pushes live notifications, image clips, and the system's reasoning logs directly to a web-based dashboard for immediate human approval.

How Agents Communicate - Vision -> Memory -> Guardrail -> Response in under 2 seconds.

One of the reason to build this - Only in reddit there are like 70K poeple in the subreddit channel of dementia.