Muhammad Asim Hanif

Posted on May 23

MedReport Agent — AI-Powered Medical Report Interpreter

#hermesagentchallenge #devchallenge #agents #agentskills

Hermes Agent Challenge Submission: Build With Hermes Agent

This is a submission for the Hermes Agent Challenge: Build With Hermes Agent

What I Built

I built MedReport Agent, an AI-powered medical report interpreter that helps patients understand their lab reports in simple and clear language.

Many patients receive medical reports such as blood tests, liver function tests, kidney function tests, thyroid reports, and other lab results, but they often cannot understand the medical terms, abbreviations, values, and reference ranges written inside those reports.

MedReport Agent solves this problem by allowing users to upload a medical report as a PDF or image. The system then reads the report, extracts medical values, detects abnormal results, compares them with reference ranges, and generates easy-to-understand explanations in both English and Urdu.

The goal of this project is not to replace doctors. Instead, it helps patients understand their reports better and prepare useful questions before visiting a healthcare professional.

The system provides:

Medical report upload support
OCR-based text extraction
Automatic report type detection
Lab value extraction
Reference range comparison
Abnormal value highlighting
English and Urdu explanations
Doctor questions generation
Clear next-step guidance
Agent audit logs
Privacy-focused local processing

Demo

Demo Video:

Screenshots:
1. Upload screen

2. Agent processing/progress screen

3. Final results dashboard

4. Agent audit logs section

Code

GitHub Repository:
https://github.com/codedbyasim/MedReport

The project structure includes:

MedReport/
├── backend/
│   ├── main.py
│   ├── hermes_agent.py
│   ├── medreport_skill.yaml
│   ├── database.py
│   ├── ocr_processor.py
│   ├── llm_client.py
│   ├── chroma_kb.py
│   ├── requirements.txt
│   └── Dockerfile
│
├── frontend/
│   ├── src/
│   │   ├── App.tsx
│   │   ├── index.css
│   │   └── main.tsx
│   ├── package.json
│   └── Dockerfile
│
├── skills/
│   └── medical/medreport-interpreter/
│       └── SKILL.md
│
├── Model/
│   └── qwen2.5-1.5b-instruct-q4_k_m.gguf
│
├── docker-compose.yml
├── README.md
├── DOCUMENTATION.md
├── hackathon_evaluation.md
├── LICENSE
└── SRS.txt

My Tech Stack

Layer	Technology
Agent Workflow	Hermes Agent
Backend	Python, FastAPI, Uvicorn
Frontend	React, TypeScript, Vite
OCR	PyMuPDF, EasyOCR
Local LLM	Qwen2.5 GGUF
LLM Runtime	llama-cpp-python
Knowledge Retrieval	ChromaDB
Styling	CSS, responsive dashboard UI
Deployment	Docker, Docker Compose
License	MIT

How I Used Hermes Agent

Hermes Agent is the core of MedReport Agent. I used it to build a real multi-step agentic workflow instead of a simple chatbot-style application.

The agent controls the complete medical report analysis pipeline from upload to final explanation.

The workflow follows these steps:

Each step is handled as a separate tool or module. This makes the system more reliable, easier to debug, and closer to a real agentic application.

Agentic Capabilities Used

I used Hermes Agent for:

Planning: The agent follows a structured medical report analysis workflow.
Tool use: Each stage of the pipeline is handled by a dedicated tool.
Multi-step reasoning: The system connects OCR output, parsed values, reference ranges, and retrieved knowledge to generate the final explanation.
Self-correction: If normal parsing fails, the agent can use an LLM-based fallback parsing strategy.
Audit logging: Every major tool call is logged so the workflow remains transparent.
Skill-based configuration: The workflow is defined using a Hermes skill configuration file.

Main Features

Medical Report Upload

Users can upload a PDF or image of their medical report through the web dashboard.

OCR Processing

The backend extracts text from uploaded reports. Digital PDFs are handled using PDF text extraction, while scanned images can be processed using OCR.

Report Type Detection

The system identifies the type of medical report, such as:

CBC
LFT
RFT
Lipid profile
Thyroid profile
Glucose or diabetes-related reports

Lab Value Extraction

The parser extracts medical test names, values, and units from the report text.

Reference Range Comparison

Extracted values are compared with stored reference ranges to determine whether a result is normal, low, high, or critical.

Bilingual Explanation

The system generates patient-friendly explanations in:

English
Urdu

This makes the project more useful for local users who may not be comfortable with English medical terminology.

Doctor Questions

The agent generates useful questions that the patient can ask their doctor during consultation.

Agent Audit Logs

The dashboard shows the workflow logs so users and developers can understand what the agent did at each step.

Why This Project Is Useful

Medical reports are often difficult for normal users to understand. A patient may see values such as hemoglobin, WBC, platelets, ALT, AST, bilirubin, creatinine, glucose, TSH, or HbA1c, but may not know what they mean.

MedReport Agent converts this complex medical information into simple explanations.

This can help patients:

Understand their report better
Identify abnormal values
Ask better questions to doctors
Reduce confusion caused by medical jargon
Access explanations in their local language

The project is especially useful in regions where medical literacy is low and where patients may not always get enough time with doctors to discuss every value in detail.

What Makes It Different

MedReport Agent is different from a normal chatbot because it does not depend on one single prompt.

Instead, it uses a complete agentic pipeline:

This makes the output more structured and transparent.

The project also focuses on:

Local processing
Privacy
Urdu support
Medical report understanding
Transparent agent workflow
Practical healthcare use case

Challenges I Faced

One challenge was handling different medical report formats because labs write test names and values in different ways.

Another challenge was extracting useful text from both digital PDFs and scanned images.

It was also important to design the system in a way that gives helpful explanations without pretending to be a doctor.

To solve these challenges, I used a modular pipeline where each step has a clear responsibility.

How to Run Locally

git clone https://github.com/codedbyasim/MedReport
cd MedReport
docker-compose up --build

What I Learned

While building this project, I learned that agentic systems are most useful when they are connected with real tools and real workflows.

Hermes Agent helped me design the project as a proper pipeline instead of a basic AI wrapper.

I learned about:

Agentic workflow design
OCR integration
Tool-based architecture
Local LLM usage
Retrieval-based medical explanation
Error handling and fallback strategies
User-centered healthcare UI design

Future Improvements

In the future, I would like to add:

Support for more medical test categories
Better handwritten report recognition
PDF export of final explanation
Voice explanation in Urdu
Mobile app version
Patient history comparison
Doctor-side dashboard
More local languages such as Punjabi, Sindhi, and Pashto

Final Thoughts

MedReport Agent shows how Hermes Agent can power a practical and useful real-world application.

The project combines OCR, local LLMs, medical reference ranges, retrieval-based knowledge, bilingual explanation, and an agentic workflow into one complete system.

It is designed to help patients understand their reports better and approach doctors with more confidence.

Disclaimer: MedReport Agent is not a replacement for professional medical advice, diagnosis, or treatment. It is an educational assistant that helps users understand medical reports in simple language. Users should always consult a qualified medical professional for healthcare decisions.

DEV Community