HuggingFace Build Small Hackathon My First Health Sprint,

#ai #medical #hackathon #programming

Chapter One: Why Build Small
Building In the Health Space is something that I had always dreaded. The data protection the scary HIPPA laws and the penalties sounded like a nightmare for someone like me who just liked to build, build, build.

The build small hackathon though seemed like just the event to build and test things in this space, because of its one rule use models <32b params to build an app and publish it as a gradio space. I picked the backyard AI track which asked you to build something for someone real that would actually use it. Someone close to me recently developed a cardiac condition and was told by the doctor to monitor their blood pressure and their heart. I knew that eventually we may need to pick a cardiologist and just in case I wanted them to have the full picture.

The Main use case was to give the next doctor a clean starting point. A clinical summary they could hand over at the first appointment instead of trying to reconstruct six months of readings from memory in a waiting room. also in case the readings weren't capturing something I also wanted to use a phone to capture the heartbeat and roughly detect condidtions like brachachardia, tachachardia, Afib and PVC, and I explain this later

What I built
I built Heartline a clinical summary generator app to pass off to the doctor, it takes in two inputs BP cuff readings if you have them(the person that I was building for does have them) and audio recordings of the heart through the phone, which it converts to a spectrogram and does 4 way classification of ( brachachardia, tachachardia, Afib and PVC).
I these as not every one has a cuff at home. That was actually what pushed the scope wider. If you only have your phone, the audio needs to carry the whole load: bradycardia, tachycardia, AFib, PVCs, all of it. So the app works both ways. If you have a cuff, log your readings and the model gets more to work with. If you don't, audio alone still gets you something. The clinical summary just reflects whatever data was actually available.

The training data was synthetic. I wasn't about to feed anyone's real health information into a model, due to all the health laws I generated realistic BP log scenarios from AHA guidelines and trained the audio classifier on normal PhysioNet heart sound recordings and added the signals for each of the 4 classes mixed with iPhone microphone noise, because if this was going to work for the person I built it for it had to work on a phone held to a chest in a kitchen, not a hospital. The model was able to get an impressive AUROC of .95 for the 4 class classification.

The language model was a finetuned OpenBMB MiniCPM at 1B parameters. Its one job was taking the BP logs and audio classifier output and writing something a cardiologist would actually want to read at the start of an appointment. The data generated was once again synthetic constitutionally generated from the objective section of a soap note using a teacher model deepseek:v4-pro, thus making it sound like a clinically grounded objective mini summary for the doctor.

Chapter Four: Fully Private HIPPA Compliant the story of my deployment.

It runs through wllama and webgpu the spectrogram model running through onnx.

Main things I learned was how to generate good data constitutiton based finetuning, spectrogram and audio analysis and the need for fully local models.

here is the space https://huggingface.co/spaces/build-small-hackathon