How I Unified 3 Fragmented Medical APIs Into a Single Python SDK

#api #python #science #showdev

I built MedKit because medical data is notoriously difficult to work with. If you want to correlate a drug's FDA label with its latest clinical trial phases and related research papers, you usually have to juggle three different APIs, handle idiosyncratic JSON schemas, and deal with inconsistent identifier types.

MedKit is a unified Python SDK that transforms these fragmented sources (OpenFDA, PubMed, and ClinicalTrials.gov) into a single, programmable platform.

Key Features:

Unified Client: One MedKit() client to rule them all. No more multiple API keys or manual correlation.
Clinical Synthesis (med.conclude()): Aggregates data to give a "snapshot" verdict on a drug or condition, including an evidence strength score (0.0–1.0).
Interaction Engine: catch drug-drug contraindications using cross-label mentions (brand vs generic).
Medical Relationship Graph: Visualize connections between drugs, trials, and research papers as a knowledge graph.
Intelligence Layer: Natural language routing (med.ask()) to query data in plain English.
Why Use It? Most healthcare developers spend 80% of their time just cleaning and joining data. MedKit handles the plumbing (caching, schema normalization, relationship mapping) so you can focus on the analysis or the application logic.

Tech Stack: Python (Sync/Async), Disk/Memory caching, and a provider-based architecture for easy extensibility.

I'd love to get your thoughts on the med.conclude() synthesis logic, other features and what other providers (e.g., pharmacogenomics) you'd find useful.
More informative description on my github repo.
Repo: https://github.com/interestng/medkit PyPI: pip install medkit-sdk
I really appreciate any support towards this post, and stars/follows on my github repo!
Looking forward to your feedback!

Top comments (2)

klement Gunndu • Mar 1

The evidence strength score in med.conclude() is interesting — how are you weighting conflicting results across sources? Clinical trial phase vs. published paper often disagree on efficacy.

Jamie Likernon • Mar 1 • Edited

Hey! The current implementation doesn't actually resolve conflicts, it uses a purely additive scoring model where Phase III trials (0.4), FDA approval (0.3), and papers (0.1 each, capped at 0.5) are simply summed together. The code is completely open source of course, so if you would like to do a deeper analysis, feel totally free to do so! Do you reckon I should try to implement a contradiction detection and potentially penalize disagreement rather than reward all evidence equally? I was planning to do this in one of the later phases of this project, as I assumed it to be a little less important than some other things.