I decided to build a clinical AI system from scratch. Not using the OpenAI API. Not a tutorial project. An actual system where I train my own model, build my own search pipeline, and deploy the whole thing.
This post is about Day 1 — setting up everything before writing a single line of ML code.
Why I'm doing this
Most AI projects I see from students are just API wrappers. You call GPT-4, get an answer, show it in a UI. That's fine, but it doesn't teach you how any of it actually works.
I wanted to understand what happens under the hood — how models get trained, how retrieval works, how you actually serve a model in production. So I picked a real problem (clinical decision support) and decided to build the whole thing myself.
What I'm building
MedMind — a system that takes a clinical question, searches a database of medical knowledge, and generates an answer using a model I trained on real medical exam questions.
The full stack:
- Download and clean a real medical dataset
- Fine-tune a language model on that data
- Build a RAG pipeline with a vector database
- Evaluate the model honestly
- Serve it with FastAPI
- Build a UI with Streamlit
Setting up the environment
First thing — Python version matters a lot in ML. I installed Python 3.11 specifically because PyTorch and HuggingFace have best support for it.
Then I created a virtual environment. This keeps all the libraries for this project separate from everything else on my machine:
python -m venv venv
venv\Scripts\activate # Windows
Then installed the core libraries:
pip install torch transformers datasets peft trl accelerate
pip install chromadb sentence-transformers
pip install fastapi uvicorn streamlit
Each of these does something specific:
-
transformers— gives access to pre-trained models like OPT, Mistral, LLaMA -
peft— lets you fine-tune models efficiently using LoRA -
trl— makes instruction fine-tuning easier -
chromadb— the vector database for storing medical knowledge -
sentence-transformers— converts text to vectors for search
Project structure
I organized everything before writing code:
medmind/
├── data/ ← data scripts
├── training/ ← fine-tuning
├── rag/ ← retrieval pipeline
├── eval/ ← evaluation
├── api/ ← FastAPI backend
└── frontend/ ← Streamlit UI
For training — Colab
My PC has no GPU. Training a language model on CPU would take weeks. So I set up Google Colab with a free T4 GPU. This is what most people without expensive hardware do — it's completely normal.
Top comments (0)