This is the post where things got real. Training an actual language model, watching the loss go down, pushing it to HuggingFace with my name on it.
The plan
I couldn't afford to train from scratch — that takes thousands of GPU hours and costs thousands of dollars. Instead I used fine-tuning: take an existing pre-trained model and train it further on my medical data.
The model I chose: facebook/opt-1.3b — 1.3 billion parameters, open source, no access restrictions.
The technique: LoRA (Low-Rank Adaptation) — instead of updating all 1.3 billion parameters, LoRA adds small trainable layers on top and only trains those. You go from training 1.3 billion parameters to training about 4 million. Same result, 100x cheaper.
Why Google Colab
My laptop has no GPU. Training even a small LLM on CPU takes days. Google Colab gives you a free Tesla T4 GPU with 15GB of memory. You get 30 hours per week for free. This is what I used.
The training code
The key parts:
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import LoraConfig, get_peft_model
from trl import SFTTrainer, SFTConfig
# Load base model
model = AutoModelForCausalLM.from_pretrained("facebook/opt-1.3b")
# Add LoRA adapters
lora_config = LoraConfig(
r=8,
lora_alpha=16,
target_modules=["q_proj", "v_proj"],
task_type="CAUSAL_LM"
)
model = get_peft_model(model, lora_config)
# Train
trainer = SFTTrainer(
model=model,
train_dataset=train_dataset,
args=SFTConfig(num_train_epochs=3, learning_rate=2e-4)
)
trainer.train()
The results
Training took 1.5 hours on the free T4 GPU. Here's what the loss looked like:
Step 100: Loss 1.163
Step 500: Loss 0.994
Step 1000: Loss 0.967
Step 1700: Loss 0.944 ← training complete
Loss going down means the model is learning. Both training and validation loss decreased together, which means the model generalized rather than just memorizing.
Publishing to HuggingFace
model.push_to_hub("Yakhilesh/medmind-opt-medical")
tokenizer.push_to_hub("Yakhilesh/medmind-opt-medical")
That's it. My model is now publicly available at:
huggingface.co/Yakhilesh/medmind-opt-medical
Anyone can download and use it. The adapter weights are only 12.6MB — small because LoRA only saves the adapter, not the entire base model.
What I actually learned
Fine-tuning is more about data quality than model architecture. My 1.3B model trained for 1.5 hours learned genuine medical patterns. The loss numbers prove it.

Top comments (0)