DEV Community

Cover image for I trained my own LLM and published it on HuggingFace
Akhilesh
Akhilesh

Posted on

I trained my own LLM and published it on HuggingFace

This is the post where things got real. Training an actual language model, watching the loss go down, pushing it to HuggingFace with my name on it.

The plan

I couldn't afford to train from scratch — that takes thousands of GPU hours and costs thousands of dollars. Instead I used fine-tuning: take an existing pre-trained model and train it further on my medical data.

The model I chose: facebook/opt-1.3b — 1.3 billion parameters, open source, no access restrictions.

The technique: LoRA (Low-Rank Adaptation) — instead of updating all 1.3 billion parameters, LoRA adds small trainable layers on top and only trains those. You go from training 1.3 billion parameters to training about 4 million. Same result, 100x cheaper.

Why Google Colab

My laptop has no GPU. Training even a small LLM on CPU takes days. Google Colab gives you a free Tesla T4 GPU with 15GB of memory. You get 30 hours per week for free. This is what I used.

The training code

The key parts:

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import LoraConfig, get_peft_model
from trl import SFTTrainer, SFTConfig

# Load base model
model = AutoModelForCausalLM.from_pretrained("facebook/opt-1.3b")

# Add LoRA adapters
lora_config = LoraConfig(
    r=8,
    lora_alpha=16,
    target_modules=["q_proj", "v_proj"],
    task_type="CAUSAL_LM"
)
model = get_peft_model(model, lora_config)

# Train
trainer = SFTTrainer(
    model=model,
    train_dataset=train_dataset,
    args=SFTConfig(num_train_epochs=3, learning_rate=2e-4)
)
trainer.train()
Enter fullscreen mode Exit fullscreen mode

The results

Training took 1.5 hours on the free T4 GPU. Here's what the loss looked like:

Step 100:  Loss 1.163
Step 500:  Loss 0.994
Step 1000: Loss 0.967
Step 1700: Loss 0.944  ← training complete
Enter fullscreen mode Exit fullscreen mode

Loss going down means the model is learning. Both training and validation loss decreased together, which means the model generalized rather than just memorizing.

Publishing to HuggingFace

model.push_to_hub("Yakhilesh/medmind-opt-medical")
tokenizer.push_to_hub("Yakhilesh/medmind-opt-medical")
Enter fullscreen mode Exit fullscreen mode

That's it. My model is now publicly available at:
huggingface.co/Yakhilesh/medmind-opt-medical

Anyone can download and use it. The adapter weights are only 12.6MB — small because LoRA only saves the adapter, not the entire base model.

What I actually learned

Fine-tuning is more about data quality than model architecture. My 1.3B model trained for 1.5 hours learned genuine medical patterns. The loss numbers prove it.

Top comments (0)