Fine-tuning lets you customize an LLM for your specific use case. Here's a practical walkthrough using OpenAI's API and Hugging Face.
When to Fine-Tune vs. Prompt Engineer
Fine-tune when you need consistent output format, few-shot isn't enough, you want shorter prompts, or you have domain-specific knowledge. Don't fine-tune when RAG can solve it, you have fewer than 50 examples, or the base model works with good prompts.
Fine-Tuning with OpenAI
Step 1: Prepare Training Data
import json
training_data = [
{
"messages": [
{"role": "system", "content": "You are a code reviewer giving concise feedback."},
{"role": "user", "content": "Review: def add(a,b): return a+b"},
{"role": "assistant", "content": "1. Add type hints: `def add(a: int, b: int) -> int`\n2. Add docstring\n3. Consider input validation"}
]
}
]
with open("training.jsonl", "w") as f:
for item in training_data:
f.write(json.dumps(item) + "\n")
Step 2: Upload and Train
from openai import OpenAI
client = OpenAI()
file = client.files.create(file=open("training.jsonl", "rb"), purpose="fine-tune")
job = client.fine_tuning.jobs.create(
training_file=file.id,
model="gpt-4o-mini-2024-07-18",
hyperparameters={"n_epochs": 3, "batch_size": 1, "learning_rate_multiplier": 1.8}
)
print(client.fine_tuning.jobs.retrieve(job.id).status)
Step 3: Use Your Model
response = client.chat.completions.create(
model="ft:gpt-4o-mini-2024-07-18:your-org::abc123",
messages=[
{"role": "system", "content": "You are a code reviewer."},
{"role": "user", "content": "Review: passwords = open('passwords.txt').read()"}
]
)
Fine-Tuning with Hugging Face (Local)
from transformers import AutoModelForCausalLM, AutoTokenizer, TrainingArguments
from trl import SFTTrainer
from peft import LoraConfig
model_name = "meta-llama/Llama-2-7b-hf"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, load_in_4bit=True, device_map="auto")
lora_config = LoraConfig(r=16, lora_alpha=32, target_modules=["q_proj", "v_proj"], lora_dropout=0.05, task_type="CAUSAL_LM")
trainer = SFTTrainer(
model=model, train_dataset=dataset, peft_config=lora_config,
args=TrainingArguments(output_dir="./results", num_train_epochs=3, per_device_train_batch_size=4, learning_rate=2e-4),
tokenizer=tokenizer, dataset_text_field="text", max_seq_length=512,
)
trainer.train()
trainer.save_model("./my-fine-tuned-model")
Data Quality Validation
def validate_training_data(filepath):
issues = []
with open(filepath) as f:
for i, line in enumerate(f):
try:
data = json.loads(line)
msgs = data["messages"]
if len(msgs) < 2:
issues.append(f"Line {i}: Need at least 2 messages")
if msgs[-1]["role"] != "assistant":
issues.append(f"Line {i}: Last message should be assistant")
except (json.JSONDecodeError, KeyError) as e:
issues.append(f"Line {i}: {e}")
return issues
Key Takeaways
- Start with 50-100 high-quality examples
- Quality > quantity for training data
- Use LoRA for efficient local fine-tuning
- OpenAI fine-tuning is easiest for production
- Always evaluate against a held-out test set
6. Fine-tuning + RAG together is often the best approach
🚀 Level up your AI workflow! Check out my AI Developer Mega Prompt Pack — 80 battle-tested prompts for developers. $9.99
Top comments (0)