Originally published on AI Tech Connect.
What this recipe gives you Fine-tuning has a reputation as a dark art reserved for teams with eight-figure GPU budgets. It is not. With LoRA and QLoRA you can adapt a 7B or 8B open-weight model on a single consumer-grade card, in an afternoon, for the price of a couple of cups of coffee. The hard part was never the compute. The hard part is knowing whether you should fine-tune at all, building a dataset that matches how you will actually call the model, and proving the result is better rather than merely different. This is a recipe you can keep and reuse across models and tasks. The methods here are stable: low-rank adaptation, 4-bit quantisation, an 80/10/10 split, an eval set built first. The specific model you point it at — Llama, Qwen, Mistral, Gemma — barely changes the steps. Here…
Top comments (0)