I Built an Automated SLM Fine-Tuning Engine with Python and Unsloth 🚀

#python #machinelearning #opensource #nlp

Fine-tuning Small Language Models (SLMs) like Llama 3.2 or Phi-4 is easier than ever thanks to tools like Unsloth. But the setup is still a pain. You have to find the right Colab notebook, format your data into JSONL, debug generic errors, and figure out which model actually fits your use case.

I wanted to fix that friction. So I built SLMGen, an open-source tool that automates the entire process.

Your Data → Best Model → Matched. One notebook. Zero setup.

Here's how I built it and how it works under the hood.

🏗️ The Architecture

SLMGen is a full-stack application:

Backend: Python 3.11 with FastAPI (because Python is the language of AI).
Frontend: Next.js 16 (for a snappy, modern UI).
Authentication: Supabase.
Engine: A custom Recommendation Engine & Notebook Generator.

The core magic happens in libslmgen, the Python backend. Let's dive into the cool parts.

🧠 1. The Recommendation Engine

Most people don't know which model to pick. "Should I use Phi-4 or Llama 3.2? Is Mistral 7B too big for my edge device?"

I built a 100-point scoring system to mathematically answer that question. It analyzes your dataset and intent to pick the perfect model.

Here is the scoring logic from core/recommender.py:

def get_recommendations(task, deployment, stats, characteristics):
    # ...
    # 50 points: Does the model fit the task? (e.g. Code vs Creative Writing)
    task_score = _score_task_fit(model, task)

    # 30 points: Does it fit the hardware? (e.g. Edge vs Cloud)
    deploy_score = _score_deployment_fit(model, deployment)

    # 20 points: Does the data match the model strengths? (e.g. Multilingual -> Qwen)
    data_score = _score_data_fit(model, stats, characteristics)

    total = task_score + deploy_score + data_score
    # ...

It doesn't just guess. It looks at your data features:

Multilingual? → Bonus for Qwen 2.5 (it's great at non-English).
JSON Output? → Bonus for Phi-4 or Qwen (strong structured output).
Edge Deployment? → Big penalty for 7B models, big bonus for Gemma 2B or SmolLM.

🔍 2. Dataset Intelligence

Before you train, you should know what you're training on. SLMGen scans your JSONL file for:

Quality Issues: Duplicates, empty responses, short inputs.
Personality: Is your data "Formal & Expert" or "Casual & Helpful"?
Hallucination Risk: It scans for vague language ("might," "probably") vs. grounded language ("according to," numbers/dates).

This is done via heuristic analysis in core/personality.py and core/risk.py. It's not an LLM judging an LLM—it's fast, deterministic signal processing on text.

🪄 3. Self-Contained Notebooks

This is my favorite feature. Usually, to run a Colab notebook, you have to:

Download a notebook.
Upload your JSONL file to Colab.
Fix the file path in the code.

SLMGen removes steps 2 and 3. It embeds your dataset directly into the notebook as a base64 string.

# From core/notebook.py
def generate_notebook(dataset_jsonl, ...):
    # Encode dataset
    dataset_b64 = base64.b64encode(dataset_jsonl.encode()).decode()

    # Generate Python code cell
    code_cell = f"""
    import base64
    import json

    DATASET_B64 = "{dataset_b64}"

    # Decode on the fly
    dataset_str = base64.b64decode(DATASET_B64).decode()
    raw_data = [json.loads(line) for line in dataset_str.splitlines()]
    """
    # ...

When you open the notebook, you just click "Run All". No file uploads. No path errors. It just works.

⚡ 4. Unsloth Integration

For the actual training, I rely on Unsloth. It makes fine-tuning 2x faster and uses 70% less memory.

SLMGen automatically configures the correct LoRA target modules for every supported model structure (Llama, Phi, Gemma, etc.), so you never face a "layer not found" error.