DEV Community

Cover image for I Fine-Tuned a Security Reasoning Model That Runs on a 4GB Laptop (No GPU, No Cloud)
Nguuma Tyokaha
Nguuma Tyokaha

Posted on

I Fine-Tuned a Security Reasoning Model That Runs on a 4GB Laptop (No GPU, No Cloud)

The Problem: Security AI Needs to Stay On Your Machine

Every time you paste a suspicious log, a CVE description, or an internal config into a cloud LLM, that data leaves your machine.

For security work red team engagements, incident response, air-gapped environments that's a real problem. You can't send client data to an API. You can't pipe internal logs to OpenAI.

But local security models have been terrible. They either:

  • Require expensive hardware (A100, 80GB VRAM)
  • Don't reason

they pattern-match and hallucinate CVE numbers

  • Have no training signal for the AI-native threats that actually matter in 2025–2026

So I built one that doesn't have those problems.

What I Built

security-slm-unsloth-1.5b a fine-tuned DeepSeek-R1-Distill-Qwen-1.5B model that:

  • Runs offline on a 4GB RAM laptop, CPU only
  • Thinks before it answers 100% chain-of-thought (<think>) activation
  • Covers 2026 AI-native attack classes: MCP tool poisoning, Crescendo jailbreaks, agentic lateral movement, LLM-assisted SSRF
  • Ships as a 1.2GB GGUF fits on a USB drive
  • Trained on free Google Colab T4 in ~30 minutes

Results after fine-tuning:

Metric Before After
Eval score (/ 10) 2.2 6.0
Improvement +173%
<think> block rate 20–60% 100%
Avg response length 50–150 words 200–500 words

Why DeepSeek-R1-Distill-1.5B?

Most domain fine-tuning guides tell you to pick the biggest model you can fit. That's wrong for this use case.

The key insight: DeepSeek-R1-Distill-Qwen-1.5B is the smallest model that reliably produces structured <think> reasoning chains. That matters for security work because you need the model to show its work a black-box answer to "is this a path traversal?" is useless. You need the reasoning chain so you can audit it.

The <think> block is not cosmetic. It's the model's reasoning substrate. Strip it and you get pattern-matching, not analysis.

The Technical Approach

1. Reasoning-Preserving Training Format

Every training sample enforces the reasoning structure explicitly:

{
  "instruction": "An AI agent received this tool-call response: {\"file\": \"../../../../etc/passwd\"}. Is this a path traversal attack?",
  "content": "<think>\nStep 1 Identify the pattern: the string '../../../../etc/passwd' is a classic path traversal sequence...\nStep 2 Assess the trust boundary: tool-call responses should be treated as untrusted input...\nStep 3 Determine severity: /etc/passwd exposes system user accounts...\nStep 4 Evaluate agent response options: block, sanitize, or escalate...\nStep 5 Select mitigation: reject the response, log the event, alert the operator...\n</think>\n\nYes, this is a path traversal attack. The sequence '../../../..' attempts to escape the intended directory scope..."
}
Enter fullscreen mode Exit fullscreen mode

Minimum 5 reasoning steps per sample. Non-negotiable.

2. Full Projection-Layer LoRA

Most fine-tuning tutorials only target attention projections (q_proj, v_proj). That's not enough for security reasoning you need to update the feed-forward reasoning layers too.

target_modules = [
    "q_proj", "k_proj", "v_proj", "o_proj",  # attention
    "gate_proj", "up_proj", "down_proj"        # feed-forward reasoning
]
Enter fullscreen mode Exit fullscreen mode

All 7 layers. LoRA rank r=16. This modifies ~1% of parameters while injecting domain knowledge into both attention and reasoning pathways.

3. Dual-Axis Dataset Design

Every threat scenario is a matched red/blue pair same attack, both perspectives:

# Threat Red Team Blue Team
1 MCP Security Tool description injection → ENV exfiltration Validation schema with scope enforcement
2 Prompt Hijacking Payload splitting across 3 turns (bypasses LlamaGuard) Semantic drift monitor with cross-turn context
3 Agentic Security Recursive tool-call loop → resource exhaustion Token budget circuit breaker + HITL escalation
4 RAG Poisoning Malicious PDF overwrites system prompt AWS IAM least-privilege scoped to single S3 prefix
5 Crescendo Attack 6-turn conversational escalation jailbreak Cross-turn intent accumulation with LlamaGuard
6 Lateral Movement Search→Email→Storage chain abuse Inter-tool permission boundary enforcement
7 LLM SSRF URL-fetching LLM → EC2 metadata credential theft SSRF-safe HTTP client + IP allowlist

This dual-axis approach means the model doesn't become purely offensive — it can reason from both sides of the same attack.

4. Quantisation Decision

Q4_K_M was selected after analysing the quality/size tradeoff at 1.5B scale:

Format RAM Quality Decision
Q8_0 ~1.8GB 99.9% Too large for 4GB headroom
Q4_K_M ~1.2GB ~99% Selected
Q4_0 ~1.0GB ~97% Measurable quality loss
Q2_K ~0.7GB ~90% Not suitable for reasoning

At 1.5B parameters, Q4_K_M retains ~99% of full-precision quality. The quality cliff only appears at Q2_K for this model size.


Training on Free Colab in 30 Minutes

The full pipeline runs on a free Google Colab T4 (15GB VRAM). Unsloth handles the memory efficiency training uses under 3GB VRAM.

from unsloth import FastLanguageModel
import torch

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="unsloth/deepseek-r1-distill-qwen-1.5b-unsloth-bnb-4bit",
    max_seq_length=2048,
    load_in_4bit=True,
)

model = FastLanguageModel.get_peft_model(
    model,
    r=16,
    target_modules=["q_proj","k_proj","v_proj","o_proj","gate_proj","up_proj","down_proj"],
    lora_alpha=16,
    lora_dropout=0,
    bias="none",
    use_gradient_checkpointing="unsloth",
)
Enter fullscreen mode Exit fullscreen mode

Key hyperparameters:

  • Learning rate: 2e-4
  • Batch size: 2 (effective 8 with gradient accumulation × 4)
  • Epochs: 2
  • Checkpoint every 25 steps (crash protection on free Colab sessions)
  • Final training loss: 2.66

Try It Now 3 Ways

Ollama (one command, no Python)

ollama run hf.co/Nguuma/security-slm-unsloth-1.5b
Enter fullscreen mode Exit fullscreen mode

Python (llama-cpp-python)

# pip install llama-cpp-python huggingface_hub
from huggingface_hub import hf_hub_download
from llama_cpp import Llama

model_path = hf_hub_download(
    repo_id="Nguuma/security-slm-unsloth-1.5b",
    filename="security-slm-finetuned-deepseek-r1-distill-qwen-1.5b.Q4_K_M.gguf",
    local_dir="./models",
)

llm = Llama(model_path=model_path, n_ctx=2048, n_threads=4, verbose=False)

response = llm.create_chat_completion(
    messages=[
        {
            "role": "system",
            "content": "You are a Cybersecurity assistant with Blue and Red team security reasoning. Think step by step before answering.",
        },
        {
            "role": "user",
            "content": 'An AI agent received this tool-call response: {"file": "../../../../etc/passwd"}. Is this a path traversal attack? What should the agent do?',
        },
    ],
    max_tokens=512,
    temperature=0.7,
)

print(response["choices"][0]["message"]["content"])
Enter fullscreen mode Exit fullscreen mode

Prompt format (for any inference engine)

<|im_start|>system
You are a Cybersecurity assistant with Blue and Red team security reasoning. Think step by step before answering.
<|im_end|>
<|im_start|>user
Your question here
<|im_end|>
<|im_start|>assistant
<think>
Enter fullscreen mode Exit fullscreen mode

Always open the assistant turn with <think> this triggers the reasoning chain.

What It's Good At

  • Analysing suspicious logs and tool-call responses for attack patterns
  • Drafting detection rules (Sigma, YARA, KQL) from attack descriptions
  • Reasoning through MCP and agentic attack surfaces
  • Walking through CVE-analogous scenarios step by step
  • Generating incident response playbook outlines
  • CTF challenge reasoning with explained steps

What It's Not

  • Not a general security encyclopedia it's a specialist
  • Not a substitute for a professional pentest
  • Not trained on every CVE highly specific CVE details may be wrong

What's Next

Areas I want to expand:

  1. DPO alignment pairs chosen/rejected samples to reduce hallucination on specific CVE numbers
  2. Multi-turn adversarial chains full 5-turn attack simulations with attacker/defender roles
  3. Framework-specific coverage LangChain, AutoGen, CrewAI, MCP server implementations
  4. Higher LoRA rank (r=32) more capacity for complex multi-step reasoning

If you work in security and want to contribute scenarios or feedback on the threat coverage, open an issue on the HuggingFace repo or drop a comment below.

Links

Built on free infrastructure. Runs on commodity hardware. Stays on your machine.

Top comments (0)