DEV Community

Cover image for Run Gemma-4 12B on WSL2 with llama.cpp
0xkoji
0xkoji

Posted on

Run Gemma-4 12B on WSL2 with llama.cpp

1. update WSL environment

sudo apt update && sudo apt upgrade -y
Enter fullscreen mode Exit fullscreen mode

2. install dependencies

If you don't use -hf option, you don't need to install libssl-dev in this step.

sudo apt install build-essential cmake git libssl-dev -y
Enter fullscreen mode Exit fullscreen mode

If nvidia-smi shows a GPU/GPUs on your terminal, you will need to install the tooklit. This will take some time.

sudo apt install nvidia-cuda-toolkit -y
Enter fullscreen mode Exit fullscreen mode

3. clone the repo

Build llama-cli and llama-server. This step also will take some time.
If you don't plan to use -hf option, you don't need to use -DLLAMA_OPENSSL=ON.

git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
cmake -B build -DGGML_CUDA=ON -DLLAMA_OPENSSL=ON
cmake --build build --config Release

# no GPU
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
cmake -B build
cmake --build build --config Release
Enter fullscreen mode Exit fullscreen mode

4. run the model

Run gemma-4-12b-it with cli and server.

unsloth/gemma-4-12b-it-GGUF Β· Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co
./build/bin/llama-cli -hf unsloth/gemma-4-12b-it-GGUF:UD-Q4_K_XL
Enter fullscreen mode Exit fullscreen mode
> hello

[Start thinking]
The user said "hello".
The user is initiating a conversation.
Respond politely and offer assistance.

    *   "Hello! How can I help you today?"
    *   "Hi there! What's on your mind?"
    *   "Hello! Is there anything I can assist you with?"
[End thinking]

Hello! How can I help you today?

[ Prompt: 19.5 t/s | Generation: 11.8 t/s ]
Enter fullscreen mode Exit fullscreen mode

or run web-ui

./build/bin/llama-server -hf unsloth/gemma-4-12b-it-GGUF:UD-Q4_K_XL --port 8080
Enter fullscreen mode Exit fullscreen mode

optional download model from huggingface

mkdir -p models
wget -O models/gemma-4-12b-it-UD-Q4_K_XL.gguf https://huggingface.co/unsloth/gemma-4-12b-it-GGUF/resolve/main/gemma-4-12b-it-UD-Q4_K_XL.gguf
Enter fullscreen mode Exit fullscreen mode

Top comments (0)