Complete Guide: Setting Up Ollama on Intel GPU with Intel Graphics Package Manager

#intelgpu #ipexllm #generativeai #localai

I remember using ChatGPT for the first time to write a reply when i received appreciation from leadership team for my work in my previous company. Nowadays, it is part of day to day life, AI has made my life easier. I was wondering what if we can run LLM locally on my laptop. I installed Ollama desktop for windows on my Laptop. My laptop with just 16 GB RAM was working fine with small models with basic email writing task. Using a model with 1b parameters and my regular apps like teams, chrome etc, my laptop was frequently become unresponsive. On my another Laptop with dedicated graphics card, I was able to run models upto 8b parameters smoothly.

I thought why can’t we use Intel GPU to perform the GPU heavy tasks on my laptop. I started exploring and found a reference to Intel Ipex-llm project on Github. You will get a zip file which you can extract and Ollama locally using Intel GPU. I did this setup on ubuntu 24.04 running on windows wsl. Here is step by step process:

Update GPU driver on machine

Follow below steps to install packages from intel

A. Refresh the package index and install paclage manager

sudo apt-get update
sudo apt-get install -y software-properties-common

B. Add intel-graphics Personal Package Archive (PPA)

sudo add-apt-repository -y ppa:kobuk-team/intel-graphics

C. Install compute related packages

sudo apt-get install -y libze-intel-gpu1 libze1 intel-metrics-discovery intel-opencl-icd clinfo intel-gsc

D. Install media related packages

sudo apt-get install -y intel-media-va-driver-non-free libmfx-gen1 libvpl2 libvpl-tools libva-glx2 va-driver-all vainfo

E. Verify installation

clinfo | grep "Device Name"

if you do not see the result like above, there could be some issue with the user you are using , run below commands to add your user.

sudo gpasswd -a ${USER} render
newgrp render

using the above steps, we have installed intel graphics packages in ubuntu running in wsl.

Download the file from this link.
Extract the file

tar -xvf [Downloaded tgz file path]

Go to the extracted folder and run start-ollama.sh

cd PATH/TO/EXTRACTED/FOLDER
./start-ollama.sh

Open another terminal and run your model

cd PATH/TO/EXTRACTED/FOLDER
./ollama run llama3.2:1b

You can verify the GPU usage from task manager.

Conclusion

I was able to run small models like qwen3:1.7b, qwen3:0.6b, llama3.2:1b, and gemma3:1b smoothly. running deepseek model deepseek-r1:1.5b was giving garbage response. somehow managed to run gemma3:4b only once after that it was getting failed. what more i can expect on machine running on 16 GB RAM with i5 processor. It was good learning, i connected the locally running ollama with Librechat and played with it.

References:

DEV Community

Complete Guide: Setting Up Ollama on Intel GPU with Intel Graphics Package Manager

Top comments (0)