Ajeet Singh Raina

Posted on Nov 4, 2023

Running Ollama 2 on NVIDIA Jetson Nano with GPU using Docker

#docker #nvidia #gpu #ollama

Ollama is a rapidly growing development tool, with 10,000 Docker Hub pulls in a short period of time. It is a large language model (LLM) from Google AI that is trained on a massive dataset of text and code. It can generate text, translate languages, write different kinds of creative content, and answer your questions in an informative way.

To run OLLAMA on a Jetson Nano, you will need to install the following software:

Docker Engine
OLLAMA Docker image

Installing Docker

To install Docker on a Jetson Nano, follow these steps:

Update the package list:

sudo apt update

Install Docker:

sudo curl -sSL https://get.docker.com/ | sh

Add your user to the Docker group:

sudo groupadd docker
sudo usermod -aG docker $USER

Log out and back in for the changes to take effect.

Install with Apt

Configure the repository

curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey \
    | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list \
    | sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' \
    | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt-get update

Install the NVIDIA Container Toolkit packages

sudo apt-get install -y nvidia-container-toolkit

Configure Docker to use Nvidia driver

sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker

Start the container

sudo docker run -d --gpus=all --runtime=nvidia -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

Run model locally

Now you can run a model:

sudo docker exec -it ollama ollama run llama2

sudo docker exec -it ollama ollama run llama2
pulling manifest
pulling 8daa9615cce3... 100% |████████████████████████████| (3.8/3.8 GB, 2.4 MB/s)
pulling 8c17c2ebb0ea... 100% |████████████████████████████████| (7.0/7.0 kB, 3.7 kB/s)
pulling 7c23fb36d801... 100% |████████████████████████████████| (4.8/4.8 kB, 2.0 kB/s)
pulling bec56154823a... 100% |█████████████████████████████████████| (59/59 B, 18 B/s)
pulling e35ab70a78c7... 100% |█████████████████████████████████████| (90/90 B, 32 B/s)
pulling 09fe89200c09... 100% |██████████████████████████████████| (529/529 B, 180 B/s)
verifying sha256 digest
writing manifest
removing any unused layers
success

The command sudo docker exec -it ollama ollama run llama2 will start the OLLAMA 2 model in the ollama container. This will allow you to interact with the model directly from the command line.

To use the OLLAMA 2 model, you can send it text prompts and it will generate text in response. For example, to generate a poem about a cat, you would run the following command:

docker exec -it ollama ollama run llama2 --prompt "Write a poem about a cat."

This will generate a poem about a cat and print it to the console.

You can also use the OLLAMA 2 model to translate languages, write different kinds of creative content, and answer your questions in an informative way.

Experiment with different prompts to test the capabilities of the OLLAMA 2 model.

Here are some examples of prompts you can use with the OLLAMA 2 model:

Translate the sentence "Hello, world!" into Spanish.
Write a short story about a robot who falls in love with a human.
Generate a list of ideas for new products.
Answer the question "What is the meaning of life?"

The OLLAMA 2 model is still under development, but it has the potential to be a powerful tool for a variety of tasks.

Top comments (2)

andyD • Nov 27 '23

Yeah so this doesn't even use the GPU on the Jetson. If it was a little more than just a rehash of hub.docker.com/r/ollama/ollama page you'd quickly see that it's just using the CPUs because of the way the Jetson works....

andyD • Nov 26 '23

Great info- thanks. When i run this on my Jetson however I see 100% CPU usage and 0% gpu usage... what am I missing to run the model on the Ampere GPU on this board?

DEV Community

Running Ollama 2 on NVIDIA Jetson Nano with GPU using Docker

Installing Docker

Install Docker:

Add your user to the Docker group:

Install with Apt

Install the NVIDIA Container Toolkit packages

Configure Docker to use Nvidia driver

Start the container

Run model locally

Top comments (2)

Read next

Reload NGINX configuration using Docker

Why Apache Kafka, Docker, and C#?

Understanding Docker: part 46 – Tools: Cosign

Managing Docker Containers with Portainer