Ollama is a rapidly growing development tool, with 10,000 Docker Hub pulls in a short period of time. It is a large language model (LLM) from Google AI that is trained on a massive dataset of text and code. It can generate text, translate languages, write different kinds of creative content, and answer your questions in an informative way.
To run OLLAMA on a Jetson Nano, you will need to install the following software:
- Docker Engine
- OLLAMA Docker image
Installing Docker
To install Docker on a Jetson Nano, follow these steps:
Update the package list:
sudo apt update
Install Docker:
sudo curl -sSL https://get.docker.com/ | sh
Add your user to the Docker group:
sudo groupadd docker
sudo usermod -aG docker $USER
Log out and back in for the changes to take effect.
Install with Apt
Configure the repository
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey \
| sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list \
| sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' \
| sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt-get update
Install the NVIDIA Container Toolkit packages
sudo apt-get install -y nvidia-container-toolkit
Configure Docker to use Nvidia driver
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker
Start the container
sudo docker run -d --gpus=all --runtime=nvidia -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
Run model locally
Now you can run a model:
sudo docker exec -it ollama ollama run llama2
sudo docker exec -it ollama ollama run llama2
pulling manifest
pulling 8daa9615cce3... 100% |████████████████████████████| (3.8/3.8 GB, 2.4 MB/s)
pulling 8c17c2ebb0ea... 100% |████████████████████████████████| (7.0/7.0 kB, 3.7 kB/s)
pulling 7c23fb36d801... 100% |████████████████████████████████| (4.8/4.8 kB, 2.0 kB/s)
pulling bec56154823a... 100% |█████████████████████████████████████| (59/59 B, 18 B/s)
pulling e35ab70a78c7... 100% |█████████████████████████████████████| (90/90 B, 32 B/s)
pulling 09fe89200c09... 100% |██████████████████████████████████| (529/529 B, 180 B/s)
verifying sha256 digest
writing manifest
removing any unused layers
success
The command sudo docker exec -it ollama ollama run llama2
will start the OLLAMA 2 model in the ollama container. This will allow you to interact with the model directly from the command line.
To use the OLLAMA 2 model, you can send it text prompts and it will generate text in response. For example, to generate a poem about a cat, you would run the following command:
docker exec -it ollama ollama run llama2 --prompt "Write a poem about a cat."
This will generate a poem about a cat and print it to the console.
You can also use the OLLAMA 2 model to translate languages, write different kinds of creative content, and answer your questions in an informative way.
Experiment with different prompts to test the capabilities of the OLLAMA 2 model.
Here are some examples of prompts you can use with the OLLAMA 2 model:
- Translate the sentence "Hello, world!" into Spanish.
- Write a short story about a robot who falls in love with a human.
- Generate a list of ideas for new products.
- Answer the question "What is the meaning of life?"
The OLLAMA 2 model is still under development, but it has the potential to be a powerful tool for a variety of tasks.
Top comments (2)
Yeah so this doesn't even use the GPU on the Jetson. If it was a little more than just a rehash of hub.docker.com/r/ollama/ollama page you'd quickly see that it's just using the CPUs because of the way the Jetson works....
Great info- thanks. When i run this on my Jetson however I see 100% CPU usage and 0% gpu usage... what am I missing to run the model on the Ampere GPU on this board?