DEV Community

Cover image for Raspberry Pi 5 Ollama Llama3.x Performance
Blacknight318
Blacknight318

Posted on • Originally published at saltyoldgeek.com

Raspberry Pi 5 Ollama Llama3.x Performance

This post contains affiliate links. If you make a purchase through these links, I may earn a small commission at no extra cost to you. Thanks for supporting Salty Old Geek! It helps cover the costs of running the blog.

Why a Raspberry Pi 5

Back in April of this year, I posted an article about setting up an Ollam, Llama3, and OpenWebUI. In parts one and two I looked at trying to create a 1U unit with an external GPU using some adapters and setting up a server using an old gaming PC with the same external GPU(Nvidia Quadro K620). This did work but was under-spec'd with only 2GB VRAM. In the search for a cost-effective node structure I've narrowed it down to 2 options, the first of which we'll go over in this post, the Raspberry Pi 5 8GB RAM. Let's dive in!

Testing

To give a reference between certain setups I started with a prompt to work the CPU and provide a meaningful run time.

The test prompt and benchmark times

  • Test prompt

    • "Write a detailed 500-word essay on the importance of renewable energy in combating climate change."
  • Llama 3.1 model

  • Lamma 3.2 model

NUMA, is it worth the patch?

While looking for ways to improve performance, if not with shorter times than with thermal performance, I came upon an article by Jeff Geerling "NUMA Emulation speeds up Pi 5 (and other improvements)" and decided to give that a try. Now, I tested this month after Jeff's post had been published, so the patch might now be standard in the Raspberry Pi kernel, and in my case, the patch made no meaningful difference.

TLDR

For the best value, the Raspberry Pi 5 with the Argon Neo NVME case is my second choice(only to an M series MacBook/Mac Mini with 16GB RAM). If you're comfortable with a little longer answer times or some prompt tweaking then running the Pi setup with Ollama and llama3.2:latest is the sweet spot. I may go into cost more in a future post. That said, if you wanted more power I'd go for the MacBook or Mac Mini, any M series with +16GB RAM and you'd still come in under a PC with NVIDIA GPU(s) equivalent for the job.

I hope you found this post helpful, if so consider buying me a coffee. till next time fair winds and following seas.

Top comments (0)

A Workflow Copilot. Tailored to You.

Pieces.app image

Our desktop app, with its intelligent copilot, streamlines coding by generating snippets, extracting code from screenshots, and accelerating problem-solving.

Read the docs