DEV Community

Cover image for How to Set Up a Local Ubuntu Server to Host Ollama Models with a WebUI
Korak Kurani
Korak Kurani

Posted on

15

How to Set Up a Local Ubuntu Server to Host Ollama Models with a WebUI

Are you ready to set up a powerful local server to host Ollama models and interact with them via a sleek WebUI? This guide will take you through each step, from preparing your Ubuntu server to installing Ollama and integrating OpenWebUI for seamless interaction.

Whether you're a beginner or an experienced user, this comprehensive guide will make the process straightforward and error-free. Let's get started!

Installing Ubuntu Server on a PC

Before diving into the server setup, you need to install Ubuntu Server on your PC. Follow these steps to get started:

Step 1: Download Ubuntu Server ISO

  1. Visit the Ubuntu Server Download Page.
  2. Download the latest version of the Ubuntu Server ISO file.

Step 2: Create a Bootable USB Drive

Use tools like Rufus (Windows) or dd (Linux/Mac) to create a bootable USB drive:

  • For Rufus: Select the ISO file and your USB drive, then click "Start."
  • For dd on Linux/Mac:

     sudo dd if=/path/to/ubuntu-server.iso of=/dev/sdX bs=4M status=progress
    

    Replace /dev/sdX with the appropriate USB device.

Step 3: Boot from USB and Install Ubuntu Server

  1. Insert the USB drive into the PC and restart it.
  2. Enter the BIOS/UEFI (usually by pressing DEL, F2, or F12 during startup).
  3. Set the USB drive as the primary boot device and save the changes.
  4. Follow the on-screen instructions to install Ubuntu Server.
  • Select your language, keyboard layout, and network configuration.
  • Partition the disk as needed (guided options work for most setups).
  • Set up a username, password, and hostname for the server.

Complete the installation and reboot the system. Remove the USB drive during the reboot.


Setting Up Your Ubuntu Server

Step 1: Update and Install Essential Packages

To ensure your server is up-to-date and has the necessary tools, run the following commands:

sudo apt update && sudo apt upgrade -y
sudo apt install build-essential dkms linux-headers-$(uname -r) software-properties-common -y
Enter fullscreen mode Exit fullscreen mode

Step 2: Add NVIDIA Repository and Install Drivers

If your server includes an NVIDIA GPU, follow these steps to install the appropriate drivers:

  • Add the NVIDIA PPA:
   sudo add-apt-repository ppa:graphics-drivers/ppa -y
   sudo apt update
Enter fullscreen mode Exit fullscreen mode
  • Detect the recommended driver:
   ubuntu-drivers devices
Enter fullscreen mode Exit fullscreen mode

Example output:

   driver   : nvidia-driver-560 - third-party non-free recommended
Enter fullscreen mode Exit fullscreen mode
  • Install the recommended driver:
   sudo apt install nvidia-driver-560 -y
   sudo reboot
Enter fullscreen mode Exit fullscreen mode
  • Verify the installation:
   nvidia-smi
Enter fullscreen mode Exit fullscreen mode

This should display GPU details and driver version. If not, revisit the steps.


Step 3: Configure NVIDIA GPU as Default

If your system has an integrated GPU, disable it to ensure NVIDIA is the default:

  • Identify GPUs:
   lspci | grep -i vga
Enter fullscreen mode Exit fullscreen mode
  • Blacklist the integrated GPU driver:
   sudo nano /etc/modprobe.d/blacklist-integrated-gpu.conf
Enter fullscreen mode Exit fullscreen mode

Add the following lines based on your GPU type:

For Intel:

   blacklist i915
   options i915 modeset=0
Enter fullscreen mode Exit fullscreen mode

For AMD:

   blacklist amdgpu
   options amdgpu modeset=0
Enter fullscreen mode Exit fullscreen mode
  • Update and reboot:
   sudo update-initramfs -u
   sudo reboot
Enter fullscreen mode Exit fullscreen mode

Verify again with:

nvidia-smi
Enter fullscreen mode Exit fullscreen mode

Installing and Setting Up Ollama

Step 1: Install Ollama

Download and install Ollama using the following command:

curl -fsSL https://ollama.com/install.sh | sh
Enter fullscreen mode Exit fullscreen mode

Step 2: Add Models to Ollama

Ollama allows you to work with different models. For example, to add the llama3 model, run:

ollama pull llama3
Enter fullscreen mode Exit fullscreen mode

Setting Up OpenWebUI for Seamless Interaction

To enhance your experience with Ollama, integrate OpenWebUI—a user-friendly interface for interacting with models:

  • Run the following Docker command to set up OpenWebUI:
   sudo docker run -d --network=host -v open-webui:/app/backend/data \
       -e OLLAMA_BASE_URL=http://127.0.0.1:11434 \
       --name open-webui --restart always \
       ghcr.io/open-webui/open-webui:main
Enter fullscreen mode Exit fullscreen mode
  • This command sets up a containerized WebUI with:

    • Data persistence via the open-webui volume.
    • Ollama base URL configuration for model interaction.
  • Access the WebUI through your server's IP address.


Testing and Troubleshooting

Verify NVIDIA GPU Functionality

Use nvidia-smi to confirm the GPU is functioning properly. If you encounter errors like Command not found, revisit the driver installation process.

Common Errors and Fixes

Error: ERROR:root:aplay command not found

  • Fix: Install alsa-utils:
  sudo apt install alsa-utils -y
Enter fullscreen mode Exit fullscreen mode

Error: udevadm hwdb is deprecated. Use systemd-hwdb instead.

  • Fix: Update system packages:
  sudo hwdb update
  sudo apt update && sudo apt full-upgrade -y
Enter fullscreen mode Exit fullscreen mode

Optional: CUDA Setup for Compute Workloads

For advanced compute tasks, install CUDA tools:

  • Install CUDA:
   sudo apt install nvidia-cuda-toolkit -y
Enter fullscreen mode Exit fullscreen mode
  • Verify CUDA installation:
   nvcc --version
Enter fullscreen mode Exit fullscreen mode

Congratulations! You've set up a robust local Ubuntu server for hosting Ollama models and interacting with them via OpenWebUI. This setup is perfect for experimenting with AI models in a controlled, local environment.

If you encounter any issues, double-check the steps and consult the documentation. Enjoy exploring the possibilities of Ollama and OpenWebUI!

API Trace View

Struggling with slow API calls? 🕒

Dan Mindru walks through how he used Sentry's new Trace View feature to shave off 22.3 seconds from an API call.

Get a practical walkthrough of how to identify bottlenecks, split tasks into multiple parallel tasks, identify slow AI model calls, and more.

Read more →

Top comments (4)

Collapse
 
sairnd profile image
joja

How would you make it go into a low energy mode after 15 minutes ollama stops the model? And wake up on a remote connection?

Collapse
 
korak997 profile image
Korak Kurani

That is a good idea. I did not added anything like that because its only used by me then i turn off the server completly whenever not in use.

Collapse
 
stu_de profile image
Stu De • Edited

I think you missed a section about installing Docker

and maybe opening a firewall port to allow access to the webui from your local network and adding the --listen flag to webui.py

Collapse
 
korak997 profile image
Korak Kurani

You are totaly right thanks for the note :)

I will update the blog as soon as i can.

Image of Docusign

🛠️ Bring your solution into Docusign. Reach over 1.6M customers.

Docusign is now extensible. Overcome challenges with disconnected products and inaccessible data by bringing your solutions into Docusign and publishing to 1.6M customers in the App Center.

Learn more

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay