DEV Community

Cover image for Run LLMs Locally Using Ollama (Open Source)
Santhosh
Santhosh

Posted on • Edited on

Run LLMs Locally Using Ollama (Open Source)

In this article, I'll walk you through the steps of running open-source large language models on our system with the Ollama package. Ollama is compatible with macOS, Linux, and Windows platforms.

What you can find in this article?

  1. What is Ollama?
  2. Installing Ollama on MacOS
  3. Running Ollama
  4. Downloading models locally.
  5. Commands
  6. Summary

1. What is Ollama?

Simpler: Running large language models on your computer is difficult, but Ollama makes it easy!

More specific: Ollama simplifies the process of running open-source large language models such as LLaMA2, Gemma, Mistral, Phi on your personal system by managing technical configurations, environments, and storage requirements.

Focus on benefit: Tired of dealing with complex setups? Ollama lets you unleash the power of open-source large language models on your system with ease.

Ollama tackles these challenges by:
Bundling everything you need: Ollama combines the model weights, configurations, and data into a single package, making setup easier.

Optimizing resource usage: Ollama helps your computer use its resources efficiently, including your GPU, for running the LLMs.

➡️ GitHub respository: repo

➡️ Ollama official webpage: webpage

Ollama

2. Installing Ollama on MacOS

Ollama works flawlessly with Windows, Mac, and Linux. This quick instructional leads you through the installation processes, particularly for MacOS.

Ollama Download

➡️ Go to Ollama download page and download the file: downloads

  • Download the file.
  • Extract the zip file.
  • Drag and drop the Ollama file to application folder.
  • Now, run the Ollama.

Ollama Installation

  • After installation, click Finish, and the Ollama will appear in the menu bar.

If you wish to run Ollama in a Docker container, skip the following explanation and proceed to ➡️ Ollama Docker Image Blog: docker image

3. Running Ollama

  • Start Ollama with: ollama serve is used when you want to start ollama without running the desktop application.

Ollama Serve

  • List available models: ollama list.

Ollama List

  • ➡️ Browse Ollama Library : Models

On this page, you’ll find numerous models ready for download, available in various parameter sizes.

Ollama Models

4. Downloading models locally

Once Ollama has been set up, you may open your MacOS terminal and download various models locally.

Before you download a model locally, make sure your hardware has enough memory to load it. For testing, tiny models designated 7B are recommended; they are adequate for integrating the model into applications.

⚠️ It is strongly recommended to have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 13B models, and 32 GB to run the 33B models.

Download specific models:

  • ollama run llama2
  • ollama run gemma
  • ollama run mistral
  • Ollama run model_name replace model_name for others.

Ollama Llama Model

Ollama Llama Model Download

5. Commands

  • Get help: /? or /help

Ollama Help

  • Remove a model: ollama rm model_name (e.g., ollama rm llama2)

Ollama remove model

6. Summary

Ollama is a tool that simplifies running large language models (LLMs) on your local computer. It handles the technical configurations and downloads the necessary files to get you started. This guide covers installing Ollama on MacOS and downloading pre-trained models like Llama2 and Gemma. Make sure you have enough RAM to run these models (8GB minimum recommended).

Top comments (0)