AI monkey

Posted on Apr 21

How to Install Ollama on Linux and Windows: Complete Setup Guide

#ai #llm #linux

Running large language models locally has never been easier thanks to Ollama. It allows you to download, run, and manage LLMs on your own machine with minimal configuration. In this guide, you’ll learn how to install Ollama on both Linux and Windows, configure it properly, and run your first model.

🚀 What is Ollama?

Ollama is a lightweight runtime that lets you run open-source LLMs locally (like Llama, Mistral, Gemma, and others). It handles model downloading, optimization, and inference through a simple CLI and API.

Key benefits:
Runs models locally (privacy-first)
Simple CLI interface
Supports multiple LLMs
Works on Linux, Windows (WSL2), and macOS
Built-in model management
🐧 How to Install Ollama on Linux
1. System Requirements

Before installing, make sure you have:

Linux distro (Ubuntu recommended)
64-bit system
At least 8GB RAM (16GB+ recommended for larger models)
GPU optional (NVIDIA improves performance)

2. Install via official script

Open terminal and run:

curl -fsSL https://ollama.com/install.sh | sh

This script will:

Download Ollama
Install binaries
Set up system service (if supported)

3. Verify installation

ollama --version

If installed correctly, you should see version output.

4. Run your first model

ollama run llama3

This will:

Download the model (first run only)
Start interactive chat session

Useful Linux commands

ollama list        # show installed models
ollama pull mistral # download model
ollama rm llama3    # remove model

🪟 How to Install Ollama on Windows

Windows installation is slightly different because it uses a native app or WSL2.

**Option 1: Native Windows Installation (Recommended)

Download installer**

Go to:
👉 https://ollama.com/download

Download the Windows installer (.exe).

2. Install
Run the installer
Follow setup wizard
Ollama will install system services automatically

3. Verify installation

Open PowerShell:

ollama --version

4. Run a model

ollama run llama3

Option 2: Install via WSL2 (Advanced users)

If you want Linux-like performance:

1. Enable WSL2

wsl --install

Restart your system.

2. Install Ubuntu

From Microsoft Store, install Ubuntu.

3. Install Ollama inside WSL

curl -fsSL https://ollama.com/install.sh | sh

4. Run model

ollama run mistral

⚡ Running Models with Ollama

Once installed, you can run different models:

ollama run llama3
ollama run mistral
ollama run gemma

You can also pass prompts directly:

ollama run llama3 "Explain quantum computing in simple terms"

🔌 Using Ollama API (Local AI Server)

Ollama runs a local API server automatically.

Example request:

curl http://localhost:11434/api/generate -d '{
  "model": "llama3",
  "prompt": "Write a blog about AI"
}'

This makes Ollama usable for:
apps
bots
automation
coding assistants

🧠 Performance Tips
Use smaller models (3B–8B) for CPU-only machines
Enable GPU acceleration on NVIDIA systems
Close heavy apps to free RAM
Use quantized models for better speed

🧩 Common Issues & Fixes
❌ Command not found
Restart terminal
Check PATH variables
❌ Slow performance
Use smaller model
Ensure GPU drivers are installed
❌ Model download stuck
ollama rm
ollama pull

🔥 Final Thoughts
Ollama is one of the easiest ways to run local AI models on Linux and Windows in 2026. Whether you're a developer, researcher, or AI enthusiast, it provides a simple yet powerful way to bring LLMs directly to your machine without relying on cloud APIs.

If you're building AI tools or experimenting with local inference, Ollama is the fastest way to get started.

Source: https://inferencerig.com/setup/how-to-install-ollama-on-linux-and-windows-complete-setup-guide/