DEV Community

kabeer1choudary
kabeer1choudary

Posted on

The Ultimate Guide: Installing Ollama on Fedora 43

Running large language models (LLMs) locally isn’t just for the privacy-obsessed anymore—it’s for anyone who wants a snappy, custom coding assistant without a monthly subscription. If you’re rocking Fedora 43, you’re already using one of the most cutting-edge distros out there.

Here is how to get Ollama up and running with full NVIDIA acceleration and hook it into VS Code for a seamless dev experience.

There’s something uniquely satisfying about seeing your GPU fans spin up because your local AI is thinking. Let’s get you there in eight steps.

Step 1: Open the Gates (RPM Fusion)

Fedora is known for its commitment to free, open-source software, which means the proprietary NVIDIA drivers aren't there by default. We need to add the RPM Fusion repositories to get the "non-free" goodies.

Run this in your terminal:

$ sudo dnf5 install https://mirrors.rpmfusion.org/free/fedora/rpmfusion-free-release-$(rpm -E %fedora).noarch.rpm 

$ sudo dnf5 install https://mirrors.rpmfusion.org/nonfree/fedora/rpmfusion-nonfree-release-$(rpm -E %fedora).noarch.rpm
Enter fullscreen mode Exit fullscreen mode

Note: We’re using dnf5 here, which is the faster, shinier version of the package manager standard in Fedora 43.

Step 2: The Driver Dance

Now, we install the NVIDIA drivers and the CUDA toolkit. This is what allows Ollama to talk to your GPU instead of making your CPU do all the heavy lifting.

sudo dnf5 install akmod-nvidia xorg-x11-drv-nvidia-cuda
Enter fullscreen mode Exit fullscreen mode

A word of caution: Fedora moves fast. Sometimes, when the kernel updates, the NVIDIA modules need a moment to "catch up" (rebuild). If you update your system and things look wonky, it’s usually because the driver is still compiling in the background.

Step 3: The Classic Reboot

You know the drill. For the kernel to start using those new NVIDIA drivers, you need a fresh start. sudo reboot

Step 4: The Moment of Truth

Once you’re back in, let’s make sure Fedora and your GPU are on speaking terms. Run this command

nvidia-smi
Enter fullscreen mode Exit fullscreen mode

If you see a table showing your GPU name and VRAM usage, congratulations—you’ve passed the hardest part.

Step 5: Installing Ollama

Ollama makes installation incredibly easy with a one-liner script. This script handles the heavy lifting: it downloads the binary, creates an ollama user/group, and sets up a systemd service so it starts automatically.

curl -fsSL https://ollama.com/install.sh | sh
Enter fullscreen mode Exit fullscreen mode

Step 6: Verifying the GPU Link

Just because Ollama is installed doesn't mean it’s using your GPU. It might be falling back to your CPU if it can't find the drivers. Let's check the logs:

journalctl -u ollama -b | grep "NVIDIA"
Enter fullscreen mode Exit fullscreen mode

You’re looking for a line that confirms an NVIDIA GPU was detected. If you see it, you're golden.

Step 7: Your First Local Run

Let's test it with a lightweight, high-performance model. Qwen 2.5 Coder (0.5B version) is tiny but surprisingly "smart" for its size, making it perfect for a quick test.

ollama run qwen2.5-coder:0.5b
Enter fullscreen mode Exit fullscreen mode

Once it downloads, you can chat with it directly in your terminal. Ask it to write a Python script; you'll be impressed by the speed.

Integrating with VS Code (The "Continue" Extension)

Running AI in a terminal is cool, but having it inside your IDE is where the real productivity happens. We’ll use the Continue extension, which is an open-source powerhouse for local AI.

  • Install the Extension: Search for "Continue" in the VS Code Marketplace and install it.
  • Configure Local Access: Open the Continue sidebar, click the gear icon (settings), and select your local config file.
  • Edit the Config: Replace or add the following to your config.yaml (or config.json depending on your version) to point it toward your local Ollama instance:
name: Local Config
version: 1.0.0
schema: v1
models:
  - name: Qwen2.5-Coder 0.5B
    provider: ollama
    model: qwen2.5-coder:0.5b
    apiBase: http://localhost:11434
    roles:
      - chat
      - edit
      - apply
      - autocomplete
      - embed
Enter fullscreen mode Exit fullscreen mode

Final Thoughts

You now have a fully private, incredibly fast AI coding assistant running on your local hardware. No data leaving your machine, no latency issues, and total control.

Top comments (0)