I was killing Ollama processes the hard way for months. A one-word command fixed it.

#ai #programming #bash #linux

I run a lot of local LLMs. Ollama, OpenWebUI, coding models.

llama3, qwen-coder models, deepseek-coder, whatever I'm testing that week.

The problem is Ollama doesn't always clean up after itself.

Pull a few models. Run them. Close the terminal. Think you're done.
You're not. There are now 3 or 4 ollama and ollama_llama_server
processes still sitting in the background, each one holding
6-8 GB of RAM hostage while I am trying to actually get stuff done.

I'd notice it when my machine started lagging mid-session.
Open a terminal, run ps aux, and get back a wall of text
200 lines deep. Scroll through it. Find the PID. Copy it.
Run kill. Realize there are two more and loop this way for an hour trying to clean everything up.

Four commands. Every single time. For months:

bash
ps aux | grep ollama   # step 1: find it
# scroll through output, locate PID
# e.g. 47823
kill 47823             # step 2: kill it
# still running? find the next one
kill 47824             # step 3: kill that one too
# check again...

This is my struggle for months. But then i learned pkill.

pkill exists and I ignored it

pkill lets you kill any process by name. No PID hunting.
No ps aux wall of text. One command.

pkill ollama

That's it. Every Ollama process, the server, the model runner,
all of it. A fresh start, per se.

But before you run that blindly, always do a pgrep first.
It shows you exactly what you're about to kill without
actually killing anything:

pgrep -l ollama

Output:
47823 ollama
47824 ollama_llama_server
47825 ollama_llama_server

Three processes. All Ollama. All eating RAM. Now you know
what you're working with before you pull the trigger.

Then kill them all in one shot:

pkill ollama

Verify they're gone:

pgrep -l ollama
# no output = all clear

Done. The whole workflow takes 3 seconds instead of 3 minutes.

The flags I actually use

# Preview before killing — always do this first
pgrep -l ollama

# Kill all matching processes by name
pkill ollama

# Exact match only — won't catch ollama_llama_server
pkill -x ollama

# Force kill if it's frozen and won't respond
pkill -9 ollama

# Kill a specific model process by full command string
pkill -f "ollama_llama_server"

The -f flag is especially useful when you have multiple
models running and only want to kill one of them.
pkill -f "codellama" will only hit the codellama process
and leave everything else running.

Why this matters more with local LLMs

With normal processes you might have one or two orphans at a time.
With Ollama you can easily have five or six. Tthe main server,
plus a separate runner process for each model you've touched
in a session. kill with a PID was never designed for
cleaning up that many processes at once, the runner processes were adding up.

pkill was.

If you're running local LLMs regularly, this is the command
that keeps your RAM clean between sessions. Add it to muscle
memory alongside ollama list and ollama ps.