Does ramalama make AI boring?? Running AI models with Ramalama.

#ramalama #ai #programming #opensource

What is ramalama

Ramalama is an open source command line tool that makes running AI models locally simple by treating them like containers.
Ramalama runs models with podman/docker and there's no config needed.
It is GPU optimizedand accelerates performance.
It is compatible with llama.cpp, openvino, vLLM, whisper.cpp and manymore.

Installing ramalama

Ramalama is easy to install.
After installing check the version you are using.

sudo dnf pip install python3-ramalama

ramalama version

Ramalama supports multiple model registries(transports);

1. Ollama

It is the quickest and easiest registry.
Here are a few AI models i ran using ollama.

ramalama run granite moe3

ramalama run ollama://llama4:scout

2. Hugging face

Some hugging face model require one to login.
Here are some that don't require logging in:

ramalama run huggingface://instructlab/granite-7b-lab-Q4_K_M.gguf

ramalama run huggingface://microsoft/Phi-3-mini-4k-instruct-q4.gguf

3. Modelscope

Model scope worked quite well too.
but I had to upgrade ramalama's version.

sudo dnf upgrade ramalama

Here are some of modelscope's model I used;

ramalama run modelscope://Qwen/Qwen2.5-7B-Instruct-GGUF/qwen2.5-7b-instruct-q4_k_m.gguf

4. OCI registries

Let's start with what is OCI?
OCI(Open Container Initiative), is a standard or a specification which defines how containers and their images should be packaged and determined.
There are several OCI registries;

quay.io
docker.io
github container registry(ghcr.io) In github I had to login first then get an authentication token. Afterwards, I pushed a model then accessed using the ghcr.io

ramalama convert ollama://mistral oci://ghcr.io/njeri-kimaru/mistral:gguf

ramalama run oci://ghcr.io/njeri-kimaru/mistral:gguf

google container registry(gcr.io)
amazon elastic container registry(ecr.io)
Ramalama Container Registry(rlcr.io)

5.URL based source

RamaLama also supports loading models directly from URLs instead of registries.

They include:

https:// → download from the internet
file:// → load from your local machine

6.Hosted API

For a model like Openai to run it requires a secret key which you get from openai API-keys then you'll have to pay for your model to run successfully.