What is ramalama
Ramalama is an open source command line tool that makes running AI models locally simple by treating them like containers.
Ramalama runs models with podman/docker and there's no config needed.
It is GPU optimizedand accelerates performance.
It is compatible with llama.cpp, openvino, vLLM, whisper.cpp and manymore.
Installing ramalama
Ramalama is easy to install.
After installing check the version you are using.
Ramalama supports multiple model registries(transports);
1. Ollama
It is the quickest and easiest registry.
Here are a few AI models i ran using ollama.
2. Hugging face
Some hugging face model require one to login.
Here are some that don't require logging in:
3. Modelscope
Model scope worked quite well too.
but I had to upgrade ramalama's version.
Here are some of modelscope's model I used;
4. OCI registries
Let's start with what is OCI?
OCI(Open Container Initiative), is a standard or a specification which defines how containers and their images should be packaged and determined.
There are several OCI registries;
- quay.io
- docker.io
- github container registry(ghcr.io)
- google container registry(gcr.io)
- amazon elastic container registry(ecr.io)
5.Ramalama Container Registry
RLCR only works when:
The model is explicitly published there
You use the exact full path
The tag exists (e.g., :latest)
6.URL based source
RamaLama also supports loading models directly from URLs instead of registries.
They include:
-
https://→ download from the internet -
file://→ load from your local machine
7.Hosted API
For a model like Openai to run it requires a secret key which you get from openai API-keys then you'll have to pay for your model to run successfully.








Top comments (0)