DEV Community

Njeri Kimaru
Njeri Kimaru

Posted on

Does ramalama make AI boring?? Running AI models with Ramalama.

What is ramalama

Ramalama is an open source command line tool that makes running AI models locally simple by treating them like containers.
Ramalama runs models with podman/docker and there's no config needed.
It is GPU optimizedand accelerates performance.
It is compatible with llama.cpp, openvino, vLLM, whisper.cpp and manymore.

Installing ramalama

Ramalama is easy to install.
After installing check the version you are using.

Ramalama supports multiple model registries(transports);

1. Ollama

It is the quickest and easiest registry.
Here are a few AI models i ran using ollama.

2. Hugging face

Some hugging face model require one to login.
Here are some that don't require logging in:

3. Modelscope

Model scope worked quite well too.
but I had to upgrade ramalama's version.

Here are some of modelscope's model I used;

4. OCI registries

Let's start with what is OCI?
OCI(Open Container Initiative), is a standard or a specification which defines how containers and their images should be packaged and determined.
There are several OCI registries;

  • quay.io
  • docker.io
  • github container registry(ghcr.io)
  • google container registry(gcr.io)
  • amazon elastic container registry(ecr.io)

5.Ramalama Container Registry

RLCR only works when:
The model is explicitly published there
You use the exact full path
The tag exists (e.g., :latest)

6.URL based source

RamaLama also supports loading models directly from URLs instead of registries.

They include:

  • https:// → download from the internet
  • file:// → load from your local machine

7.Hosted API

For a model like Openai to run it requires a secret key which you get from openai API-keys then you'll have to pay for your model to run successfully.

Top comments (0)