Bessouat40

Posted on Mar 4

Build and deploy a RAG pipeline as a REST API in under 5 minutes with RAGLight

#ai #rag #python #opensource

Classic Problem

If you've ever built a RAG pipeline, you know how it usually ends: the tutorial shows you how to retrieve documents and generate answers, then leaves you to "wrap it in FastAPI yourself."

I got tired of writing the same boilerplate every time, so I built it once inside RAGLight, an open-source Python library for building RAG and Agentic RAG pipelines.

Latest Feature: Expose a RAG as REST API

raglight serve : one command to expose your RAG pipeline as a fully functional REST API.

What you get out of the box

pip install raglight
raglight serve --port 8000

That's it. You now have a running HTTP server with:

POST /generate : ask a question, get an answer
POST /ingest : index a local folder or a GitHub repository
POST /ingest/upload : upload files directly via multipart form
GET /collections : list available collections
GET /health : healthcheck
Swagger UI at http://localhost:8000/docs

Configuration via environment variables

The entire pipeline is configured through RAGLIGHT_* environment variables. No code to write.

Create a .env file:

# LLM
RAGLIGHT_LLM_PROVIDER=Ollama
RAGLIGHT_LLM_MODEL=llama3.2
RAGLIGHT_LLM_API_BASE=http://localhost:11434

# Embeddings
RAGLIGHT_EMBEDDINGS_PROVIDER=HuggingFace
RAGLIGHT_EMBEDDINGS_MODEL=all-MiniLM-L6-v2

# Vector store
RAGLIGHT_PERSIST_DIR=./raglight_db
RAGLIGHT_COLLECTION=default

# Retrieval
RAGLIGHT_K=5

Then start the server:

raglight serve --port 8000

RAGLight picks up the .env automatically.

Step-by-step demo

1. Index your documents

Point the API at a local folder:

curl -X POST http://localhost:8000/ingest \
  -H "Content-Type: application/json" \
  -d '{"data_path": "./my_docs"}'

Or index a GitHub repository directly:

curl -X POST http://localhost:8000/ingest \
  -H "Content-Type: application/json" \
  -d '{"github_url": "https://github.com/Bessouat40/RAGLight", "github_branch": "main"}'

Or upload files directly (useful when the API is on a remote server):

curl -X POST http://localhost:8000/ingest/upload \
  -F "files=@report.pdf" \
  -F "files=@notes.txt"

2. Ask a question

curl -X POST http://localhost:8000/generate \
  -H "Content-Type: application/json" \
  -d '{"question": "What are the main features of RAGLight?"}'

Response:

{
  "answer": "RAGLight provides a modular RAG pipeline with support for multiple LLM providers, vector stores, and document types..."
}

3. Check available collections

curl http://localhost:8000/collections

{
  "collections": ["default", "default_classes"]
}

Docker Compose

If you want to deploy the API on a server, here's a minimal docker-compose.yml:

services:
  raglight-api:
    image: python:3.12-slim
    command: >
      bash -c "pip install raglight && raglight serve"
    ports:
      - "8000:8000"
    env_file: .env
    extra_hosts:
      - "host.docker.internal:host-gateway"
    volumes:
      - ./raglight_db:/app/raglight_db

The extra_hosts line allows the container to reach Ollama running on your host machine.

Just copy your .env, run docker-compose up, and the API is live.

Supported providers

Component	Supported
LLM	Ollama, OpenAI, Mistral, Gemini, LM Studio
Embeddings	HuggingFace (local), Ollama, OpenAI, Gemini
Vector store	ChromaDB (local or remote)
Knowledge sources	Local folders, GitHub repos, file upload

What's also in RAGLight

Beyond raglight serve, the library includes:

Agentic RAG : iterative retrieval with reasoning loops and MCP tool support
Hybrid search : combines BM25 keyword search and semantic search with Reciprocal Rank Fusion
Multimodal RAG : index PDFs with images using Vision-Language Models
Builder API : fine-grained control over every component

DEV Community