DEV Community

Bessouat40
Bessouat40

Posted on

Build and deploy a RAG pipeline as a REST API in under 5 minutes with RAGLight

Classic Problem

If you've ever built a RAG pipeline, you know how it usually ends: the tutorial shows you how to retrieve documents and generate answers, then leaves you to "wrap it in FastAPI yourself."

I got tired of writing the same boilerplate every time, so I built it once inside RAGLight, an open-source Python library for building RAG and Agentic RAG pipelines.

Latest Feature: Expose a RAG as REST API

raglight serve : one command to expose your RAG pipeline as a fully functional REST API.

What you get out of the box

pip install raglight
raglight serve --port 8000
Enter fullscreen mode Exit fullscreen mode

That's it. You now have a running HTTP server with:

  • POST /generate : ask a question, get an answer
  • POST /ingest : index a local folder or a GitHub repository
  • POST /ingest/upload : upload files directly via multipart form
  • GET /collections : list available collections
  • GET /health : healthcheck
  • Swagger UI at http://localhost:8000/docs

Configuration via environment variables

The entire pipeline is configured through RAGLIGHT_* environment variables. No code to write.

Create a .env file:

# LLM
RAGLIGHT_LLM_PROVIDER=Ollama
RAGLIGHT_LLM_MODEL=llama3.2
RAGLIGHT_LLM_API_BASE=http://localhost:11434

# Embeddings
RAGLIGHT_EMBEDDINGS_PROVIDER=HuggingFace
RAGLIGHT_EMBEDDINGS_MODEL=all-MiniLM-L6-v2

# Vector store
RAGLIGHT_PERSIST_DIR=./raglight_db
RAGLIGHT_COLLECTION=default

# Retrieval
RAGLIGHT_K=5
Enter fullscreen mode Exit fullscreen mode

Then start the server:

raglight serve --port 8000
Enter fullscreen mode Exit fullscreen mode

RAGLight picks up the .env automatically.


Step-by-step demo

1. Index your documents

Point the API at a local folder:

curl -X POST http://localhost:8000/ingest \
  -H "Content-Type: application/json" \
  -d '{"data_path": "./my_docs"}'
Enter fullscreen mode Exit fullscreen mode

Or index a GitHub repository directly:

curl -X POST http://localhost:8000/ingest \
  -H "Content-Type: application/json" \
  -d '{"github_url": "https://github.com/Bessouat40/RAGLight", "github_branch": "main"}'
Enter fullscreen mode Exit fullscreen mode

Or upload files directly (useful when the API is on a remote server):

curl -X POST http://localhost:8000/ingest/upload \
  -F "files=@report.pdf" \
  -F "files=@notes.txt"
Enter fullscreen mode Exit fullscreen mode

2. Ask a question

curl -X POST http://localhost:8000/generate \
  -H "Content-Type: application/json" \
  -d '{"question": "What are the main features of RAGLight?"}'
Enter fullscreen mode Exit fullscreen mode

Response:

{
  "answer": "RAGLight provides a modular RAG pipeline with support for multiple LLM providers, vector stores, and document types..."
}
Enter fullscreen mode Exit fullscreen mode

3. Check available collections

curl http://localhost:8000/collections
Enter fullscreen mode Exit fullscreen mode
{
  "collections": ["default", "default_classes"]
}
Enter fullscreen mode Exit fullscreen mode

Docker Compose

If you want to deploy the API on a server, here's a minimal docker-compose.yml:

services:
  raglight-api:
    image: python:3.12-slim
    command: >
      bash -c "pip install raglight && raglight serve"
    ports:
      - "8000:8000"
    env_file: .env
    extra_hosts:
      - "host.docker.internal:host-gateway"
    volumes:
      - ./raglight_db:/app/raglight_db
Enter fullscreen mode Exit fullscreen mode

The extra_hosts line allows the container to reach Ollama running on your host machine.

Just copy your .env, run docker-compose up, and the API is live.


Supported providers

Component Supported
LLM Ollama, OpenAI, Mistral, Gemini, LM Studio
Embeddings HuggingFace (local), Ollama, OpenAI, Gemini
Vector store ChromaDB (local or remote)
Knowledge sources Local folders, GitHub repos, file upload

What's also in RAGLight

Beyond raglight serve, the library includes:

  • Agentic RAG : iterative retrieval with reasoning loops and MCP tool support
  • Hybrid search : combines BM25 keyword search and semantic search with Reciprocal Rank Fusion
  • Multimodal RAG : index PDFs with images using Vision-Language Models
  • Builder API : fine-grained control over every component

Links

Feedback welcome :)

Top comments (0)