<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Bessouat40</title>
    <description>The latest articles on DEV Community by Bessouat40 (@bessouat40).</description>
    <link>https://dev.to/bessouat40</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3806333%2F632729df-b4b2-4f1b-ad36-7b2a254ba538.png</url>
      <title>DEV Community: Bessouat40</title>
      <link>https://dev.to/bessouat40</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/bessouat40"/>
    <language>en</language>
    <item>
      <title>Build and deploy a RAG pipeline as a REST API in under 5 minutes with RAGLight</title>
      <dc:creator>Bessouat40</dc:creator>
      <pubDate>Wed, 04 Mar 2026 18:27:47 +0000</pubDate>
      <link>https://dev.to/bessouat40/build-and-deploy-a-rag-pipeline-as-a-rest-api-in-under-5-minutes-with-raglight-5hk0</link>
      <guid>https://dev.to/bessouat40/build-and-deploy-a-rag-pipeline-as-a-rest-api-in-under-5-minutes-with-raglight-5hk0</guid>
      <description>&lt;h2&gt;
  
  
  Classic Problem
&lt;/h2&gt;

&lt;p&gt;If you've ever built a RAG pipeline, you know how it usually ends: the tutorial shows you how to retrieve documents and generate answers, then leaves you to "wrap it in FastAPI yourself."&lt;/p&gt;

&lt;p&gt;I got tired of writing the same boilerplate every time, so I built it once inside &lt;a href="https://github.com/Bessouat40/RAGLight" rel="noopener noreferrer"&gt;RAGLight&lt;/a&gt;, an open-source Python library for building RAG and Agentic RAG pipelines.&lt;/p&gt;

&lt;h2&gt;
  
  
  Latest Feature: Expose a RAG as REST API
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;raglight serve&lt;/code&gt; : one command to expose your RAG pipeline as a fully functional REST API.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What you get out of the box&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;raglight
raglight serve &lt;span class="nt"&gt;--port&lt;/span&gt; 8000
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. You now have a running HTTP server with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;POST /generate&lt;/code&gt; : ask a question, get an answer&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;POST /ingest&lt;/code&gt; : index a local folder or a GitHub repository&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;POST /ingest/upload&lt;/code&gt; : upload files directly via multipart form&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;GET /collections&lt;/code&gt; : list available collections&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;GET /health&lt;/code&gt; : healthcheck&lt;/li&gt;
&lt;li&gt;Swagger UI at &lt;code&gt;http://localhost:8000/docs&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Configuration via environment variables
&lt;/h2&gt;

&lt;p&gt;The entire pipeline is configured through &lt;code&gt;RAGLIGHT_*&lt;/code&gt; environment variables. No code to write.&lt;/p&gt;

&lt;p&gt;Create a &lt;code&gt;.env&lt;/code&gt; file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# LLM
RAGLIGHT_LLM_PROVIDER=Ollama
RAGLIGHT_LLM_MODEL=llama3.2
RAGLIGHT_LLM_API_BASE=http://localhost:11434

# Embeddings
RAGLIGHT_EMBEDDINGS_PROVIDER=HuggingFace
RAGLIGHT_EMBEDDINGS_MODEL=all-MiniLM-L6-v2

# Vector store
RAGLIGHT_PERSIST_DIR=./raglight_db
RAGLIGHT_COLLECTION=default

# Retrieval
RAGLIGHT_K=5
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then start the server:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;raglight serve &lt;span class="nt"&gt;--port&lt;/span&gt; 8000
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;RAGLight picks up the &lt;code&gt;.env&lt;/code&gt; automatically.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step-by-step demo
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Index your documents
&lt;/h3&gt;

&lt;p&gt;Point the API at a local folder:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST http://localhost:8000/ingest &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"data_path": "./my_docs"}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or index a GitHub repository directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST http://localhost:8000/ingest &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"github_url": "https://github.com/Bessouat40/RAGLight", "github_branch": "main"}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or upload files directly (useful when the API is on a remote server):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST http://localhost:8000/ingest/upload &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-F&lt;/span&gt; &lt;span class="s2"&gt;"files=@report.pdf"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-F&lt;/span&gt; &lt;span class="s2"&gt;"files=@notes.txt"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Ask a question
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST http://localhost:8000/generate &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"question": "What are the main features of RAGLight?"}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Response:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"answer"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"RAGLight provides a modular RAG pipeline with support for multiple LLM providers, vector stores, and document types..."&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Check available collections
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl http://localhost:8000/collections
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"collections"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"default"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"default_classes"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Docker Compose
&lt;/h2&gt;

&lt;p&gt;If you want to deploy the API on a server, here's a minimal &lt;code&gt;docker-compose.yml&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;raglight-api&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;python:3.12-slim&lt;/span&gt;
    &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="s"&gt;bash -c "pip install raglight &amp;amp;&amp;amp; raglight serve"&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;8000:8000"&lt;/span&gt;
    &lt;span class="na"&gt;env_file&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;.env&lt;/span&gt;
    &lt;span class="na"&gt;extra_hosts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;host.docker.internal:host-gateway"&lt;/span&gt;
    &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;./raglight_db:/app/raglight_db&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;extra_hosts&lt;/code&gt; line allows the container to reach Ollama running on your host machine.&lt;/p&gt;

&lt;p&gt;Just copy your &lt;code&gt;.env&lt;/code&gt;, run &lt;code&gt;docker-compose up&lt;/code&gt;, and the API is live.&lt;/p&gt;




&lt;h2&gt;
  
  
  Supported providers
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Supported&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;LLM&lt;/td&gt;
&lt;td&gt;Ollama, OpenAI, Mistral, Gemini, LM Studio&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Embeddings&lt;/td&gt;
&lt;td&gt;HuggingFace (local), Ollama, OpenAI, Gemini&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Vector store&lt;/td&gt;
&lt;td&gt;ChromaDB (local or remote)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Knowledge sources&lt;/td&gt;
&lt;td&gt;Local folders, GitHub repos, file upload&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  What's also in RAGLight
&lt;/h2&gt;

&lt;p&gt;Beyond &lt;code&gt;raglight serve&lt;/code&gt;, the library includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Agentic RAG&lt;/strong&gt; : iterative retrieval with reasoning loops and MCP tool support&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hybrid search&lt;/strong&gt; : combines BM25 keyword search and semantic search with Reciprocal Rank Fusion&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multimodal RAG&lt;/strong&gt; : index PDFs with images using Vision-Language Models&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Builder API&lt;/strong&gt; : fine-grained control over every component&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Repo: &lt;a href="https://github.com/Bessouat40/RAGLight" rel="noopener noreferrer"&gt;github.com/Bessouat40/RAGLight&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Documentation: &lt;a href="https://raglight.mintlify.app" rel="noopener noreferrer"&gt;raglight.mintlify.app&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;REST API docs: &lt;a href="https://raglight.mintlify.app/documentation/rest-api" rel="noopener noreferrer"&gt;raglight.mintlify.app/documentation/rest-api&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Feedback welcome :)&lt;/p&gt;

</description>
      <category>python</category>
      <category>ai</category>
      <category>rag</category>
      <category>opensource</category>
    </item>
  </channel>
</rss>
