NEBULA DATA

Posted on Jan 27

High Performance, Low Cost: Building a Professional RAG Chatbot from Scratch

#webdev #programming #beginners #ai

Building a RAG Chatbot from Scratch: Part 1

Choosing the Right Engine

Hello everyone! Today, I’m kicking off a short series where I’ll be documenting my journey of building a specialized chatbot. Unlike a standard chatbot that provides general answers, I want this one to have a very specific "job": answering questions based on the 2024 Indonesian Government Financial Reports compiled by the Ministry of Finance.

You might be wondering: "What’s the difference between a regular chatbot and a RAG-based chatbot?" The primary difference lies in the information source and how the AI formulates its response.

Understanding the RAG Difference

In a standard AI setup, the process is quite linear:

As you can see, the user asks a question, and the AI responds based on the data it was trained on. However, for my project, I am adding a critical component that prevents the AI from needing to "guess" or rely on outdated training data.

By adding Stored Information (our 2024 Financial Report), the general-purpose AI becomes a Specialized AI. It will only provide answers relevant to the context found in that stored data. We will discuss what happens when a user asks something "out of context" in future articles, but today, my focus is on selecting the right AI model.

Selecting the Model: Why Nebula Lab?

When looking for a model, I felt overwhelmed by the different platforms—GPT, Claude, and Gemini all live in different ecosystems. I initially looked at OpenRouter, a popular API aggregator. However, after some research and a tip from a friend, I discovered Nebula Lab.

Nebula Lab (ai-nebula.com) is an API aggregator that offers not just LLMs, but also marketing tools. Here is why I decided to switch from OpenRouter to Nebula:

Cost-Effectiveness: Their prices are significantly lower. For example, GPT-5.2 is listed at $1.40 USD per 1M tokens. Compared to official OpenAI pricing, Nebula is genuinely more affordable.

No Platform Fees: Unlike some aggregators that charge a 5% platform fee, Nebula Lab doesn't tack on extra costs.
Model Variety: They host all the heavy hitters, including OpenAI, Google, and Anthropic.
Clean UI: The interface is simple and easy to navigate.

Clear Documentation: For a beginner, their documentation is straightforward and easy to implement.

Testing the API

Getting started was incredibly easy:

Navigate to the Model Center.
Select API Key on the left sidebar.
Generate your key.

To ensure everything was working, I tested the connection using two methods provided in their documentation:

1. Testing via CURL

I ran the following command in my terminal (Command Prompt/PowerShell):

curl https://llm.ai-nebula.com/v1/chat/completions \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer YOUR_API_KEY" \
    -d '{
        "model": "gpt-5.2",
        "messages": [{"role": "user", "content": "Hello!"}]
    }'

The response was instant and normal. The "GPT-5.2" model responded perfectly.

2. Testing via Python

I then used Python (version 3.13.2) for a more integrated test:

from openai import OpenAI

client = OpenAI(
    api_key="sk-xxxxxxxxxxxxxxxxxxx", # Replace with your actual key
    base_url="https://llm.ai-nebula.com/v1"
)

response = client.chat.completions.create(
    model="gpt-5.2",
    messages=[
        {"role": "user", "content": "Hello!"}
    ]
)

print(response.choices[0].message.content)

Success! The code ran smoothly without a single hitch. I’m really impressed with Nebula Lab’s variety and ease of use.

What's Next?

In the next article, we’ll start building the actual chatbot and gradually begin injecting our financial data to transform this from a simple API call into a full-fledged RAG system.

If you want to try it out yourself, check out Nebula Lab here: https://openai-nebula.com/

DEV Community