LLM Explained for Beginners

#learnai #oxlo #ai

We're going to build a command-line "Explain Like I'm Five" tutor. It takes any complex topic and breaks it down into simple terms. If you have never called an LLM API before, this is the shortest path from zero to working code.

What you'll need

You need Python 3.10 or newer, the OpenAI SDK, and an Oxlo.ai API key. Install the SDK with pip and grab a key from the portal.

pip install openai

Your Oxlo.ai API key is available at https://portal.oxlo.ai. The free tier gives you 60 requests per day across 16 models, which is plenty for this tutorial.

Step 1: Make your first API call

Before we add any logic, we need to prove the connection works. I always start with a single hardcoded prompt.

from openai import OpenAI

client = OpenAI(base_url="https://api.oxlo.ai/v1", api_key="YOUR_OXLO_API_KEY")

response = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[
        {"role": "user", "content": "What is an LLM?"},
    ],
)

print(response.choices[0].message.content)

Step 2: Define the system prompt

An LLM has no fixed personality until you give it one. We will use a system message to tell it to act as a patient teacher.

SYSTEM_PROMPT = """You are a friendly teacher explaining topics to a complete beginner.
Break concepts into short sentences.
Use analogies from everyday life.
Never use jargon without defining it first."""

Step 3: Build the interactive loop

Now we wrap the call in a loop so the user can ask multiple questions. We keep conversation history in a list so the model remembers context.

from openai import OpenAI

client = OpenAI(base_url="https://api.oxlo.ai/v1", api_key="YOUR_OXLO_API_KEY")

SYSTEM_PROMPT = """You are a friendly teacher explaining topics to a complete beginner.
Break concepts into short sentences.
Use analogies from everyday life.
Never use jargon without defining it first."""

messages = [{"role": "system", "content": SYSTEM_PROMPT}]

while True:
    user_input = input("\nTopic: ")
    if user_input.lower() in ["exit", "quit"]:
        break

    messages.append({"role": "user", "content": user_input})

    response = client.chat.completions.create(
        model="llama-3.3-70b",
        messages=messages,
    )

    reply = response.choices[0].message.content
    print(f"\n{reply}")

    messages.append({"role": "assistant", "content": reply})

Step 4: Add streaming

Waiting for the full response feels slow. Streaming sends tokens as they are generated. On Oxlo.ai, this works exactly like the OpenAI SDK.

from openai import OpenAI

client = OpenAI(base_url="https://api.oxlo.ai/v1", api_key="YOUR_OXLO_API_KEY")

SYSTEM_PROMPT = """You are a friendly teacher explaining topics to a complete beginner.
Break concepts into short sentences.
Use analogies from everyday life.
Never use jargon without defining it first."""

messages = [{"role": "system", "content": SYSTEM_PROMPT}]

while True:
    user_input = input("\nTopic: ")
    if user_input.lower() in ["exit", "quit"]:
        break

    messages.append({"role": "user", "content": user_input})

    stream = client.chat.completions.create(
        model="llama-3.3-70b",
        messages=messages,
        stream=True,
    )

    print("\n")
    reply_chunks = []
    for chunk in stream:
        if chunk.choices[0].delta.content:
            text = chunk.choices[0].delta.content
            print(text, end="", flush=True)
            reply_chunks.append(text)

    reply = "".join(reply_chunks)
    messages.append({"role": "assistant", "content": reply})
    print("\n")

Run it

Save the final script as tutor.py and run it. Here is what a session looks like when I asked about neural networks.

$ python tutor.py

Topic: neural networks

A neural network is like a team of tiny decision makers. Imagine you are trying to guess if a photo shows a cat or a dog. Each tiny decision maker looks at one small clue, like pointy ears or a fluffy tail. They vote, and the majority vote becomes the answer. An LLM is just a very large neural network trained on text.

Topic: quit

Next steps

You now have a working mental model of what an LLM does: it receives a list of messages, predicts the next token, and returns text. Two concrete ways to push this further. First, swap the model string to qwen-3-32b or kimi-k2.6 to see how different architectures affect explanation style. Second, add function calling by defining a tools list so the tutor can look up current facts before answering. Both patterns work out of the box on Oxlo.ai because the platform is fully OpenAI SDK compatible, and the flat per-request pricing means you can send long system prompts without watching token costs climb. See https://oxlo.ai/pricing for details.