Unlocking Education and Research with LLMs

#learnai #oxlo #ai

Building a Literature Learning Agent that turns raw academic abstracts into structured summaries, critical questions, and Anki flashcards saves hours for students and researchers. In this tutorial, I will walk through the exact Python script I use to process papers. We will call Oxlo.ai's OpenAI-compatible endpoint so that long abstracts do not inflate the bill, because Oxlo.ai charges per request rather than per token.

What you'll need

Python 3.10 or newer
The OpenAI SDK: pip install openai
An Oxlo.ai API key from https://portal.oxlo.ai

I also recommend setting your key in an environment variable named OXLO_API_KEY so you do not commit it to git.

Step 1: Scaffold the client

First, I create an OpenAI client pointed at Oxlo.ai and verify that the connection works with a one-line test. I use llama-3.3-70b here because it is a reliable general-purpose model for instruction following.

import os
from openai import OpenAI

client = OpenAI(
    base_url="https://api.oxlo.ai/v1",
    api_key=os.getenv("OXLO_API_KEY", "YOUR_OXLO_API_KEY"),
)

# Verify connectivity
response = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[{"role": "user", "content": "Say hello"}],
)

print(response.choices[0].message.content)

Step 2: Define the research system prompt

The system prompt is the only magic in this agent. It forces the model to return a predictable structure that I can parse later.

SYSTEM_PROMPT = """You are a research assistant helping graduate students understand academic papers.
When given an abstract, produce exactly the following sections:

1. Plain-Language Summary: One paragraph, no jargon.
2. Key Findings: A bulleted list of the three most important results.
3. Methodology Critique: One paragraph noting strengths and one weakness.
4. Follow-Up Questions: Three Socratic questions the student should ask their advisor.
5. Anki Flashcards: Exactly five question-and-answer pairs formatted as:
   Q: [question]
   A: [answer]

Do not include any text outside these sections."""

Step 3: Build the analysis function

Next, I wrap the API call in a function that accepts an abstract and returns the structured analysis. I keep the temperature low so the output stays deterministic.

def analyze_abstract(abstract: str) -> str:
    response = client.chat.completions.create(
        model="llama-3.3-70b",
        messages=[
            {"role": "system", "content": SYSTEM_PROMPT},
            {"role": "user", "content": f"Analyze this abstract:\n\n{abstract}"},
        ],
        temperature=0.2,
    )
    return response.choices[0].message.content

Step 4: Generate structured flashcards with JSON mode

I run a second pass over the same abstract with deepseek-v3.2 and JSON mode to get machine-readable flashcards. Oxlo.ai supports the response_format parameter, so I can request a strict schema that my script consumes without regex.

import csv
import json

FLASHCARD_PROMPT = """You are a flashcard generator. Given an academic abstract, emit exactly 5 Anki flashcards as a JSON object.
The JSON must match this schema:
{
  "flashcards": [
    {"front": "...", "back": "..."}
  ]
}
Keep each front under 120 characters and each back under 240 characters."""

def generate_flashcards(abstract: str) -> list[dict]:
    response = client.chat.completions.create(
        model="deepseek-v3.2",
        messages=[
            {"role": "system", "content": FLASHCARD_PROMPT},
            {"role": "user", "content": abstract},
        ],
        response_format={"type": "json_object"},
        temperature=0.2,
    )
    data = json.loads(response.choices[0].message.content)
    return data["flashcards"]

def save_flashcards(cards: list[dict], filename: str = "flashcards.csv"):
    with open(filename, "w", newline="", encoding="utf-8") as f:
        writer = csv.writer(f)
        writer.writerow(["Front", "Back"])
        for card in cards:
            writer.writerow([card["front"], card["back"]])
    print(f"Saved {len(cards)} flashcards to {filename}")

Step 5: Tie it together

I add a small main block that reads an example abstract, prints the analysis, and writes the CSV. You can swap in any abstract from arXiv or PubMed.

if __name__ == "__main__":
    sample_abstract = (
        "We introduce a novel transformer architecture that reduces attention complexity "
        "from quadratic to linear via kernelized positional sampling. Experiments on WikiText-103 "
        "and BookCorpus show a 2.3x speedup at 4096-token context lengths with no loss in perplexity. "
        "However, the method underperforms on tasks requiring fine-grained positional reasoning. "
        "This work suggests a new trade-off between efficiency and positional fidelity in large language models."
    )

    print("=== Analysis ===")
    analysis = analyze_abstract(sample_abstract)
    print(analysis)
    print("\n")

    print("=== Flashcards ===")
    cards = generate_flashcards(sample_abstract)
    for c in cards:
        print(f"Q: {c['front']}")
        print(f"A: {c['back']}")
        print()

    save_flashcards(cards)

Run it

Save the full script as literature_agent.py and run it with your Oxlo.ai key exported.

export OXLO_API_KEY="sk-oxlo.ai-..."
python literature_agent.py

When I run this against the sample abstract, I get output similar to the following. Your exact wording will vary slightly because of sampling.

=== Analysis ===
1. Plain-Language Summary: This paper presents a new way to make transformers faster by changing how they handle position information, cutting the computational cost while keeping text quality the same on long documents.

2. Key Findings:
   - Attention complexity drops from quadratic to linear using kernelized positional sampling.
   - A 2.3x speedup is measured at 4096-token contexts on WikiText-103 and BookCorpus.
   - Perplexity remains unchanged compared to standard transformers.

3. Methodology Critique: The experiments are thorough on language modeling benchmarks, but the evaluation lacks downstream task testing on retrieval or reasoning benchmarks. One weakness is the reported drop in fine-grained positional reasoning tasks.

4. Follow-Up Questions:
   - Why does kernelized sampling preserve perplexity yet hurt positional reasoning?
   - What downstream tasks beyond language modeling were considered, and why were they omitted?
   - How does this approach compare to sparse attention patterns such as Longformer?

5. Anki Flashcards:
   Q: What is the main contribution of this paper?
   A: A linear-complexity transformer using kernelized positional sampling.
   Q: On which datasets was the model evaluated?
   A: WikiText-103 and BookCorpus.
   ...

=== Flashcards ===
Q: What complexity does the proposed attention reduce?
A: From quadratic to linear.
Q: What is the measured speedup at 4096 tokens?
A: 2.3x.
...

Saved 5 flashcards to flashcards.csv

Wrap up

From here, you can extend the agent by adding a PDF extraction step with PyPDF2 or pymupdf so it ingests full papers instead of just abstracts. Another solid upgrade is swapping in kimi-k2.6 for the analysis step when you need deeper reasoning over multi-page methodology sections, because its 131K context window on Oxlo.ai handles entire articles under the same flat per-request price.