I built a lightweight literature synthesis agent that ingests academic paper abstracts and returns structured summaries, method tags, and limitation flags. It runs entirely on Oxlo.ai's flat per-request pricing (https://oxlo.ai/pricing), which keeps costs predictable when I feed it long PDF excerpts or multi-paper context windows. If you are a researcher or grad student tired of token-metering your literature reviews, this tutorial will get you running in under ten minutes.
What you'll need
Before starting, grab your Oxlo.ai API key from https://portal.oxlo.ai. You will also need Python 3.10+ and the OpenAI SDK.
- Python 3.10+
pip install openai- An Oxlo.ai API key
Step 1: Instantiate the client
I start by instantiating the client. Oxlo.ai is fully OpenAI SDK compatible, so the only change is the base URL.
from openai import OpenAI
client = OpenAI(base_url="https://api.oxlo.ai/v1", api_key="YOUR_OXLO_API_KEY")
response = client.chat.completions.create(
model="llama-3.3-70b",
messages=[
{"role": "system", "content": "You are a helpful research assistant."},
{"role": "user", "content": "List three best practices for prompt engineering."},
],
)
print(response.choices[0].message.content)
Step 2: Define the system prompt
Next, I define the system prompt that shapes every analysis. Keeping this in a dedicated variable makes it easy to iterate without touching the request logic.
SYSTEM_PROMPT = """You are an academic literature analyzer. Given a paper title and abstract, output a JSON object with exactly these keys:
- summary: one paragraph describing the core contribution.
- methods: a list of techniques, datasets, or experimental setups used.
- limitations: a list of stated or implied limitations.
- related_work_queries: three concrete search queries to find related papers.
Be concise, technical, and accurate. Output only valid JSON."""
Step 3: Analyze a single paper
Now I wrap the call in a function that accepts a title and abstract, then returns parsed JSON. I use Llama 3.3 70B because it handles structured instructions reliably.
import json
def analyze_paper(title: str, abstract: str) -> dict:
user_message = f"Title: {title}\n\nAbstract: {abstract}"
response = client.chat.completions.create(
model="llama-3.3-70b",
messages=[
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": user_message},
],
response_format={"type": "json_object"},
)
raw = response.choices[0].message.content
return json.loads(raw)
Step 4: Synthesize multiple papers
For literature reviews, I often need to compare several papers at once. Because Oxlo.ai uses request-based pricing, I can pack multiple full abstracts into a single prompt without paying for extra input tokens. This is where Oxlo.ai wins over token-based billing for long-context research tasks. I switch to Kimi K2.6 here for its strong reasoning window.
COMPARE_PROMPT = """You are a literature review synthesizer. Given several paper abstracts, output a JSON object with exactly these keys:
- synthesis: one paragraph integrating the main findings.
- method_comparison: a list comparing the techniques used in each paper.
- gaps: a list of research gaps not addressed by the provided papers.
Output only valid JSON."""
def synthesize_papers(papers: list[dict]) -> dict:
blocks = []
for idx, paper in enumerate(papers, 1):
blocks.append(f"[{idx}] {paper['title']}\n{paper['abstract']}")
user_message = "\n\n".join(blocks)
response = client.chat.completions.create(
model="kimi-k2.6",
messages=[
{"role": "system", "content": COMPARE_PROMPT},
{"role": "user", "content": user_message},
],
response_format={"type": "json_object"},
)
return json.loads(response.choices[0].message.content)
Step 5: Generate BibTeX entries
Finally, I add a helper to generate BibTeX entries. DeepSeek V3.2 handles the formatting consistently, and I can call it in the same script without managing a second provider.
def generate_bibtex(title: str, authors: list[str], year: str, venue: str) -> str:
user_message = (
f"Generate a clean BibTeX entry for the following paper.\n\n"
f"Title: {title}\n"
f"Authors: {', '.join(authors)}\n"
f"Year: {year}\n"
f"Venue: {venue}\n\n"
f"Output only the BibTeX block."
)
response = client.chat.completions.create(
model="deepseek-v3.2",
messages=[
{"role": "system", "content": "You generate valid BibTeX citations."},
{"role": "user", "content": user_message},
],
)
return response.choices[0].message.content
Run it
I test the analyzer with a well-known paper. The JSON output is ready to drop into a note-taking app or reference manager.
if __name__ == "__main__":
paper = {
"title": "Attention Is All You Need",
"abstract": (
"The dominant sequence transduction models are based on complex recurrent or convolutional neural networks "
"that include an encoder and a decoder. The best performing models also connect the encoder and decoder through "
"an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention "
"mechanisms, dispensing with recurrence and convolutions entirely."
),
}
result = analyze_paper(paper["title"], paper["abstract"])
print(json.dumps(result, indent=2))
Example output:
{
"summary": "The authors introduce the Transformer, a novel sequence transduction architecture that replaces recurrent and convolutional layers with multi-head self-attention, achieving state-of-the-art translation quality while enabling significantly faster training.",
"methods": [
"Multi-head self-attention",
"Positional encoding",
"Encoder-decoder architecture",
"WMT 2014 English-German and English-French datasets"
],
"limitations": [
"Quadratic complexity with respect to sequence length",
"Positional encoding may not generalize to lengths unseen during training"
],
"related_work_queries": [
"efficient attention mechanisms long sequences transformer",
"non-autoregressive neural machine translation",
"positional encoding generalization length extrapolation"
]
}
Wrap-up
This agent is already useful for my weekend literature sweeps. Two concrete next steps: wire it to the arXiv API to auto-ingest papers by keyword, or swap in deepseek-r1-671b-moe on Oxlo.ai for deeper methodological critique when evaluating complex experimental designs. Because Oxlo.ai does not charge by the token, I can stuff entire related-work sections into context without watching a meter run up.
Top comments (0)