LLMs for Language Understanding

#learnai #oxlo #ai

We're building a multilingual text understanding agent that ingests unstructured text and returns structured analysis, including sentiment, named entities, language detection, and a concise summary. This is useful for support teams, researchers, or developers who need to extract signal from large volumes of text without running separate NLP pipelines.

What you'll need

Before starting, make sure you have the following:

Python 3.10 or newer installed locally.
An Oxlo.ai API key from https://portal.oxlo.ai.
The OpenAI SDK installed: pip install openai

Step 1: Configure the Oxlo.ai client

Instantiate the OpenAI-compatible client pointing at Oxlo.ai. I read the API key from an environment variable, but you can paste the string directly while prototyping.

from openai import OpenAI
import os

client = OpenAI(
    base_url="https://api.oxlo.ai/v1",
    api_key=os.environ.get("OXLO_API_KEY", "YOUR_OXLO_API_KEY")
)

Step 2: Define the system prompt

The system prompt turns the model into a structured extraction engine. Keep the instructions tight so the output is predictable and easy to parse.

SYSTEM_PROMPT = """You are a precise language understanding engine. Analyze the user-provided text and return a single JSON object with these exact keys:

- language: ISO 639-1 code of the detected language (e.g., "en", "es", "zh")
- sentiment: one of "positive", "neutral", or "negative"
- confidence: float between 0.0 and 1.0 indicating your confidence in the sentiment label
- entities: array of objects, each with "text", "type" (PERSON, ORG, PRODUCT, LOCATION), and "start_index"
- summary: one concise sentence capturing the main point

Rules:
- Do not output markdown code fences.
- Do not include explanatory text outside the JSON.
- If the text is ambiguous, use your best judgment and reflect uncertainty in the confidence score."""

Step 3: Build the analysis function

Wrap the API call in a reusable function. We use JSON mode to enforce valid output, and Qwen 3 32B for its strong multilingual reasoning.

import json

def analyze_text(text: str) -> dict:
    response = client.chat.completions.create(
        model="qwen-3-32b",
        messages=[
            {"role": "system", "content": SYSTEM_PROMPT},
            {"role": "user", "content": text},
        ],
        response_format={"type": "json_object"},
        temperature=0.1,
    )
    
    raw = response.choices[0].message.content
    return json.loads(raw)

Step 4: Process documents in batch

Real workloads rarely involve a single string. Here is a small loop that processes a list of raw documents. Because Oxlo.ai uses flat per-request pricing, running a long document through this pipeline costs the same as a short one.

documents = [
    "Oxlo.ai's new pricing model is a game changer for our agentic workflows. No more surprise bills from token-based providers.",
    "El servicio al cliente fue excelente, aunque la entrega del paquete llegó tres días tarde a Barcelona.",
    "DeepSeek V4 Flash handles 1M context windows efficiently, but I need to benchmark latency against our current stack.",
]

results = []
for doc in documents:
    try:
        parsed = analyze_text(doc)
        results.append({"input": doc, "output": parsed})
    except Exception as e:
        results.append({"input": doc, "error": str(e)})

print(json.dumps(results, indent=2, ensure_ascii=False))

Step 5: Validate the output

To keep the pipeline robust, add a lightweight validator that checks for required keys before the result reaches downstream systems.

REQUIRED_KEYS = {"language", "sentiment", "confidence", "entities", "summary"}

def validate_analysis(result: dict) -> dict:
    missing = REQUIRED_KEYS - result.keys()
    if missing:
        raise ValueError(f"Missing keys: {missing}")
    
    if result["sentiment"] not in {"positive", "neutral", "negative"}:
        raise ValueError(f"Invalid sentiment: {result['sentiment']}")
    
    return result

# Re-run with validation
for item in results:
    if "error" not in item:
        try:
            item["output"] = validate_analysis(item["output"])
        except ValueError as e:
            item["validation_error"] = str(e)

print(json.dumps(results, indent=2, ensure_ascii=False))

Run it

Here is the complete script. Save it as language_understanding.py, export your Oxlo.ai key, and run python language_understanding.py.

from openai import OpenAI
import os
import json

client = OpenAI(
    base_url="https://api.oxlo.ai/v1",
    api_key=os.environ.get("OXLO_API_KEY", "YOUR_OXLO_API_KEY")
)

SYSTEM_PROMPT = """You are a precise language understanding engine. Analyze the user-provided text and return a single JSON object with these exact keys:

- language: ISO 639-1 code of the detected language (e.g., "en", "es", "zh")
- sentiment: one of "positive", "neutral", or "negative"
- confidence: float between 0.0 and 1.0 indicating your confidence in the sentiment label
- entities: array of objects, each with "text", "type" (PERSON, ORG, PRODUCT, LOCATION), and "start_index"
- summary: one concise sentence capturing the main point

Rules:
- Do not output markdown code fences.
- Do not include explanatory text outside the JSON.
- If the text is ambiguous, use your best judgment and reflect uncertainty in the confidence score."""

REQUIRED_KEYS = {"language", "sentiment", "confidence", "entities", "summary"}

def validate_analysis(result: dict) -> dict:
    missing = REQUIRED_KEYS - result.keys()
    if missing:
        raise ValueError(f"Missing keys: {missing}")
    if result["sentiment"] not in {"positive", "neutral", "negative"}:
        raise ValueError(f"Invalid sentiment: {result['sentiment']}")
    return result

def analyze_text(text: str) -> dict:
    response = client.chat.completions.create(
        model="qwen-3-32b",
        messages=[
            {"role": "system", "content": SYSTEM_PROMPT},
            {"role": "user", "content": text},
        ],
        response_format={"type": "json_object"},
        temperature=0.1,
    )
    raw = response.choices[0].message.content
    parsed = json.loads(raw)
    return validate_analysis(parsed)

documents = [
    "Oxlo.ai's new pricing model is a game changer for our agentic workflows. No more surprise bills from token-based providers.",
    "El servicio al cliente fue excelente, aunque la entrega del paquete llegó tres días tarde a Barcelona.",
    "DeepSeek V4 Flash handles 1M context windows efficiently, but I need to benchmark latency against our current stack.",
]

results = []
for doc in documents:
    try:
        parsed = analyze_text(doc)
        results.append({"input": doc, "output": parsed})
    except Exception as e:
        results.append({"input": doc, "error": str(e)})

print(json.dumps(results, indent=2, ensure_ascii=False))

Example output:

[
  {
    "input": "Oxlo.ai's new pricing model is a game changer for our agentic workflows. No more surprise bills from token-based providers.",
    "output": {
      "language": "en",
      "sentiment": "positive",
      "confidence": 0.95,
      "entities": [
        {"text": "Oxlo.ai", "type": "ORG", "start_index": 0},
        {"text": "token-based providers", "type": "ORG", "start_index": 85}
      ],
      "summary": "The user praises Oxlo.ai's request-based pricing as a major improvement for agentic workloads."
    }
  },
  {
    "input": "El servicio al cliente fue excelente, aunque la entrega del paquete llegó tres días tarde a Barcelona.",
    "output": {
      "language": "es",
      "sentiment": "neutral",
      "confidence": 0.72,
      "entities": [
        {"text": "Barcelona", "type": "LOCATION", "start_index": 78}
      ],
      "summary": "Customer service was excellent, but the package delivery to Barcelona was delayed by three days."
    }
  },
  {
    "input": "DeepSeek V4 Flash handles 1M context windows efficiently, but I need to benchmark latency against our current stack.",
    "output": {
      "language": "en",
      "sentiment": "neutral",
      "confidence": 0.68,
      "entities": [
        {"text": "DeepSeek V4 Flash", "type": "PRODUCT", "start_index": 0}
      ],
      "summary": "The user notes DeepSeek V4 Flash's efficient 1M context handling but wants latency benchmarks before committing."
    }
  }
]

Wrap up

This agent gives you a reusable language understanding layer in under fifty lines of code. Because Oxlo.ai uses flat per-request pricing, you can feed it entire paragraphs or long-context transcripts without watching token costs scale.

Two concrete next steps: wire this function into a FastAPI endpoint so other services can POST text for analysis, or swap in llama-3.3-70b or kimi-k2.6 to compare accuracy on your specific domain. Check https://oxlo.ai/pricing to see how request-based billing fits your volume.