We are building a lightweight sentiment analysis pipeline that ingests raw social media posts, classifies each by sentiment, extracts mentioned entities, and aggregates everything into a structured report. This is for teams that need brand monitoring without managing token-based cost surprises on high-volume streams.
What you'll need
- Python 3.10 or newer
- An Oxlo.ai API key from https://portal.oxlo.ai
- The OpenAI SDK. Install it with
pip install openai
Step 1: Configure the Oxlo.ai client and test a single post
I start by verifying the connection with one sample tweet. I use llama-3.3-70b because it handles classification reliably.
import os
from openai import OpenAI
client = OpenAI(base_url="https://api.oxlo.ai/v1", api_key=os.getenv("OXLO_API_KEY"))
test_post = "Just spent 20 minutes on hold with @AcmeSupport. Worst customer service ever. Never again."
response = client.chat.completions.create(
model="llama-3.3-70b",
messages=[
{"role": "system", "content": "Classify the sentiment of the following social media post as positive, neutral, or negative. Respond with one word only."},
{"role": "user", "content": test_post},
],
)
print(response.choices[0].message.content)
Step 2: Define the system prompt for structured analysis
To get consistent, parseable output, I lock the model into a JSON schema via the system prompt. I do not rely on the model to guess the format.
SYSTEM_PROMPT = """
You are a social media sentiment analyst. Analyze the provided post and return a JSON object with exactly these keys:
- sentiment: one of "positive", "neutral", or "negative"
- entities: list of brand names, product names, or handles mentioned
- severity: integer from 1 to 10, where 10 means extremely urgent or damaging
- summary: one sentence explaining the reason for the sentiment
Return only valid JSON. Do not wrap it in markdown.
"""
Step 3: Analyze a batch of posts
Now I loop over a list of posts. Because Oxlo.ai uses flat per-request pricing, sending long posts or full complaint threads does not inflate costs the way token-based billing would. That matters when you monitor long-form content.
import json
from openai import OpenAI
client = OpenAI(base_url="https://api.oxlo.ai/v1", api_key=os.getenv("OXLO_API_KEY"))
SYSTEM_PROMPT = """
You are a social media sentiment analyst. Analyze the provided post and return a JSON object with exactly these keys:
- sentiment: one of "positive", "neutral", or "negative"
- entities: list of brand names, product names, or handles mentioned
- severity: integer from 1 to 10, where 10 means extremely urgent or damaging
- summary: one sentence explaining the reason for the sentiment
Return only valid JSON. Do not wrap it in markdown.
"""
posts = [
"Absolutely love the new update from @TechCorp. Dark mode is a game changer.",
"Three failed deliveries in a row. @FastShip is a joke.",
"Anyone else notice the battery drain on the latest Model X firmware? Not great.",
]
results = []
for post in posts:
response = client.chat.completions.create(
model="llama-3.3-70b",
messages=[
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": post},
],
)
raw = response.choices[0].message.content
results.append(json.loads(raw))
print(json.dumps(results, indent=2))
Step 4: Aggregate results into a monitoring report
Raw JSON per post is useful, but stakeholders want a summary. I aggregate by entity and sentiment, and compute average severity.
from collections import defaultdict
def aggregate(post_results):
report = defaultdict(lambda: {"positive": 0, "neutral": 0, "negative": 0, "total_severity": 0, "count": 0})
for r in post_results:
for entity in r.get("entities", []):
ent = report[entity]
ent[r["sentiment"]] += 1
ent["total_severity"] += r.get("severity", 5)
ent["count"] += 1
for entity, data in report.items():
data["avg_severity"] = round(data["total_severity"] / data["count"], 2) if data["count"] else 0
return dict(report)
summary = aggregate(results)
print(json.dumps(summary, indent=2))
Run it
I tie everything together in a single script with sample data. Below is the full program, followed by the terminal output.
import os
import json
from collections import defaultdict
from openai import OpenAI
client = OpenAI(base_url="https://api.oxlo.ai/v1", api_key=os.getenv("OXLO_API_KEY"))
SYSTEM_PROMPT = """
You are a social media sentiment analyst. Analyze the provided post and return a JSON object with exactly these keys:
- sentiment: one of "positive", "neutral", or "negative"
- entities: list of brand names, product names, or handles mentioned
- severity: integer from 1 to 10, where 10 means extremely urgent or damaging
- summary: one sentence explaining the reason for the sentiment
Return only valid JSON. Do not wrap it in markdown.
"""
posts = [
"Just spent 20 minutes on hold with @AcmeSupport. Worst customer service ever.",
"Absolutely love the new update from @TechCorp. Dark mode is a game changer.",
"Three failed deliveries in a row. @FastShip is a joke.",
"Anyone else notice the battery drain on the latest Model X firmware? Not great.",
]
results = []
for post in posts:
response = client.chat.completions.create(
model="llama-3.3-70b",
messages=[
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": post},
],
)
results.append(json.loads(response.choices[0].message.content))
def aggregate(post_results):
report = defaultdict(lambda: {"positive": 0, "neutral": 0, "negative": 0, "total_severity": 0, "count": 0})
for r in post_results:
for entity in r.get("entities", []):
ent = report[entity]
ent[r["sentiment"]] += 1
ent["total_severity"] += r.get("severity", 5)
ent["count"] += 1
for entity, data in report.items():
data["avg_severity"] = round(data["total_severity"] / data["count"], 2) if data["count"] else 0
return dict(report)
print("=== Raw Results ===")
print(json.dumps(results, indent=2))
print("\n=== Aggregated Report ===")
print(json.dumps(aggregate(results), indent=2))
=== Raw Results ===
[
{
"sentiment": "negative",
"entities": ["AcmeSupport"],
"severity": 8,
"summary": "The user is frustrated after a long hold time with customer service."
},
{
"sentiment": "positive",
"entities": ["TechCorp"],
"severity": 3,
"summary": "The user praises the new dark mode feature in a recent update."
},
{
"sentiment": "negative",
"entities": ["FastShip"],
"severity": 9,
"summary": "The user expresses anger over repeated failed deliveries."
},
{
"sentiment": "negative",
"entities": ["Model X"],
"severity": 5,
"summary": "The user reports dissatisfaction with battery drain on new firmware."
}
]
=== Aggregated Report ===
{
"AcmeSupport": {
"positive": 0,
"neutral": 0,
"negative": 1,
"total_severity": 8,
"count": 1,
"avg_severity": 8.0
},
"TechCorp": {
"positive": 1,
"neutral": 0,
"negative": 0,
"total_severity": 3,
"count": 1,
"avg_severity": 3.0
},
"FastShip": {
"positive": 0,
"neutral": 0,
"negative": 1,
"total_severity": 9,
"count": 1,
"avg_severity": 9.0
},
"Model X": {
"positive": 0,
"neutral": 0,
"negative": 1,
"total_severity": 5,
"count": 1,
"avg_severity": 5.0
}
}
Wrap-up
Two concrete ways to push this further. First, wire the script to a real-time feed like the X API or a Reddit subreddit stream and run it on a schedule with cron or a lightweight task runner. Second, swap to qwen-3-32b or kimi-k2.6 if you need multilingual sentiment detection or deeper reasoning on sarcastic posts, still on the same flat per-request pricing. See https://oxlo.ai/pricing for plan details.
Top comments (0)