Prose Balance Review with Nebius Llama-3.3-70B
What is "Balance Review"?
Balance review goes beyond spell-check. It detects structural imbalances in writing:
- One paragraph significantly longer than others
- Bullet point granularity inconsistent across a list
- Weak conclusion or overlong introduction
This requires moderate reasoning ability — more than a tag suggester, less than a full editorial rewrite. Nebius's llama-3.3-70b at $0.10/1M tokens is the right fit.
Where It Fits in the AI Routing Table
| Task | Model | Cost |
|---|---|---|
| Tag suggestions | Groq llama-3.3-70b | Free tier |
| Bulk summarization | DeepInfra llama-3.1-70b | $0.07/1M |
| Prose balance review | Nebius llama-3.3-70b | $0.10/1M |
| Document-level improvements | Claude Haiku | $0.25/1M |
| Design decisions | Claude Sonnet | $3.00/1M |
Nebius sits between DeepInfra (cost-optimized) and Claude Haiku (quality-optimized).
Supabase Edge Function
// ai-hub/index.ts (action: "notes.balance_review")
case "notes.balance_review": {
const { content } = body;
const response = await fetch(
"https://api.studio.nebius.ai/v1/chat/completions",
{
method: "POST",
headers: {
"Authorization": `Bearer ${Deno.env.get("NEBIUS_API_KEY")}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "meta-llama/Llama-3.3-70B-Instruct",
messages: [
{
role: "system",
content: `Analyze the writing balance and suggest up to 3 improvements.
Return JSON: {"issues": [{"type": "paragraph_balance|list_granularity|conclusion_strength", "description": "...", "suggestion": "..."}]}`,
},
{ role: "user", content: content.slice(0, 2000) },
],
max_tokens: 300,
temperature: 0.3,
response_format: { type: "json_object" },
}),
}
);
const data = await response.json();
const result = JSON.parse(data.choices[0].message.content);
return new Response(JSON.stringify(result), {
headers: { "Content-Type": "application/json" },
});
}
response_format: { type: "json_object" } enforces structured output — no parse errors.
Flutter Side
// note_editor_page.dart
Future<void> _requestBalanceReview() async {
setState(() => _reviewLoading = true);
try {
final response = await Supabase.instance.client.functions.invoke(
'ai-hub',
body: {
'action': 'notes.balance_review',
'content': _controller.text,
},
);
final issues = (response.data['issues'] as List? ?? [])
.cast<Map<String, dynamic>>();
if (mounted) {
setState(() {
_reviewIssues = issues;
_reviewLoading = false;
});
}
} catch (e) {
setState(() => _reviewLoading = false);
}
}
A "Balance Check" button below the editor triggers this. Results render as issue cards.
Nebius Quick Reference
| Property | Value |
|---|---|
| Endpoint |
api.studio.nebius.ai/v1/ (OpenAI-compatible) |
| Models | Llama-3.3-70B, DeepSeek-V3, others |
| Input cost | $0.10/1M tokens |
| JSON mode | ✅ supported |
| Japanese quality | Solid at 70B scale |
The Routing Pattern
The consistent theme across Groq, DeepInfra, and Nebius: all use OpenAI-compatible APIs. Your routing layer is just a URL swap:
const PROVIDER_URLS = {
groq: "https://api.groq.com/openai/v1",
deepinfra: "https://api.deepinfra.com/v1/openai",
nebius: "https://api.studio.nebius.ai/v1",
};
Build once, tune per task.
Building in public: https://my-web-app-b67f4.web.app/
Top comments (0)