If you've used NeoVoice AI for audio transcription and analysis, you know the API already delivered a lot: full transcription, sentiment analysis, topic identification, and executive summary — all in a single POST call.
Now, NeoVoice takes it further. By adding a single boolean parameter to your request body, the API generates structured meeting minutes with title, participants, decisions, tasks, and pending items. And if you need a formal document, you can receive the minutes as a ready-to-download .docx file.
The best part? Zero breaking changes. If you already integrate NeoVoice, your implementation keeps working exactly as before.
The Architecture of the Change
The design of this feature is elegantly simple. The endpoint remains the same POST /analyze_audio. The transcription flow is identical. The only difference is two new optional parameters in the JSON body:
{
"audio_base64": "...",
"filename": "sprint_meeting.mp3",
"language_code": "en-US",
"generate_ata": true,
"output_format": "json"
}
generate_ata (boolean, default false) — When true, the AI analyzes the transcription with a prompt specialized in meeting minutes instead of the standard analytics prompt. The AI model is the same; the difference is 100% in the system prompt.
output_format (string, "json" | "docx", default "json") — Only relevant when generate_ata: true. Controls whether the response comes back as structured JSON or as a .docx binary.
The generate_ata parameter uses strict is True comparison — it only accepts JSON boolean true, not strings. Requests without the field behave exactly as before. Backward-compatible by omission.
The Three Response Modes
Mode 1: Analytics (original behavior)
Without generate_ata or with generate_ata: false. Nothing changes:
import requests
url = "https://neovoice-ai.p.rapidapi.com/analyze_audio"
headers = {
"X-RapidAPI-Key": "YOUR_KEY",
"X-RapidAPI-Host": "neovoice-ai.p.rapidapi.com"
}
payload = {
"audio_url": "https://example.com/meeting.mp3",
"language_code": "en-US"
}
response = requests.post(url, json=payload, headers=headers)
data = response.json()
print(data["transcript"])
print(data["analytics"]["overall_sentiment"])
print(data["analytics"]["main_topics"])
print(data["analytics"]["summary"])
Response:
{
"status": "success",
"transcript": "...",
"analytics": {
"overall_sentiment": "positive",
"main_topics": ["sprint planning", "delivery deadline", "module X"],
"summary": "The team aligned priorities for the next sprint..."
}
}
Mode 2: Smart Minutes as JSON
Add generate_ata: true:
payload = {
"audio_url": "https://example.com/meeting.mp3",
"language_code": "en-US",
"generate_ata": True,
"output_format": "json"
}
response = requests.post(url, json=payload, headers=headers)
data = response.json()
print(data["ata"]["titulo"])
print(data["ata"]["participantes"])
print(data["ata"]["decisoes"])
print(data["ata"]["tarefas"])
print(data["ata"]["pendentes"])
Response:
{
"status": "success",
"transcript": "...",
"ata": {
"titulo": "Sprint Alignment Meeting",
"participantes": ["Ana (PO)", "Bruno (dev)", "Carlos (QA)"],
"decisoes": [
"Delivery deadline approved for July 15th",
"Module X will be prioritized over module Y"
],
"tarefas": [
{
"responsavel": "Bruno",
"acao": "Open PR for module X",
"prazo": "06/28"
},
{
"responsavel": "Carlos",
"acao": "Create regression test suite",
"prazo": "06/30"
}
],
"pendentes": [
"Waiting for legal response on clause 4"
]
}
}
The structure is predictable and parseable. Each task has responsavel (assignee), acao (action), and prazo (deadline) — perfect for feeding project boards, Jira, Notion, or any task management system.
Mode 3: Minutes as Word Document (.docx)
Switch output_format to "docx":
payload = {
"audio_url": "https://example.com/meeting.mp3",
"language_code": "en-US",
"generate_ata": True,
"output_format": "docx"
}
response = requests.post(url, json=payload, headers=headers)
# Response is binary — save directly to file
with open("meeting_minutes.docx", "wb") as f:
f.write(response.content)
print("Minutes saved as meeting_minutes.docx")
The response comes with Content-Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document and Content-Disposition: attachment; filename="ata_reuniao.docx". It's not JSON. Handle it as a blob/binary on the frontend.
The generated document is a real .docx — opens in Word, Google Docs, LibreOffice. Formatted with sections, lists, and task tables.
Technical Limits
| Parameter | Value |
|---|---|
| Maximum audio size | 100 MB |
| Maximum processed duration | 7 minutes |
| Accepted formats |
.wav, .mp3, .m4a, .mp4, .ogg, .opus, .flac, .aac, .wma, .webm, .amr
|
| Languages |
pt-BR, en-US and others supported by the speech engine |
Format detection uses magic bytes — it doesn't rely on the file extension. If you rename an .mp3 to .wav, the API detects the real format and converts it correctly.
Why This Is Actually Great
Most transcription APIs stop at text — they hand you a transcript and that's it. If you want something structured, you need to build your own LLM pipeline, write prompts, parse outputs, deal with hallucinations and broken JSON.
NeoVoice does all of that in one call. The same call that already existed. With one extra field. No new endpoint, no new auth, no new deployment. If you already consume the API, it's literally adding "generate_ata": true to your payload.
And the .docx export is generated server-side in memory — no disk writes, no dependency on external conversion services. Pure binary in the HTTP response.
Get Started
Visit NeoVoice AI — Audio Transcription and Analysis with AI to learn more about the product, see demos, and try it for free directly in your browser.
To integrate via API, grab your key at 👉 NeoVoice AI on RapidAPI
First 3 uses are free. Happy coding! 🚀
Top comments (0)