DEV Community

Cover image for NeoVoice AI: From Transcription to Smart Meeting Minutes with a Single Flag
NjC-IA
NjC-IA

Posted on

NeoVoice AI: From Transcription to Smart Meeting Minutes with a Single Flag

If you've used NeoVoice AI for audio transcription and analysis, you know the API already delivered a lot: full transcription, sentiment analysis, topic identification, and executive summary — all in a single POST call.

Now, NeoVoice takes it further. By adding a single boolean parameter to your request body, the API generates structured meeting minutes with title, participants, decisions, tasks, and pending items. And if you need a formal document, you can receive the minutes as a ready-to-download .docx file.

The best part? Zero breaking changes. If you already integrate NeoVoice, your implementation keeps working exactly as before.


The Architecture of the Change

The design of this feature is elegantly simple. The endpoint remains the same POST /analyze_audio. The transcription flow is identical. The only difference is two new optional parameters in the JSON body:

{
  "audio_base64": "...",
  "filename": "sprint_meeting.mp3",
  "language_code": "en-US",
  "generate_ata": true,
  "output_format": "json"
}
Enter fullscreen mode Exit fullscreen mode

generate_ata (boolean, default false) — When true, the AI analyzes the transcription with a prompt specialized in meeting minutes instead of the standard analytics prompt. The AI model is the same; the difference is 100% in the system prompt.

output_format (string, "json" | "docx", default "json") — Only relevant when generate_ata: true. Controls whether the response comes back as structured JSON or as a .docx binary.

The generate_ata parameter uses strict is True comparison — it only accepts JSON boolean true, not strings. Requests without the field behave exactly as before. Backward-compatible by omission.


The Three Response Modes

Mode 1: Analytics (original behavior)

Without generate_ata or with generate_ata: false. Nothing changes:

import requests

url = "https://neovoice-ai.p.rapidapi.com/analyze_audio"
headers = {
    "X-RapidAPI-Key": "YOUR_KEY",
    "X-RapidAPI-Host": "neovoice-ai.p.rapidapi.com"
}

payload = {
    "audio_url": "https://example.com/meeting.mp3",
    "language_code": "en-US"
}

response = requests.post(url, json=payload, headers=headers)
data = response.json()

print(data["transcript"])
print(data["analytics"]["overall_sentiment"])
print(data["analytics"]["main_topics"])
print(data["analytics"]["summary"])
Enter fullscreen mode Exit fullscreen mode

Response:

{
  "status": "success",
  "transcript": "...",
  "analytics": {
    "overall_sentiment": "positive",
    "main_topics": ["sprint planning", "delivery deadline", "module X"],
    "summary": "The team aligned priorities for the next sprint..."
  }
}
Enter fullscreen mode Exit fullscreen mode

Mode 2: Smart Minutes as JSON

Add generate_ata: true:

payload = {
    "audio_url": "https://example.com/meeting.mp3",
    "language_code": "en-US",
    "generate_ata": True,
    "output_format": "json"
}

response = requests.post(url, json=payload, headers=headers)
data = response.json()

print(data["ata"]["titulo"])
print(data["ata"]["participantes"])
print(data["ata"]["decisoes"])
print(data["ata"]["tarefas"])
print(data["ata"]["pendentes"])
Enter fullscreen mode Exit fullscreen mode

Response:

{
  "status": "success",
  "transcript": "...",
  "ata": {
    "titulo": "Sprint Alignment Meeting",
    "participantes": ["Ana (PO)", "Bruno (dev)", "Carlos (QA)"],
    "decisoes": [
      "Delivery deadline approved for July 15th",
      "Module X will be prioritized over module Y"
    ],
    "tarefas": [
      {
        "responsavel": "Bruno",
        "acao": "Open PR for module X",
        "prazo": "06/28"
      },
      {
        "responsavel": "Carlos",
        "acao": "Create regression test suite",
        "prazo": "06/30"
      }
    ],
    "pendentes": [
      "Waiting for legal response on clause 4"
    ]
  }
}
Enter fullscreen mode Exit fullscreen mode

The structure is predictable and parseable. Each task has responsavel (assignee), acao (action), and prazo (deadline) — perfect for feeding project boards, Jira, Notion, or any task management system.

Mode 3: Minutes as Word Document (.docx)

Switch output_format to "docx":

payload = {
    "audio_url": "https://example.com/meeting.mp3",
    "language_code": "en-US",
    "generate_ata": True,
    "output_format": "docx"
}

response = requests.post(url, json=payload, headers=headers)

# Response is binary — save directly to file
with open("meeting_minutes.docx", "wb") as f:
    f.write(response.content)

print("Minutes saved as meeting_minutes.docx")
Enter fullscreen mode Exit fullscreen mode

The response comes with Content-Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document and Content-Disposition: attachment; filename="ata_reuniao.docx". It's not JSON. Handle it as a blob/binary on the frontend.

The generated document is a real .docx — opens in Word, Google Docs, LibreOffice. Formatted with sections, lists, and task tables.


Technical Limits

Parameter Value
Maximum audio size 100 MB
Maximum processed duration 7 minutes
Accepted formats .wav, .mp3, .m4a, .mp4, .ogg, .opus, .flac, .aac, .wma, .webm, .amr
Languages pt-BR, en-US and others supported by the speech engine

Format detection uses magic bytes — it doesn't rely on the file extension. If you rename an .mp3 to .wav, the API detects the real format and converts it correctly.


Why This Is Actually Great

Most transcription APIs stop at text — they hand you a transcript and that's it. If you want something structured, you need to build your own LLM pipeline, write prompts, parse outputs, deal with hallucinations and broken JSON.

NeoVoice does all of that in one call. The same call that already existed. With one extra field. No new endpoint, no new auth, no new deployment. If you already consume the API, it's literally adding "generate_ata": true to your payload.

And the .docx export is generated server-side in memory — no disk writes, no dependency on external conversion services. Pure binary in the HTTP response.


Get Started

Visit NeoVoice AI — Audio Transcription and Analysis with AI to learn more about the product, see demos, and try it for free directly in your browser.

To integrate via API, grab your key at 👉 NeoVoice AI on RapidAPI

First 3 uses are free. Happy coding! 🚀

Top comments (0)