Build an AI meeting-notes tool
Meeting notes are two steps: transcribe the audio, then summarize it into key points and action
items. Whisper does the transcription locally (free per use); APIVAI (GPT-5.5 or Claude Sonnet)
does the summarization cheaply over an OpenAI-compatible call.
1. Transcribe (local Whisper)
from faster_whisper import WhisperModel
stt = WhisperModel("small", device="cuda", compute_type="float16") # or device="cpu"
def transcribe(audio_path):
segs, _ = stt.transcribe(audio_path, vad_filter=True)
return "".join(s.text for s in segs).strip()
2. Summarize (APIVAI)
from openai import OpenAI
ai = OpenAI(api_key="YOUR_APIVAI_API_KEY", base_url="https://api.apivai.com/v1")
def summarize(transcript):
r = ai.chat.completions.create(model="gpt-5.5", messages=[
{"role": "system", "content": "Summarize this meeting transcript. Output: 1) a 3-bullet summary, 2) decisions, 3) action items with owners. Keep it tight."},
{"role": "user", "content": transcript}])
return r.choices[0].message.content
print(summarize(transcribe("meeting.wav")))
Make it useful
- Speakers/long meetings: chunk long transcripts, summarize each, then summarize the summaries.
- Action items: ask for a structured list (owner + task + due) you can push to a tracker.
- Languages: GPT-5.5 summarizes and translates multilingual meetings well.
Why this stack
- Whisper local = no per-minute audio cost and privacy (audio stays on your machine).
- APIVAI = cheap, OpenAI-compatible summarization; Claude Sonnet for very long transcripts.
FAQ
Does APIVAI transcribe audio? No — run Whisper locally for transcription; APIVAI provides the
summarization model (Claude/GPT).
Which model for summaries? GPT-5.5 for speed/cost; Claude Sonnet for very long transcripts.
How do I handle long meetings? Chunk and do a two-pass summary (summarize chunks, then combine).
Get started
Transcribe with Whisper, summarize with APIVAI. Examples:
APIVAI examples repo.
Top comments (0)