Honestly, building AI Meeting Notes: My First Real Coding Adventure
Three weeks ago I walked out of coding bootcamp with a shiny certificate and basically zero clue how the real world worked. Then my friend asked me to help her build an AI tool that takes meeting recordings and turns them into actual usable notes. I said yes because, honestly, I had no idea what I was getting into.
What followed was the wildest rabbit hole I've ever fallen down. I learned more in these three weeks than I did in the entire 14-week bootcamp. And the biggest shocker? There's this thing called Global API that gives you access to 184 different AI models through one simple endpoint. I had genuinely never heard of it before. Nobody in my cohort mentioned it. None of the YouTube tutorials I watched mentioned it. Yet it changed everything about how I think about building with AI.
Let me tell you the whole story, because if you're a new dev like me, this might save you weeks of confusion.
The Problem That Started It All
My friend Sarah runs a small consulting firm. She sits in three to five meetings a day, and her team was spending literal hours afterward writing up notes. She tried Otter, tried Fireflies, tried Notion AI. Some worked okay. Some were expensive. None of them were great at handling her specific format.
"Can you build me something custom?" she asked.
Now, three months ago, I would have laughed. AI was that magical thing in closed products that I had no access to. Then someone in my Discord server casually dropped a link to Global API and said "you can literally just call any model you want." I was shocked. Blew my mind, honestly. The whole time I thought you needed separate accounts with OpenAI, Anthropic, Google, and a dozen other companies to even experiment. Turns out that's not true at all.
The First Wall I Hit (Model Overload)
Here's the thing nobody warns you about. When you have 184 models to pick from, you don't feel empowered. You feel paralyzed. I spent two whole days just trying to figure out which model to use for meeting transcripts. I was reading benchmark papers, watching comparison videos, spiraling hard.
Then I finally made a spreadsheet. I'm a visual learner, so I literally had to lay it all out. And that's when a few things jumped out at me.
The pricing spread is insane. Some models cost almost nothing, and some cost a small fortune. We're talking prices ranging from $0.01 all the way up to $3.50 per million tokens. If you're new to this like I was, "per million tokens" basically means "per roughly 750,000 words." So even the expensive models are technically cheap, but the difference between cheap and expensive can still wreck your budget if you're processing a lot of meetings.
For my use case, I started comparing the most popular options side by side. Here's what I scribbled down in my notebook:
DeepSeek V4 Flash came in at $0.27 input and $1.10 output, with a 128K context window. For context, "context window" just means how much text the model can read at once. 128K is enormous. That's basically a short book's worth of transcript.
DeepSeek V4 Pro was $0.55 input and $2.20 output, but bumped the context up to 200K. So if Sarah ever had a 6-hour meeting that she wanted notes from, this could probably handle it in one shot.
Qwen3-32B sat at $0.30 input and $1.20 output with a 32K context. Cheaper, but that context felt limiting for long meetings.
GLM-4 Plus surprised me at $0.20 input and $0.80 output with 128K context. The price-to-context ratio on this one looked really tasty.
And then there's GPT-4o, the one everyone talks about. $2.50 input and $10.00 output for 128K context. I had no idea the popular models were that much pricier until I saw it in writing next to the alternatives.
I had no idea the popular models were that much more expensive until I saw them in a list next to alternatives. My whole mental model was wrong.
The Code That Actually Worked
Once I picked a model to start with, the actual code was way simpler than I expected. I almost didn't believe it when it ran on the first try. Here's the basic structure I used, and I want to share it because if you're a fellow bootcamp grad staring at AI APIs, this is genuinely all you need to get started:
import openai
import os
client = openai.OpenAI(
base_url="https://global-apis.com/v1",
api_key=os.environ["GLOBAL_API_KEY"],
)
response = client.chat.completions.create(
model="deepseek-ai/DeepSeek-V4-Flash",
messages=[{"role": "user", "content": "Summarize this meeting transcript into action items, decisions, and key discussion points..."}],
)
print(response.choices[0].message.content)
That's it. That's the whole thing. You set up the client once with that base URL, you grab your API key from your environment variables, and then you just call any of those 184 models. The OpenAI Python library everyone already knows just works with it. I kept waiting for the catch. There wasn't one.
I started with DeepSeek V4 Flash because it was cheap and had plenty of context. For testing prompts and iterating, I couldn't ask for a better starting point. When I had to do my "real" run on a long meeting transcript, I switched to the Pro version just to be safe with the 200K context window.
The Moment My Brain Broke
I want to pause and talk about something that genuinely floored me. In the old world, picking a model meant committing to that model. You'd build your whole app around GPT-4o, then realize it's too expensive, and you'd have to rewrite tons of code to switch. Not here.
Because Global API is unified, I can swap the model name in that one line and try a completely different model. Want to compare DeepSeek V4 Pro to GLM-4 Plus? Change one string. Want to run the same prompt through 10 models and pick the cheapest one that gives acceptable results? Just loop through them.
I built a little test harness in like 20 minutes that ran Sarah's typical meeting transcript through five different models and ranked them by cost and quality. That's something I literally could not have done with traditional API access. The whole closed-off nature of the AI world kind of dissolved for me in that moment.
The Numbers That Sold Me
Look, I'm a bootcamp grad. I don't have unlimited money. Every dollar matters. So when I saw the cost analysis on AI Meeting Notes workloads, I did a literal double take.
The benchmarks I kept seeing claimed 40 to 65 percent cost reduction compared to just throwing generic prompts at expensive models. That's not a small number. For Sarah's business, processing maybe 20 hours of meetings a week, that difference is hundreds of dollars a month. Real money. Bootcamp-grad money. Coffee money times a thousand.
The other numbers that made me feel like I wasn't crazy for attempting this project:
- Average latency around 1.2 seconds. So when she pastes a transcript in, the notes come back faster than she can refill her coffee.
- Throughput of 320 tokens per second. I had to look up what tokens per second meant, but essentially it's how fast the model spits out text. 320 is fast.
- Average benchmark score of 84.6 percent across various tests. I don't fully understand every benchmark yet, but 84.6 percent sounded pretty solid compared to the alternatives.
- Setup time under 10 minutes. From "let me try this API" to "I have a working prototype" was literally a coffee break. I timed it.
The Mistakes I Made So You Don't Have To
I want to share a few things I learned the hard way, because the bootcamp didn't teach me any of this and I wish someone had.
First, caching is your best friend. I was making the same API call over and over during testing because I kept tweaking my prompt. I didn't realize I could just cache the input tokens and save about 40 percent of my cost. Once I figured that out, my testing bill dropped noticeably. If you're calling the same big transcript multiple times, look into caching. It adds up.
Second, streaming responses makes a huge difference for user experience. Even though the model only takes 1.2 seconds on average, if you wait for the entire output before showing anything, the user feels like they're waiting forever. When I switched to streaming, where the response appears word by word as it's generated, Sarah said it "feels way faster" even though the total time is identical. Magic trick. Free magic trick.
Third, there's a tier called GA-Economy that's specifically designed for simple queries, and it can drop your costs by about 50 percent. I didn't even know this existed until I dug into the docs. If you're building a tool that mostly does routine work, this is a no-brainer. If you need heavy reasoning, you obviously need the bigger models, but most meeting notes are pretty straightforward summarization tasks.
Fourth, set up fallbacks from the start. Rate limits are real. Sometimes a model is overloaded. Sometimes there's a brief outage. If your tool just throws an error in those moments, your users will hate it. I added a simple fallback that retries with a different model if the first one fails. Five extra lines of code, massive quality of life improvement.
Fifth, and this is more of a soft tip, monitor quality even if you don't have a fancy system for it. Sarah sends me a Slack message every now and then with "this one was great" or "this one missed the point." I keep a running tally. It's manual and scrappy but it tells me if my prompts are drifting in the wrong direction.
What I Actually Built
I won't bore you with every line of code, but here's the basic flow. Sarah uploads a meeting transcript (or pastes text from a transcription service). The backend takes that transcript, sends it to the model with a carefully crafted prompt asking for action items, decisions made, and key discussion points. The model returns structured notes. Sarah can then edit them, share them, or just copy them into her project management tool.
The prompt engineering took longer than the actual coding. I learned that small changes in wording can completely change the output. Saying "list action items" gets you a list. Saying "identify action items, including the person responsible and the deadline if mentioned" gets you a much more useful list. Specifics matter. I went through probably 30 versions of the prompt before I found one that consistently produced what Sarah wanted.
Why I'm Telling You All This
If you're a bootcamp grad like me, you might be intimidated by AI. I was. The bootcamp taught me React, Node, some Python, basic databases. It did not teach me how to integrate AI into a product. It did not teach me about model pricing. It did not teach me that there are unified APIs that let you access dozens of models through one interface.
I had no idea this was so accessible. Genuinely. I thought AI was locked behind enterprise contracts and PhD-level understanding. I was wrong, and I wasted two months being afraid of something I could have been building with.
The thing that changed my perspective the most was finding Global API. Not because it's the flashiest tool, but because it removed the biggest barrier I was facing. Instead of needing accounts and billing setups with five different companies, I made one account, got one key, and had access to 184 models. The complexity dropped by like 90 percent overnight.
I'm not going to pretend my little meeting notes tool is going to disrupt any industry. It's a small project for my friend's small business. But the lessons I learned scaling it, picking models, optimizing costs, building a real product that real people use every day, those lessons were worth more than any bootcamp module.
If you're curious about the API I used, check out Global API at global-apis.com. They have free credits to get started so you can experiment without committing any money. I'm not getting paid to say this, I just wish someone had pointed me in that direction three months ago when I was still staring at model comparison spreadsheets with no idea how to actually call any of them.
The whole "AI is hard and inaccessible" thing is, in my experience, mostly a myth. You just need the right entry point. Go build something. You'll learn way more by doing than by reading another tutorial. I promise.
Top comments (0)