<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Abhinav Goyal</title>
    <description>The latest articles on DEV Community by Abhinav Goyal (@trippyengineer).</description>
    <link>https://dev.to/trippyengineer</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3838558%2Fe0e004a5-f468-4112-a4ed-505b9072de82.jpg</url>
      <title>DEV Community: Abhinav Goyal</title>
      <link>https://dev.to/trippyengineer</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/trippyengineer"/>
    <language>en</language>
    <item>
      <title>MemPalace Benchmark Claims Don't Hold Up - A Technical Breakdown</title>
      <dc:creator>Abhinav Goyal</dc:creator>
      <pubDate>Sun, 12 Apr 2026 17:39:00 +0000</pubDate>
      <link>https://dev.to/trippyengineer/mempalace-benchmark-claims-dont-hold-up-a-technical-breakdown-3l91</link>
      <guid>https://dev.to/trippyengineer/mempalace-benchmark-claims-dont-hold-up-a-technical-breakdown-3l91</guid>
      <description>&lt;p&gt;This past week, MemPalace went viral on GitHub — an open-source AI memory system fronted by actress Milla Jovovich, claiming 100% on LongMemEval and 100% on LoCoMo. I was evaluating it for a production agentic AI pipeline and decided to dig into the actual code and community audits before integrating anything. Here's what I found.&lt;br&gt;
To be fair, the core idea is solid. MemPalace stores your LLM conversation history locally using ChromaDB, organized into a spatial hierarchy:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Wings&lt;/strong&gt; — people or projects&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Halls&lt;/strong&gt; — memory types&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Rooms&lt;/strong&gt; — conversation threads&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tunnels&lt;/strong&gt; — cross-connections between memories&lt;/p&gt;

&lt;p&gt;Instead of dumping your entire memory store to the LLM (the naive approach), it sends only the top 15 semantically relevant memories (~800 tokens). That's a claimed 250x token reduction vs. brute-force context stuffing. Fully offline, MIT-licensed, costs ~$0.70/year to run.&lt;/p&gt;

&lt;p&gt;The spatial retrieval does measurably outperform flat ChromaDB search. The privacy-first architecture is real. This part is genuinely good work.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Benchmark Problem&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;Here's where it breaks down.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;LongMemEval: 100% → 96.6%&lt;/strong&gt;&lt;br&gt;
The team identified exactly which questions were failing, engineered fixes targeting those specific questions, then retested on the same dataset. Classic overfitting to a benchmark. After GitHub Issue #29 surfaced this publicly, they revised the score to 96.6% without announcement. The community caught it via commit history.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;LoCoMo: 100% (trivially gamed)&lt;/strong&gt;&lt;br&gt;
They ran evaluation with top_k=50 on a dataset containing only 19–32 items. When your retrieval window exceeds the entire dataset size, you retrieve everything by default. This isn't a memory system benchmark — it's a retrieval window that swallows the whole test set whole.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Real-World Performance&lt;/strong&gt;&lt;br&gt;
One developer ran manual end-to-end tests by actually asking questions through an LLM connected to MemPalace. Correct answer rate: approximately 17%. Three independent audits reached the same conclusion: solid ChromaDB wrapper, broken marketing claims.&lt;br&gt;
&lt;strong&gt;README vs. Codebase Table&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
  &lt;tr&gt;
    &lt;th&gt;README Claim&lt;/th&gt;
    &lt;th&gt;Code Reality&lt;/th&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;Contradiction detection&lt;/td&gt;
    &lt;td&gt;knowledge_graph.py has zero contradiction logic&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;Palace structure drives benchmark scores&lt;/td&gt;
    &lt;td&gt;LongMemEval scores are ChromaDB's default embedding performance; palace routing sits above this&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;MCP Claude Desktop integration&lt;/td&gt;
    &lt;td&gt;stdout bug corrupts JSON stream, breaks Claude Desktop on first use&lt;/td&gt;
  &lt;/tr&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;The Crypto Context&lt;/strong&gt;&lt;br&gt;
The primary author is Ben Sigman, a crypto CEO. Milla Jovovich had 7 commits across 2 days at launch. A memecoin spawned within days of the GitHub release. Celebrity face + inflated benchmarks + viral launch + token = a pattern the community rightly recognizes. The MIT license means no software rug-pull, but the marketing playbook is straight from crypto launch culture.&lt;br&gt;
&lt;strong&gt;How It Compares to Obsidian / Logseq&lt;/strong&gt;&lt;br&gt;
Worth noting for anyone using PKM tools: these aren't competitors, they solve different problems.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
  &lt;tr&gt;
    &lt;th&gt;&lt;/th&gt;
    &lt;th&gt;MemPalace&lt;/th&gt;
    &lt;th&gt;Obsidian&lt;/th&gt;
    &lt;th&gt;Logseq&lt;/th&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;Storage format&lt;/td&gt;
    &lt;td&gt;ChromaDB binary vectors&lt;/td&gt;
    &lt;td&gt;Plain Markdown&lt;/td&gt;
    &lt;td&gt;Plain Markdown&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;Human readable&lt;/td&gt;
    &lt;td&gt;No&lt;/td&gt;
    &lt;td&gt;Yes&lt;/td&gt;
    &lt;td&gt;Yes&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;Portability&lt;/td&gt;
    &lt;td&gt;Low (Python API only)&lt;/td&gt;
    &lt;td&gt;Very high&lt;/td&gt;
    &lt;td&gt;Moderate&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;Best for&lt;/td&gt;
    &lt;td&gt;LLM agent memory&lt;/td&gt;
    &lt;td&gt;Human PKM&lt;/td&gt;
    &lt;td&gt;Journaling/outlining&lt;/td&gt;
  &lt;/tr&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The practical hybrid: use Obsidian/Logseq as your human knowledge layer, feed structured data into a vector store only for agent retrieval. Don't get locked into a binary format.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Verdict&lt;/strong&gt;&lt;br&gt;
MemPalace has a genuinely interesting spatial memory architecture. The local-first, privacy-respecting design is real. But benchmarks were manipulated, multiple advertised features don't exist in the codebase, and the launch was engineered around a celebrity and a memecoin.&lt;/p&gt;

&lt;p&gt;Version 0.1 ChromaDB wrapper with good ideas and dishonest marketing. Revisit in 3–6 months once independent benchmark reproductions exist and the known bugs are fixed.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>opensource</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>The reason your live AI demo spins has nothing to do with your model</title>
      <dc:creator>Abhinav Goyal</dc:creator>
      <pubDate>Mon, 23 Mar 2026 12:28:37 +0000</pubDate>
      <link>https://dev.to/trippyengineer/the-reason-your-live-ai-demo-spins-has-nothing-to-do-with-your-model-2hil</link>
      <guid>https://dev.to/trippyengineer/the-reason-your-live-ai-demo-spins-has-nothing-to-do-with-your-model-2hil</guid>
      <description>&lt;p&gt;There's a specific kind of fear before a live demo.&lt;/p&gt;

&lt;p&gt;Not general anxiety. The 30-seconds-before-you-hit-run kind. Where &lt;br&gt;
you're suddenly aware of every API call, every network hop, every &lt;br&gt;
dependency you didn't stress-test. You smile. You keep talking. &lt;br&gt;
Somewhere in the background, something is computing.&lt;/p&gt;

&lt;p&gt;And you're just hoping it finishes before the silence gets awkward.&lt;/p&gt;

&lt;p&gt;I've been in that position more times than I want to admit. I work &lt;br&gt;
at a global health non-profit. My audience is usually program managers, &lt;br&gt;
M&amp;amp;E specialists, researchers. People who are genuinely skeptical about &lt;br&gt;
whether AI belongs in their workflows. People who need to &lt;em&gt;see&lt;/em&gt; it work.&lt;/p&gt;

&lt;p&gt;One spinner and you've set back AI adoption in that room by six months.&lt;/p&gt;

&lt;p&gt;So I started asking a question I should have asked much earlier:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;What actually has to happen live — and what am I running live out of habit?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That question rewired everything.&lt;/p&gt;


&lt;h2&gt;
  
  
  The split
&lt;/h2&gt;

&lt;p&gt;Most live AI demos fail for the same reason. Not bad models. Not flaky &lt;br&gt;
APIs. The architecture itself — generation and performance running in &lt;br&gt;
the same process, at the same time, in front of people.&lt;/p&gt;

&lt;p&gt;Audio synthesis. Script generation. API calls. All happening live, &lt;br&gt;
while a room waits.&lt;/p&gt;

&lt;p&gt;The fix isn't faster tools. It's removing the dependency entirely.&lt;/p&gt;

&lt;p&gt;Split it into two phases:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Phase 1 — before anyone arrives:&lt;/strong&gt; narration scripts, audio synthesis, &lt;br&gt;
slide timing, the entire orchestration layer. All of it. Pre-computed, &lt;br&gt;
cached, done.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Phase 2 — during the talk:&lt;/strong&gt; only the things that genuinely have to &lt;br&gt;
happen live. Real workflows. Real data. Real outputs.&lt;/p&gt;

&lt;p&gt;By the time I walk into the room, the orchestration overhead is gone. &lt;br&gt;
What's left is only the actual work — and it runs with nothing in its way.&lt;/p&gt;


&lt;h2&gt;
  
  
  Phase 1 — Before the room fills (15–20 min)
&lt;/h2&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python &lt;span class="nt"&gt;-m&lt;/span&gt; core.pre_generate
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Per slide, in order:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;slide_reader.py&lt;/code&gt; pulls text and structure from the PPTX&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;script_agent.py&lt;/code&gt; sends it to Claude → 3–4 sentence narration back&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;voice_engine.py&lt;/code&gt; synthesises audio via Edge TTS (free, no API key)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;ffprobe&lt;/code&gt; measures the actual duration of each generated file&lt;/li&gt;
&lt;li&gt;Everything lands in &lt;code&gt;cache/manifest.json&lt;/code&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That last step matters more than it sounds. First version used &lt;br&gt;
word-count estimates for timing. It was wrong constantly — natural &lt;br&gt;
pauses, variation in delivery pace. Switching to &lt;code&gt;ffprobe&lt;/code&gt; measuring &lt;br&gt;
actual media duration fixed it permanently. Now timing is a property &lt;br&gt;
of the content, not a guess about it.&lt;/p&gt;

&lt;p&gt;The pipeline is resumable. Re-run it and completed slides skip.&lt;/p&gt;


&lt;h2&gt;
  
  
  Phase 2 — During the talk
&lt;/h2&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python &lt;span class="nt"&gt;-m&lt;/span&gt; core.orchestrator
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;manifest → play audio (pygame)
         → wait actual duration
         → PyAutoGUI advances slide
         → fire n8n webhook (if demo slide)
         → next
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;No API calls. No generation. Reads the manifest, plays files, fires &lt;br&gt;
webhooks at the right moments.&lt;/p&gt;

&lt;p&gt;One thing that took longer than expected: PyAutoGUI focus management. &lt;br&gt;
Controlling another application's window from a separate Python process &lt;br&gt;
sounds trivial. Window focus, keypress timing, different PowerPoint &lt;br&gt;
versions — all of it needed explicit settle delays and focus checking. &lt;br&gt;
Budget an afternoon for this.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;pynput&lt;/code&gt; handles keyboard controls globally without stealing PowerPoint &lt;br&gt;
focus. SPACE pauses, D skips a demo countdown, Q quits.&lt;/p&gt;


&lt;h2&gt;
  
  
  Phase 3 — Q&amp;amp;A
&lt;/h2&gt;

&lt;p&gt;Final slide, mic opens automatically.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SpeechRecognition → Google STT
→ Claude (full conversation history maintained across turns)
→ Edge TTS → plays answer aloud → loops
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;The decision to maintain the full &lt;code&gt;messages&lt;/code&gt; array across Q&amp;amp;A turns &lt;br&gt;
was the smallest change with the biggest effect. Follow-up questions &lt;br&gt;
get answered in full session context. Claude remembers what it said &lt;br&gt;
two questions ago. That made the Q&amp;amp;A feel like a completely different &lt;br&gt;
product from the single-turn version.&lt;/p&gt;


&lt;h2&gt;
  
  
  The three live workflows
&lt;/h2&gt;

&lt;p&gt;These are the part that actually runs live. They fire via webhook at &lt;br&gt;
configured slide numbers. All three ship as importable JSON — connect &lt;br&gt;
credentials and they run immediately.&lt;/p&gt;
&lt;h3&gt;
  
  
  Email Pipeline
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Webhook (triggered by orchestrator)
→ Gmail: fetch latest email
→ Claude: classify intent + draft reply
→ Escalation Router
→ Gmail Draft + CC logic + Sheets log
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;This is the slide where the phone went face down.&lt;/p&gt;

&lt;p&gt;n8n fired. Gmail opened on the secondary screen. Claude read an &lt;br&gt;
incoming email, classified the intent, drafted a reply, routed it &lt;br&gt;
for escalation. 4 seconds. Nobody touched a keyboard.&lt;/p&gt;
&lt;h3&gt;
  
  
  Meeting Pipeline
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Webhook (paste any transcript)
→ Claude: action items, decisions, risks, follow-up email
→ Pull attendees from Sheets
→ Gmail + Slack + Sheets log
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;I pasted a transcript from a meeting that happened that morning. By the time I finished talking over the slide — action items extracted, follow-ups written to 6 attendees, Slack notified, row in Sheets.&lt;/p&gt;

&lt;p&gt;Someone in the back: &lt;em&gt;"wait, that just actually happened?"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Yeah.&lt;/p&gt;
&lt;h3&gt;
  
  
  Evidence Intelligence Engine
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Webhook (research question)
→ Claude: decompose into sub-queries
→ Perplexity Web Search  ─┐
→ Perplexity Academic    ─┘ parallel
→ Claude: is this evidence strong enough?
→ [yes] → Google Doc brief
→ [no]  → rewrite queries → retry (max 2 rounds)
→ Slack + Sheets log
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;The quality gate is what gets the most questions. Claude evaluates &lt;br&gt;
whether it actually has enough evidence to synthesise before writing &lt;br&gt;
the brief. If not, it rewrites the queries and runs again. Caps at &lt;br&gt;
two retries then forces synthesis anyway.&lt;/p&gt;

&lt;p&gt;Live. In the room. Under 90 seconds.&lt;/p&gt;

&lt;p&gt;The question afterward wasn't "how did you build this?"&lt;/p&gt;

&lt;p&gt;It was: &lt;strong&gt;"can I have this?"&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That's the only metric that matters.&lt;/p&gt;


&lt;h2&gt;
  
  
  One thing I'd change immediately
&lt;/h2&gt;

&lt;p&gt;Built with ElevenLabs initially. Swapped to Microsoft Edge TTS when &lt;br&gt;
I realised quality was close enough for narration and the cost &lt;br&gt;
difference was literally $0. If you're building voice into an AI &lt;br&gt;
pipeline for a resource-constrained organisation, start there. &lt;br&gt;
Upgrade only if you actually hit a ceiling.&lt;/p&gt;


&lt;h2&gt;
  
  
  Structure
&lt;/h2&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ai-presentation-orchestrator/
├── core/
│   ├── orchestrator.py      main runtime controller
│   ├── pre_generate.py      pre-generation pipeline
│   ├── regenerate.py        selective slide regeneration
│   ├── diagnose.py          pre-flight checks
│   └── logger.py
│
├── agents/
│   ├── script_agent.py
│   ├── slide_controller.py
│   └── slide_reader.py
│
├── integrations/
│   ├── voice_engine.py      Edge TTS + pygame
│   ├── heygen_engine.py     avatar video (optional)
│   ├── n8n_trigger.py
│   └── slack_notifier.py
│
├── n8n/                     3 importable workflow JSONs
├── docs/
│   ├── RUNBOOK.md           day-of checklist
│   └── architecture.svg
└── cache/                   auto-created, gitignored
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Run it
&lt;/h2&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/TrippyEngineer/ai-presentation-orchestrator
&lt;span class="nb"&gt;cd &lt;/span&gt;ai-presentation-orchestrator
pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt

&lt;span class="nb"&gt;cp &lt;/span&gt;env.example .env
&lt;span class="c"&gt;# ANTHROPIC_API_KEY is the only required key&lt;/span&gt;

python &lt;span class="nt"&gt;-m&lt;/span&gt; core.diagnose      &lt;span class="c"&gt;# before anything else&lt;/span&gt;
python &lt;span class="nt"&gt;-m&lt;/span&gt; core.pre_generate  &lt;span class="c"&gt;# 15–20 min before the talk&lt;/span&gt;
python &lt;span class="nt"&gt;-m&lt;/span&gt; core.orchestrator  &lt;span class="c"&gt;# when the room fills up&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;ffmpeg separately (not pip):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Windows: &lt;a href="https://gyan.dev/ffmpeg/builds" rel="noopener noreferrer"&gt;gyan.dev/ffmpeg/builds&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Mac: &lt;code&gt;brew install ffmpeg&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Linux: &lt;code&gt;sudo apt install ffmpeg&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/TrippyEngineer" rel="noopener noreferrer"&gt;
        TrippyEngineer
      &lt;/a&gt; / &lt;a href="https://github.com/TrippyEngineer/ai-presentation-orchestrator" rel="noopener noreferrer"&gt;
        ai-presentation-orchestrator
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      Fully automated AI presentation system — pre-generates narration, fires live n8n demo workflows, and answers audience questions via voice Q&amp;amp;A. Zero manual input during the talk.
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;div&gt;
&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;AI Presentation Orchestrator&lt;/h1&gt;
&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Separate the compute from the performance.&lt;/strong&gt;&lt;br&gt;
Pre-generate everything. Cache it. Walk in and press Enter.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://python.org" rel="nofollow noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/fba7e48fe18b6c58e553ffa88922b1d239e140dfb80911ff12a5a70753cd0dd9/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f507974686f6e2d332e31302532422d3337373661623f6c6f676f3d707974686f6e266c6f676f436f6c6f723d7768697465" alt="Python 3.10+"&gt;&lt;/a&gt;
&lt;a href="https://github.com/TrippyEngineer/ai-presentation-orchestrator/LICENSE" rel="noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/6c290d3fa30f4a51454757590f2beec29a83cccdfcd9945e2c0d387af01477f3/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f4c6963656e73652d4d49542d323263353565" alt="License: MIT"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/div&gt;

&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;What This Project Demonstrates&lt;/h2&gt;
&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Two-phase system design&lt;/strong&gt; — generation and runtime are fully decoupled. All LLM calls, TTS synthesis, and video rendering happen before the presentation. The runtime reads a manifest and executes deterministically.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cache-first pipeline&lt;/strong&gt; — every output is stored with a manifest. Partial runs resume from where they stopped. Nothing is regenerated unless explicitly forced.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-modal orchestration&lt;/strong&gt; — synchronises audio playback, slide advancement, and live webhook triggers from a single timing source (actual media duration).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Modular integration layer&lt;/strong&gt; — TTS, avatar video, n8n, Slack, and Google APIs are all isolated modules. Swapping any one of them does not affect the core pipeline.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Live voice Q&amp;amp;A&lt;/strong&gt; — final slide activates a mic listener. Audience questions are captured via speech recognition, answered by Claude in persona, and spoken aloud via TTS. Conversation…&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/TrippyEngineer/ai-presentation-orchestrator" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;





&lt;p&gt;Six talks since I rebuilt around this principle. Zero spinners.&lt;/p&gt;

&lt;p&gt;The audience doesn't know there's a manifest. They don't know about &lt;br&gt;
the cache. They see real workflows execute in 4 seconds. They hear &lt;br&gt;
spoken answers before they've finished processing that it worked.&lt;/p&gt;

&lt;p&gt;Nothing hesitates — because by the time the lights go down, the only &lt;br&gt;
thing left to compute is the actual work.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Abhinav Goyal — building AI infrastructure for global health programs, &lt;br&gt;
where reliability isn't optional. Drop questions below, especially if you've hit the live demo &lt;br&gt;
reliability problem from a different angle.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>python</category>
      <category>automation</category>
      <category>n8n</category>
      <category>opensource</category>
    </item>
  </channel>
</rss>
