The "Remote Ollama Server" feature caught my eye. I run a Mac Mini (M-series, 64GB) as a 24/7 AI agent host — Ollama serves qwen3:30b and deepseek-r1:70b locally, and having meeting transcription pull from the same local inference server would be a clean topology.
Two questions from an actual local-first practitioner:
Model selection for summarization — you mention supporting 7B+ models. Which architectures have you found produce the best meeting summaries? I'd imagine instruction-tuned models (like Qwen or Mistral) outperform base models significantly here, but the "structured clinical notes" use case (StenoAI Med) must have very different requirements from a casual standup summary.
Transcript drift in long meetings — Whisper-based transcription tends to accumulate errors over 60+ minute sessions (speaker confusion, repeated phrases, timestamp drift). Do you do any post-processing correction, or is the raw Whisper output the canonical transcript?
The privacy angle resonates — I built a browser-only finance tool where all data stays in IndexedDB specifically because financial data + cloud = trust barrier. Same principle applies even more strongly to meeting recordings.
Remote Ollama - Yes, we had some enterprise users start to use this feature. Atm, it allows our roster of models but we quickly (likely today) will allow support of any ollama models you are running. We are advising users to go with 30b qwen as well.
Model selection - So what I've found is that my model selection is less important post 7b mark, and it's more the prompt engineering and extraction strategies. There is still work to be done here.
StenoAI Med - I started to test StenoAI Med, it is going to live within the product and be activated by a toggle within advanced that will swap out summary templates and branding. It is essentially StenoAI under the hood but with more hardened controls and med specific models.
Transcript Drift - We used the smallest whisper model for performance reasons and we don't have diarisation yet. Post processing correction is interesting, could you elaborate?
We'd be happy for you to try the product, we recently had a German govt department buy Mac minis as they were happy with StenoAI :) W'd be even happier if you joined our discord and started feeding back on the roadmap, its privacy aware users like yourselves that are really helping us driving great features. discord.com/invite/DZ6vcQnxxu
For further actions, you may consider blocking this person and/or reporting abuse
We're a place where coders share, stay up-to-date and grow their careers.
The "Remote Ollama Server" feature caught my eye. I run a Mac Mini (M-series, 64GB) as a 24/7 AI agent host — Ollama serves qwen3:30b and deepseek-r1:70b locally, and having meeting transcription pull from the same local inference server would be a clean topology.
Two questions from an actual local-first practitioner:
Model selection for summarization — you mention supporting 7B+ models. Which architectures have you found produce the best meeting summaries? I'd imagine instruction-tuned models (like Qwen or Mistral) outperform base models significantly here, but the "structured clinical notes" use case (StenoAI Med) must have very different requirements from a casual standup summary.
Transcript drift in long meetings — Whisper-based transcription tends to accumulate errors over 60+ minute sessions (speaker confusion, repeated phrases, timestamp drift). Do you do any post-processing correction, or is the raw Whisper output the canonical transcript?
The privacy angle resonates — I built a browser-only finance tool where all data stays in IndexedDB specifically because financial data + cloud = trust barrier. Same principle applies even more strongly to meeting recordings.
@maxxmini great questions.
Remote Ollama - Yes, we had some enterprise users start to use this feature. Atm, it allows our roster of models but we quickly (likely today) will allow support of any ollama models you are running. We are advising users to go with 30b qwen as well.
Model selection - So what I've found is that my model selection is less important post 7b mark, and it's more the prompt engineering and extraction strategies. There is still work to be done here.
StenoAI Med - I started to test StenoAI Med, it is going to live within the product and be activated by a toggle within advanced that will swap out summary templates and branding. It is essentially StenoAI under the hood but with more hardened controls and med specific models.
Transcript Drift - We used the smallest whisper model for performance reasons and we don't have diarisation yet. Post processing correction is interesting, could you elaborate?
We'd be happy for you to try the product, we recently had a German govt department buy Mac minis as they were happy with StenoAI :) W'd be even happier if you joined our discord and started feeding back on the roadmap, its privacy aware users like yourselves that are really helping us driving great features. discord.com/invite/DZ6vcQnxxu