⚡ TL;DR
interm.ai is my cross-platform AI assistant that listens to your calls in real time and whispers suggestions à la Cluely – but tuned for interviews & sales demos.
The desktop app ships with Electron v30, a Swift command-line helper for macOS-level audio/screenshot capture (signed & notarized 🛡️), and a marketing site served from Bolt.new. Two “faithful” walk-through videos were generated with Claude Code, so visitors see the magic before installing. Grab the beta at interm.ai.
What I Built 🚀
| Piece | Stack | Why It Matters |
|---|---|---|
| Desktop app | Electron 30 + React | One codebase → .dmg, .exe, AppImage. Native-ish UX with hot reload. |
| Real-time AI | Google Speech-to-Text → GPT-4o streaming (Deepgram migration planned) |
Sub-500 ms latency keeps suggestions feeling instant. |
| macOS helper | Swift CLI (Audio & Screenshot) | Grants system-audio capture that Electron can’t access alone. |
| Marketing & demos | Bolt.new site + Claude Code videos | Ship pages in hours and embed polished demos—no Figma export needed. |
Key Features ✨
- Dual-stream capture – splits mic & system audio for cleaner transcripts (essential on Windows).
- Contextual suggestions – GPT-4o looks at live transcript and CRM data to surface next-best-action tips.
- Swift screenshot – grabs the active window every 5 s so suggestions can reference what’s actually on-screen.
- Privacy first – all raw audio stays local unless the user opts-in to cloud sync.
Architecture Overview 🛠️
Swift CLI: Capturing macOS Audio & Screenshots 🎙️🖼️
// AudioScreenshot.swift
import AVFoundation
import CoreImage
import ScreenCaptureKit
@main
struct AudioScreenshot {
static func main() async throws {
// 1️⃣ Capture raw system audio
let session = try SCStream.shared(systemAudio: true, microphone: false)
// 2️⃣ Poll active window every 5 s
while true {
let image = try session.captureCurrentFrame()
save(image) // → ~/Library/Caches/interm/frame.png
try await Task.sleep(for: .seconds(5))
}
}
}
Demo 🎞️
Roadmap 🗺️
- On-device fallback with a tiny LLM for offline flights.
- CRM webhook – push call highlights straight into HubSpot.
- Switch speech backend to Deepgram once their low-latency best-word model exits beta.
- Open-source the Swift CLI (after tidying up the build script).



Top comments (0)