DEV Community

Cover image for Voice Notes Transcriber - Email Your Audio, Get Smart Transcriptions
Giacomo Verdi
Giacomo Verdi

Posted on

Voice Notes Transcriber - Email Your Audio, Get Smart Transcriptions

This is a submission for the Postmark Challenge: Inbox Innovators.

What I Built

I built Voice Notes Transcriber, an AI-powered system that transforms voice memos sent via email into searchable, organized text notes.

Simply email an audio file to your Postmark inbound address, and the system automatically:

  • πŸŽ™οΈ Transcribes audio using Google Speech-to-Text API
  • πŸ€– Generates summaries and extracts action items with AI
  • 🏷️ Auto-categorizes notes using NLP
  • πŸ” Makes everything searchable
  • πŸ“Š Provides a beautiful dashboard to manage your notes
  • πŸ”„ Optionally syncs to Notion

It solves the problem of voice notes being quick to create but hard to organize and search through later.

Demo

Live App: https://voice-notes.jugaad.digital/

Test Credentials:

How to Test:

  1. Login with test credentials
  2. Send an audio file (MP3, WAV, M4A) to: your-address@inbound.postmarkapp.com
  3. Wait ~30 (depending on duration) seconds for processing
  4. See your transcribed note appear in the dashboard!

Screenshots

Dashboard View
Dashboard

Audio Player with Transcription
Player

Email Processing Flow
Email Flow

Code Repository

πŸŽ™οΈ Voice Notes Transcriber & Organizer

Un sistema intelligente che trascrive automaticamente le note vocali inviate via email utilizzando il parsing delle email in entrata di Postmark, l'API Google Speech-to-Text, e le organizza con categorizzazione basata su AI.

✨ Caratteristiche

FunzionalitΓ  Core

  • πŸ“§ Email-to-Transcription: Invia note vocali come allegati email per ottenere trascrizioni istantanee
  • 🎯 Elaborazione AI: Trascrizione automatica usando Google Speech-to-Text
  • πŸ“ Riassunti Intelligenti: Generazione di riassunti e estrazione di action items
  • 🏷️ Auto-Categorizzazione: Categorizzazione intelligente basata sul contenuto
  • πŸ” Ricerca Full-Text: Cerca attraverso trascrizioni, riassunti e metadati
  • πŸ“± Dashboard Responsive: Interfaccia web elegante per gestire le note

FunzionalitΓ  Avanzate

  • πŸ”„ Integrazione Notion: Sincronizza le note trascritte con il tuo workspace Notion
  • 🎡 Riproduzione Audio: Player audio integrato con visualizzazione dell'onda
  • 🌐 Supporto Multilingua: Trascrivi audio in piΓΉ lingue
  • πŸ“Š Dashboard Analitica: Monitora pattern di utilizzo e insights
  • πŸ”β€¦

How I Built It

Tech Stack

  • Backend: Node.js, Express, PostgreSQL, Redis
  • Frontend: React, Vite, TailwindCSS
  • AI/ML: Google Speech-to-Text API, Google Cloud AI
  • Email: Postmark Inbound Email Parsing
  • Storage: Google Cloud Storage (optional) or local
  • Infrastructure: Docker, Docker Compose, Nginx

Postmark Implementation

The core feature uses Postmark's inbound email parsing webhook:


javascript
// Webhook endpoint that receives emails from Postmark
async handleInboundEmail(req, res) {
  const inboundEmail = req.body;

  // Validate webhook signature for security
  if (!postmarkService.validateWebhookSignature(req)) {
    return res.status(401).json({ error: 'Invalid signature' });
  }

  // Extract audio attachments
  const audioAttachments = inboundEmail.Attachments?.filter(att => 
    ['audio/mpeg', 'audio/wav', 'audio/mp4'].includes(att.ContentType)
  );

  // Process each audio file
  for (const attachment of audioAttachments) {
    // Decode base64 audio
    const audioBuffer = Buffer.from(attachment.Content, 'base64');

    // Save to Google Cloud Storage or local
    const audioUrl = await storageService.uploadAudio(audioBuffer);

    // Queue transcription job
    await transcriptionQueue.add({
      audioUrl,
      userId: user.id,
      emailSubject: inboundEmail.Subject
    });
  }

  // Send confirmation email
  await postmarkService.sendProcessingConfirmation(inboundEmail.From);
}
Enter fullscreen mode Exit fullscreen mode

Top comments (1)

Collapse
 
dotallio profile image
Dotallio

This is actually super useful for organizing all those scattered voice notes. Which use case do you see getting the most traction so far?