A Voice Memo App With MediaRecorder, IndexedDB, and Live Waveform Rendering
MediaRecorder captures microphone audio as WebM. IndexedDB stores the blobs (localStorage is too small for audio). An AnalyserNode feeds a Canvas for the live waveform during recording. Web Speech API provides best-effort transcription. Each API has its own quirks — together they make a fully client-side voice memo tool.
Voice memo apps exist natively on every phone, but web-based ones are surprisingly rare because audio is awkward on the web platform. The APIs exist, but they don't compose cleanly — you have to wire together four different browser APIs and deal with each one's edge cases.
🔗 Live demo: https://sen.ltd/portfolio/voice-memo/
📦 GitHub: https://github.com/sen-ltd/voice-memo
Features:
- MediaRecorder audio capture
- IndexedDB blob storage
- Live waveform during recording (Canvas + AnalyserNode)
- Playback with scrubber
- Web Speech API transcription (best-effort)
- Label + tag + delete
- Export recordings
- Japanese / English UI (language-aware recognition)
- Zero dependencies, 55 tests
Recording with MediaRecorder
async function startRecording() {
const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
const recorder = new MediaRecorder(stream, { mimeType: getBestMimeType() });
const chunks = [];
recorder.ondataavailable = (e) => chunks.push(e.data);
recorder.onstop = () => {
const blob = new Blob(chunks, { type: recorder.mimeType });
saveToIndexedDB(blob);
stream.getTracks().forEach(t => t.stop());
};
recorder.start();
}
The getBestMimeType() helper picks whichever format the browser supports:
export function supportedMimeTypes() {
return [
'audio/webm;codecs=opus',
'audio/webm',
'audio/ogg;codecs=opus',
'audio/mp4',
].filter(MediaRecorder.isTypeSupported);
}
Chrome prefers WebM/Opus, Safari prefers MP4/AAC. Supporting both means falling back to whichever is available.
Live waveform with AnalyserNode
For the visual feedback during recording, we tap the audio stream through a Web Audio AnalyserNode and read its time-domain data every frame:
const audioCtx = new AudioContext();
const source = audioCtx.createMediaStreamSource(stream);
const analyser = audioCtx.createAnalyser();
analyser.fftSize = 2048;
source.connect(analyser);
const buffer = new Uint8Array(analyser.fftSize);
function draw() {
analyser.getByteTimeDomainData(buffer);
drawLiveWaveform(canvas, buffer);
if (recording) requestAnimationFrame(draw);
}
draw();
The buffer holds PCM samples in the 0-255 range (128 = silence). Rendering is a simple line plot of sample values across the canvas width.
IndexedDB, not localStorage
Audio blobs are hundreds of KB to megabytes. localStorage has a 5-10 MB per-origin limit and only stores strings — you'd have to base64-encode the blob, which adds 33% overhead. IndexedDB stores binary blobs natively and has a much higher quota:
async function saveMemo(memo) {
const db = await openDB();
const tx = db.transaction('memos', 'readwrite');
const store = tx.objectStore('memos');
await promisify(store.put(memo));
return memo.id;
}
The wrapper turns IndexedDB's verbose callback API into promises. IndexedDB also supports secondary indexes, so sorting by createdAt is trivial:
const index = store.index('createdAt');
const memos = await promisify(index.getAll());
memos.sort((a, b) => b.createdAt - a.createdAt);
Web Speech API transcription
The tricky part: Web Speech API transcribes live microphone input, not arbitrary blobs. To transcribe a saved memo, we play it back through the speakers while SpeechRecognition listens to the microphone. Not ideal (requires a quiet room), but works:
async function transcribeMemo(memo) {
const SR = window.SpeechRecognition || window.webkitSpeechRecognition;
const recog = new SR();
recog.lang = getSpeechLang(); // 'ja-JP' or 'en-US'
recog.continuous = true;
recog.interimResults = false;
const audio = new Audio(URL.createObjectURL(memo.blob));
let transcript = '';
recog.onresult = (e) => {
for (let i = e.resultIndex; i < e.results.length; i++) {
transcript += e.results[i][0].transcript;
}
};
recog.start();
audio.play();
await new Promise(r => audio.onended = r);
recog.stop();
return transcript;
}
A better approach would be to send the blob to Whisper or another off-device speech model, but that requires a server. For pure-browser best-effort, playback-through-microphone is the only option.
Series
This is entry #71 in my 100+ public portfolio series.
- 📦 Repo: https://github.com/sen-ltd/voice-memo
- 🌐 Live: https://sen.ltd/portfolio/voice-memo/
- 🏢 Company: https://sen.ltd/

Top comments (0)