DEV Community

Felix Zeller
Felix Zeller

Posted on

Real-Time Breath Detection in the Browser: Spectral Centroid, Dual-Path State Machines, and a Nasty iOS Bug

Microphone-based breath detection sounds simple until you actually try it. Energy goes up, energy goes down — that's a breath, right? In practice, you run into continuous breathing with no silence gaps, noisy environments that drift over time, and (on iOS) a Web Audio API that silently returns all zeros. This post walks through how @shiihaa/breath-detection solves each of these problems.

The library was extracted from shii·haa, a breathwork and biofeedback app built by Felix Zeller, a Swiss physician (Intensive Care + Internal Medicine + Diplompsychologe). It's MIT-licensed, zero dependencies, and ships TypeScript types.

Installation

npm install @shiihaa/breath-detection
Enter fullscreen mode Exit fullscreen mode

The Core Idea: Spectral Centroid for Inhale/Exhale Classification

Most breath detectors only measure when breathing occurs — they can't tell you whether a given phase is an inhale or an exhale. This one can, and the reason is grounded in physiology.

When you inhale through your nose, air moves through narrow nasal passages and turbinates. That turbulence generates higher-frequency acoustic energy. When you exhale, airflow is slower and more laminar — the spectral energy shifts down.

Phase Airflow Centroid range
Inhale Turbulent ~800–2500 Hz
Exhale Laminar ~200–800 Hz

The library computes the spectral centroid from a 4096-point FFT over the 150–2500 Hz band on every tick. If the centroid of phase A is meaningfully higher than phase B, it labels A as inhale and B as exhale. The BreathCycle object even exposes a labelSwapped flag for cases where the centroid evidence flipped the initial threshold-based guess.

The Detection Pipeline

Microphone → FFT → Energy + Centroid → State Machine → Breath Cycles
                                              ↑
                                     Peak Counter (fallback)
Enter fullscreen mode Exit fullscreen mode

State machine (primary path): Energy crosses the calibrated threshold → active phase begins. Energy drops to near-noise-floor → silent phase begins. One active + one silent = one breath cycle. This works well for deliberate breathwork with natural pauses.

Peak fallback (secondary path): Some people breathe continuously without any silence gap — the energy never fully drops. In that case, the library counts energy peaks instead. Two peaks = one breath cycle, with the trough between them treated as the phase boundary. The method field in BreathCycle tells you which path fired: 'threshold' or 'peak'.

Quick Start

import { BreathDetector } from '@shiihaa/breath-detection';

const detector = new BreathDetector({
  thresholdFactor: 0.35,   // 0 = sensitive, 1 = strict
  enableCentroid: true,
  centroidThreshold: 40,   // Hz difference for confident labeling
  minCycleGapSeconds: 2.5,
});

detector.onCycle((cycle) => {
  console.log(`${cycle.inhaleMs}ms in / ${cycle.exhaleMs}ms out`);
  console.log(`Rate: ${(60000 / cycle.cycleMs).toFixed(1)} breaths/min`);
  console.log(`Method: ${cycle.method}`); // 'threshold' or 'peak'
  console.log(`Centroid A: ${cycle.centroidA1}Hz, B: ${cycle.centroidA2}Hz`);
});

detector.onPhase((event) => {
  // fires every tick — useful for live UI feedback
  console.log(`Phase: ${event.phase}, Energy: ${event.energy.toFixed(3)}`);
});

const ok = await detector.start();
if (!ok) { console.error('Mic access denied'); return; }

// 6-second calibration: 2s silence + 4s breathing
const cal = await detector.calibrate();
console.log(`Noise floor: ${cal.noiseFloor.toFixed(4)}`);
console.log(`Breath max: ${cal.breathMax.toFixed(4)}`);

detector.startDetection();
Enter fullscreen mode Exit fullscreen mode

Auto-Recalibration

One underrated problem: users move rooms, switch from earbuds to laptop mic, or the HVAC kicks on. The library re-samples the noise floor every 10 seconds during active detection, so the threshold adapts without requiring a fresh calibration call.

The iOS Problem (and the Fix)

If you're building a Capacitor app, there's a well-known but poorly documented bug: AnalyserNode.getByteFrequencyData() returns all zeros inside WKWebView even when getUserMedia succeeds and the microphone is actually capturing audio.

// ❌ Broken on iOS WKWebView
const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
const ctx = new AudioContext();
const analyser = ctx.createAnalyser();
ctx.createMediaStreamSource(stream).connect(analyser);

const data = new Uint8Array(analyser.frequencyBinCount);
analyser.getByteFrequencyData(data);
console.log(data); // [0, 0, 0, 0, ...] — even when mic is live
Enter fullscreen mode Exit fullscreen mode

The companion plugin @shiihaa/capacitor-audio-analysis routes microphone capture through native AVAudioEngine in Swift, computes RMS and band energy there, and emits the results as Capacitor events. BreathDetector can then consume those values instead of touching the Web Audio API.

npm install @shiihaa/capacitor-audio-analysis
npx cap sync
Enter fullscreen mode Exit fullscreen mode
import { AudioAnalysis } from '@shiihaa/capacitor-audio-analysis';

await AudioAnalysis.start({ gain: 8.0 });

const handle = await AudioAnalysis.addListener('audioData', (data) => {
  console.log('RMS:', data.rms);         // smoothed 0–1
  console.log('Band energy:', data.bandEnergy); // 150–2500 Hz proxy
});
Enter fullscreen mode Exit fullscreen mode

BreathCycle Object

Every completed cycle delivers:

{
  inhaleMs: number;      // inhale duration
  exhaleMs: number;      // exhale duration
  holdInMs: number;      // breath hold after inhale (0 if none)
  holdOutMs: number;     // breath hold after exhale (0 if none)
  cycleMs: number;       // total cycle duration
  peakEnergy: number;    // 0–1
  confidence: number;    // 0–100
  labelSwapped: boolean; // centroid overrode threshold labeling
  centroidA1: number;    // spectral centroid for phase 1 (Hz)
  centroidA2: number;    // spectral centroid for phase 2 (Hz)
  method: 'threshold' | 'peak';
  timestamp: number;
}
Enter fullscreen mode Exit fullscreen mode

What It's Designed For

The library was built for guided breathwork (box breathing, 4-7-8, coherence breathing) and biofeedback applications where you need reliable per-cycle data with inhale/exhale distinction. It's not a medical device. Confidence scores and the labelSwapped flag give you enough signal to decide whether to trust a given cycle for real-time feedback or discard it.

Links

Top comments (0)