DEV Community

Niv Dvir
Niv Dvir

Posted on

How I Built a Cochlear Spiral Spectrogram That Visualizes Music Like the Inner Ear

What if you could see music the way your inner ear hears it?

I built a visualization system that maps audio frequencies onto a Fermat spiral — the same geometric curve that describes how the human cochlea arranges its frequency-sensitive hair cells. The result reveals the hidden geometry of harmony: you can literally see the difference between a major and minor chord.

The Core Idea

Traditional spectrograms show frequency vs. time as a rectangular heatmap. They're useful but clinical — they don't capture the feeling of music.

The cochlea (your inner ear) isn't rectangular. It's a spiral. Low frequencies resonate at the outer end, high frequencies at the inner end — logarithmically spaced, just like musical octaves.

So I asked: what if we visualize frequencies on an actual spiral?

How It Works

1. Audio Analysis (scipy FFT)

  • 381 logarithmically-spaced frequency bins (20 Hz — 8 kHz)
  • ISO 226 equal-loudness contours for perceptual accuracy
  • 60 FPS frame-by-frame analysis
# Simplified core: FFT → cochlear frequency mapping
import numpy as np
from scipy.fft import rfft, rfftfreq

def analyze_frame(samples, sample_rate=44100, n_bins=381):
    spectrum = np.abs(rfft(samples))
    freqs = rfftfreq(len(samples), 1/sample_rate)

    # Logarithmic bins: 20 Hz to 8 kHz (cochlear range)
    bin_edges = np.logspace(np.log10(20), np.log10(8000), n_bins + 1)
    amplitudes = np.zeros(n_bins)

    for i in range(n_bins):
        mask = (freqs >= bin_edges[i]) & (freqs < bin_edges[i+1])
        if mask.any():
            amplitudes[i] = spectrum[mask].mean()

    return amplitudes
Enter fullscreen mode Exit fullscreen mode

2. Spiral Mapping (Fermat Spiral)

Each frequency bin gets a position on a Fermat spiral: r = sqrt(θ)

Low frequencies sit at the outer edge (like the cochlea's apex), high frequencies spiral inward.

# Map frequency bins to spiral coordinates
theta = np.linspace(0, 8 * np.pi, n_bins)
r = np.sqrt(theta)
x = r * np.cos(theta)
y = r * np.sin(theta)
Enter fullscreen mode Exit fullscreen mode

3. Chromesthesia Color Mapping

Colors follow a chromesthesia mapping — the neurological phenomenon where people "see" sounds as colors:

  • Low frequencies (bass) → warm reds/oranges
  • Mid frequencies (voice, guitar) → greens/yellows
  • High frequencies (cymbals, harmonics) → cool blues/cyans

4. Temporal Features (The Secret Sauce)

Static spectrograms miss the movement of music. I added 5 temporal features, each validated across 1,704 audio samples:

Feature What it does Optimal parameter
Melodic trails Short glowing trails following melody 10 frames, 0.70 decay
Rhythm pulses Radial pulse on beat hits 0.50 intensity, 0.25 decay
Harmonic auras Sustained glow for held chords 4.0s blend time
Atmospheric context Background mood from 60s window 0.35 influence
Harmonic connections Lines between harmonically related notes Octave + fifth detection

Why Harmony Looks Beautiful

This is the magical part. When notes are harmonically related (octaves, fifths, thirds), they land at symmetric positions on the spiral. A major chord creates a visually balanced, symmetric pattern. Dissonance creates asymmetric, chaotic (but still beautiful) patterns.

Different musical traditions create remarkably different visual signatures:

  • Classical harmony → orderly radial symmetry
  • Arabic maqam → quarter-tone asymmetry with unique geometric beauty
  • EDM/electronic → explosive, pulsing energy patterns

Try It: The Wellspring

I also built a crowdsourcing platform called The Wellspring where people can rate how well these visualizations capture the music. The goal: build an open dataset for AI-powered audio visualization evaluation.

Tech Stack

  • Audio analysis: scipy (FFT), librosa
  • Rendering: PIL (2D), PyVista (3D optional)
  • Video encoding: FFmpeg (H.264, CRF 18, 60 FPS)
  • Web platform: React 18 + TypeScript, Node/Express, PostgreSQL

What's Next

I'm working on browser-based creation tools so anyone can create their own audio-visual harmony — no installation needed. The vision: a global community of creators exploring the intersection of sound and moving image.

The ancient dance between rhythm and movement, renewed with modern tools.


Channel: youtube.com/@NivDvir-ND
The Wellspring: synesthesia-labeler.onrender.com

I'd love to hear your thoughts — especially from anyone working on audio visualization, creative coding, or signal processing!

Top comments (1)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.