What if you could see music the way your inner ear hears it?
I built a visualization system that maps audio frequencies onto a Fermat spiral — the same geometric curve that describes how the human cochlea arranges its frequency-sensitive hair cells. The result reveals the hidden geometry of harmony: you can literally see the difference between a major and minor chord.
The Core Idea
Traditional spectrograms show frequency vs. time as a rectangular heatmap. They're useful but clinical — they don't capture the feeling of music.
The cochlea (your inner ear) isn't rectangular. It's a spiral. Low frequencies resonate at the outer end, high frequencies at the inner end — logarithmically spaced, just like musical octaves.
So I asked: what if we visualize frequencies on an actual spiral?
How It Works
1. Audio Analysis (scipy FFT)
- 381 logarithmically-spaced frequency bins (20 Hz — 8 kHz)
- ISO 226 equal-loudness contours for perceptual accuracy
- 60 FPS frame-by-frame analysis
# Simplified core: FFT → cochlear frequency mapping
import numpy as np
from scipy.fft import rfft, rfftfreq
def analyze_frame(samples, sample_rate=44100, n_bins=381):
spectrum = np.abs(rfft(samples))
freqs = rfftfreq(len(samples), 1/sample_rate)
# Logarithmic bins: 20 Hz to 8 kHz (cochlear range)
bin_edges = np.logspace(np.log10(20), np.log10(8000), n_bins + 1)
amplitudes = np.zeros(n_bins)
for i in range(n_bins):
mask = (freqs >= bin_edges[i]) & (freqs < bin_edges[i+1])
if mask.any():
amplitudes[i] = spectrum[mask].mean()
return amplitudes
2. Spiral Mapping (Fermat Spiral)
Each frequency bin gets a position on a Fermat spiral: r = sqrt(θ)
Low frequencies sit at the outer edge (like the cochlea's apex), high frequencies spiral inward.
# Map frequency bins to spiral coordinates
theta = np.linspace(0, 8 * np.pi, n_bins)
r = np.sqrt(theta)
x = r * np.cos(theta)
y = r * np.sin(theta)
3. Chromesthesia Color Mapping
Colors follow a chromesthesia mapping — the neurological phenomenon where people "see" sounds as colors:
- Low frequencies (bass) → warm reds/oranges
- Mid frequencies (voice, guitar) → greens/yellows
- High frequencies (cymbals, harmonics) → cool blues/cyans
4. Temporal Features (The Secret Sauce)
Static spectrograms miss the movement of music. I added 5 temporal features, each validated across 1,704 audio samples:
| Feature | What it does | Optimal parameter |
|---|---|---|
| Melodic trails | Short glowing trails following melody | 10 frames, 0.70 decay |
| Rhythm pulses | Radial pulse on beat hits | 0.50 intensity, 0.25 decay |
| Harmonic auras | Sustained glow for held chords | 4.0s blend time |
| Atmospheric context | Background mood from 60s window | 0.35 influence |
| Harmonic connections | Lines between harmonically related notes | Octave + fifth detection |
Why Harmony Looks Beautiful
This is the magical part. When notes are harmonically related (octaves, fifths, thirds), they land at symmetric positions on the spiral. A major chord creates a visually balanced, symmetric pattern. Dissonance creates asymmetric, chaotic (but still beautiful) patterns.
Different musical traditions create remarkably different visual signatures:
- Classical harmony → orderly radial symmetry
- Arabic maqam → quarter-tone asymmetry with unique geometric beauty
- EDM/electronic → explosive, pulsing energy patterns
Try It: The Wellspring
I also built a crowdsourcing platform called The Wellspring where people can rate how well these visualizations capture the music. The goal: build an open dataset for AI-powered audio visualization evaluation.
Tech Stack
- Audio analysis: scipy (FFT), librosa
- Rendering: PIL (2D), PyVista (3D optional)
- Video encoding: FFmpeg (H.264, CRF 18, 60 FPS)
- Web platform: React 18 + TypeScript, Node/Express, PostgreSQL
What's Next
I'm working on browser-based creation tools so anyone can create their own audio-visual harmony — no installation needed. The vision: a global community of creators exploring the intersection of sound and moving image.
The ancient dance between rhythm and movement, renewed with modern tools.
Channel: youtube.com/@NivDvir-ND
The Wellspring: synesthesia-labeler.onrender.com
I'd love to hear your thoughts — especially from anyone working on audio visualization, creative coding, or signal processing!
Top comments (1)
Some comments may only be visible to logged-in visitors. Sign in to view all comments.