๐ One sound. Any sound. The gem listens. The music appears.
๐๐๐๐๐๐๐ ๐๐ ๐ฅ๐๐ค
Make any sound. Hum. Tap. Breathe. Whistle.
Gemma 4 finds the music inside it and plays it back as piano, cello, marimba, or drums.
No keyboard. No music theory. No pitch-perfect voice.
Built for anyone who has ever felt shut out of making music.
๐ GitHub: https://github.com/brookehoward2008-droid/Babbled-notes-v2
๐ต Live app: https://ai.studio/apps/4d235490-15ac-47a5-9599-f82aa85a2b57
โ The problem
Most music tools require two hands, ten fingers, perfect pitch, or years of training.
That shuts out a huge part of the world. People who are non-verbal. People with ALS, cerebral palsy, locked-in syndrome, quadriplegia, Parkinson's. People who have always heard music inside them and had no way to get it out.
babbled notes gives them a door.
A single breath. A tongue click. A finger tap. A hum with a tremor in it.
The app takes whatever you can give and turns it into a real musical composition, rendered in real time by a synthesized instrument of your choice.
๐ The NeuralGem
At the center of the app is the NeuralGem, a canvas visualizer with three states:
IDLE โ a breathing silver ring. waiting.
RECORDING โ a crystallizing polygon. sides grow with your audio level.
color shifts purple โ cyan as the sound builds.
LOCKED โ a hexagon. facets lit in your mood color.
the gem has heard you.
The gem is not decoration. It tells you what the app is doing without words.
โ Who it is built for
| Profile | What they give | What they get |
|---|---|---|
| ๐ Non-verbal autism | Sustained hum, single tone | Cello or piano melody |
| ๐ Cerebral palsy | Tremor-affected taps | Percussive rhythm, drum or marimba |
| ๐ค ALS | Minimal breath | Ambient drone pad with gentle melody |
| ๐ Locked-in syndrome | Single eye-blink switch click | One-trigger composition, looping |
| ๐ Quadriplegia | Hard puff / soft puff | Two-dynamic melody: accent and soft |
| ๐งก Parkinson's | Tremor vocal hum | Composition that treats tremor as vibrato |
| ๐ฉท Apraxia of speech | Broken phonation bursts | Legato phrase bridging the gaps |
| ๐ AAC / pre-verbal | Rising or falling hum | Interval-based melodic response |
โ How it works
1. TAP THE ORB โ microphone opens
2. MAKE A SOUND โ Web Audio API captures + analyzes in real time
(FFT pitch, RMS amplitude, onset detection)
3. TAP AGAIN โ recording stops
4. GEMMA 4 READS โ receives audio + DSP digest simultaneously
returns: mood, voice, articulation, Lilt score
5. THE GEM LOCKS โ mood-colored hexagon appears
6. MUSIC PLAYS โ synthesized instrument renders the Lilt score
7. EDIT ANYTIME โ piano roll + live Lilt code editor, re-render without re-recording
๐ Why Gemma 4
The app sends two things to the model at once:
- Raw audio: the actual recorded sound
- DSP digest: structured analysis of onset times, dominant frequency, pitch name, amplitude, tempo estimate
Gemma 4 (gemma-4-26b-a4b-it) reads both together and returns fast enough that a user with ALS or limited stamina hears their composition without waiting. That responsiveness matters. A slow model breaks the experience.
The system prompt enforces a strict JSON Lilt score every time. No freeform text. No guessing.
{
"mood": "gentle",
"articulation": "legato",
"voice": "cinematic cello",
"notes": [
{ "note": "A3", "duration": 1.2, "velocity": "soft", "time": 0.0 },
{ "note": "C4", "duration": 0.8, "velocity": "normal", "time": 1.2 }
],
"explanation": "A slow exhale, barely a sound. But steady. Like resolve."
}
โ Disability profiles tested
32 real DSP profiles. 7 disability categories. 3 difficulty levels.
Beginner: one event, one sound, one note
Intermediate: 2-3 events, some rhythm or pitch shift
Advanced: 4+ events, dynamics, intentional pattern
NV-01 Autism โ slow exhale breath (beginner)
NV-02 Autism โ single sustained hum (beginner)
NV-03 Autism โ two-tone hum shift (intermediate)
NV-04 Autism โ melodic hum phrase (advanced)
NV-05 Apraxia โ disrupted single vowel (beginner)
NV-06 Apraxia โ broken phonation bursts (intermediate)
NV-07 Apraxia โ vowel glide attempt (advanced)
NV-08 Selective mutism โ barely audible (beginner)
NV-09 Selective mutism โ nose exhale (intermediate)
PH-01 Cerebral palsy โ single finger tap (beginner)
PH-02 Cerebral palsy โ tremor cluster (intermediate)
PH-03 Cerebral palsy โ intentional beat (advanced)
PH-04 ALS โ minimal breath control (beginner)
PH-05 ALS โ pulsed breath pattern (intermediate)
PH-06 Locked-in โ single switch click (beginner)
PH-07 Locked-in โ two-click phrase (intermediate)
PH-08 Locked-in โ morse-style rhythm (advanced)
PH-09 Quadriplegia โ single breath puff (beginner)
PH-10 Quadriplegia โ hard/soft contrast (intermediate)
PH-11 Quadriplegia โ rhythmic phrase (advanced)
PH-12 Parkinson's โ tremor hum (beginner)
PH-13 Parkinson's โ vocal tremor melody (advanced)
MX-01 Whistle โ single clear pitch (beginner)
MX-02 Whistle โ two-note call (intermediate)
MX-03 Whistle โ pentatonic phrase (advanced)
MX-04 Tongue click โ single event (beginner)
MX-05 Tongue click โ 4/4 rhythm (intermediate)
MX-06 Tongue click โ syncopated groove (advanced)
MX-07 AAC โ rising hum intention (intermediate)
MX-08 AAC โ call and response (advanced)
MX-09 SCI C4 โ head tap (beginner)
MX-10 SCI C4 โ two-tap intentional gap (intermediate)
Run them yourself: node test-runner.mjs
โ Stack
Gemma 4 (gemma-4-26b-a4b-it) multimodal audio + DSP digest to Lilt JSON
Web Audio API mic capture, FFT/RMS DSP, synthesized playback
React + Vite + TypeScript frontend
Express + @google/genai SDK backend (API key stays server-side)
๐ What the Lilt format looks like
A3 ! soft @ 0.00s
C4 ! normal @ 1.20s
E4 ! accent @ 2.10s
G4 ! soft @ 3.40s
Each line is a note trigger: pitch, velocity, timestamp. The piano roll renders from this. The code is editable live. Change a velocity, move a timestamp, swap a note, hit compile. The music changes without re-recording.
๐ The gem crystallizes. The music plays. You made that.
You made that with a breath.
GitHub: https://github.com/brookehoward2008-droid/Babbled-notes-v2
Live app: https://ai.studio/apps/4d235490-15ac-47a5-9599-f82aa85a2b57
by Brooke Chauntel
Top comments (0)