<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: BABBLED77</title>
    <description>The latest articles on DEV Community by BABBLED77 (@brookehoward2008droid).</description>
    <link>https://dev.to/brookehoward2008droid</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3920837%2F481e45f0-fa9d-40f5-8633-55f6bd47d839.jpeg</url>
      <title>DEV Community: BABBLED77</title>
      <link>https://dev.to/brookehoward2008droid</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/brookehoward2008droid"/>
    <language>en</language>
    <item>
      <title>Building an Accessibility Agent with Hermes Agent: Sound to Music for People Who Couldn't Before</title>
      <dc:creator>BABBLED77</dc:creator>
      <pubDate>Sun, 24 May 2026 22:17:44 +0000</pubDate>
      <link>https://dev.to/brookehoward2008droid/building-an-accessibility-agent-with-hermes-agent-sound-to-music-for-people-who-couldnt-before-382l</link>
      <guid>https://dev.to/brookehoward2008droid/building-an-accessibility-agent-with-hermes-agent-sound-to-music-for-people-who-couldnt-before-382l</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;💎 &lt;em&gt;One sound. Any sound. The agent hears it. Music appears.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h1&gt;
  
  
  𝕓𝕒𝕓𝕓𝕝𝕖𝕕 𝕟𝕠𝕥𝕖𝕤
&lt;/h1&gt;

&lt;h2&gt;
  
  
  ✦ An Accessibility Agent Built on Hermes Agent Principles ✦
&lt;/h2&gt;

&lt;p&gt;This post explores how &lt;strong&gt;Hermes Agent&lt;/strong&gt; -- Nous Research's open-source autonomous agent platform -- maps perfectly to a real-world accessibility problem: giving people who cannot use traditional music tools a way to make music with any sound their body can produce.&lt;/p&gt;

&lt;p&gt;The result is &lt;strong&gt;babbled notes v2&lt;/strong&gt;: a Gemma 4-powered agent that turns a hum, a breath, a tap, or a tongue click into a real musical composition.&lt;/p&gt;

&lt;p&gt;🔗 &lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/brookehoward2008-droid/Babbled-notes-v2" rel="noopener noreferrer"&gt;https://github.com/brookehoward2008-droid/Babbled-notes-v2&lt;/a&gt;&lt;br&gt;
📐 &lt;strong&gt;Agent docs:&lt;/strong&gt; &lt;a href="https://github.com/brookehoward2008-droid/Babbled-notes-v2/blob/main/HERMES.md" rel="noopener noreferrer"&gt;https://github.com/brookehoward2008-droid/Babbled-notes-v2/blob/main/HERMES.md&lt;/a&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  ◈ What Is Hermes Agent
&lt;/h2&gt;

&lt;p&gt;Hermes Agent is an autonomous system by Nous Research -- not a coding copilot tethered to an IDE, not a chatbot wrapper around a single API. It is a server-side agent with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;✦  Multi-platform reach    Telegram, Discord, Slack, WhatsApp, Signal, Email, CLI
✦  Persistent memory       Learns from past work, reapplies solutions automatically
✦  Scheduled automations   Natural language cron: "send me a briefing every morning"
✦  Subagent delegation     Parallel agents with isolated contexts, no context bleed
✦  Five sandbox backends   Local, Docker, SSH, Singularity, Modal
✦  Web capabilities        Search, browser automation, vision, image gen, TTS
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Install in one command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What makes Hermes Agent different from a chatbot: it operates independently on your server, can schedule work, delegate subtasks to subagents, and persist knowledge across sessions. That is a real agent architecture -- not a prompt-response loop.&lt;/p&gt;




&lt;h2&gt;
  
  
  💎 Why Hermes Agent and Accessibility Belong Together
&lt;/h2&gt;

&lt;p&gt;Most music tools require two hands, ten fingers, perfect pitch, or years of training.&lt;/p&gt;

&lt;p&gt;That shuts out a huge part of the world. People who are non-verbal. People with ALS, cerebral palsy, locked-in syndrome, quadriplegia, Parkinson's. People who have always heard music inside them and had no way to get it out.&lt;/p&gt;

&lt;p&gt;Hermes Agent's architecture is exactly what an accessibility tool needs:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Hermes Agent capability&lt;/th&gt;
&lt;th&gt;Accessibility use case&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Multi-platform&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;User triggers music generation from Telegram with a voice message -- no keyboard needed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Persistent memory&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Agent remembers "this user has Parkinson's -- treat tremor as vibrato, always use cello"&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Scheduled automations&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;"Generate a new composition every morning at 7am" -- ambient music therapy, automated&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Subagent delegation&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;One subagent handles DSP analysis; another handles Gemma 4 reasoning -- no context bleed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Web + browser&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Agent could automatically post generated Lilt scores to a shared Notion page or email&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;A user with locked-in syndrome can send a single message to Hermes Agent via Telegram. Hermes Agent delegates to the babbled notes subagent. A composition comes back to their phone. No laptop required. No mouse. No keyboard.&lt;/p&gt;




&lt;h2&gt;
  
  
  ◈ The Agent Loop: How babbled notes Maps to Hermes Agent
&lt;/h2&gt;

&lt;p&gt;Hermes Agent's architecture -- perceive, reason, act, remember -- is exactly the loop babbled notes runs on every sound.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────────────────────────────────────────────────────────┐
│                    HERMES AGENT ORCHESTRATION                   │
│                                                                 │
│  User message (Telegram / CLI)                                  │
│       "Turn my hum into music"                                  │
│              |                                                  │
│              v                                                  │
│  ┌───────────────────────────┐                                  │
│  │   babbled notes subagent  │                                  │
│  │                           │                                  │
│  │  PERCEIVE                 │                                  │
│  │  Web Audio API            │                                  │
│  │  FFT + onset detection    │                                  │
│  │  -&amp;gt; DspDigest             │                                  │
│  │          |                │                                  │
│  │  REASON                   │                                  │
│  │  Gemma 4 reads audio      │                                  │
│  │  + DspDigest              │                                  │
│  │  -&amp;gt; Lilt score            │                                  │
│  │          |                │                                  │
│  │  ACT                      │                                  │
│  │  Synthesizer plays music  │                                  │
│  └───────────────────────────┘                                  │
│              |                                                  │
│  Hermes Agent delivers result to user                           │
│  + stores preference in persistent memory                       │
└─────────────────────────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  💎 Integrating babbled notes With Hermes Agent
&lt;/h2&gt;

&lt;p&gt;Here is how to wire babbled notes into Hermes Agent as a callable subagent skill.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Run babbled notes
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/brookehoward2008-droid/Babbled-notes-v2.git
&lt;span class="nb"&gt;cd &lt;/span&gt;Babbled-notes-v2
npm &lt;span class="nb"&gt;install&lt;/span&gt;
&lt;span class="c"&gt;# add GEMINI_API_KEY to .env.local&lt;/span&gt;
npm run dev
&lt;span class="c"&gt;# server running at http://localhost:3000&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 2: Install Hermes Agent
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 3: Create a babbled notes skill
&lt;/h3&gt;

&lt;p&gt;Save this as &lt;code&gt;~/.hermes/skills/babbled_notes.py&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
Babbled Notes skill for Hermes Agent
Converts a DSP sound description into a musical Lilt score via Gemma 4.
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;generate_music&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;pitch_hz&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;duration_s&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;amplitude&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;user_prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Ask babbled notes to compose music from a sound description.
    pitch_hz: dominant frequency in Hz (e.g. 220 for A3)
    duration_s: how long the sound lasted
    amplitude: loudness 0.0-1.0 (0.1 = soft breath, 0.9 = loud tap)
    user_prompt: optional intent hint (&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;make it a cello&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;slow and gentle&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;)
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;math&lt;/span&gt;
    &lt;span class="c1"&gt;# build note name from Hz
&lt;/span&gt;    &lt;span class="n"&gt;names&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;C&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;C#&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;D&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;D#&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;E&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;F&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;F#&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;G&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;G#&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;A&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;A#&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;B&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;midi&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;12&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pitch_hz&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;440&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;69&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;midi&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;127&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;midi&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;pitch_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;names&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;midi&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}{&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;midi&lt;/span&gt; &lt;span class="o"&gt;//&lt;/span&gt; &lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

    &lt;span class="n"&gt;digest&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;duration&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;duration_s&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;averageEnergy&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;amplitude&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;peakOnsetCount&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;events&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;time&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;frequency&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;pitch_hz&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
             &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pitchName&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;pitch_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;amplitude&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;amplitude&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http://localhost:3000/api/interpret&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;dspDigest&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;digest&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;userPrompt&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;user_prompt&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;90&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;generate_music_from_profile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;profile&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Generate music for a known disability profile.
    profile: one of &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;breath&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;hum&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;tremor&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;tap&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;click&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;puff&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;whistle&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;profiles&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;breath&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;180&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="mf"&gt;2.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.03&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;minimal breath, ambient drone&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;hum&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;     &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;220&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="mf"&gt;3.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.11&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gentle sustained hum, cello&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tremor&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;196&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="mf"&gt;2.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.08&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tremor hum, treat as vibrato&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tap&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;     &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;440&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.45&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;single finger tap, percussive&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;click&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;   &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;800&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.60&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tongue click, sharp and short&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;puff&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;120&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="mf"&gt;1.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.05&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;breath puff, soft and round&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;whistle&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1047&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.30&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;single whistle note, clear pitch&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;hz&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dur&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;amp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;profiles&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;profile&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;profiles&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;hum&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;generate_music&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hz&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dur&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;amp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 4: Use it via Hermes Agent
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;you: Generate music from a hum at A3
hermes: [calls generate_music(220, 3.0, 0.11, "gentle hum")]

Gemma 4 returned:
  mood: pensive
  voice: cinematic cello
  notes: A3 soft @ 0.00s / C4 normal @ 1.20s / A2 soft @ 0.00s (drone)
  explanation: A sustained A natural, barely above a whisper.
               The cello holds it, lets it breathe.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 5: Add persistent memory
&lt;/h3&gt;

&lt;p&gt;Tell Hermes Agent your preference once:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;you: Remember that I have Parkinson's -- always treat tremor as vibrato,
     always use cinematic cello voice
hermes: Noted. I'll apply that to all future music generation for you.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Hermes Agent's persistent memory stores this. Every future &lt;code&gt;generate_music&lt;/code&gt; call&lt;br&gt;
is automatically informed by the user's disability profile -- no re-explaining needed.&lt;/p&gt;


&lt;h2&gt;
  
  
  ◈ The Perception Layer in Detail
&lt;/h2&gt;

&lt;p&gt;The babbled notes perception layer is what Hermes Agent's subagent would feed into.&lt;/p&gt;

&lt;p&gt;Web Audio API runs in real time during recording:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Microphone  -&amp;gt;  AnalyserNode (FFT 256 bins)  -&amp;gt;  peak bin -&amp;gt; Hz -&amp;gt; note name
                ScriptProcessor              -&amp;gt;  RMS amplitude
                Onset detector               -&amp;gt;  timestamps when sound starts
                                             -&amp;gt;  DspDigest (JSON)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Onset detection threshold is set deliberately low (0.1 RMS) to catch breath inputs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;rms&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;elapsed&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;lastOnset&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;events&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;time&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;elapsed&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;frequency&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;freq&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="na"&gt;pitchName&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;note&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;amplitude&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;rms&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A breath at 0.02 RMS in a quiet room barely registers. The threshold is at 0.1 because it needs to catch sounds that are 5x quieter than a normal speaking voice.&lt;/p&gt;




&lt;h2&gt;
  
  
  💎 The Reasoning Layer: Gemma 4
&lt;/h2&gt;

&lt;p&gt;Gemma 4 (&lt;code&gt;gemma-4-26b-a4b-it&lt;/code&gt;) is the reasoning engine. It receives both:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Raw audio&lt;/strong&gt; (base64 WebM) -- texture, tremor quality, breath shape&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DspDigest&lt;/strong&gt; (JSON) -- precise onset timing, Hz, amplitude&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And returns a complete Lilt score -- a musical composition in structured JSON:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mood"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"pensive"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"articulation"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"legato"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"voice"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"cinematic cello"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"notes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"note"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"A3"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"duration"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;1.4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"velocity"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"soft"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="nl"&gt;"time"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"note"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"C4"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"duration"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;1.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"velocity"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"normal"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"time"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;1.2&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"note"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"A2"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"duration"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;4.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"velocity"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"soft"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="nl"&gt;"time"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"voice"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"synthesizer ambient"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"explanation"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"A breath, barely a sound. Steady. Like resolve."&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The reasoning follows the Lilt Contract -- guidelines Gemma 4 interprets, not hardcoded rules:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;◈  Slow, soft, hummed  -&amp;gt;  pensive/gentle + cello/piano + legato
◈  Sharp, rhythmic     -&amp;gt;  energetic/tight + marimba/drums + staccato
◈  Always harmonious pitches: C major, A minor, pentatonic
◈  Always include a synthesizer ambient drone layer
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  ◈ 32 Profiles Tested Across 7 Disability Categories
&lt;/h2&gt;

&lt;p&gt;The agent was validated against 32 live Gemma 4 responses -- no simulated data.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Result: 32 / 32 passed
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Category&lt;/th&gt;
&lt;th&gt;Tests&lt;/th&gt;
&lt;th&gt;Notes range&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Non-verbal autism (NV)&lt;/td&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;3-6 notes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Physical disabilities (PH)&lt;/td&gt;
&lt;td&gt;13&lt;/td&gt;
&lt;td&gt;3-7 notes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mixed / cross-profile (MX)&lt;/td&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;3-7 notes&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;node test-runner.mjs   &lt;span class="c"&gt;# run all 32 yourself&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Full results in &lt;code&gt;test-results.json&lt;/code&gt; -- 1,347 lines of live Gemma 4 output.&lt;/p&gt;




&lt;h2&gt;
  
  
  ◈ The NeuralGem: Agent State Without Words
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;◇  IDLE        breathing silver ring. the agent is waiting.
◈  RECORDING   crystallizing polygon, purple to cyan.
               Hermes subagent is running perception layer.
⬡  PROCESSING  hexagon forming. Gemma 4 is reasoning.
⬡  LOCKED      hexagon, facets lit in the mood color.
               The agent has decided. Music is loading.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No text labels. Shape and color carry all the state. For users who cannot read, or who have cognitive differences: the gem is the interface.&lt;/p&gt;




&lt;h2&gt;
  
  
  💎 What Hermes Agent Makes Possible Next
&lt;/h2&gt;

&lt;p&gt;With Hermes Agent as the orchestration layer:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;◈  Telegram trigger       User sends voice note to Hermes Agent bot
                          Hermes transcribes audio -&amp;gt; babbled notes API -&amp;gt; Lilt score sent back

◈  Persistent memory      Agent knows: "this user uses breath puffs, always cello,
                          always soft dynamics" -- applied every session without re-explaining

◈  Scheduled music        "Every morning at 7am, generate a new ambient piece
                          from my baseline breath profile" -- Hermes cron triggers babbled notes

◈  Subagent pipeline      Agent 1: DSP analysis on uploaded audio file
                          Agent 2: Gemma 4 reasoning with profile context
                          Agent 3: Delivery to user's preferred channel

◈  Multi-platform         Same music generation accessible from phone, desktop,
                          Slack workspace, or Discord server -- wherever the user is
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  ◈ Stack
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌──────────────────────────────────────────────────────┐
│  ORCHESTRATION  Hermes Agent (Nous Research)         │  ← autonomous agent platform
│  REASONING      Gemma 4  gemma-4-26b-a4b-it          │  ← the agent's brain
│  PERCEPTION     Web Audio API (FFT, RMS, onset)      │  ← the agent's ears
│  ACTION         Web Audio API (synthesis)            │  ← the agent's voice
│  FRONTEND       React + Vite + TypeScript            │
│  BACKEND        Express + @google/genai SDK          │  ← API key stays here
└──────────────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;blockquote&gt;
&lt;p&gt;💎 &lt;em&gt;The gem crystallizes.&lt;/em&gt;&lt;br&gt;
&lt;em&gt;Hermes delegates. Gemma 4 decides.&lt;/em&gt;&lt;br&gt;
&lt;em&gt;The music plays.&lt;/em&gt;&lt;br&gt;
&lt;em&gt;You made that. You made that with a breath.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;p&gt;&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/brookehoward2008-droid/Babbled-notes-v2" rel="noopener noreferrer"&gt;https://github.com/brookehoward2008-droid/Babbled-notes-v2&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;Agent architecture:&lt;/strong&gt; &lt;a href="https://github.com/brookehoward2008-droid/Babbled-notes-v2/blob/main/HERMES.md" rel="noopener noreferrer"&gt;https://github.com/brookehoward2008-droid/Babbled-notes-v2/blob/main/HERMES.md&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;by Brooke Chauntel&lt;/em&gt;&lt;/p&gt;

</description>
      <category>hermesagentchallenge</category>
      <category>devchallenge</category>
      <category>agents</category>
    </item>
    <item>
      <title>babbled notes: a sound-to-music agent for people who could not make music before</title>
      <dc:creator>BABBLED77</dc:creator>
      <pubDate>Sun, 24 May 2026 21:47:57 +0000</pubDate>
      <link>https://dev.to/brookehoward2008droid/babbled-notes-a-sound-to-music-agent-for-people-who-could-not-make-music-before-c0p</link>
      <guid>https://dev.to/brookehoward2008droid/babbled-notes-a-sound-to-music-agent-for-people-who-could-not-make-music-before-c0p</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;💎 &lt;em&gt;You make a sound. Any sound. The agent hears it. Music comes back.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h1&gt;
  
  
  𝕓𝕒𝕓𝕓𝕝𝕖𝕕 𝕟𝕠𝕥𝕖𝕤
&lt;/h1&gt;

&lt;p&gt;Hum into a microphone. Tap your desk. Exhale slowly. Click your tongue. Whistle once.&lt;/p&gt;

&lt;p&gt;A Gemma 4 agent reads what you made, decides what music lives inside it, and plays it back as piano, cello, marimba, or drums.&lt;/p&gt;

&lt;p&gt;You chose nothing. The agent chose everything.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Built for people who have never been able to make music before&lt;/strong&gt; -- people who are non-verbal, people with ALS, cerebral palsy, locked-in syndrome, quadriplegia, Parkinson's. People who have always heard music inside them and had no way to get it out.&lt;/p&gt;

&lt;p&gt;🔗 &lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/brookehoward2008-droid/Babbled-notes-v2" rel="noopener noreferrer"&gt;https://github.com/brookehoward2008-droid/Babbled-notes-v2&lt;/a&gt;&lt;br&gt;
🎵 &lt;strong&gt;Agent architecture:&lt;/strong&gt; &lt;a href="https://github.com/brookehoward2008-droid/Babbled-notes-v2/blob/main/HERMES.md" rel="noopener noreferrer"&gt;HERMES.md&lt;/a&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  ◈ Why this is an agent, not a tool
&lt;/h2&gt;

&lt;p&gt;A tool does what you tell it. You configure it. You choose the settings. You push the button.&lt;/p&gt;

&lt;p&gt;An agent perceives its environment, reasons about what it observes, and takes action on its own judgment.&lt;/p&gt;

&lt;p&gt;babbled notes runs a full agent loop on every sound:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;What it does&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Perceive&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Web Audio API reads the mic: FFT pitch analysis, RMS amplitude, onset detection. Outputs a structured DspDigest.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Reason&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Gemma 4 (&lt;code&gt;gemma-4-26b-a4b-it&lt;/code&gt;) receives the raw audio AND the DspDigest. Decides mood, instrument voice, articulation, and note timing.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Act&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Web Audio API synthesizer plays the composition. Real instruments. Real time.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Reflect&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;User edits the Lilt score. Agent re-renders without re-recording.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The user never chooses a key, a tempo, a voice, or a mood. The agent reads the sound and decides all of it.&lt;/p&gt;


&lt;h2&gt;
  
  
  💎 The NeuralGem
&lt;/h2&gt;

&lt;p&gt;The agent communicates its state through the NeuralGem -- a canvas visualizer with no text labels:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;IDLE       -&amp;gt;  breathing silver ring. waiting for input.

RECORDING  -&amp;gt;  crystallizing polygon. sides grow as your audio level rises.
              color shifts purple to cyan as the sound builds.

PROCESSING -&amp;gt;  hexagon forming. the agent is reading your sound.

LOCKED     -&amp;gt;  hexagon. facets lit in the mood color the agent chose.
              the agent has heard you. music is loading.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For users who are non-verbal, have cognitive differences, or who cannot read: shape and color carry all the information. No labels to parse. No configuration panel to navigate. Tap once to start. Tap once to stop.&lt;/p&gt;




&lt;h2&gt;
  
  
  ◈ How the agent reasons
&lt;/h2&gt;

&lt;p&gt;The agent sends two things to Gemma 4 simultaneously:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Raw audio&lt;/strong&gt; (base64 WebM)&lt;br&gt;
The actual sound. Gemma 4 can hear the texture -- a tremor in a hum, the scrape of a breath, the sharp crack of a tongue click. These textures do not survive FFT analysis. They live in the audio.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. DspDigest&lt;/strong&gt; (structured JSON)&lt;br&gt;
What the perception layer already calculated precisely:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"duration"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;3.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"averageEnergy"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.11&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"peakOnsetCount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"events"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"time"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nl"&gt;"frequency"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;220&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"pitchName"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"A3"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"amplitude"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.11&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"time"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;1.6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nl"&gt;"frequency"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;261&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"pitchName"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"C4"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"amplitude"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.13&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two onsets. A3 moving to C4. 1.6 seconds apart. Average energy 0.11 -- a soft sound.&lt;/p&gt;

&lt;p&gt;Gemma 4 reads both and decides: this is a sustained hum that rose in pitch. Mood: pensive. Voice: cinematic cello. Articulation: legato. Two melody notes, one drone pad underneath. Timestamps aligned to the 1.6-second interval in the digest.&lt;/p&gt;

&lt;p&gt;The agent's output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mood"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"pensive"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"articulation"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"legato"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"voice"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"cinematic cello"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"liltCode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"A3 ! soft @ 0.00s&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;C4 ! normal @ 1.60s"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"notes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"note"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"A3"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"duration"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;1.4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"velocity"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"soft"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="nl"&gt;"time"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"note"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"C4"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"duration"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;1.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"velocity"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"normal"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"time"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;1.6&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"note"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"A2"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"duration"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;3.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"velocity"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"soft"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="nl"&gt;"time"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"voice"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"synthesizer ambient"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"explanation"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"A rising hum -- two tones, a minor third apart. The cello holds the first note soft, lifts into the second. The drone underneath gives it weight."&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The agent turned a two-second hum into a composition with melody, countermelody, and an ambient drone. The user made one sound. The agent made the music.&lt;/p&gt;




&lt;h2&gt;
  
  
  ◈ The Lilt Contract
&lt;/h2&gt;

&lt;p&gt;The agent's reasoning follows a set of guidelines built into the system prompt. These are not hardcoded rules -- Gemma 4 interprets them against what it actually heard:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Slow, soft, or hummed sounds:
  mood = "gentle" or "pensive"
  voice = "cinematic cello" or "grand piano"
  articulation = "legato"

Sharp, rhythmic, or tapped sounds:
  mood = "energetic" or "tight"
  voice = "marimba" or "drum kit"
  articulation = "staccato"

Always keep pitches harmonious (C major, A minor, or pentatonic).
Timestamps must align with DSP onsets but feel musically polished.
Always include a drone layer using "synthesizer ambient" voice.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A tremor-affected tap does not fit cleanly into either category. The agent reads it as closer to a soft sound than a sharp one -- Parkinson's tremor in a hum becomes vibrato in the cello voice. A morse-style rhythm gets staccato articulation but the agent may still choose "grand piano" if the pattern feels musical rather than percussive.&lt;/p&gt;

&lt;p&gt;The agent makes judgment calls. That is the point.&lt;/p&gt;




&lt;h2&gt;
  
  
  ◈ The Lilt format
&lt;/h2&gt;

&lt;p&gt;The agent outputs in Lilt -- a flat timestamp-based musical notation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;A3 ! soft   @ 0.00s
C4 ! normal @ 1.60s
E4 ! accent @ 2.80s
A2 ! soft   @ 0.00s   [synthesizer ambient]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each line: pitch, velocity flag, timestamp, optional voice override.&lt;/p&gt;

&lt;p&gt;The piano roll renders from this. The code is editable live. Change a velocity, shift a timestamp, swap a pitch, add a note. The synthesizer re-renders immediately. No new recording. No new API call.&lt;/p&gt;

&lt;p&gt;This is the feedback loop. The agent interprets. The user adjusts. The agent re-renders.&lt;/p&gt;




&lt;h2&gt;
  
  
  💎 Who the agent serves
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Profile&lt;/th&gt;
&lt;th&gt;What they give&lt;/th&gt;
&lt;th&gt;What the agent produces&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;💜 Non-verbal autism&lt;/td&gt;
&lt;td&gt;Sustained hum, single tone&lt;/td&gt;
&lt;td&gt;Cello or piano melody in that pitch&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;💙 Cerebral palsy&lt;/td&gt;
&lt;td&gt;Tremor-affected taps&lt;/td&gt;
&lt;td&gt;Percussive or piano rhythm&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🤍 ALS&lt;/td&gt;
&lt;td&gt;Minimal breath control&lt;/td&gt;
&lt;td&gt;Ambient drone with gentle melody over it&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;💛 Locked-in syndrome&lt;/td&gt;
&lt;td&gt;Single eye-blink switch click&lt;/td&gt;
&lt;td&gt;One-trigger composition, loops&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;💚 Quadriplegia&lt;/td&gt;
&lt;td&gt;Hard puff / soft puff contrast&lt;/td&gt;
&lt;td&gt;Two-dynamic melody: accent and soft&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🧡 Parkinson's&lt;/td&gt;
&lt;td&gt;Tremor vocal hum&lt;/td&gt;
&lt;td&gt;Cello composition that treats tremor as vibrato&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🩷 Apraxia of speech&lt;/td&gt;
&lt;td&gt;Broken phonation bursts&lt;/td&gt;
&lt;td&gt;Legato phrase bridging the silence between bursts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;💎 AAC / pre-verbal&lt;/td&gt;
&lt;td&gt;Rising or falling hum&lt;/td&gt;
&lt;td&gt;Interval-based melodic response&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🔵 Spinal cord injury C4&lt;/td&gt;
&lt;td&gt;Head tap on mic&lt;/td&gt;
&lt;td&gt;Beat-based composition from impact events&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;⚪ Selective mutism&lt;/td&gt;
&lt;td&gt;Barely audible breath&lt;/td&gt;
&lt;td&gt;Gentle drone that validates the smallest input&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The agent does not have a "minimum input" requirement. A breath at 0.02 RMS amplitude -- almost nothing -- produces a composition. This was a deliberate design decision. The quietest input a person can give must be enough.&lt;/p&gt;




&lt;h2&gt;
  
  
  ◈ 32 profiles tested
&lt;/h2&gt;

&lt;p&gt;The agent was validated against 32 real DSP profiles representing the disability communities it was built for.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Three difficulty levels:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Beginner     -- one event, one sound. proves the agent handles the minimum.
Intermediate -- 2-3 events, some rhythm or pitch shift.
Advanced     -- 4+ events, dynamics, intentional pattern.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Results across all 32 profiles:&lt;/strong&gt; 32 passed. 0 failed.&lt;/p&gt;

&lt;p&gt;Every result is a live Gemma 4 response -- no simulated data, no hardcoded fallback. The test suite fires real DSP payloads at the running Express server and logs every decision the agent made.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;node test-runner.mjs   &lt;span class="c"&gt;# run all 32 profiles yourself&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Full results in &lt;code&gt;test-results.json&lt;/code&gt; on GitHub.&lt;/p&gt;




&lt;h2&gt;
  
  
  ◈ Technical stack
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Gemma 4 (gemma-4-26b-a4b-it)   reasoning engine
Web Audio API                   perception layer + action layer (synthesis)
React + Vite + TypeScript       frontend / state machine
Express + @google/genai SDK     backend agent server
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The API key stays server-side. The browser never sees it.&lt;/p&gt;




&lt;h2&gt;
  
  
  ◈ How to run it
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/brookehoward2008-droid/Babbled-notes-v2.git
&lt;span class="nb"&gt;cd &lt;/span&gt;Babbled-notes-v2
npm &lt;span class="nb"&gt;install&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Add a free Gemini API key to &lt;code&gt;.env.local&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;GEMINI_API_KEY=your_key_here
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm run dev
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Open &lt;code&gt;http://localhost:3000&lt;/code&gt;. Allow microphone access. Tap the silver ring. Make any sound. Wait 30-60 seconds for Gemma 4 to reason. The music plays.&lt;/p&gt;

&lt;p&gt;No API key? The app runs in simulation mode -- the full UI and audio play back immediately.&lt;/p&gt;




&lt;h2&gt;
  
  
  ◈ Agent architecture (detailed)
&lt;/h2&gt;

&lt;p&gt;Full technical breakdown in &lt;a href="https://github.com/brookehoward2008-droid/Babbled-notes-v2/blob/main/HERMES.md" rel="noopener noreferrer"&gt;HERMES.md&lt;/a&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Perception layer: FFT signal chain, onset detector, DspDigest schema&lt;/li&gt;
&lt;li&gt;Reasoning layer: dual-input Gemma 4 call, Lilt Contract, JSON extraction&lt;/li&gt;
&lt;li&gt;Action layer: per-voice synthesis chains, scheduling via AudioContext&lt;/li&gt;
&lt;li&gt;Feedback loop: live Lilt editor, re-render without re-recording&lt;/li&gt;
&lt;li&gt;State machine: idle / recording / processing / playing&lt;/li&gt;
&lt;/ul&gt;




&lt;blockquote&gt;
&lt;p&gt;💎 &lt;em&gt;The gem crystallizes. The music plays. You made that.&lt;/em&gt;&lt;br&gt;
&lt;em&gt;You made that with a breath.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;p&gt;&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/brookehoward2008-droid/Babbled-notes-v2" rel="noopener noreferrer"&gt;https://github.com/brookehoward2008-droid/Babbled-notes-v2&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;Agent docs:&lt;/strong&gt; &lt;a href="https://github.com/brookehoward2008-droid/Babbled-notes-v2/blob/main/HERMES.md" rel="noopener noreferrer"&gt;https://github.com/brookehoward2008-droid/Babbled-notes-v2/blob/main/HERMES.md&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;by Brooke Chauntel&lt;/em&gt;&lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>hermeschallenge</category>
      <category>agents</category>
    </item>
    <item>
      <title>babbled notes: any sound becomes music. built for people who couldn't before.</title>
      <dc:creator>BABBLED77</dc:creator>
      <pubDate>Sat, 23 May 2026 22:58:41 +0000</pubDate>
      <link>https://dev.to/brookehoward2008droid/babbled-notes-turning-hums-taps-and-breath-into-editable-music-code-with-gemma-4-2i22</link>
      <guid>https://dev.to/brookehoward2008droid/babbled-notes-turning-hums-taps-and-breath-into-editable-music-code-with-gemma-4-2i22</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;💎 &lt;em&gt;One sound. Any sound. The gem listens. The music appears.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h1&gt;
  
  
  𝕓𝕒𝕓𝕓𝕝𝕖𝕕 𝕟𝕠𝕥𝕖𝕤
&lt;/h1&gt;

&lt;p&gt;&lt;strong&gt;Make any sound. Hum. Tap. Breathe. Whistle.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Gemma 4 finds the music inside it and plays it back as piano, cello, marimba, or drums.&lt;/p&gt;

&lt;p&gt;No keyboard. No music theory. No pitch-perfect voice.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Built for anyone who has ever felt shut out of making music.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;🔗 &lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/brookehoward2008-droid/Babbled-notes-v2" rel="noopener noreferrer"&gt;https://github.com/brookehoward2008-droid/Babbled-notes-v2&lt;/a&gt;&lt;br&gt;
🎵 &lt;strong&gt;Live app:&lt;/strong&gt; &lt;a href="https://ai.studio/apps/4d235490-15ac-47a5-9599-f82aa85a2b57" rel="noopener noreferrer"&gt;https://ai.studio/apps/4d235490-15ac-47a5-9599-f82aa85a2b57&lt;/a&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  ◈ The problem
&lt;/h2&gt;

&lt;p&gt;Most music tools require two hands, ten fingers, perfect pitch, or years of training.&lt;/p&gt;

&lt;p&gt;That shuts out a huge part of the world. People who are non-verbal. People with ALS, cerebral palsy, locked-in syndrome, quadriplegia, Parkinson's. People who have always heard music inside them and had no way to get it out.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;babbled notes gives them a door.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A single breath. A tongue click. A finger tap. A hum with a tremor in it.&lt;/p&gt;

&lt;p&gt;The app takes whatever you can give and turns it into a real musical composition, rendered in real time by a synthesized instrument of your choice.&lt;/p&gt;


&lt;h2&gt;
  
  
  💎 The NeuralGem
&lt;/h2&gt;

&lt;p&gt;At the center of the app is the &lt;strong&gt;NeuralGem&lt;/strong&gt;, a canvas visualizer with three states:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;IDLE       →  a breathing silver ring. waiting.
RECORDING  →  a crystallizing polygon. sides grow with your audio level.
             color shifts purple → cyan as the sound builds.
LOCKED     →  a hexagon. facets lit in your mood color.
             the gem has heard you.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The gem is not decoration. It tells you what the app is doing without words.&lt;/p&gt;




&lt;h2&gt;
  
  
  ◈ Who it is built for
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Profile&lt;/th&gt;
&lt;th&gt;What they give&lt;/th&gt;
&lt;th&gt;What they get&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;💜 Non-verbal autism&lt;/td&gt;
&lt;td&gt;Sustained hum, single tone&lt;/td&gt;
&lt;td&gt;Cello or piano melody&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;💙 Cerebral palsy&lt;/td&gt;
&lt;td&gt;Tremor-affected taps&lt;/td&gt;
&lt;td&gt;Percussive rhythm, drum or marimba&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🤍 ALS&lt;/td&gt;
&lt;td&gt;Minimal breath&lt;/td&gt;
&lt;td&gt;Ambient drone pad with gentle melody&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;💛 Locked-in syndrome&lt;/td&gt;
&lt;td&gt;Single eye-blink switch click&lt;/td&gt;
&lt;td&gt;One-trigger composition, looping&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;💚 Quadriplegia&lt;/td&gt;
&lt;td&gt;Hard puff / soft puff&lt;/td&gt;
&lt;td&gt;Two-dynamic melody: accent and soft&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🧡 Parkinson's&lt;/td&gt;
&lt;td&gt;Tremor vocal hum&lt;/td&gt;
&lt;td&gt;Composition that treats tremor as vibrato&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🩷 Apraxia of speech&lt;/td&gt;
&lt;td&gt;Broken phonation bursts&lt;/td&gt;
&lt;td&gt;Legato phrase bridging the gaps&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;💎 AAC / pre-verbal&lt;/td&gt;
&lt;td&gt;Rising or falling hum&lt;/td&gt;
&lt;td&gt;Interval-based melodic response&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  ◈ How it works
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1.  TAP THE ORB      →  microphone opens
2.  MAKE A SOUND     →  Web Audio API captures + analyzes in real time
                        (FFT pitch, RMS amplitude, onset detection)
3.  TAP AGAIN        →  recording stops
4.  GEMMA 4 READS    →  receives audio + DSP digest simultaneously
                        returns: mood, voice, articulation, Lilt score
5.  THE GEM LOCKS    →  mood-colored hexagon appears
6.  MUSIC PLAYS      →  synthesized instrument renders the Lilt score
7.  EDIT ANYTIME     →  piano roll + live Lilt code editor, re-render without re-recording
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  💎 Why Gemma 4
&lt;/h2&gt;

&lt;p&gt;The app sends two things to the model at once:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Raw audio&lt;/strong&gt;: the actual recorded sound&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DSP digest&lt;/strong&gt;: structured analysis of onset times, dominant frequency, pitch name, amplitude, tempo estimate&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Gemma 4 (&lt;code&gt;gemma-4-26b-a4b-it&lt;/code&gt;) reads both together and returns fast enough that a user with ALS or limited stamina hears their composition without waiting. That responsiveness matters. A slow model breaks the experience.&lt;/p&gt;

&lt;p&gt;The system prompt enforces a strict JSON Lilt score every time. No freeform text. No guessing.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mood"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"gentle"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"articulation"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"legato"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"voice"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"cinematic cello"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"notes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"note"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"A3"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"duration"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;1.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"velocity"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"soft"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"time"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"note"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"C4"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"duration"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"velocity"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"normal"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"time"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;1.2&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"explanation"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"A slow exhale, barely a sound. But steady. Like resolve."&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  ◈ Disability profiles tested
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;32 real DSP profiles. 7 disability categories. 3 difficulty levels.&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Beginner: one event, one sound, one note&lt;br&gt;
Intermediate: 2-3 events, some rhythm or pitch shift&lt;br&gt;
Advanced: 4+ events, dynamics, intentional pattern&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;NV-01  Autism — slow exhale breath         (beginner)
NV-02  Autism — single sustained hum       (beginner)
NV-03  Autism — two-tone hum shift         (intermediate)
NV-04  Autism — melodic hum phrase         (advanced)
NV-05  Apraxia — disrupted single vowel    (beginner)
NV-06  Apraxia — broken phonation bursts   (intermediate)
NV-07  Apraxia — vowel glide attempt       (advanced)
NV-08  Selective mutism — barely audible   (beginner)
NV-09  Selective mutism — nose exhale      (intermediate)
PH-01  Cerebral palsy — single finger tap  (beginner)
PH-02  Cerebral palsy — tremor cluster     (intermediate)
PH-03  Cerebral palsy — intentional beat   (advanced)
PH-04  ALS — minimal breath control        (beginner)
PH-05  ALS — pulsed breath pattern         (intermediate)
PH-06  Locked-in — single switch click     (beginner)
PH-07  Locked-in — two-click phrase        (intermediate)
PH-08  Locked-in — morse-style rhythm      (advanced)
PH-09  Quadriplegia — single breath puff   (beginner)
PH-10  Quadriplegia — hard/soft contrast   (intermediate)
PH-11  Quadriplegia — rhythmic phrase      (advanced)
PH-12  Parkinson's — tremor hum            (beginner)
PH-13  Parkinson's — vocal tremor melody   (advanced)
MX-01  Whistle — single clear pitch        (beginner)
MX-02  Whistle — two-note call             (intermediate)
MX-03  Whistle — pentatonic phrase         (advanced)
MX-04  Tongue click — single event         (beginner)
MX-05  Tongue click — 4/4 rhythm           (intermediate)
MX-06  Tongue click — syncopated groove    (advanced)
MX-07  AAC — rising hum intention          (intermediate)
MX-08  AAC — call and response             (advanced)
MX-09  SCI C4 — head tap                   (beginner)
MX-10  SCI C4 — two-tap intentional gap    (intermediate)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run them yourself: &lt;code&gt;node test-runner.mjs&lt;/code&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  ◈ Stack
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Gemma 4 (gemma-4-26b-a4b-it)   multimodal audio + DSP digest to Lilt JSON
Web Audio API                   mic capture, FFT/RMS DSP, synthesized playback
React + Vite + TypeScript       frontend
Express + @google/genai SDK     backend (API key stays server-side)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  💎 What the Lilt format looks like
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;A3 ! soft   @ 0.00s
C4 ! normal @ 1.20s
E4 ! accent @ 2.10s
G4 ! soft   @ 3.40s
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each line is a note trigger: pitch, velocity, timestamp. The piano roll renders from this. The code is editable live. Change a velocity, move a timestamp, swap a note, hit compile. The music changes without re-recording.&lt;/p&gt;




&lt;blockquote&gt;
&lt;p&gt;💎 &lt;em&gt;The gem crystallizes. The music plays. You made that.&lt;/em&gt;&lt;br&gt;
&lt;em&gt;You made that with a breath.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;p&gt;&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/brookehoward2008-droid/Babbled-notes-v2" rel="noopener noreferrer"&gt;https://github.com/brookehoward2008-droid/Babbled-notes-v2&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;Live app:&lt;/strong&gt; &lt;a href="https://ai.studio/apps/4d235490-15ac-47a5-9599-f82aa85a2b57" rel="noopener noreferrer"&gt;https://ai.studio/apps/4d235490-15ac-47a5-9599-f82aa85a2b57&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;by Brooke Chauntel&lt;/em&gt;&lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>gemmachallenge</category>
      <category>gemma</category>
    </item>
  </channel>
</rss>
