DEV Community

Cover image for Building an AI Music Assistant with Sonic Pi and MCP
Kelsea Blackwell
Kelsea Blackwell

Posted on

Building an AI Music Assistant with Sonic Pi and MCP

Building an AI Music Assistant with Sonic Pi and MCP

Ever wanted to say, “Play me a funky bassline in C minor”, and have your computer just do it?

That’s exactly what this project is about. We're merging the expressive world of music with the precise world of code — and then adding AI to the mix. By combining Sonic Pi (a live-coded music studio) with the Model Context Protocol (MCP), we’re building an interface where natural language becomes actual sound.

Why This Matters

If you live at the crossroads of code and creativity (hi, welcome), you know the frustration of context-switching between the rigid syntax of programming and the freeform weirdness of making art. Sonic Pi already lets us write music like we write code — but what if our tools could understand us when we speak in musical ideas?

Imagine: an assistant that knows music theory and your synth settings. That can jam with you. That can help you experiment, explore, and play.

The Building Blocks

Sonic Pi: Our Musical Engine

# This is what Sonic Pi understands
use_synth :fm
play :C4
sleep 0.5
play :E4
Enter fullscreen mode Exit fullscreen mode

MCP: The AI Communication Layer

// This is what AI assistants understand
{
  "jsonrpc": "2.0",
  "method": "play_note",
  "params": {
    "note": "C4",
    "synth": "fm",
    "attack": 0.1
  }
}
Enter fullscreen mode Exit fullscreen mode

Our job is to build the bridge — letting AI express music in a format Sonic Pi understands, and vice versa.

The Architecture

Think of the system like a band with three key roles:

  1. The Interpreter (MCP Layer)
- Parses natural language

- Translates intent into structured musical actions

- Handles real-time AI interaction
Enter fullscreen mode Exit fullscreen mode
  1. The Conductor (Core Logic)
- Maintains musical state

- Coordinates rhythm, timing, and structure

- Converts music theory into actual notes
Enter fullscreen mode Exit fullscreen mode
  1. The Performer (OSC Layer)
- Talks directly to Sonic Pi

- Sends commands like `play :C4`

- Tracks playback in real-time
Enter fullscreen mode Exit fullscreen mode
pub struct SonicPiMCP {
    interpreter: MCPServer,
    conductor: MusicState,
    performer: SonicPiClient,
}
Enter fullscreen mode Exit fullscreen mode

Building the Brain

Here’s where it gets spicy: we need to give the assistant tools that actually understand music.

#[tool(description = "Create a chord progression")]
async fn create_progression(&self,
    #[tool(param)] key: String,
    #[tool(param)] progression: Vec<String>,
    #[tool(param)] style: Option<String>
) -> Result<String> {
    let chord_sequence = music_theory::resolve_progression(&key, &progression)?;

    let pattern = if let Some(style) = style {
        StylePattern::from_name(&style)?
    } else {
        StylePattern::default()
    };

    self.performer.play_pattern(chord_sequence, pattern)
}
Enter fullscreen mode Exit fullscreen mode

This isn’t just code generation — it’s musical composition, dynamically shaped by style, key, and feel.

Making It Musical

Groove. Swing. Imperfection. These are what make music feel real.

pub struct MusicalParameters {
    tempo: f32,
    velocity: u8,
    groove: Option<GroovePattern>,
    feel: Option<Feel>,
    swing: f32,
    humanize: f32,
}

impl MusicalParameters {
    pub fn with_feel(mut self, feel: &str) -> Result<Self> {
        match feel.to_lowercase().as_str() {
            "laid_back" => {
                self.swing = 0.67;
                self.humanize = 0.2;
                self.velocity = 90;
            },
            "aggressive" => {
                self.swing = 0.52;
                self.humanize = 0.1;
                self.velocity = 110;
            },
            _ => Err(Error::UnknownFeel(feel.to_string()))
        }
        Ok(self)
    }
}
Enter fullscreen mode Exit fullscreen mode

These subtle tweaks make the AI feel less like a metronome and more like a collaborator.

Real-Time Feedback

Now we’re getting into live performance territory. The assistant can listen and respond in real time.

impl SonicPiClient {
    pub async fn monitor_playback(&self) -> Result<()> {
        let mut cue_receiver = self.create_cue_receiver()?;

        while let Some(cue) = cue_receiver.next().await {
            match cue.path.as_str() {
                "/beat" => self.update_timing(cue.args)?,
                "/synth/started" => self.note_on(cue.args)?,
                "/synth/finished" => self.note_off(cue.args)?,
                _ => debug!("Unknown cue: {:?}", cue),
            }
        }
        Ok(())
    }
}
Enter fullscreen mode Exit fullscreen mode

This is the real magic: feedback loops. The system hears itself and adapts.

The Magic Moment

When everything works, it feels like this:

You: "Create a chill lofi beat in G minor"

[AI calls:]
- create_drum_loop(style="lofi", tempo=85)
- create_chord_progression("Gm", ["i", "III", "VII", "v"])
- add_effect("vinyl_crackle", mix=0.3)

[Sonic Pi starts vibing with a mellow beat]

You: "Add some jazzy piano"

[AI layers in tasteful piano voicings, following the chord structure...]
Enter fullscreen mode Exit fullscreen mode

That’s not just generative music. That’s collaboration.

Try It Yourself

Want to get weird with it? Here’s your on-ramp:

git clone https://github.com/TrippingKelsea/sonic-pi-mcp
cd sonic-pi-mcp
cargo build --release
./target/release/sonic-pi-mcp
Enter fullscreen mode Exit fullscreen mode
  • Make sure you’ve got Sonic Pi installed
  • Point your AI assistant at the MCP server
  • Start jamming

What’s Next?

This is barely scratching the surface. Imagine:

  • AI jam sessions that evolve in real-time
  • A musical copilot that suggests reharmonizations
  • Ambient sound installations powered by GPT
  • Teaching kids music theory with code and conversation
  • Real-time backing for live musicians (stay tuned for the next post!)

Join the Experiment

This project lives at the messy, magical intersection of code, sound, and intelligence. It’s weird, it’s delightful, and it wants your brain on it.

Whether you’re a musician who dabbles in code, or a coder who dreams in 7ths and swing — you belong here.

For more information, check out:

Top comments (0)