Building an AI Music Assistant with Sonic Pi and MCP
Ever wanted to say, “Play me a funky bassline in C minor”, and have your computer just do it?
That’s exactly what this project is about. We're merging the expressive world of music with the precise world of code — and then adding AI to the mix. By combining Sonic Pi (a live-coded music studio) with the Model Context Protocol (MCP), we’re building an interface where natural language becomes actual sound.
Why This Matters
If you live at the crossroads of code and creativity (hi, welcome), you know the frustration of context-switching between the rigid syntax of programming and the freeform weirdness of making art. Sonic Pi already lets us write music like we write code — but what if our tools could understand us when we speak in musical ideas?
Imagine: an assistant that knows music theory and your synth settings. That can jam with you. That can help you experiment, explore, and play.
The Building Blocks
Sonic Pi: Our Musical Engine
# This is what Sonic Pi understands
use_synth :fm
play :C4
sleep 0.5
play :E4
MCP: The AI Communication Layer
// This is what AI assistants understand
{
"jsonrpc": "2.0",
"method": "play_note",
"params": {
"note": "C4",
"synth": "fm",
"attack": 0.1
}
}
Our job is to build the bridge — letting AI express music in a format Sonic Pi understands, and vice versa.
The Architecture
Think of the system like a band with three key roles:
- The Interpreter (MCP Layer)
- Parses natural language
- Translates intent into structured musical actions
- Handles real-time AI interaction
- The Conductor (Core Logic)
- Maintains musical state
- Coordinates rhythm, timing, and structure
- Converts music theory into actual notes
- The Performer (OSC Layer)
- Talks directly to Sonic Pi
- Sends commands like `play :C4`
- Tracks playback in real-time
pub struct SonicPiMCP {
interpreter: MCPServer,
conductor: MusicState,
performer: SonicPiClient,
}
Building the Brain
Here’s where it gets spicy: we need to give the assistant tools that actually understand music.
#[tool(description = "Create a chord progression")]
async fn create_progression(&self,
#[tool(param)] key: String,
#[tool(param)] progression: Vec<String>,
#[tool(param)] style: Option<String>
) -> Result<String> {
let chord_sequence = music_theory::resolve_progression(&key, &progression)?;
let pattern = if let Some(style) = style {
StylePattern::from_name(&style)?
} else {
StylePattern::default()
};
self.performer.play_pattern(chord_sequence, pattern)
}
This isn’t just code generation — it’s musical composition, dynamically shaped by style, key, and feel.
Making It Musical
Groove. Swing. Imperfection. These are what make music feel real.
pub struct MusicalParameters {
tempo: f32,
velocity: u8,
groove: Option<GroovePattern>,
feel: Option<Feel>,
swing: f32,
humanize: f32,
}
impl MusicalParameters {
pub fn with_feel(mut self, feel: &str) -> Result<Self> {
match feel.to_lowercase().as_str() {
"laid_back" => {
self.swing = 0.67;
self.humanize = 0.2;
self.velocity = 90;
},
"aggressive" => {
self.swing = 0.52;
self.humanize = 0.1;
self.velocity = 110;
},
_ => Err(Error::UnknownFeel(feel.to_string()))
}
Ok(self)
}
}
These subtle tweaks make the AI feel less like a metronome and more like a collaborator.
Real-Time Feedback
Now we’re getting into live performance territory. The assistant can listen and respond in real time.
impl SonicPiClient {
pub async fn monitor_playback(&self) -> Result<()> {
let mut cue_receiver = self.create_cue_receiver()?;
while let Some(cue) = cue_receiver.next().await {
match cue.path.as_str() {
"/beat" => self.update_timing(cue.args)?,
"/synth/started" => self.note_on(cue.args)?,
"/synth/finished" => self.note_off(cue.args)?,
_ => debug!("Unknown cue: {:?}", cue),
}
}
Ok(())
}
}
This is the real magic: feedback loops. The system hears itself and adapts.
The Magic Moment
When everything works, it feels like this:
You: "Create a chill lofi beat in G minor"
[AI calls:]
- create_drum_loop(style="lofi", tempo=85)
- create_chord_progression("Gm", ["i", "III", "VII", "v"])
- add_effect("vinyl_crackle", mix=0.3)
[Sonic Pi starts vibing with a mellow beat]
You: "Add some jazzy piano"
[AI layers in tasteful piano voicings, following the chord structure...]
That’s not just generative music. That’s collaboration.
Try It Yourself
Want to get weird with it? Here’s your on-ramp:
git clone https://github.com/TrippingKelsea/sonic-pi-mcp
cd sonic-pi-mcp
cargo build --release
./target/release/sonic-pi-mcp
- Make sure you’ve got Sonic Pi installed
- Point your AI assistant at the MCP server
- Start jamming
What’s Next?
This is barely scratching the surface. Imagine:
- AI jam sessions that evolve in real-time
- A musical copilot that suggests reharmonizations
- Ambient sound installations powered by GPT
- Teaching kids music theory with code and conversation
- Real-time backing for live musicians (stay tuned for the next post!)
Join the Experiment
This project lives at the messy, magical intersection of code, sound, and intelligence. It’s weird, it’s delightful, and it wants your brain on it.
Whether you’re a musician who dabbles in code, or a coder who dreams in 7ths and swing — you belong here.
For more information, check out:
Top comments (0)