Bondi Sonoro: A Build Log of Real Data, Generative Music, and the MTA.me Mechanic
Live demo: bondi-sonoro.vercel.app
Code: github.com/JuanTorchia/bondi-sonoro
Previous chapter (trains, static schedules): amba-trenes-sonoros.vercel.app
Why this post is longer than the last one
When I published AMBA Trenes Sonoros, I closed it with something that now sounds almost prophetic to me:
"If the Ministry ever opens a real-time feed, swapping the source is 10 lines of code."
Spoiler: it's not ten. It's several thousand. In between there are architectural decisions, weird bugs, a post-mortem of a Vercel deploy that broke over a PolySynth<any>, two complete rewrites of the sonification engine, and one exact moment where what sounded like a metronome turned into music.
This post is the complete build log of chapter 2. I want to cover not just what I built, but what I tried, what broke, and why the decisions landed the way they did. If you ever thought a "creative" project was just making an output look pretty — this is the opposite of that. It's architecture, all the way down.
Step one: hunting the data
The first thing that had changed since the trains post was that a reader had replied:
"Buses do have real-time. Look it up."
So I looked it up. I ended up at api-transporte.buenosaires.gob.ar. Turns out the Buenos Aires City Government has been publishing a public transport API for years, with:
- Live GPS positions for every bus in CABA and the greater Buenos Aires metro area.
- Arrival predictions per stop.
- Operational alerts.
- Static GTFS with official routes.
- Full GTFS-RT (protobuf feed).
The weird part: Google Maps and Moovit both use it in production, but outside the transit-tech world almost nobody seems to build new things on top of it. The friction is minimal — you sign up, they email you a free client_id and client_secret, and you're off.
A raw first request to the vehiclePositionsSimple endpoint:
curl "https://apitransporte.buenosaires.gob.ar/colectivos/vehiclePositionsSimple?client_id=XXX&client_secret=YYY"
Response: 1.1 MB of JSON, 3,197 active vehicles. Each one with:
{
"route_id": "764",
"latitude": -34.78668,
"longitude": -58.249,
"speed": 9.72,
"timestamp": 1776129272,
"id": "1881",
"direction": 0,
"agency_name": "MICRO OMNIBUS QUILMES S.A.C.I. Y F.",
"agency_id": 72,
"route_short_name": "159C",
"trip_headsign": "a Est. Lanus x Gimnasia"
}
I had the data. Now I had to decide what story to tell with it.
The inspiration, revisited
Conductor by Alexander Chen from 2011 is the unavoidable reference. Each NYC subway line is drawn as a string stretched between stations. When a train leaves a station, that station "plucks" the string connecting it to the next one — the string vibrates, you hear a note, and the next train responds on another part of the network.
The collective effect is emergent music: nobody composes it, it just falls out of traffic. And the most beautiful part is that the crossings matter. When two lines meet at a transfer point, both strings interact. Counterpoint without a score.
I made two decisions before writing a single line of code:
- Back to the original aesthetic. Strings, not dots. Strings that vibrate. Strings that ring when others cross them.
- Make it feel like a game. Black background, neon, subtle scanlines. No Google Maps-style basemap. The map is an instrument, not a GPS.
The first decision that shapes everything else
I could've built this as a pure SPA, hitting the API from the browser with credentials baked in. Plenty of people do that. It's wrong.
My reasoning:
- The GCBA credentials are free but personal. Exposing them in the client turns them into potential abuse victims — even accidentally. A server-side proxy keeps them in one place.
- The raw feed weighs 1.1 MB and covers every bus in the greater metro area. I only wanted CABA. If the client downloads the full feed, I'm just torching bandwidth.
- Polling from many browsers at once would hammer the GCBA upstream. With a proxy + cache, a thousand of my users look like one request to them.
So: Next.js with App Router and a Route Handler as a proxy. The client hits /api/positions, the server is the only one that knows the credentials, filters the payload, and caches it.
// app/api/positions/route.ts
export const revalidate = 30;
export async function GET() {
const url = `https://apitransporte.buenosaires.gob.ar/colectivos/vehiclePositionsSimple?client_id=${process.env.BA_TRANSPORT_CLIENT_ID}&client_secret=${process.env.BA_TRANSPORT_CLIENT_SECRET}`;
const res = await fetch(url, { next: { revalidate: 30 } });
const upstream: UpstreamVehicle[] = await res.json();
const filtered = upstream
.filter(v => CURATED_PREFIXES.has(prefixOf(v.route_short_name)))
.map(v => ({
id: v.id,
lineShort: prefixOf(v.route_short_name),
lat: v.latitude,
lon: v.longitude,
speed: v.speed,
direction: v.direction,
headsign: v.trip_headsign,
timestamp: v.timestamp,
}));
return NextResponse.json(
{ generatedAt: Date.now(), vehicles: filtered },
{ headers: { "Cache-Control": "public, s-maxage=30, stale-while-revalidate=60" } }
);
}
What comes out of the proxy is no longer 1.1 MB — it's ~30 KB. The revalidate: 30 combined with s-maxage=30 makes Next cache the response on Vercel Edge for 30 seconds, so the GCBA gets exactly one fetch from me every 30 seconds regardless of how many users I have.
The routes: static GTFS and the 200 MB zip
The real-time feed gives me positions, but it doesn't draw routes. To have "strings," I need the shapes.txt from the CABA static GTFS.
The dataset lives at data.buenosaires.gob.ar/dataset/colectivos-gtfs. A zip with routes.txt, trips.txt, shapes.txt, and friends. I downloaded it.
curl -L -o /tmp/colectivos.zip "https://cdn.buenosaires.gob.ar/.../colectivos-gtfs.zip"
# 209 MB
Two hundred and nine megabytes. Running that on every Vercel build would be a terrible idea. And it's semi-static anyway — routes change rarely. Decision:
- I run the parser manually on my machine with
pnpm gtfs:fetch. - The script extracts, simplifies to ~200 points per line (lite Douglas-Peucker), projects to the lines I care about, and writes
data/routes.json(~250 KB). - The JSON gets committed to the repo.
- Vercel reads that JSON and never touches the internet during build.
This has a name: "data as build artifact". When the source changes slowly and the app changes fast, there's no reason to make the build depend on the network.
First bug I didn't expect
My curated list had 20 iconic lines: 60, 152, 29, 7, 39, 132, etc. I ran the parser:
[routes] couldn't find route_id for line 60
[routes] couldn't find route_id for line 152
[routes] couldn't find route_id for line 29
...
What? I grepped routes.txt:
"152","16","21A","JNAMBA021","Ejercito de los Andres - Rotonda Dardo Rocha Tigre",3
The actual route_short_name is 21A, 96AG, 621R9, etc. These are variant/branch IDs. "Line 60" in the Buenos Aires sense splits into dozens of sub-routes with suffixes. The human name "60" just doesn't exist as-is.
A more careful grep showed that variants follow the pattern <number><optional letter>:
10A 15A 17A 19A 20A 20B 23A 24A 24B 24C
29A 29B 29C 34A 37A 39A 39B 39F 42A
44A 45A 46A 50A 53A 53B 55A 56A 59A 59B 59D
60C 60F 60G 61A 64A 65A 67A 68A 68B
92A 92C 92D 101A 101B 101C 105A 108A
111B 111D 111E 132A 132B 132C 140A 140B 140C
151A 152A 152B 152C 160A
I fixed the matcher: for a curated line "152" I look for any route_short_name matching /^152[A-Z]?$/. I take the first variant that has an associated shape. Result:
[routes] ✓ 60: 201 points
[routes] ✓ 152: 201 points
[routes] ✓ 29: 201 points
...
[routes] ✓ parsed: 20/20 lines
Data loaded. On to act two.
Turning the city into sound
I needed to decide what note each line plays. Two golden rules:
Rule 1: Major pentatonic
Buses don't coordinate with each other. Every line fires notes independently. If I use a chromatic scale (with semitones), the probability of dissonance explodes with every simultaneous bus.
The major pentatonic (C, D, E, G, A) has zero semitone intervals between its notes. Any simultaneous combination sounds consonant. It's the same trick used in kindergarten xylophones: "no matter how you hit it, it never sounds ugly."
In distributed systems language: if you can't coordinate producers, design the protocol so that any message is valid. The pentatonic scale is the protocol that eliminates an entire category of musical bugs by design.
Rule 2: Karplus-Strong
Tone.js has a lot of synths. I went with PluckSynth because it implements the Karplus-Strong algorithm, the classic primitive for plucked-string synthesis. Mathematically it's a delay line with filtered feedback. What matters: it sounds exactly like a string being plucked.
// lib/sonify.ts
const pluck = new Tone.PluckSynth({
attackNoise: 0.8,
dampening: 3500,
resonance: 0.9,
});
// when a bus crosses:
pluck.triggerAttack(note);
Each line has its own PluckSynth routed through a shared reverb. The aesthetic coherence — code, audio, visual — starts with choosing the right primitives.
First attempt: buses as metronomes
First version: I drew the 20 strings in SVG, placed each bus on its nearest polyline, and every time a bus "progressed" enough, it plucked its own string.
// pseudo
if (bondiProgressed > THRESHOLD) {
pluck(bondi.line, note);
}
I hit play. Result: near-total silence, and then an annoying burst every 30 seconds.
What was happening? Two bugs stacked on top of each other:
- The threshold was compared per-frame, but the smoothing that moved the bus toward its new position only advanced 3.5% of the diff per frame. It never cleared the 0.5% threshold in a single frame.
- When the poll hit every 30s,
serverProgressjumped all at once → that 0.5% accumulated in one frame → all 20 buses fired simultaneously → one giant chord, then silence.
It was a metronome, not music.
Interim fix: per-vehicle accumulation
First pass: instead of comparing against the previous frame, compare against the last pluck for THAT specific bus. Let the small transitions accumulate.
const sinceLastPluck = Math.abs(state.progress - lastPluckProgress.get(state.id));
if (sinceLastPluck > PLUCK_DELTA) {
pluck(...);
lastPluckProgress.set(state.id, state.progress);
}
Better. It was making sound now. But still bursting every 30s. And something else was bothering me: each bus was plucking its own string — the exact opposite of what I wanted. I wanted crossings.
The "aha" moment: intersections
I re-read Conductor carefully. The string doesn't sound from its own movement, it sounds when another string crosses it. A train on line 4 passing through the station where it crosses line N plucks line N's string. Your own line does nothing. The music is a product of the network, not of each line in isolation.
That changes everything. It means:
- I need to precompute intersections between all the strings.
- When a bus advances and its position crosses an intersection point, I pluck the OTHER line at that point, not its own.
Implementation:
// lib/intersections.ts
export function buildIntersectionIndex(lines) {
const byLine = new Map<string, Intersection[]>();
for (let i = 0; i < lines.length; i++) {
for (let j = i + 1; j < lines.length; j++) {
const A = lines[i];
const B = lines[j];
// For each segment pair (A[a], A[a+1]) x (B[b], B[b+1])
// we compute the 2D intersection. If it exists, we store:
// - progress along A where it happens
// - progress along B where it happens
// - XY point on screen
// - cross-reference: when A crosses, B sounds
// when B crosses, A sounds
}
}
// Sort each line's intersections by progress
// for O(log n) range-scan when a bus advances.
for (const arr of byLine.values()) arr.sort((a, b) => a.progress - b.progress);
return { byLine };
}
For 20 lines × 20 lines / 2 = 190 pairs, each with ~200×200 segment combinations = ~7.6M operations. Runs in ~50ms on mount. After that, it gets used thousands of times per second with a simple range scan.
On each tick frame:
const crossed = intersectionsCrossed(index, bondiLine, previousProgress, newProgress);
for (const hit of crossed) {
// hit.other is the OTHER line. We pluck that one.
pluck(hit.other, noteOf(hit.other));
}
I hit play. That's when it sounded like I wanted. For the first time the map felt like an instrument.
The next problem: poll pulse
But there were still bursts every 30 seconds. Mental trace:
- Between polls: buses "advance" very little (slow smoothing).
- Poll arrives:
serverProgressjumps to the new position. - Smoothing now has a massive diff → in the next frame it moves A LOT → crosses many intersections → many plucks at once.
The bug was architectural: I was treating a position correction as if it were movement. They're two different things.
The fix was to split into two distinct phases:
// PHASE 1: real simulated advance, based on the speed reported by the feed.
// This is the ONLY phase that fires plucks.
const effectiveSpeed = Math.max(state.speed, DEFAULT_SPEED_MS);
const progressDelta = (effectiveSpeed * dt) / pLine.lengthMeters;
const sign = state.direction === 1 ? -1 : 1;
const simulatedProgress = state.progress + sign * progressDelta;
const crossed = intersectionsCrossed(index, line, state.progress, simulatedProgress);
// ...fire plucks...
// PHASE 2: correction toward serverProgress. Silent — fires no plucks.
const drift = state.serverProgress - simulatedProgress;
state.progress = simulatedProgress + drift * CORRECTION_RATE;
This has two beautiful effects:
- Buses move continuously even when the poll takes 30 seconds. The simulation advances them frame by frame using their reported speed and the actual length of their route.
- When the poll arrives, the correction is silent. The bus re-centers toward its real position at 2% per frame, without firing any plucks. The music keeps flowing.
It went from metronome to concert.
The final push: density
It was sounding good, but with 9-20 active buses it still felt sparse. A user actually told me: "it barely makes any sound, I'm hearing something every 10-15 seconds."
Two final moves:
Double the lines: from 20 to 40
More lines = more intersections with the same buses. I added 20 more trunk lines (15, 17, 19, 20, 23, 26, 34, 37, 42, 44, 45, 46, 50, 53, 55, 56, 64, 65, 105, 160). The data/routes.json file grew from 250 KB to ~500 KB — still a trivial payload.
Auto-pluck as a bass pulse
While intersections provide melody, I added a self-pluck every 1.2% of progress traveled. Low intensity (0.25–0.55 vs 0.5–1 for crossings). It comes across as a soft bass, a steady pulse underneath which the intersection events make shapes.
const advancedSinceSelf = Math.abs(simulatedProgress - state.lastSelfPluckProgress);
if (advancedSinceSelf > SELF_PLUCK_INTERVAL) {
if (canPluck(state.lineShort, now, 260)) {
const intensity = Math.max(0.25, Math.min(0.55, state.speed / 14));
engineRef.current?.pluck(state.lineShort, note, intensity);
}
state.lastSelfPluckProgress = simulatedProgress;
}
And finally, a global rate limit: maximum 12 plucks per second (rolling 1-second window). If there's a storm of simultaneous crossings, the excess gets dropped. The music stays dense but readable.
The final architecture, as a diagram
┌──────────────────────────────┐
│ Static GTFS (GCBA) │ 209MB zip, downloaded
│ routes / trips / shapes │ ONCE with pnpm gtfs:fetch
└──────────┬────────────────────┘
│
▼
┌──────────────────────────────┐
│ scripts/build-routes.ts │ Simplifies to 200 pts
│ numeric prefix matcher │ per line (40 lines)
└──────────┬────────────────────┘
│ writes JSON
▼
┌──────────────────────────────┐
│ data/routes.json (~500 KB) │ Committed to the repo
└──────────┬────────────────────┘
│ static import
▼
┌──────────────────────────────┐ ┌────────────────────────────┐
│ app/page.tsx (RSC) │────────▶ /api/positions (Route Handler) │
└──────────┬────────────────────┘ │ (server-side proxy with creds) │
│ └────────────┬───────────────────┘
▼ │ every 30s, cached
┌──────────────────────────────┐ ▼
│ PlayerShell (Client) │ ┌────────────────────────────┐
│ ├─ ConductorEngine (Tone.js) │◀──fetch──│ apitransporte.buenosaires │
│ ├─ StringsMap (SVG) │ 30s │ vehiclePositionsSimple │
│ └─ IntersectionIndex (memo) │ └────────────────────────────┘
└──────────┬────────────────────┘
│
├─ 30fps simulation based on reported speed
├─ crossing detection → pluck the crossed line
├─ auto-pluck every 1.2% self-progress
├─ global rate limit 12 plucks/s
└─ silent correction toward serverProgress
Key files
If you want to read the code, here are the hot spots:
-
lib/intersections.ts— the MTA.me mechanic: precomputes all crossings, exposesintersectionsCrossed(index, line, from, to). -
lib/projection.ts—makeProjector(lat/lon → SVG),nearestOnPolyline(snap bus to its route),polylineMeters(real length in meters to calibrate the simulation). -
lib/sonify.ts—ConductorEngine, onePluckSynthper line, shared reverb, mute/volume. -
components/strings-map.tsx— the heart: polling, simulation, crossing detection, SVG render, wobble visuals, pluck rings. -
app/api/positions/route.ts— the credentialed proxy.
Full pedagogical docs in /docs/arquitectura.md.
Real bugs, with commits
An honest list of what broke during development:
| Bug | Symptom | Fix | Commit |
|---|---|---|---|
| React 19 RC + framer-motion | Vercel deploy broken at npm install
|
Switch to React 19 stable + .npmrc legacy-peer-deps=true
|
trains: fix(deps) |
Tone.PolySynth<MetalSynth> not assignable |
TypeScript build error | Type the voice as PolySynth<any>
|
trains: fix(sonify) |
| GTFS fetch with no timeout | Vercel build hanging forever |
AbortController + 15s timeout |
trains: fix(build-gtfs) |
| GTFS URL returning 404 | [routes] responded 404 |
Follow redirects with curl -L, find the real CDN URL |
bondi: feat:... |
| Lines not matching | couldn't find route_id for line 60 |
Numeric prefix matcher + optional suffix | same |
| Plucks not firing | Total silence with few active buses | Compare against last-pluck-per-bus, not previous frame | same |
| Bursts every 30s | 20 notes firing at once on poll arrival | Separate simulation (→plucks) from correction (→silent) | same |
| Sparse music | Too sparse with ~20 buses | 40 lines + auto-pluck + global rate limit | same |
Every bug is a lesson. I left them visible in the repo — commit by commit, no cheating.
What I took from both chapters
Chapter 1 (Trenes Sonoros) taught me that when the ideal data doesn't exist, the work is adapting the problem to the available material — and saying so out loud. I made an honest piece with scheduled timetables.
Chapter 2 (Bondi Sonoro) taught me that when the ideal data does exist, the work is deciding what story to tell with it. And that architectural decisions are also aesthetic decisions: where the code runs, how data flows, what timbre you pick, what scale you use — it's all part of the same work.
Both chapters are part of the same craft: reading the data you have and deciding what to say with it. Sometimes you work with little and make it sound full. Sometimes you work with plenty and make it sound meaningful.
What's still open
- v3 with live shapes: the GCBA also publishes the full GTFS-RT in protobuf format, with richer signal (delays, cancellations). Consuming it as protobuf instead of simplified JSON would give access to events I'm not sonifying yet.
- Sympathetic resonance between intersections: when line A plucks line B, have line B lightly pluck line C if they're very close. A second layer of emergent reverberation.
- Recording + export: let the user hit "record" and generate a WAV of N minutes as a unique musical piece from that exact moment in the city.
- Other cities: Rosario, Córdoba, Mendoza all have static GTFS. If any of them ever publish a public GTFS-RT, the code is ready to go.
- Single-line mode: isolate the 60 or the 152 and listen to its own song across the day.
All of it lives in the mental backlog.
Closing thought
This project didn't make me money. It didn't go viral. It took more hours than I should probably admit. But there's one thing I took from it that applies directly to serious paid work:
The most instructive projects are the ones nobody asked for.
When there's no deliverable, there's no scope creep, no "just ship something that works." There's only you, the problem, and decisions made slowly. These experiments are where you sharpen the craft. Then you use it on the real job.
If you code, clone the repo, try changing the scale, add a line, fork it for your city. The code is MIT, the data belongs to the Argentine state, the music is collective, and the learning is yours.
Useful links
- 🎧 Demo: bondi-sonoro.vercel.app
- 💻 Code: github.com/JuanTorchia/bondi-sonoro
- 🚂 Chapter 1 (trains): amba-trenes-sonoros.vercel.app · repo
- 📡 GCBA API: apitransporte.buenosaires.gob.ar/console/
- 📜 Route data: CC-BY 2.5 AR / Gobierno de la Ciudad
- 🏙️ Original inspiration: Conductor (mta.me) by Alexander Chen
- 🎼 Tone.js: tonejs.github.io
- 🧮 @turf/turf: turfjs.org
Top comments (0)