DEV Community: Arjun

I reverse-engineered my motorcycle's Bluetooth protocol to put Google Maps on the dashboard

Arjun — Thu, 18 Jun 2026 03:41:11 +0000

My motorcycle has a Bluetooth instrument cluster. It pairs with the manufacturer's phone app and shows turn-by-turn navigation right on the dash, which sounds great until you actually use it. The nav is routed through a maps provider I don't love, the app is clunky, and there's no way to extend any of it.

I kept thinking: it's just my bike talking to my phone over Bluetooth. How locked down can it really be? So one weekend I decided to find out, and a few weeks later I had Google Maps navigation running on the cluster through an app I wrote myself.

Here's how that went.

There are no docs

Of course there aren't. It's a proprietary protocol, and the only reference that exists is the manufacturer's own app, in compiled form. So step one was just watching.

I started with a GATT walk on the live bike, which is the Bluetooth equivalent of knocking on every door to see what's there. The cluster exposes one vendor service with two characteristics: one the phone writes to, one the bike sends notifications back on. That's the entire conversation surface.

Then I captured the actual bytes going across. Android can log every Bluetooth packet through its HCI snoop log, so I paired the phone with the bike, rode around, and pulled the capture. Now I had real traffic, and absolutely no idea what any of it meant.

Reading the app to read the protocol

You can stare at hex forever and still guess wrong. The faster path was the app itself. I pulled the APK, ran it through JADX to decompile it, and got something close to readable source. Most of the class names weren't even obfuscated, which was a gift.

From there it was cross-referencing: take a message I saw on the wire, find the code that builds it, and work out what each byte is. Frida helped a lot here. It lets you hook a running app and watch functions get called with their real arguments, so I could catch the exact moment the app turned "next turn is a left in 200m" into bytes and shipped them to the bike.

Slowly the shape came out. Every message is exactly 30 bytes: a fixed header byte, an ASCII character for the message type, a body, a checksum, and a terminator. I confirmed the checksum the right way, by finding the function that computes it in the decompiled source, instead of reverse-guessing it from samples. The bike turned out to be response-driven too. It sends nothing until the phone writes to it first, which is why a passive listener captures total silence and had me convinced the thing was dead the first time I tried.

Being wrong, on the record

The part I'm actually proud of isn't the protocol. It's how often I was wrong along the way and caught it.

Early on, a Bluetooth analyzer confidently labeled the bike's characteristics as a known digital-key security spec. I almost wrote that down as fact. It was just the tool pattern-matching on a UUID and guessing. The real thing was a plain vendor service with nothing fancy about it.

I also assumed for a while that the bike had its own SIM and was quietly phoning telemetry home to the manufacturer's cloud, which shaped a whole chunk of my thinking about where data came from. Then I actually checked the hardware and the spec sheet. No SIM. The bike talks to nothing but the paired phone. That one assumption had been steering me wrong for days.

So I started keeping a running log of every claim: what I assumed, what turned out to be true, and what evidence corrected it. Reverse engineering is mostly the discipline of not believing yourself too early.

Then I built the app

Once the protocol was understood, the fun part. REDLINE is an Android app in Kotlin and Jetpack Compose. It intercepts Google Maps turn-by-turn notifications, encodes each maneuver into the cluster's frame format, and writes it over Bluetooth, so the dash shows Google's directions instead of the stock app's. On top of that it reads the telemetry the bike emits and turns it into a live dashboard, records every ride with stats and a speed graph, exports trips, and renders a clock to the cluster when nav is idle.

It's about 14k lines, 205 tests, and runs entirely on the phone with no account and no cloud. The same frame inspector I used to reverse the protocol ships inside the app, because the work isn't really finished.

What it taught me

The protocol was never the hard part. The hard part was staying honest: treating the analyzer's label as a hypothesis, checking the hardware instead of trusting the spec in my head, writing down the wrong turns so I wouldn't repeat them. That habit has quietly made me better at regular software work too, where the bug is almost never where you first assume it is.

If you want the full breakdown, the frame formats, the tooling, and the walked-back assumptions, it's all written up here: https://www.arjunp.pro/projects/suzuki-connect-re.html

Stop trusting ‘looks about right’: I gave my AI agent a way to verify its UI against Figma

Arjun — Wed, 17 Jun 2026 16:47:51 +0000

I do a lot of UI work, and like a lot of people lately I've been letting an AI agent take the first pass. Point it at a Figma file, let it write the components, come back to something that's 90% there. On a good day that's a huge time save.

The problem is the other 10%, and where it hides.

It's never an obvious break. It's the padding that's 12px instead of 16. A font weight that's 500 where the design says 600. A border radius that's a couple pixels off. A gradient that starts at the wrong stop. Each one tiny, but together they're the difference between "looks like the design" and "looks like someone who sort of saw the design once." And the only way I was catching any of it was opening the Figma frame and the browser side by side and squinting back and forth like it's a spot-the-difference puzzle.

That got old fast. What bugged me most was that the agent had no idea it was wrong. It would read the design, write the code, and confidently tell me it matched. It couldn't check its own work. Every Figma tool I tried could feed it data about a node, but none of them could answer the actual question: does the thing you just built look like the thing the designer drew?

So I stopped squinting and built the missing piece. It's a local tool called figma-connect, and the part I care about is one function: verify_node.

The actual idea

verify_node takes the code the agent wrote, renders it in a real browser, and compares it pixel for pixel against the live Figma node. Pass or fail, with the diff image attached. That's it. The agent finally has a mirror.

There's a read side too (it can pull geometry, auto-layout, fills, type, tokens, components, the usual), but honestly the read part is table stakes. Plenty of tools do that. The verify part is the bit I hadn't seen anywhere, and it's the bit that changed how the agent behaves.

The whole thing runs on my laptop. Browser Figma works, so there's no desktop app, no cloud API, and no design files leaving my machine.

How the check actually works

Give it a node id and the candidate code. It mounts the code in headless Chromium with Playwright, exports the matching node from Figma, and diffs them. I run three different comparisons at once, because I tried each one alone first and each one lied to me in its own way.

Raw pixel diffing catches the most: the 4px shift, the wrong radius, the moved gradient. But it's hysterical about a one-pixel global offset, screaming that everything's broken when the whole thing just nudged sideways. So I layered SSIM on top, which scores structural similarity and tracks closer to what a human would actually call "close." And then a text and accessibility pass with axe-core, because more than once I had a render that was pixel-perfect and had quietly dropped a label or lowercased a heading. Looking right and being right are not the same thing, and I learned that the annoying way.

The output is a labeled EXPECTED / ACTUAL / DIFF image. I did that on purpose. A bare similarity score is something an agent will happily rationalize ("0.94, close enough!"). A picture of exactly what's wrong is not.

The real win is that "looks about right" stopped being good enough. The agent now has a gate it has to pass before it can call something done.

The stuff that actually ate my week

The render-and-diff idea took an afternoon. Making it trustworthy took way longer, because a verifier that fails for dumb reasons is worse than no verifier at all. The first time it cried wolf, I stopped trusting it, and that defeats the whole point.

The first thing that got me was fonts. I kept getting failures on text that looked identical, and I burned an embarrassing amount of time before I realized I was screenshotting before the web fonts had loaded. The render was comparing a fallback font against the design's real font and flagging the difference. Gating the capture on document.fonts.ready killed that entire class of false failures.

Then there was the wait strategy. I was waiting on networkidle before capturing, which is fine until you hit a page with a long-poll or a streaming connection, and then it just never idles. The verification would hang forever. I ripped out the blanket wait and replaced it with explicit readiness signals.

The one I'm still not fully done with is fidelity versus budget. The summary of a node that I hand to the agent can't carry every property at full precision, or it blows the context window. So I had to make calls about what to keep exact and what to approximate, and then be honest about it. The digest now carries explicit flags for gradients, shadows, strokes, opacity, and masks, so the agent knows when a value is the real thing and when it's a best guess.

Under the hood, briefly

Small pnpm monorepo. A Figma plugin lives in the file (it's the only thing that can actually read the document). A local bridge daemon indexes the file into SQLite with full-text search, updates itself as the design changes, and exposes everything over MCP. A separate harness does the rendering and diffing. The agent talks to the daemon through a little stdio shim so the file stays indexed between sessions. Around 15k lines of TypeScript, 35 tools, all read-only except the verify step, bound to localhost only.

What it still can't do

Being honest about the edges, because I hate posts that pretend their thing is finished.

The search is lexical, not semantic, so it matches words that literally appear in a layer's name or text. A vibes-based query won't find a generically named group. The digest is budgeted, hence the fidelity flags. And it only reads, it never writes back to Figma, on purpose.

If you've ever watched an AI spit out UI that's subtly, confidently wrong and had no way to catch it except your own eyes, this was my attempt at giving it the feedback loop it was missing.

Full writeup with screenshots and the architecture: https://www.arjunp.pro/projects/figma-connect.html