The Sweet Lure of Prediction Arb: How I Tried to Speedrun a Rust Monitor (and Fried My Brain)

#rust #web3 #arbitrage #ai

It all started with some casual research. I was deep down the prediction market rabbit hole, trying to figure out if there was any real alpha in playing options strategies based on crypto crowd expectations.

Then I spotted it: the discrepancies. You could buy >Yes< on one platform, grab >No< on another, and pocket a risk-free delta. The spreads weren't massive, but they were definitely there.

Just to give you an idea, here is one of the tastiest setups from that time:

Imagine an NBA market like "New York at Chicago Winner?". On Kalshi, you could buy "YES" for an average of 21.0¢, while on Predictfun you could scoop up the opposing "NO" for 22.3¢. You lock in both sides for roughly 43.3¢ total, guaranteeing a $1.00 payout. As you can see in the stats: you drop $47k to secure both sides, and walk away with $109k. That's a massive +129% net yield (+$61k profit) after fees in a single event. A literal free money glitch.

That sweet, sweet arbitrage FOMO hit me hard. My inner degen whispered: "We’re building a real-time monitor. Right now." I teamed up with Claude as my AI co-pilot, delegated the boilerplate, and started slapping together a tracker. After a quick recon, I pulled together 6 platforms: Polymarket, Kalshi, PredictFun, Proba, Limitless, and Opinion.

The tech side was actually pretty elegant. I whipped up a lightweight Rust architecture: REST requests to scrape all active markets, and WebSockets to pump the order books straight to the frontend. Watch the tape, analyze the spread, ape in. Everything worked flawlessly... until I got to the matching engine.

Matching the exact same real-world event across different platforms turned out to be the boss fight that completely fried my brain.

I designed a 4-stage filtering pipeline:

Xref: PredictFun actually provides direct links to Poly/Kalshi. Easy money. 100% accuracy right out of the gate using a BFS traversal on the relationship graph.
Normalized: If the normalized question string matched 1:1 across platforms = match. This gave me about ~95% accuracy.
Fuzzy Jaccard: For the stragglers. I used an inverted index to find pairs with ≥3 shared words, calculated the Jaccard similarity, and slapped on a strict Entity Guard so it wouldn't accidentally merge "France vs Brazil" with completely different events. Finally, I used Union-Find to stitch transitive chains together.
Merge: The final boss-fusing markets that landed in both the Xref and Normalized buckets.

Sounds like a gigabrain setup, right? But I just couldn't hit that holy grail of 98% accuracy. Edge cases and false positives kept creeping in, entirely because platforms phrase their questions so wildly differently. Honestly, my patience just ran out. I was trying to speedrun the build while hyped on FOMO, and I hit a wall.

The takeaway? Regex and Jaccard aren't enough. You need heavy LLM artillery for this. The real solution is classifying every single question against a strict template, deploying an AI agent to audit and read the actual description rules of each market, and aggressively caching the verified matches. No big deal, just processing 2M+ variations! 😅

I originally planned to ship this monitor to the public, but because of those inaccuracies, I tossed it in the drawer. PredictWit is on pause for now, but the idea is still sitting there, waiting for its time.

DEV Community

The Sweet Lure of Prediction Arb: How I Tried to Speedrun a Rust Monitor (and Fried My Brain)

Top comments (0)