We Lost $323. Then We Found a 75% Win Rate. Here's Everything That Happened.

#ai #agentaichallenge #agentskills #webdev

Originally published at nulllimit.gg/blog/kalshi-trading-bot

This isn't a hedge fund story. There's no VC money, no quant team, no Bloomberg terminal. It's two of us — me and Mewtwo, my AI ops agent powered by Claude — building an automated trading engine from a Lenovo ThinkCentre M75n running Ubuntu in my living room.
The target: Kalshi's binary prediction market for Solana price movements. Specifically KXSOL15M — will SOL's price be higher in the next 15 minutes? You bet YES or NO. You're right or you're wrong. No hedging. No partial positions.
For an AI system, that's actually perfect. It's a pure classification problem. The question was whether we could design one good enough.

Phase 1: Paper Trading (April 4–12)
Before we risked a dollar, we built a scoring engine using 8 technical indicators: RSI, EMA crossovers, Volume Spikes, Momentum, Bollinger Bands, HTF Alignment, Order Book Imbalance, and Sentiment Analysis. Each signal contributes points to a directional score. When the score passes a threshold, the system places a bet.
We ran 69 paper trades. The patterns were clear immediately.
Session Breakdown:
SessionWin RateDecisionNY Afterhours68%Golden window — keepNY Market59%Keep, raise thresholdNY Evening38%DisabledNY Overnight38%Disabled
Direction Breakdown:
DirectionWin RateDecisionUP signals69%KeepDOWN signals45%Disabled
The biggest decision: we had added a hard RSI filter to skip trades when RSI was in a danger zone. Logical. Protective. And completely wrong. Out of 11 trades the filter blocked, 10 would have been winners. 9% accuracy rate. We deleted it immediately.

Your best defense can sometimes be your worst offense. The RSI filter was logically sound. It had a 9% accuracy rate. Delete it.

Phase 2: The Technical Hell of Going Live (April 12–15)
Going live was supposed to be simple. Write live-trade.js, flip a switch. It took 3+ days of debugging before the first real trade landed.
Problem 1: live-trade.js didn't exist. The previous AI session had described building it. It never actually wrote the file. When the first qualifying signal fired — crash. File not found.
Problem 2: Kalshi ask prices returning undefined. The /markets list endpoint returns prices as yes_ask_dollars — dollar values, not cents. We were looking for the old format. Fixed by reading yes_ask_dollars directly.
Problem 3: Markets were "initialized" but not tradeable. Kalshi creates markets in advance with status initialized. These show yes_ask_dollars: 0.0000. Fixed by filtering for openTime ≤ now < closeTime and checking ask price after a secondary fetch.
Problem 4: Trades never resolved. Kalshi marks filled orders as status: "executed" with settlement_status: "N/A" — not "filled" and "settled" as our code expected. Fixed by falling back to price-movement resolution.
Problem 5: balance.json kept resetting to $500. A subtle bug in loadBalance() was re-creating the default balance on certain error conditions. Added explicit try/catch with logging.
Problem 6: Cron job was reading the wrong file. The payload was reading from pre-check.js instead of running radar.js. The bot wasn't actually trading. Fixed.

Phase 3: The Losing Streak
5 consecutive live losses. $323 lost. Balance: $167.84.

DateDirectionScoreSessionP&L1Apr 13, 6:55 PMUP65Afterhours−$124.462Apr 14, 9:50 AMUP70NY Market−$49.883Apr 15, 5:20 PMDOWN75Afterhours−$49.404Apr 15, 6:10 PMUP60Afterhours−$49.505Apr 15, 6:40 PMUP60Afterhours−$49.68

Trade 1 used $124 sizing — we hadn't fixed the 25% compounding logic yet. Trades 4 and 5 had RSI overbought conditions and price above the Bollinger upper band. We were taking UP signals in conditions that directly contradicted an UP thesis. That's not a bad market. That's a bad system.

Phase 4: 15 Optimization Rounds in One Day
We built a backtesting system from scratch. The Node.js version crashed repeatedly. We ported it to Python — it hit errors. We ported it back to Node.js. Then we ran 15 rounds of optimization.
RoundWin RateTradesNotes151.16%215Initial run — coin flip2–422%19–27Too strict, mostly losses5–6—0Filters too aggressive7–850.0%182Breakeven with both directions10–1149.73%185Stuck at coin flip12—0Over-optimized to nothing
After Round 12 we realized: this is not a filtering problem. RSI, EMA, Bollinger Bands, Momentum — these indicators consistently produced a 50% win rate across all parameter combinations. The signal set doesn't have a predictive edge. Everything had to change.

Phase 5: The Radical Shift (Round 13)
We threw out every indicator we'd built and started over with price action patterns and mean reversion logic.
The new signal set:

Bearish Engulfing → DOWN (+100 pts): A large red candle that completely engulfs the previous green candle.
Bearish Pin Bar / Shooting Star → DOWN (+80 pts): Long upper wick, small body. Price rejected hard at the highs.
Extreme Undershoot vs 50-EMA → UP (+120 pts): Price significantly below the 50-period EMA. Mean reversion pull toward fair value. Our strongest single signal.

Round 13 results: 4 trades, 4 wins, 100% win rate. Too few trades — but the direction was clear. We relaxed the filters.
Round 14: 203 trades, 57.64% win rate, +$310.
Then we got surgical. We removed every signal and session that wasn't pulling its weight:
RemovedWRReasonBullish Engulfing33%Liability — actively hurting usBullish Pin Bar52.94%Not strong enoughNY Market session50%Coin flip — cut itExtreme Overshoot62.96%Below thresholdNY Evening57.69%Below threshold
Final result: 97 trades, 75% win rate, +$330 PnL, 6% max drawdown.

The Final Deployed Strategy
Active Signals:

Bearish Engulfing → DOWN (+100 pts)
Bearish Pin Bar → DOWN (+80 pts)
Extreme Undershoot vs 50-EMA → UP (+120 pts)

Active Sessions:

NY Afterhours (4–8 PM EDT): 71.43% WR
NY Overnight (midnight–9 AM EDT): 75.61% WR (+20pt bonus)

Filters:

HTF 1-hour trend: skip UP if bearish, skip DOWN if bullish
Score threshold: 70+

Risk Management:

Base trade size: $10
Compounding: +25% of last win
Max size: $500
3 consecutive wins → reset to $10
20% drawdown from peak → cap at $10

What This Actually Teaches

Live testing before backtesting is expensive tuition. We lost $323 learning lessons that a proper backtesting system could have taught us for free. Build the backtester first. Always.
Lagging indicators don't predict — they describe. RSI, EMA crossovers, Bollinger Bands, Momentum — across 15 optimization rounds, every combination landed at 50%. Price action patterns and mean reversion proved genuinely predictive.
Session matters more than signal quality. NY Market — the "obvious" trading window — was our worst performer. Overnight and Afterhours were where the edge lived. The market behaves differently when Wall Street isn't watching.
Directionality asymmetry is real. DOWN signals were consistently unreliable from day one. Mean reversion UP was our strongest single signal. Don't fight the asymmetry — disable what doesn't work and double down on what does.
AI + human collaboration actually works. Mewtwo wrote 95% of the code and ran all the analysis. I provided strategic direction, caught errors the AI missed, and made the calls on when to pivot. Neither of us could have done this alone at this speed.
Persistence past Round 12 is the whole game. We rebuilt the backtester three times. We ran 15 optimization rounds in a single day. We nearly gave up after Round 12. Round 13's radical shift changed everything. The breakthrough was one pivot away.

What's Next
The strategy is deployed. The bot is live on the M75n, running every 5 minutes, placing real orders. The backtested edge is 75% — now we need live validation across 30+ trades to know if that holds in production.
If it does: Can this apply to other Kalshi markets — BTC, ETH, macro events? Can the mean reversion logic generalize to Polymarket or Metaculus? What happens when we expand the candle pattern set?
The bot is running. The results will tell us if we were right.

Want to build your own autonomous system?
This entire engine — architecture, debugging, strategy discovery — was built by one human and one AI in under two weeks. The barrier isn't capital or education. It's the willingness to build something that might not work, test it ruthlessly, and iterate.
→ Get an AI agent setup: nulllimit.gg/agents
→ Read the full framework: nulllimit.gg/ebooks

Tags: AI Agents · Trading Bots · Kalshi · Builder Playbook · Prediction Markets · Backtesting · Autonomous Systems