Building an Autonomous Crypto Trading Bot

#architecture #automation #cryptocurrency #softwareengineering

I've been spending too much time inside trading bot codebases lately. Most of them are one of two things: a 200-line Jupyter notebook that someone calls a "system," or a sprawling monorepo where the strategy logic and exchange integration are so tangled that you can't swap exchanges without rewriting half the code.

A few weeks ago I went deep on AlphaStrike, a production-grade crypto perpetual futures bot. Not because the returns were headline-grabbing (though a 2.4 Sharpe is nothing to sneeze at), but because the architecture solves problems most of us hand-wave past. I want to walk through what's interesting, what's novel, and what I'd steal for my own projects.

The Problem Space

Algorithmic crypto trading sounds simple at the whiteboard: read prices, predict direction, place orders, manage risk. In practice, every layer of that stack will try to kill you.

Exchanges are inconsistent. WEEX, Binance, Hyperliquid — every one has different symbol formats, different REST paradigms, different WebSocket lifecycles, different ways of representing a position.
Models decay. A signal that worked last quarter doesn't work this quarter. Pretending otherwise is how accounts get blown up.
Volatility is non-stationary. Static leverage and fixed position sizes are a lie you tell yourself until you wake up at -40% drawdown.
Pure quant is fragile. Numbers don't know that the SEC just sued the second-largest exchange.

AlphaStrike's design isn't trying to be the smartest bot. It's trying to be the bot that's still alive in 12 months. That's a different optimization target, and it shows.

The Architecture, Top-Down

EXCHANGE → DATA GATEWAY → FEATURE LAYER → FEATURE VALIDATOR
                                                    │
                                                    ▼
EXECUTION ← RISK LAYER ← STRATEGY LAYER ← ML LAYER

Eight stages, every one of them able to halt the pipeline on its own. That's the first lesson: every layer is a potential circuit breaker. If features fail validation (PSI drift, KS test, CUSUM), no signal reaches the model. If the risk layer flags exposure, no order reaches the exchange. Fail-closed by default.

Let me walk through the four pieces I actually want to talk about.

1. Exchange Abstraction Done Right

This is where most trading bots rot. AlphaStrike defines two Protocol classes — ExchangeRESTProtocol and ExchangeWebSocketProtocol — and every adapter (WEEX, Hyperliquid, Binance, generic OpenAPI) implements them. The trading logic only talks to the unified protocol.

@runtime_checkable
class ExchangeRESTProtocol(Protocol):
    async def get_ticker(self, symbol: str) -> UnifiedTicker: ...
    async def place_order(self, order: UnifiedOrder) -> UnifiedOrderResult: ...
    async def get_positions(self, symbol: str | None = None) -> list[UnifiedPosition]: ...
    async def set_leverage(self, symbol: str, leverage: int) -> bool: ...

The unified data models (UnifiedOrder, UnifiedPosition, UnifiedCandle) are the contract. Every adapter has a mappers.py that translates between exchange-native shapes and the unified shapes. Symbol normalization happens at the adapter boundary — internally everything is BTCUSDT, externally it becomes cmt_btcusdt or whatever WEEX wants this week.

Why I care: I've shipped trading code where exchange-specific assumptions leaked into the strategy. It's death by a thousand if exchange == "binance" cuts. The Protocol-based approach keeps the boundary honest. You add a new exchange by writing one adapter file, not by hunting through the codebase.

2. The ML Layer That Doesn't Trust Itself

The signal pipeline runs 12 categories of weak signals — order flow, microstructure, volatility, correlation, sentiment, seasonality, statistical, price action, volume, derivatives, alternative, macro — and combines them through a regime-aware ensemble. This is the explicitly Renaissance/Medallion-inspired bit, and the backtest deltas are real:

Metric	Single Signal	12-Category Ensemble
Sharpe	1.2	2.4
Win Rate	52%	58%
Max Drawdown	-15%	-8%

But the part I find genuinely novel is the signal decay tracker. Every signal logs its predictions, the system records outcomes, and signals get auto-retired when their rolling accuracy drops below 48%. Weight is (edge × 2)², so signals with real edge get amplified and weak signals fade out without anyone touching code.

edge = accuracy - 0.5            # 0.52 accuracy → 0.02 edge
weight = (edge * 2) ** 2         # quadratic weighting of strong signals
if accuracy < 0.48:
    signal.retire()

This is the right way to do it. Most "ensemble" systems use static weights tuned once and forgotten. Here the weights are alive — they update with reality. Models that lose their edge get fired by the system itself.

3. Dynamic Leverage as a First-Class Citizen

Static leverage is the crypto equivalent of running with scissors while drunk. AlphaStrike treats leverage as a continuous control variable:

leverage = base × vol_factor × dd_factor × perf_factor

vol_factor  = normal_vol / current_vol     # clamped 0.3 to 1.5
dd_factor   = 1.0 / 0.7 / 0.5 / 0.3        # tiered by drawdown
perf_factor = half_kelly_fraction          # 0.6 to 1.2

Real scenarios from the doc:

Conditions	Leverage
Normal	5.0x
High vol (5%)	2.0x
In 12% drawdown	2.5x
Strong perf + low vol	9.0x
All bad (high vol + DD + losing)	1.0x

The leverage state lives in data/state/leverage_state.json so it survives restarts. When the system reduces from 5x to 2x because volatility spiked, the next process boot doesn't forget. That detail matters more than it sounds — most bots reset to defaults on restart and quietly take on more risk than the operator thinks.

4. The LLM Layer That Knows Its Place

Here's the part that surprised me. AlphaStrike has an LLM decision layer — a local Ollama-served qwen2.5:1.5b — but its design philosophy is the opposite of what's currently fashionable. The LLM does not generate signals. It does not pick trades. It does not "reason about the market."

It only intervenes when performance degrades. When the rolling win rate drops below 40%, drawdown crosses 15%, or you stack 5 consecutive losses, the system hands the LLM a structured performance report and a tightly scoped tool palette:

adjust_conviction(symbol, threshold, reason)
adjust_position_size(symbol, multiplier, reason)
adjust_leverage(new_leverage, reason)
disable_shorts(symbol, reason)
disable_asset(symbol, duration_hours, reason)
no_action(reason)

Example LLM response when SOL is having a 25% win rate, 22% drawdown, 7-loss streak:

[
  {"tool": "adjust_position_size", "params": {"symbol": "SOL", "multiplier": 0.3}},
  {"tool": "adjust_conviction", "params": {"symbol": "SOL", "new_threshold": 85}},
  {"tool": "disable_shorts", "params": {"symbol": "SOL"}},
  {"tool": "send_alert", "params": {"severity": "critical"}}
]

That's the right shape for LLMs in financial systems: bounded actions, explicit triggers, no inference loops touching live capital. The model doesn't have to be smart, it has to be defensive. A 1.5B parameter local model is more than enough when the action space is six tools wide.

What I Took Away

Three things I'm stealing:

Protocol-based exchange abstraction. No more if exchange == chains. Define the contract once, swap implementations behind it. This generalizes way past trading.
Self-retiring signals with quadratic edge weighting. Static feature weights are tech debt the moment you ship them. Make signal decay a first-class concept and let the data prune your own model.
LLM-as-circuit-breaker, not LLM-as-strategist. The hype-cycle take is "use the LLM to pick trades." The mature take is "use the LLM to recognize when your quant system is dying and apply targeted, reversible, well-typed interventions." The hype-cycle take blows up your account. The mature take saves it.

What I'd build next: an offline evaluation harness for the LLM's tool-call decisions. Right now the LLM's interventions only get evaluated by their downstream P&L impact, which is noisy and slow. A counterfactual replay framework — "what would have happened if the LLM had done nothing, or chosen a different tool?" — would let you tune the trigger thresholds and the prompt without burning real capital. That's where I'd put the next two weeks of engineering time.

Trading bots are not magic. They're software systems that have to survive volatility, exchange flakiness, model decay, and operator panic. The systems that survive are the ones that take all four threats seriously at the architecture level — not the ones with the prettiest backtest curve.