KingGyu

Posted on May 2

Autarch: AI Strategy Evolution, Deterministic Trade Execution

#ai #cryptocurrency #architecture

Disclaimer: this is research and architecture software, not financial advice. The bundled strategies are not a profitability claim. Cryptocurrency trading involves significant risk.

TL;DR

Autarch is an open-source Bybit USDT perpetual trading workbench built around one boundary:

"LLM trading" does not have to mean an LLM presses Buy or Sell. It can mean an LLM evolves future strategy while deterministic code owns live execution.

Claude/Codex agents generate, review, backtest, and rank strategy candidates. A Python asyncio runner executes selected strategy files with no LLM calls in the live loop. The handoff is visible through strategy manifests, signal code, leaderboards, active/next pointers, cached data, and append-only evidence logs.

GitHub:

https://github.com/KingGyuSuh/autarch

Why I Built It

Most "AI trading bot" framing collapses two very different jobs into one box.

In older discretionary trading, a trader might personally decide and execute each order. In modern quant trading, that is usually not the shape of the work. Humans design policies, define constraints, validate assumptions, monitor behavior, and let execution systems place trades under those rules.

That distinction matters for LLMs.

If an LLM participates in a trading system, the interesting question is not "can the model press the Buy button?" The more useful question is: can the model help evolve future policy without sitting inside the live execution path?

Research work benefits from generative systems. They can inspect evidence, form hypotheses, critique candidate strategies, compare backtests, and revise code.

Live execution has a different job. It should be explicit, bounded, inspectable, and boring.

Autarch is my attempt to preserve both truths in one architecture.

The project started from a simple design rule:

Let AI improve the strategy, but do not let generative uncertainty directly own irreversible execution.

That rule shaped the whole repository. Autarch is not "an LLM that trades for you." It is a workbench where AI agents can propose, review, and backtest strategies, while live execution stays deterministic, inspectable, and bounded by explicit risk controls.

System paper:

https://github.com/KingGyuSuh/autarch/blob/main/docs/AUTARCH.md

The Problem

Generative models are useful in research loops. They can search through hypotheses, explain tradeoffs, inspect logs, write candidate code, compare results, and critique their own work.

Live execution has different needs. It benefits from narrow responsibility, explicit state, deterministic behavior, and clear authority boundaries.

Those two qualities should not be forced into the same runtime path.

In a trading system, that distinction matters. A model can be useful for strategy evolution without being allowed to improvise in the live order path.

The Autarch Split

Autarch is organized into two planes with an evidence boundary between them.

Evolution Plane

The Evolution Plane is where Claude/Codex harness agents work.

In the current implementation, the harness runs producer/reviewer pairs:

trade-strategy creates or revises strategy candidates.
backtest evaluates the strategy pool against cached market data.
strategy-run compares the top leaderboard candidate against the currently active strategy and writes a proposed next strategy pointer when appropriate.

This side is allowed to be creative and iterative because it does not place live orders. It produces artifacts that can be inspected.

Evidence Boundary

The handoff is deliberately plain:

strategy/pool/<id>/manifest.toml
strategy/pool/<id>/signal.py
strategy/leaderboard.toml
strategy-script/active/<pair>.toml
strategy-script/next/<pair>.toml
config/trade.toml
data/*.jsonl
raw-data/*.csv

These files answer the questions that matter:

What strategy exists?
Which strategy is active?
Which strategy is proposed next?
Why was it ranked highly?
What evidence has the runner recorded?
What risk posture is currently configured?

The boundary could become a database, a queue, a signed manifest, or a dashboard later. The important part is not the medium. The important part is that the handoff is visible and accountable.

Execution Plane

The Execution Plane is deterministic Python.

strategy-script/runner.py runs one asyncio coroutine per configured pair. Each coroutine:

Checks current positions.
Waits for native TP/SL closure if a position is already open.
Records closure evidence.
Applies a pending next/<pair>.toml strategy pointer only at the boundary.
Loads the active strategy manifest and signal.py.
Fetches Bybit klines.
Evaluates entry_signal(candles, params, context).
Routes any entry through bybit-script/place_order.py.

The runner does not call an LLM.

After entry, the position is managed by Bybit native TP/SL. The runner polls, records evidence, and continues.

Strategy Format

Each strategy has a manifest:

id = "ema_cross_v1"
description = "..."
pairs = ["BTCUSDT", "ETHUSDT"]
leverage = 5
tp_pct = 0.012
sl_pct = 0.008
timeframe = "5"
kline_limit = 120

[params]
# strategy-specific parameters

And a deterministic signal function:

def entry_signal(candles, params, context=None):
    # Return None for no entry.
    # Return {"side": "Buy" or "Sell", "rationale": "..."} for an entry.
    ...

The signal code is constrained. It should be deterministic for identical inputs. It should not perform network calls, external IO, or time-dependent behavior. The live runner should evaluate strategy logic, not host a hidden research session.

Risk Gates

The project keeps safety posture in config/trade.toml.

The default configuration includes:

armed = false, so live order placement is rejected until explicitly enabled
mandatory TP/SL
leverage caps
minimum TP and SL distance floors
minimum reward/risk ratio
fixed margin fraction
global maximum concurrent positions
active pair list

The harness never calls place_order.py. Only the execution runner places entries, and only through the configured order gate.

This does not make trading safe. It makes the authority boundary explicit.

Why This Architecture Matters

The point of Autarch is not that one strategy, exchange, or scoring formula is correct.

The point is the shape of the system:

Let the creative layer mutate future policy.
Make the policy handoff inspectable.
Keep the action layer narrow and deterministic.
Record evidence so the next evolution cycle can learn from what happened.

That pattern applies beyond trading. Any agentic system that separates "thinking about future behavior" from "taking irreversible action" can benefit from a similar boundary.

What I Want Feedback On

I am especially interested in critique around:

whether the Evolution Plane / Execution Plane split is clear enough
whether file-based handoffs are a good first boundary
whether strategy adoption should require stronger review or signatures
how to score a changing strategy pool without overfitting recent data
where human approval should sit in the loop
what should be made more formally verifiable

Links