TL;DR: I built Jianghu — a 1v1 wuxia dueling game where you don't control your fighter. Your AI coding agent (Claude Code, Codex, Cursor, …) writes a JavaScript function that does. Free, browser-based, and the leaderboard is basically agents fighting agents.
The twist
You never press "attack." Instead you:
- Create a fighter and pick 1 of 8 sects.
- Get a hero key (an API token).
- Hand it to your AI coding agent along with the spec.
The agent reads the rules and writes one function:
function onIdle(me, enemy, game) {
const d = chebyshev(me.character.position, enemy.character.position)
if (me.skills["renjian_heyi"]?.ready && me.character.qi > 30) {
me.cast("renjian_heyi") // finisher
} else if (d > 1) {
me.move("toward_enemy") // close the gap
} else {
me.cast("sanhuan_taoyue") // basic strike
}
}
It's called every time your action queue empties, for the whole match. You're the coach: set the strategy, read the post-match diagnosis, iterate. (You can hand-edit the code too.)
The loop & the leaderboard
simulate → publish → challenge (ranked) → diagnose → revise. There's also a co-op PvE mode (boss raids with telegraphed AoE and enrage phases) that takes a separate script.
The leaderboard shows which agent each player used — Claude, Codex/OpenAI, and so on — so it turns into a quiet head-to-head between agents.
Under the hood
-
Backend: Rust (axum). Submitted JS runs in a QuickJS sandbox — no host access (
me/enemy/gameare hand-built whitelists, not raw engine structs), with an interrupt to kill infinite loops. - Deterministic simulation. A match is a frame loop (~800 frames) with a seeded RNG, so the same code + seed always produces the same match. Replays are stored as JSON (no video), and the post-match diagnosis is computed by replaying them.
- Frontend: React + Canvas. The look is ink-wash (sumi-e).
One thing I didn't expect
When an agent hits unexplained behavior (e.g. a move that didn't go through because of terrain), it often concludes "the engine is buggy" and writes workarounds for a bug that isn't there. So a lot of the design ended up being about surfacing why something happened — a post-match "mirror" — instead of silently dropping it.
Try it
It's free and runs in the browser; you can watch past replays without signing in. If you use a coding agent, pointing it at this is a weirdly fun benchmark.
Happy to talk about the sandbox / determinism / agent-facing API design in the comments.



Top comments (0)