Show HN: SleepyQuant – a 12-agent crypto quant running on one Mac
Hey everyone,
SleepyQuant is a solo experiment I've been running for the last couple of weeks: 12 local AI agents coordinating a paper crypto trading book on a single Apple M1 Max. No cloud inference, no API bills, no vendor black box. Every agent prompt, every losing trade, every round-trip gets written up weekly.
Stack (all local):
- Apple M1 Max, 64 GB RAM
- MLX Qwen 2.5 32B Q8 as the primary agent model
- DeepSeek R1 14B Q8 as a lazy-loaded reasoning lane for research tasks
- Priority queue on the MLX inference lock so user chat preempts automation
- FastAPI backend, SwiftUI macOS app, SQLite for state, ChromaDB for agent memory
- Binance paper via ccxt, spot + futures, 70/30 allocation, 10x leverage on the futures lane
What's deliberately boring:
- The paper book is roughly $78 equivalent. Not a typo. The real-mode transition gate requires three consecutive green days before anything touches real capital, and even then the first real trade is capped tiny. If the strategy can't handle $78, I'd rather find out for free.
- Tight scalp TP/SL (2.0% / -1.5% on futures) with a hard -8% daily drawdown stop.
- Every losing trade gets a post-mortem. The failure vault is public in the weekly newsletter, with root-cause classification (technical / news / execution slippage) and the exact param changes shipped as a response.
- Funding rate guard — refuses to open futures positions when our side is paying extreme funding. Shipped after the scanner was quietly bleeding basis points for three days straight.
Agents (one role each):
A COO / dispatcher, a trading lead, separate futures + spot executors, a CFO, a CTO with filesystem + shell tools, an R&D / failure analyst, a legal / compliance officer, a resource monitor, a QA engineer, a news intelligence watcher, and a content / SEO writer.
Each agent has a focused system prompt + a small set of skill handlers. The COO routes CEO requests to the right specialist instead of one monolithic agent trying to do everything.
Live paper P&L widget + weekly newsletter: https://sleepyquant.rest
Two things I'd genuinely want feedback on — please weigh in below:
Is 12 agents worth the routing overhead? Or would a single bigger agent with tool use be cleaner at this scale? I keep flip-flopping and would love to hear from anyone who's been through the same decomposition choice.
MLX unload strategies on Apple Silicon? Right now my reasoning model auto-unloads after 2 minutes idle, which works but feels crude. If you're running MLX in production on a Mac, how do you free RAM when you need it back?
Try it or follow along:
- Live paper P&L widget + weekly write-up: https://sleepyquant.rest
- Subscribe to the weekly post-mortem newsletter — Beehiiv, free, one email per week, no upsells, no signals, no affiliate links
- Cadence: every Tuesday. If the book dies, I'll write up that too
Happy to answer questions in the comments about the architecture, the failure vault, the priority queue design, or why local-first LLM agents are worth the effort on a 64 GB machine. Fire away.
Top comments (0)