What if we could treat financial candlestick data the way GPT treats natural language? That's exactly what Kronos does.
Built by researchers from Tsinghua and Nanjing University, Kronos is the first open-source foundation model specifically designed for financial market K-line (candlestick) data. It was accepted at AAAI 2026 and has gained 14,800+ GitHub stars.
The Problem with General-Purpose Models
General-purpose time series foundation models (TSFMs) like TimeMoE or Moirai don't perform well on financial data. Financial time series have:
- Low signal-to-noise ratio — markets are inherently noisy
- Strong non-stationarity — statistical properties change over time
- Higher-order dependencies — complex patterns beyond simple autocorrelation
Kronos addresses these challenges with a domain-specific approach.
Two-Stage Framework
Stage 1: Custom Tokenizer
Kronos converts continuous multi-dimensional OHLCV (Open, High, Low, Close, Volume) data into hierarchical discrete tokens using Binary Spherical Quantization (BSQ). This preserves both price dynamics and trading activity patterns.
Stage 2: Autoregressive Transformer
A decoder-only transformer (GPT-style) is pre-trained on 12 billion+ K-line records from 45 global exchanges. It covers:
- Asset types: Stocks, futures, forex, options, cryptocurrency
- Timeframes: 1min, 5min, 15min, 30min, 60min, daily, weekly, biweekly
Benchmark Results
The zero-shot performance improvements are significant:
| Metric | Improvement | Compared To |
|---|---|---|
| Price prediction RankIC | 93% | TimeMoE (SOTA TSFM) |
| Volatility prediction MAE | 9% | Previous TSFM |
| Synthetic data fidelity | 22% | Previous models |
| Return prediction (full-shot) | 60% | DLinear |
Quick Start
from model import Kronos, KronosTokenizer, KronosPredictor
tokenizer = KronosTokenizer.from_pretrained("NeoQuasar/Kronos-Tokenizer-base")
model = Kronos.from_pretrained("NeoQuasar/Kronos-small")
predictor = KronosPredictor(model, tokenizer, max_context=512)
pred_df = predictor.predict(
df=x_df,
x_timestamp=x_ts,
y_timestamp=y_ts,
pred_len=120
)
Three model sizes are available on Hugging Face:
- Kronos-mini (4.1M params, context 2048) — lightweight, CPU-friendly
- Kronos-small (24.7M params, context 512) — general prediction
- Kronos-base (102.3M params, context 512) — highest accuracy
All under MIT license.
Fine-Tuning
Fine-tuning scripts were released in August 2025:
# Tokenizer fine-tuning
python finetune_tokenizer.py --data_path ./data/your_market/ --epochs 50
# Model fine-tuning (multi-GPU)
torchrun --nproc_per_node=4 finetune_model.py \
--model_name NeoQuasar/Kronos-base \
--data_path ./data/your_market/ \
--distributed
Includes Qlib-based examples for China A-shares and a built-in backtesting pipeline.
Limitations to Know
- Context length is 512 tokens for small/base models
- Cannot predict black swan events or regime shifts
- Kronos-large (499M params) remains private
- It's a probabilistic tool, not a profit guarantee
Links
- GitHub: shiyu-coder/Kronos
- Paper: arXiv 2508.02739
- AAAI 2026: Proceedings
- Live Demo: BTC/USDT Prediction
Disclaimer: Not financial advice. Kronos is a research and analytical tool.
Top comments (0)