Lesson 4: Data Download and Management
β± Duration: 1.5 hours
π― Learning Objectives: Master market data download and management
π Difficulty: β Beginner
π Course Overview
Historical data is the foundation of quantitative trading. Without data, you cannot backtest the historical performance of strategies. This lesson will teach you how to download, manage, and maintain market data, preparing you for upcoming backtesting practice.
4.1 Timeframe Selection
What is a Timeframe?
A timeframe refers to the time length represented by each candle (candlestick). Common timeframes include:
Timeframe | Description | Daily Candles | Data Volume (30 days) |
---|---|---|---|
1m | 1 minute | 1,440 | 43,200 candles |
5m | 5 minutes | 288 | 8,640 candles |
15m | 15 minutes | 96 | 2,880 candles |
1h | 1 hour | 24 | 720 candles |
4h | 4 hours | 6 | 180 candles |
1d | 1 day | 1 | 30 candles |
Timeframes and Trading Styles
Different timeframes suit different trading styles:
1. Ultra Short-term (1m - 5m)
Characteristics:
- Very high trading frequency (dozens to hundreds of trades per day)
- High percentage of transaction costs
- Requires fast execution, sensitive to latency
- Many signal noises, many false signals
Suitable for:
- Full-time traders
- High-frequency trading systems
- Accounts with low transaction fees
Risks:
- Overtrading
- Transaction costs eroding profits
- High psychological pressure
2. Short-term (15m - 1h)
Characteristics:
- Mainly day trading (5-20 trades per day)
- Balances signal quality and quantity
- Moderate transaction cost impact
- Suitable for beginners to practice
Suitable for:
- Quantitative trading beginners
- Those with some time to monitor markets
- Those who want to see results quickly
Recommended Reasons:
- β Fast data download
- β Fast backtesting
- β Good signal quality
- β Controllable transaction costs
3. Medium to Long-term (4h - 1d)
Characteristics:
- Swing trading (few trades per week)
- Few but high-quality signals
- Low percentage of transaction costs
- Overnight risk needs consideration
Suitable for:
- Part-time traders
- Those who don't want frequent monitoring
- Those with larger capital
Risks:
- Large single-trade volatility
- Overnight news risk
- Low capital utilization efficiency
Data Volume vs Signal Quality
This is a classic trade-off:
Smaller Timeframe ββββββββββββββββββββββββ Larger Timeframe
β β
ββ More signals, but more noise β
ββ Frequent trading, high transaction fees β
ββ Large data volume, high storage pressure β
β β
β Fewer signals, but higher quality ββ€
β Less trading, low transaction fees ββ€
β Small data volume, easy storage ββ€
Selection Recommendations
Beginner Recommendations:
- Main timeframe: 5m or 15m
- Auxiliary timeframe: 1h (for trend confirmation)
- Data period: 30-90 days
Advanced Users:
- Choose based on strategy type
- Multi-timeframe combinations
- Prepare at least 6+ months of data
4.2 Downloading Historical Data
Activate Environment
First ensure Freqtrade environment is activated:
# Activate Conda environment
conda activate freqtrade
# Verify environment
freqtrade --version
Basic Download Commands
1. Download Default Trading Pairs
# Download trading pairs configured in config.json, recent 30 days, 5-minute data
freqtrade download-data -c config.json --days 30 --timeframes 5m
Output Example:
2025-09-30 10:00:00 - freqtrade.data.history - INFO - Downloading pair BTC/USDT, interval 5m.
2025-09-30 10:00:05 - freqtrade.data.history - INFO - BTC/USDT, 5m: 8640 candles downloaded.
2. Download Specified Trading Pairs
# Download specified trading pairs
freqtrade download-data \
-c config.json \
--pairs BTC/USDT ETH/USDT BNB/USDT \
--days 30 \
--timeframes 5m
3. Download Multiple Timeframes
# Download multiple timeframes simultaneously
freqtrade download-data \
-c config.json \
--pairs BTC/USDT ETH/USDT \
--days 90 \
--timeframes 1m 5m 15m 1h 1d
Batch Download
Download Multiple Trading Pairs
Create trading pairs list file pairs.json
:
{
"exchange": {
"pair_whitelist": [
"BTC/USDT",
"ETH/USDT",
"BNB/USDT",
"SOL/USDT",
"XRP/USDT"
]
}
}
Download using list:
freqtrade download-data \
-c pairs.json \
--exchange binance \
--days 90 \
--timeframes 5m 1h
Download Longer Time Range
# Download recent 180 days data (about 6 months)
freqtrade download-data -c config.json --days 180 --timeframes 5m
# Download 1 year data
freqtrade download-data -c config.json --days 365 --timeframes 1h
Download by Specified Date Range
Use --timerange
parameter to precisely control dates:
# Download data from September 1 to September 30, 2025
freqtrade download-data \
-c config.json \
--timerange 20250901-20250930 \
--timeframes 5m
# From a certain date to now
freqtrade download-data \
-c config.json \
--timerange 20250801- \
--timeframes 5m
# All data before a certain date
freqtrade download-data \
-c config.json \
--timerange -20250930 \
--timeframes 5m
Incremental Updates
If data has been downloaded before, running the command again will automatically update incrementally:
# First download (September 1-15)
freqtrade download-data -c config.json --timerange 20250901-20250915 --timeframes 5m
# Incremental update (will only download September 16-30)
freqtrade download-data -c config.json --timerange 20250901-20250930 --timeframes 5m
4.3 Data Format and Storage
Data Storage Formats
Freqtrade supports two data storage formats:
1. JSON Format (Default)
Advantages:
- Human readable
- Easy to debug
- Good compatibility
Disadvantages:
- Larger files
- Slower reading
Example File:
user_data/data/binance/BTC_USDT-5m.json
File Content Snippet:
[
[1693526400000, 25945.32, 25950.00, 25940.00, 25948.15, 124.5],
[1693526700000, 25948.15, 25955.20, 25945.00, 25952.30, 98.3]
]
Format: [timestamp, open, high, low, close, volume]
2. Parquet Format (Recommended)
Advantages:
- Small files (high compression)
- Fast reading
- Suitable for large data volumes
Disadvantages:
- Binary format, not directly viewable
- Requires additional library support
Configuration Method:
Add to config.json
:
{
"dataformat_ohlcv": "parquet",
"dataformat_trades": "parquet"
}
Example File:
user_data/data/binance/BTC_USDT-5m.parquet
Data Directory Structure
Standard data directory structure:
user_data/
βββ data/
βββ binance/ # Exchange name
βββ BTC_USDT-1m.json # BTC/USDT 1-minute data
βββ BTC_USDT-5m.json # BTC/USDT 5-minute data
βββ BTC_USDT-1h.json # BTC/USDT 1-hour data
βββ ETH_USDT-5m.json # ETH/USDT 5-minute data
βββ .metadata/ # Metadata directory
View Downloaded Data
Use list-data
command to view local data:
# View all downloaded data
freqtrade list-data -c config.json
# View specific trading pairs
freqtrade list-data -c config.json --pairs BTC/USDT ETH/USDT
Output Example:
ββββββββββββββ³βββββββββββ³ββββββββββββββββββββ³ββββββββββββββββββββ³ββββββββββ
β Pair β Timeframeβ From β To β Candles β
β‘ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β BTC/USDT β 5m β 2025-09-01 00:00 β 2025-09-30 23:55 β 8,640 β
β BTC/USDT β 1h β 2025-09-01 00:00 β 2025-09-30 23:00 β 720 β
β ETH/USDT β 5m β 2025-09-01 00:00 β 2025-09-30 23:55 β 8,640 β
ββββββββββββββ΄βββββββββββ΄ββββββββββββββββββββ΄ββββββββββββββββββββ΄ββββββββββ
Data Integrity Check
Check if data has gaps:
# Check data gaps
freqtrade list-data -c config.json --show-timerange
If gaps are found, re-download:
freqtrade download-data -c config.json --timerange 20250901-20250930 --timeframes 5m
4.4 Data Updates and Maintenance
Scheduled Update Strategies
Method 1: Manual Updates
Run manually weekly or monthly:
# Update recent 7 days data
freqtrade download-data -c config.json --days 7 --timeframes 5m 1h
Method 2: Using Cron Scheduled Tasks (Linux/macOS)
Edit crontab:
crontab -e
Add scheduled task (updates at 2 AM daily):
0 2 * * * /path/to/conda/envs/freqtrade/bin/freqtrade download-data -c /path/to/config.json --days 7 --timeframes 5m
Method 3: Using Windows Task Scheduler
- Open "Task Scheduler"
- Create Basic Task
- Trigger: Daily 2:00 AM
- Action: Start program
- Program:
C:\Users\YourName\anaconda3\envs\freqtrade\Scripts\freqtrade.exe
- Arguments:
download-data -c C:\path\to\config.json --days 7 --timeframes 5m
- Program:
Data Cleanup
Delete unwanted data:
# Delete specific trading pair data
rm user_data/data/binance/DOGE_USDT-5m.json
# Clear entire data directory
rm -rf user_data/data/binance/*
Storage Space Management
Estimate Storage Needs
Number of Pairs | Timeframes | Days | Format | Estimated Size |
---|---|---|---|---|
1 | 5m | 30 | JSON | ~2 MB |
1 | 5m | 365 | JSON | ~24 MB |
10 | 5m + 1h | 365 | JSON | ~300 MB |
50 | 1m + 5m + 1h | 365 | JSON | ~2 GB |
50 | 1m + 5m + 1h | 365 | Parquet | ~500 MB |
Optimize Storage Space
- Use Parquet format (saves 60-80% space)
- Keep only needed timeframes
- Regularly delete old data (keep 6-12 months)
- Delete data for non-trading pairs
π‘ Practical Tasks
Task 1: Basic Data Download
# Download BTC/USDT recent 30 days 5-minute data
conda activate freqtrade
freqtrade download-data -c config.json \
--pairs BTC/USDT \
--days 30 \
--timeframes 5m
Verify download success:
freqtrade list-data -c config.json --pairs BTC/USDT
Task 2: Multiple Pairs and Timeframes
# Download BTC, ETH, BNB 5m and 1h data
freqtrade download-data -c config.json \
--pairs BTC/USDT ETH/USDT BNB/USDT \
--days 30 \
--timeframes 5m 1h
Task 3: View Data Statistics
# View all downloaded data
freqtrade list-data -c config.json
# Record the following information:
# - How many trading pairs?
# - How many candles per pair?
# - Start and end dates of data?
Task 4: Test Error Handling
Try downloading a non-existent trading pair:
freqtrade download-data -c config.json \
--pairs INVALIDPAIR/USDT \
--days 30 \
--timeframes 5m
Observe error messages and understand how Freqtrade handles invalid pairs.
Task 5: Data Format Conversion
Convert JSON format to Parquet:
# Convert data format
freqtrade convert-data \
--format-from json \
--format-to parquet \
-c config.json
Compare file size differences:
# Linux/macOS
du -sh user_data/data/binance/*.json
du -sh user_data/data/binance/*.parquet
# Windows
dir user_data\data\binance
π Quiz
Basic Questions
- How many candles approximately for downloading 30 days of 1-minute data?
- What's the difference between
--days 30
and--timerange 20250901-20250930
? - If a strategy uses 15m timeframe but only 5m data is downloaded, what will happen in backtesting?
Answers
- 43,200 candles (1440 candles/day Γ 30 days)
-
--days 30
is 30 days back from today;--timerange
specifies exact date range - Backtesting will fail because the required timeframe data is missing
Advanced Questions
- Why is Parquet format more suitable than JSON for large data volumes?
- How much storage space is needed to download 1 year of 1-minute data (1 trading pair)?
- How does incremental download work?
Thinking Questions
- If an exchange goes down during a period, will downloaded data have gaps?
- Will data for the same trading pair differ between exchanges?
- Why recommend 5m or 15m instead of 1m for beginners?
π§ Common Issues and Solutions
Issue 1: Slow Download Speed
Cause: Network issues or exchange rate limiting
Solution:
# Use proxy (if needed)
export HTTP_PROXY=http://127.0.0.1:7890
export HTTPS_PROXY=http://127.0.0.1:7890
freqtrade download-data -c config.json --days 30 --timeframes 5m
Issue 2: Download Failure
Error Message:
Exchange binance does not support fetching OHLCV data for BTC/USDT
Cause: Incorrect trading pair name or exchange doesn't support it
Solution:
# Check exchange supported pairs
freqtrade list-pairs -c config.json --quote USDT
# Use correct trading pair name
freqtrade download-data -c config.json --pairs BTC/USDT --days 30 --timeframes 5m
Issue 3: Incomplete Data
Phenomenon: Data missing prompt during backtesting
Solution:
# Re-download complete data
freqtrade download-data -c config.json \
--timerange 20250901-20250930 \
--timeframes 5m \
--erase # Force re-download
Issue 4: Insufficient Disk Space
Solution:
- Delete unwanted data
- Convert to Parquet format
- Keep only commonly used timeframes
π Data Download Best Practices
1. Timeframe Selection
- Beginner: 5m + 1h (balances speed and quality)
- Advanced: 1m + 5m + 15m + 1h (multi-timeframe analysis)
- Professional: Full cycle (1m to 1d)
2. Data Period
- Strategy Development: 30-90 days (fast iteration)
- Strategy Validation: 180-365 days (stability testing)
- Production: 365+ days (covers various market conditions)
3. Number of Trading Pairs
- Beginner: 1-3 mainstream coins (BTC/ETH/BNB)
- Advanced: 5-10 trading pairs (diversified testing)
- Professional: 20-50 trading pairs (portfolio)
4. Storage Optimization
- Prioritize Parquet format
- Regularly clean old data older than 6 months
- Keep only actively trading pair data
π Reference Documentation
Freqtrade Official Documentation
Related Documentation
- π TESTING_GUIDE.md - Data download section
- π CONFIG_EXPLANATION.md - Exchange configuration
π Key Points Summary
- Timeframe determines trading style: Short cycle = high frequency, long cycle = swing
- Beginner recommend 5m or 15m: Balance speed and quality
- Regularly update data: Keep data current
- Use Parquet format: Save space, improve speed
- Data integrity check: Avoid backtesting errors
β‘οΈ Next Lesson Preview
Lesson 5: First Strategy Backtest
In the next lesson, we will:
- Run first complete strategy backtest
- Learn to interpret backtest reports
- Understand key performance indicators
- Analyze exit reason statistics
Preparation:
- β Ensure BTC/USDT 30 days 5m data downloaded
- β Confirm Strategy001 strategy exists
- β Read TESTING_GUIDE.md basic backtesting section
π― Learning Verification Standards:
- β Can independently download historical data for any trading pair
- β Understand applicable scenarios for different timeframes
- β
Can use
list-data
to view local data - β Can estimate data storage space requirements
After completing these tasks, you have the data foundation needed for backtesting! Ready to enter the exciting backtesting practice session! π
Top comments (0)