Introduction
As applications grow, traditional relational databases such as MySQL may struggle with analytical workloads involving millions of records and complex aggregations. While MySQL excels at Online Transaction Processing (OLTP), ClickHouse® is purpose-built for Online Analytical Processing (OLAP), enabling lightning-fast analytical queries on massive datasets.
Migrating data from MySQL to ClickHouse® allows organizations to build high-performance reporting systems, dashboards, and real-time analytics without impacting transactional workloads.
In this guide, you'll learn several approaches to migrate data from MySQL to ClickHouse®, along with their advantages, limitations, and ideal use cases.
Why Migrate from MySQL to ClickHouse®?
MySQL and ClickHouse® are designed for different workloads.
| Feature | MySQL | ClickHouse® |
|---|---|---|
| Storage Model | Row-based | Columnar |
| Best For | Transactions (OLTP) | Analytics (OLAP) |
| Query Speed | Fast for row lookups | Extremely fast for large scans |
| Aggregation Performance | Moderate | Extremely fast |
| Scalability | Primarily Vertical | Optimized for analytical scaling |
| Typical Use Cases | Applications and transactional systems | Reporting, dashboards, and analytics |
Migrating from MySQL to ClickHouse® makes sense when:
- Analytical queries are becoming slow in MySQL.
- You need real-time dashboards over large datasets.
- Reporting queries are impacting your production database.
- You regularly process millions or billions of rows.
Migration Architecture
MySQL
│
▼
Export / Synchronization
│
▼
Data Transformation
│
▼
ClickHouse®
│
▼
Dashboards / Analytics
Migration Methods
There are multiple ways to migrate data depending on your requirements.
Method 1: CSV Export and Import (Recommended for Beginners)
This is the simplest approach for performing a one-time migration of historical data.
Step 1: Export Data from MySQL
Run the following command inside MySQL:
SELECT *
INTO OUTFILE '/tmp/employees.csv'
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
FROM employees;
Alternatively, export using the MySQL CLI:
mysql -u root -p mydb \
-e "SELECT * FROM orders" \
| sed 's/\t/,/g' > /tmp/orders.csv
Step 2: Create the Destination Table in ClickHouse®
Before importing data, create the corresponding MergeTree table. Remember to map MySQL data types to ClickHouse® equivalents.
CREATE TABLE employees
(
id UInt32,
name String,
department String,
salary Float64
)
ENGINE = MergeTree
ORDER BY id;
Step 3: Import the CSV File
Using clickhouse-client:
clickhouse-client \
--query "INSERT INTO employees FORMAT CSV" \
< /tmp/employees.csv
Using the HTTP API:
curl -u default:password \
"http://localhost:8123/?query=INSERT+INTO+default.employees+FORMAT+CSV" \
--data-binary @/tmp/employees.csv
Step 4: Verify the Import
SELECT count()
FROM default.employees;
Example output:
| count() |
|---|
| 5 |
Method 2: Using the MySQL Table Engine
ClickHouse® can directly connect to and query MySQL tables without requiring an intermediate export.
Step 1: Create a MySQL Engine Table
CREATE TABLE mysql_employees
(
id UInt32,
name String,
department String,
salary Float64
)
ENGINE = MySQL(
'localhost:3306',
'default',
'employees',
'user',
'password'
);
You can now query the MySQL table directly.
SELECT *
FROM mysql_employees;
Step 2: Create the Destination MergeTree Table
CREATE TABLE employees
(
id UInt32,
name String,
department String,
salary Float64
)
ENGINE = MergeTree
ORDER BY id;
Step 3: Copy the Data
Once the MySQL engine is configured, migrate the data using a single query.
INSERT INTO employees
SELECT *
FROM mysql_employees;
This is one of the fastest and cleanest approaches for one-time migrations.
Step 4: Verify the Migration
SELECT *
FROM employees
LIMIT 5;
For most one-time migrations, this is the recommended approach because everything happens inside ClickHouse® without intermediate files.
Method 3: Using ClickPipes (ClickHouse Cloud)
If you're using ClickHouse Cloud, ClickPipes provides a fully managed ingestion service.
Migration steps:
- Create a ClickPipe.
- Connect your MySQL database.
- Select the tables you want to synchronize.
- Start continuous replication.
Advantages
- Fully managed service
- Continuous incremental synchronization
- Minimal operational maintenance
Limitation
- Available only for ClickHouse Cloud.
Method 4: Real-Time Migration with Kafka and Debezium
For continuous synchronization between MySQL and ClickHouse®, Change Data Capture (CDC) using Debezium is the most robust solution.
Debezium captures every INSERT, UPDATE, and DELETE operation from MySQL and streams the changes into Kafka, where ClickHouse® consumes them.
Architecture
MySQL
│
▼
Debezium (CDC)
│
▼
Kafka
│
▼
ClickHouse Kafka Engine
│
▼
Materialized View
│
▼
MergeTree Table
This architecture enables near real-time synchronization.
Best suited for:
- Event streaming
- Real-time dashboards
- IoT platforms
- Monitoring systems
- Operational analytics
MySQL to ClickHouse® Data Type Mapping
| MySQL Type | ClickHouse® Type |
|---|---|
| INT | Int32 / UInt32 |
| BIGINT | Int64 / UInt64 |
| VARCHAR(n) | String |
| TEXT | String |
| DECIMAL(p,s) | Decimal(p,s) |
| FLOAT | Float32 |
| DOUBLE | Float64 |
| DATETIME | DateTime |
| DATE | Date |
| TINYINT(1) | UInt8 (Boolean) |
| JSON | String |
Verifying the Migration
Always verify that both databases contain identical data after migration.
Row Count Validation
MySQL:
SELECT COUNT(*)
FROM orders;
ClickHouse®:
SELECT count()
FROM default.orders;
Both queries should return identical counts.
If they don't, investigate:
- Data type mismatches
- NULL handling differences
- Missing records during export
- Failed import batches
Best Practices
1. Choose the Right ORDER BY Key
Unlike MySQL's primary key, ClickHouse® uses ORDER BY to determine how data is physically sorted.
Choose columns that are frequently used in filtering conditions.
2. Use LowCardinality for Repetitive Strings
Columns with relatively few distinct values benefit from better compression.
country LowCardinality(String),
status LowCardinality(String)
3. Partition by Time
For time-series data, partition by month or day.
PARTITION BY toYYYYMM(order_date)
4. Remove AUTO_INCREMENT
ClickHouse® doesn't support AUTO_INCREMENT.
Instead, simply store identifiers as:
UInt32
or
UInt64
5. Handle NULL Values Carefully
ClickHouse® columns are NOT NULL by default.
If nullable columns exist in MySQL, either use Nullable(Type) or replace NULL values during migration.
Example:
SELECT
id,
COALESCE(department, 'Unknown') AS department
FROM default.mysql_employee;
Common Challenges
Data Type Mismatches
Ensure MySQL data types are correctly mapped before migration.
Duplicate Records
Implement deduplication strategies or unique identifiers when performing incremental loads.
Large Tables
Split large datasets into smaller batches to reduce memory consumption and improve reliability.
Character Encoding
Use UTF-8 encoding consistently to prevent text corruption.
Performance Tips
- Use the MergeTree engine for analytical workloads.
- Batch inserts instead of inserting one row at a time.
- Compress data whenever possible.
- Choose an efficient partitioning strategy.
- Avoid unnecessary indexes—ClickHouse® relies on sorting and data-skipping indexes instead.
Quick Reference
| Scenario | Recommended Method |
|---|---|
| One-time migration | CSV Export & Import |
| Direct access to MySQL | MySQL Table Engine |
| Full table migration | INSERT INTO ... SELECT |
| Continuous synchronization | ClickPipes |
| Real-time streaming | Kafka + Debezium |
Conclusion
Migrating data from MySQL to ClickHouse® is an effective way to accelerate analytical workloads while keeping your transactional database unchanged.
In this architecture, MySQL continues handling application writes and transactional operations, while ClickHouse® powers dashboards, reporting, and large-scale analytical queries.
For most organizations, using the MySQL Table Engine together with INSERT INTO ... SELECT offers the simplest and most reliable solution for one-time migrations.
If continuous synchronization is required, Debezium with Kafka provides a production-ready Change Data Capture (CDC) pipeline capable of replicating changes in near real time.
With proper schema design, efficient partitioning, and the right migration strategy, ClickHouse® can deliver dramatically faster analytical performance from the very first day.
Top comments (0)