Introduction
ClickHouse® provides a rich collection of built-in system tables that expose valuable information about the server's internal operations. These tables make it easier to monitor queries, storage, replication, background tasks, and overall server health without relying entirely on external monitoring tools.
Whether you're investigating slow queries, monitoring disk usage, tracking replication, or diagnosing performance bottlenecks, system tables provide real-time visibility into what's happening inside your ClickHouse server.
In this article, we'll explore the most useful system tables and learn how to use them with practical SQL examples for effective troubleshooting.
What Are System Tables?
The system database is a built-in database that contains read-only tables describing the internal state of a ClickHouse server.
These tables automatically expose metadata, runtime statistics, logs, and configuration information.
Common information available includes:
- Database and table metadata
- Column definitions
- Running queries
- Historical query logs
- Storage parts
- Background merges
- Mutations
- Replication status
- Disk usage
- Server errors
- Configuration details
Unlike user-created tables, system tables are automatically maintained by ClickHouse and can be queried using standard SQL.
Commonly Used System Tables
Below are some of the most useful system tables for troubleshooting and monitoring.
system.tables
Contains metadata about every table.
Useful information includes:
- Database name
- Table name
- Storage engine
- Total rows
- Storage settings
system.columns
Displays metadata for every column.
Information includes:
- Column name
- Data type
- Default values
- Compression codec
- Expressions
system.parts
Provides details about MergeTree data parts.
Useful for:
- Row counts
- Disk usage
- Active parts
- Storage growth
system.processes
Shows all currently running queries.
Useful for monitoring:
- Active SQL statements
- Execution time
- Memory usage
- Current user
system.query_log
Stores historical query execution information.
Useful for:
- Slow query analysis
- Query history
- Performance optimization
- User activity
system.errors
Contains server error statistics.
Useful for:
- Identifying recurring problems
- Debugging server issues
- Monitoring error frequency
system.merges
Displays currently running background merge operations.
Useful for:
- Monitoring MergeTree maintenance
- Identifying merge bottlenecks
system.mutations
Tracks asynchronous UPDATE and DELETE operations.
Useful for:
- Mutation progress
- Pending updates
- Failed mutations
system.replicas
Shows replication status for replicated tables.
Useful for:
- Replication lag
- Replica health
- Queue size
- Read-only replicas
system.disks
Displays configured storage disks.
Useful for:
- Free space
- Total capacity
- Disk health
When Should You Use System Tables?
System tables become extremely valuable whenever performance or operational issues occur.
Typical troubleshooting scenarios include:
- Finding slow queries
- Investigating long-running queries
- Monitoring storage growth
- Tracking background merges
- Checking mutation progress
- Verifying replication health
- Monitoring disk usage
- Investigating server errors
These tables provide direct visibility into the database without requiring additional monitoring software.
Troubleshooting with System Tables
1. Identify Slow Queries
Review the slowest executed queries.
SELECT
query,
query_duration_ms
FROM system.query_log
ORDER BY query_duration_ms DESC
LIMIT 10;
This helps identify:
- Expensive queries
- Poorly optimized SQL
- Queries requiring indexes or redesign
2. Monitor Running Queries
View currently executing queries.
SELECT
query,
elapsed,
memory_usage
FROM system.processes
ORDER BY elapsed DESC;
Useful for:
- Long-running queries
- Memory-heavy queries
- Current server activity
3. Monitor Background Merges
View ongoing MergeTree merges.
SELECT *
FROM system.merges;
Useful for understanding whether background merges are affecting server performance.
4. Track Pending Mutations
Check UPDATE and DELETE operations that have not yet completed.
SELECT *
FROM system.mutations
WHERE is_done = 0;
Useful for:
- Monitoring asynchronous mutations
- Identifying stalled operations
5. Verify Replication Health
Inspect replica status.
SELECT
database,
table,
queue_size,
is_readonly
FROM system.replicas;
Useful for:
- Replication lag
- Queue backlog
- Replica availability
6. Monitor Disk Usage
View storage utilization.
SELECT
name,
free_space,
total_space
FROM system.disks;
Useful for:
- Preventing storage exhaustion
- Capacity planning
- Monitoring available disk space
Additional Monitoring Queries
View Largest Tables
SELECT
database,
table,
formatReadableSize(sum(bytes_on_disk)) AS size
FROM system.parts
WHERE active = 1
GROUP BY
database,
table
ORDER BY sum(bytes_on_disk) DESC;
View Recent Query History
SELECT
event_time,
query,
query_duration_ms
FROM system.query_log
ORDER BY event_time DESC
LIMIT 20;
List All Databases
SELECT *
FROM system.databases;
View Configured Storage Disks
SELECT *
FROM system.disks;
Check Server Errors
SELECT *
FROM system.errors;
View ClickHouse Version
SELECT version();
Example Troubleshooting Workflow
Imagine users report that dashboards have become noticeably slower.
A systematic troubleshooting approach could be:
- Inspect system.processes to identify long-running queries.
- Review system.query_log for recently executed slow queries.
- Check system.merges to determine whether background merges are consuming resources.
- Verify replica health using system.replicas.
- Monitor available disk space using system.disks.
Following this workflow helps isolate performance bottlenecks efficiently.
Best Practices
For effective monitoring and troubleshooting:
- Monitor system.processes regularly.
- Review system.query_log to optimize slow queries.
- Track system.parts to monitor storage growth.
- Monitor system.merges and system.mutations to ensure background operations complete successfully.
- Check system.replicas regularly in replicated clusters.
- Monitor system.disks to avoid storage bottlenecks.
- Enable query logging in production environments.
- Integrate system tables into dashboards such as Grafana for continuous monitoring.
- Configure alerts for abnormal query durations, replication lag, disk usage, and server errors.
Conclusion
ClickHouse® system tables are one of the most powerful built-in tools for monitoring and troubleshooting database performance.
Tables such as system.processes, system.query_log, system.parts, system.mutations, and system.replicas provide deep visibility into query execution, storage, replication, and server health.
By making system tables part of your regular monitoring strategy, you can quickly diagnose issues, optimize performance, and maintain a reliable, high-performing ClickHouse deployment.
Top comments (0)