Gabor Koos

Posted on Sep 20 • Originally published at blog.gaborkoos.com

The Database Zoo: Time Series Databases

#database

This post is part of The Database Zoo: Exotic Data Storage Engines , a series exploring purpose-built databases engineered for specific workloads. Each post dives into a different type of specialized engine, explaining the problem it solves, the design decisions behind its architecture, how it stores and queries data efficiently, and real-world use cases. The goal is to show not just what these databases are, but why they exist and how they work under the hood.

Introduction

Time-series data is everywhere in modern systems. Unlike traditional transactional data, which tends to be structured and relatively static, time-series data is continuous, high-volume, and temporal. Common examples include:

Metrics from computing systems: CPU usage, memory consumption, disk I/O, network latency, and error rates, often collected at sub-second intervals from thousands of servers or containers.
IoT sensor readings: Temperature, humidity, pressure, motion, or energy consumption from sensors embedded in devices, vehicles, or industrial equipment. These streams can be massive, especially in large-scale deployments.
Financial tick data: High-frequency trades, stock prices, currency exchange rates, and market orders, which are inherently time-dependent and must be processed in near real-time.
Event logs and telemetry: Application logs, user interactions, system events, and audit trails that are continuously generated, often across distributed services.

These workloads share several key characteristics that distinguish them from traditional relational or NoSQL use cases:

High ingestion rates: Millions of data points can arrive per second, requiring the database to handle rapid writes efficiently.
Time-ordered access patterns: Queries often request ranges of data over a period (e.g., last 24 hours), which benefits from sequential storage layouts and specialized indexing.
Retention and downsampling: Systems frequently need to retain detailed recent data while summarizing or discarding older data, demanding automated retention policies and rollups.
Compression needs: Storing high-volume time-series data without optimization can consume vast amounts of disk space, making compression essential for efficiency.

While general-purpose SQL and NoSQL databases are flexible and reliable for many workloads, they struggle with these characteristics. Relational databases often become write-bound when handling millions of inserts per second, and range queries over time can be slow. Many NoSQL engines, while horizontally scalable, lack native support for time-centric queries, retention policies, or specialized compression schemes.

The goal of this post is to explain why Time-Series Databases exist, how they solve the unique challenges of temporal data, and what trade-offs they make. We will explore:

The architectural principles that enable high-throughput ingestion, efficient storage, and fast querying.
Core storage formats, indexing strategies, and compression techniques used in TSDBs.
Common query patterns and optimizations that make analysis of time-series data feasible at scale.
Real-world examples and use cases, showing how popular engines implement these ideas.

By the end of this post, you will understand not only the technical foundations of TSDBs but also when and why to choose a specialized engine for time-dependent workloads, rather than relying on general-purpose databases.

Why General-Purpose Databases Struggle

Even the most robust relational and NoSQL databases face challenges when dealing with time-series workloads. The patterns and scale of temporal data create performance bottlenecks, storage inefficiencies, and operational complexity that general-purpose engines were not designed to handle.

High Ingestion Rates

Time-series data is inherently write-heavy. Monitoring thousands of servers or IoT devices can generate millions of measurements per second. In a traditional RDBMS, each insert involves:

Writing to a table with a fixed schema
Maintaining indexes for query acceleration
Logging the transaction for durability (WAL or similar mechanisms)
Ensuring ACID guarantees

This sequence can become a bottleneck, limiting write throughput and increasing latency. Many NoSQL stores handle high write volumes better but still require careful sharding and partitioning to avoid hotspots, and lack native support for efficient temporal queries.

Range Queries Over Time

Time-series workloads are dominated by range queries: fetching all values of a metric over a time interval. Relational databases excel at point lookups or joins but scanning millions or billions of rows for a time range can be slow, even with indexes. Similarly, key-value or document stores may require multiple queries or application-side filtering to reconstruct a time range, adding overhead and complexity.

Retention and Downsampling

Applications often need to retain detailed recent data for monitoring or analysis while summarizing or discarding older data. In general-purpose systems, this is cumbersome:

Manual deletion scripts or batch jobs must be scheduled
Downsampling (aggregating data into coarser intervals) requires complex queries
Storage costs increase rapidly without automated policies

Time-series databases automate retention and downsampling, reducing operational burden and storage footprint.

Compression Needs

High-volume time-series data can quickly overwhelm storage. Traditional row-oriented databases store each data point with full metadata and padding, wasting space. Without specialized compression:

Disk usage balloons with millions of metrics per day
Cache efficiency suffers, impacting query performance
Backups and replication become more resource-intensive

Specialized TSDBs implement delta encoding, run-length encoding, Gorilla-style compression, and columnar layouts to store data efficiently while preserving query performance.

Summary

General-purpose SQL and NoSQL databases are flexible and reliable for a wide range of workloads, but time-series data presents unique challenges:

Massive, continuous write streams
Predominantly sequential, time-based queries
Retention and aggregation requirements
Storage efficiency at scale

These challenges set the stage for Time-Series Databases, purpose-built engines that optimize for the characteristics of temporal data. By designing around these specific workloads, TSDBs achieve high write throughput, fast queries, and efficient storage that general-purpose systems cannot match.

Core Architecture

Time-Series Databases are designed from the ground up to handle high-volume, time-ordered data efficiently. Unlike general-purpose databases, TSDBs make deliberate design choices that optimize ingestion speed, storage efficiency, and temporal query performance. Let's break down the key architectural elements.

Storage Layouts

TSDBs typically adopt storage layouts that exploit the sequential nature of time-series data:

Append-only log structure: Data points are written sequentially to disk or memory. This approach simplifies writes, reduces lock contention, and allows high throughput.
Time-partitioned chunks: Data is grouped into fixed intervals (e.g., hourly or daily segments). Partitioning enables efficient range scans, retention, and compaction.
Columnar layouts: Some TSDBs store each field (timestamp, value, tags) in separate columns. Columnar storage improves compression and accelerates aggregations over large ranges.
Hybrid in-memory and on-disk storage: Recent data may reside in memory for low-latency access, while older chunks are persisted on disk for durability and cost-efficiency.

This storage design allows TSDBs to ingest millions of points per second, while keeping historical data queryable with minimal overhead.

Indexing Strategies

Efficient indexing is critical for fast range queries and tag-based filtering:

Time-first primary keys: Each data point is keyed by series ID (metric + tags) and timestamp, allowing sequential retrieval of time ranges.
Secondary indexes on dimensions: Tags such as host, region, or sensor type enable filtering and grouping without scanning all data.
In-memory indexes for hot data: Recent or frequently accessed data may be indexed in memory for sub-millisecond query response.
Disk-based indexes for historical data: On-disk structures allow efficient access to older chunks without consuming excessive memory.

Some TSDBs, like InfluxDB, implement series-to-chunk maps to efficiently track which data partitions contain specific series, enabling faster query execution.

Compression Techniques

Time-series data exhibits temporal and numeric patterns that TSDBs exploit to reduce storage:

Delta encoding: Stores differences between consecutive timestamps or values rather than absolute numbers. This is effective since time intervals are often regular and values change gradually.
Run-length encoding (RLE): Efficiently stores repeated or constant values over time. RLE is a simple, lossless data compression technique that replaces sequences of identical, consecutive data values (called "runs") with a single data value and a count of its occurrences.
Gorilla-style compression: Optimized for floating-point time-series, encoding both timestamps and values using minimal bits while preserving precision. Instead of storing absolute timestamps, the system stores the difference between consecutive timestamps, and then the difference of those differences (delta-of-delta). This is efficient because time-series data often has small time intervals between data points. Floating-point numbers (like sensor readings) are often stored using a common and efficient bitwise operation called XOR. This method takes advantage of the similarity in many successive floating-point values in time-series data to compress them effectively. Gorilla compression uses a combination of these two techniques (ie, delta-of-delta for timestamps and XOR for values) to achieve high compression ratios while allowing fast decompression during queries.
Columnar block compression: Compresses each field independently, allowing fast scans and aggregation without full decompression.

These techniques can reduce storage requirements by an order of magnitude compared to naïve row-oriented storage, while maintaining fast query performance.

Retention and Downsampling

A core feature of TSDBs is automated retention and rollup:

Retention policies: Automatically expire or drop old data after a configurable interval.
Downsampling: Aggregates older data into coarser intervals (e.g., average CPU usage per hour), keeping storage growth manageable while retaining historical trends.
Continuous queries: Precompute aggregates or transformations in the background for frequently accessed time ranges.

These features relieve engineers from manually managing data lifecycles, while ensuring that queries on recent or long-term data remain efficient.

Summary

The core architecture of TSDBs revolves around four pillars:

Sequential, append-only storage optimized for writes.
Time- and dimension-based indexing for fast retrieval.
Advanced compression schemes to minimize disk usage without sacrificing speed.
Retention and downsampling mechanisms to manage storage and maintain query performance over time.

Together, these architectural choices allow TSDBs to handle workloads that would overwhelm general-purpose databases, delivering high-throughput ingestion, efficient storage, and fast temporal queries.

Query Execution and Patterns

Time-series workloads are dominated by temporal queries, aggregations, and filtering by dimensions or tags. TSDBs optimize both the query model and the execution engine to make these operations efficient, even over billions of data points.

Common Query Types

Range Queries

Fetch all data points for a metric or series over a specific time interval.

Example: CPU usage for the last 24 hours.

Optimized by: time-ordered storage, partition pruning, and sequential reads.

Aggregations

Compute min, max, average, sum, percentiles, or custom functions over a range of points.

Example: average temperature per hour for a week.

Optimized by: columnar storage and pre-aggregated chunks (downsampling).

Grouping by Tags / Dimensions

Organize metrics by metadata, e.g., host, region, device type.

Example: average memory usage per host across a data center.

Optimized by: secondary indexes and series-to-chunk maps.

Downsampling and Interval Aggregation

Aggregate data into coarser time intervals for long-term trends.

Example: 1-minute averages rolled up into hourly summaries.

Optimized by: continuous queries or materialized aggregates.

Alerting / Threshold Queries

Identify points exceeding thresholds or patterns requiring action.

Example: trigger alert if latency > 200ms for 5 consecutive minutes.

Optimized by: in-memory indexing and efficient scan algorithms.

Query Execution Strategies

TSDBs translate these queries into efficient execution plans tailored for temporal data:

Chunk/Segment Scanning

Queries access only relevant time-partitioned chunks.
Reduces disk I/O and memory usage by skipping irrelevant data.

Compression-aware scanning

Many TSDBs can aggregate data directly on compressed blocks without decompressing fully.
Minimizes CPU overhead and accelerates queries.

Merge of in-memory and on-disk data

Recent data in memory is combined with historical on-disk chunks seamlessly.
Ensures low-latency access to the most recent measurements.

Parallel execution

Queries often run in parallel across multiple chunks, partitions, or nodes.
Improves throughput for large-scale analytics over long time ranges.

Query pushdown

Filters and aggregates are pushed as close to the storage engine as possible.
Minimizes data movement and leverages compression/indexing for efficiency.

Example Query Patterns

Single series range query: Retrieve all CPU metrics for a server in the past 6 hours.
Multi-series aggregation: Compute the 95th percentile of latency for each service across all data centers.
Tag filtering: Find disk usage metrics only for servers in a specific region or cluster.
Downsampling: Generate hourly averages from per-second metrics for dashboard visualization.

Key Takeaways

Time-series queries are highly structured and predictable:

Most queries are time-bound, scanning a contiguous subset of the dataset.
Aggregation and grouping are common, rather than complex joins.
Efficient execution relies on storage layout, indexing, and compression.

By aligning storage and query execution with the temporal nature of the data, TSDBs achieve performance that general-purpose databases cannot match for these workloads.

Popular Time-Series Database Engines

Several purpose-built Time-Series Databases have emerged over the past decade, each optimized for specific workloads, ingestion rates, and query patterns. Here, we highlight a few widely adopted engines:

InfluxDB

Overview:

InfluxDB is a high-performance, open-source TSDB designed for real-time metrics and analytics. It features a SQL-like query language (InfluxQL) and supports continuous queries, retention policies, and downsampling.

Architecture Highlights:

Storage engine: Time-partitioned, append-only storage with compression.
Indexes: Series-keyed indexing allows fast retrieval by measurement and tag.
Query execution: Supports aggregations, transformations, and joins within time windows.
Retention & downsampling: Built-in policies automate lifecycle management.

Trade-offs:

Excellent for real-time monitoring and high-throughput ingestion.
Limited for multi-tenant, long-term analytics without careful planning.
Query language less expressive than full SQL for complex joins.

Use Cases:

Monitoring systems, IoT telemetry, financial tick data.

TimescaleDB

Overview:

TimescaleDB is a PostgreSQL extension that adds time-series capabilities to a relational database. It leverages PostgreSQL’s ecosystem while optimizing storage and query execution for temporal data.

Architecture Highlights:

Hypertables: Abstract large time-series tables into time-partitioned chunks transparently.
Indexes: Uses PostgreSQL indexing with optimizations for time + dimension queries.
Query execution: Compatible with full SQL, enabling joins, window functions, and complex analytics.
Compression & retention: Native compression and policies reduce storage for historical data.

Trade-offs:

Offers strong relational capabilities alongside time-series optimization.
Ingestion throughput may be lower than specialized TSDBs for extremely high-frequency metrics.
Requires PostgreSQL knowledge for advanced configurations.

Use Cases:

Infrastructure monitoring, business analytics, financial data, IoT applications requiring relational joins.

Prometheus

Overview:

Prometheus is an open-source TSDB focused on monitoring and alerting in cloud-native environments. It uses a pull-based metric collection model and provides a powerful query language, PromQL.

Architecture Highlights:

Storage engine: Append-only log with compression optimized for numeric metrics.
Indexes: Time + series labels for fast lookups.
Query execution: PromQL enables complex temporal queries, aggregations, and alert expressions.
Retention & downsampling: Configurable retention; integrates with remote storage for long-term data.

Trade-offs:

Designed for ephemeral metric storage and monitoring, not general analytics.
Horizontal scaling is possible but requires federation or remote storage.
Limited support for very high cardinality metrics (many unique label combinations).

Use Cases:

Cloud infrastructure monitoring, service health dashboards, real-time alerting.

Other Notable Engines

OpenTSDB: Built on HBase, optimized for large-scale metric storage and aggregation.

Graphite: Focused on simple metrics collection and visualization, widely used in DevOps monitoring.

VictoriaMetrics: High-performance, cost-efficient TSDB with focus on large-scale deployments and long-term storage.

Key Takeaways

While each TSDB has its own approach, they share common traits:

Time-focused storage: All optimize for sequential writes and time-ordered reads.
Compression and retention: Reducing storage overhead and maintaining query speed is a priority.
Workload alignment: Each engine balances ingestion rate, query expressiveness, and scalability differently.
Trade-offs: Choosing a TSDB involves considering ingestion speed, query complexity, retention requirements, and operational overhead.

Understanding the distinctions between engines helps engineers select the right tool for their specific time-series workloads, rather than forcing a general-purpose database to do the job.

Trade-offs and Considerations

Time-Series Databases excel at workloads that would challenge general-purpose systems, but their optimizations come with compromises. Understanding these trade-offs is essential when selecting or designing a TSDB for your application.

Ingestion vs. Query Complexity

High-throughput ingestion is a primary goal for most TSDBs, often achieved through append-only storage, sequential writes, and minimal indexing on write paths.
Complex queries, especially ad hoc joins or multi-metric correlations, may be slower because the storage layout and indexes are optimized for time-based scans rather than relational joins.
Engineers must balance the need for fast writes with the types of queries required, sometimes precomputing aggregates or using hybrid systems (e.g., TimescaleDB) to handle complex queries efficiently.

Storage Efficiency vs. Latency

Compression reduces disk usage dramatically but may introduce CPU overhead during queries.
Some TSDBs allow querying directly on compressed blocks, while others require partial decompression.
Decision points include how long data must be retained, query frequency, and acceptable latency for analysis versus monitoring dashboards.

Retention and Downsampling Trade-offs

Automated retention policies and downsampling reduce storage costs and speed up queries on historical data.
However, downsampling may lose granularity, limiting fine-grained historical analysis.
Choosing retention and aggregation strategies depends on your business requirements, regulatory constraints, and storage budgets.

Scalability Considerations

TSDBs often scale horizontally via sharding or partitioning by time intervals or series keys.
Some engines (InfluxDB, Prometheus) may encounter challenges with high cardinality metrics (many unique tag combinations).
Distributed TSDBs (TimescaleDB multi-node, VictoriaMetrics) handle scale better but introduce network overhead and operational complexity.

Operational Complexity

Specialized TSDBs require domain knowledge for tuning retention, compression, and partitioning.
Backup, replication, and disaster recovery procedures can differ significantly from general-purpose databases.
Engineers must consider monitoring the database itself, especially for high-volume workloads, to avoid ingestion bottlenecks or query slowdowns.

Ecosystem and Tooling

Consider the query language (InfluxQL, SQL, PromQL) and integration with visualization or alerting tools (Grafana, Chronograf, custom dashboards).
Ecosystem maturity affects community support, libraries, and operational best practices, which are important for production deployments.

Key Takeaways

Choosing a Time-Series Database is about matching workload characteristics to engine strengths:
High-frequency metrics → prioritize ingestion-optimized engines like InfluxDB or VictoriaMetrics.
Complex queries with relational joins → consider SQL-compatible extensions like TimescaleDB.
Cloud-native monitoring → Prometheus offers a strong ecosystem and real-time alerting.

No TSDB is perfect for every scenario. Understanding ingestion patterns, query complexity, storage constraints, and operational overhead is crucial for selecting the right tool. When designed and deployed carefully, a TSDB provides unparalleled performance and efficiency for temporal data, enabling insights that general-purpose databases cannot deliver.

Use Cases and Real-World Examples

Time-Series Databases are not just academic exercises, they solve pressing, high-volume, temporal data problems across industries. Here are some concrete use cases that illustrate why TSDBs are indispensable.

Infrastructure and Application Monitoring

Scenario: Monitoring servers, containers, and applications in real-time.

Metrics: CPU, memory, disk I/O, network latency, request rates.

Challenges: Millions of metrics per second, low-latency queries, alerting on thresholds.

TSDB Benefits:

InfluxDB, Prometheus, and VictoriaMetrics ingest high-frequency metrics efficiently.
Downsampling and retention policies reduce storage while keeping recent data detailed.
Real-time queries and aggregations enable dashboards and alerting pipelines.

Example: A major SaaS company uses Prometheus to monitor microservices, triggering alerts when latency exceeds thresholds, while storing long-term trends in TimescaleDB for capacity planning.

IoT Sensor Data

Scenario: Collecting readings from millions of connected devices (temperature, humidity, GPS).

Challenges: Continuous ingestion from devices, variable reporting intervals, and long-term storage.

TSDB Benefits:

Sequential, append-only writes handle bursts of sensor data.
Partitioning by time intervals simplifies retrieval and downsampling.
Compression techniques reduce storage cost for massive datasets.

Example: A smart city project uses TimescaleDB to store traffic sensor and environmental data, aggregating it hourly for urban planning analytics.

Financial Tick Data and Trading Analytics

Scenario: High-frequency trading platforms storing price, volume, and order book data.

Challenges: Millisecond-level ingestion, historical analysis for backtesting, and low-latency queries.

TSDB Benefits:

Ingestion-optimized engines handle millions of events per second.
Range queries enable efficient retrieval of historical time windows.

Example: Hedge funds using InfluxDB for intraday market data and TimescaleDB for end-of-day historical analysis.

Event Logging and Telemetry

Scenario: Logging application events, API requests, or user interactions.

Challenges: Sequential write-heavy workloads, large volumes, and querying trends over time.

TSDB Benefits:

Append-only structures and compression efficiently store vast log streams.
Retention policies automatically purge stale data.
Downsampling or aggregation enables long-term trend analysis without overwhelming storage.

Example: A SaaS company stores API logs in VictoriaMetrics, allowing engineers to analyze usage patterns and detect anomalies.

Key Takeaways

Time-Series Databases shine when workloads involve high-frequency writes, sequential or range queries, and aggregation-heavy analysis. Across infrastructure monitoring, IoT, finance, and telemetry, TSDBs provide:

High write throughput to handle real-time streams.
Efficient storage and retention to keep historical data manageable.
Fast temporal queries and aggregations for dashboards, alerts, and analytics.

By choosing a purpose-built TSDB rather than forcing general-purpose databases to handle temporal workloads, engineers gain scalability, performance, and operational simplicity.

Example Workflow: Ingesting and Querying CPU Usage with InfluxDB

This section provides a concrete end-to-end example of a time-series workflow, using InfluxDB to illustrate how data moves from ingestion to visualization, highlighting the architecture and optimizations discussed earlier.

Scenario

We want to monitor CPU usage across a small cluster of three servers, collecting metrics every second. This example demonstrates the complete TSDB workflow:

Data ingestion
Storage layout and indexing
Query execution
Retention and downsampling
Visualization

By following this workflow, we can see how InfluxDB's architecture enables high-throughput writes, efficient storage, fast queries, and automated data lifecycle management.

Step 1: Ingestion

Metrics are collected using Telegraf (InfluxDB's agent) or a custom script. Each data point includes:

measurement: cpu
tags: host=server1
fields: usage_user=12.5, usage_system=3.2
timestamp: 2025-09-18T10:15:00Z

Data is sent via HTTP API or written directly through InfluxDB's client libraries.

Key Concepts Illustrated:

Append-only writes: Data is appended sequentially, reducing lock contention.
Sequential storage: Enables high-throughput ingestion.
Minimal write overhead: Indexing and compression happen efficiently in the background.

Step 2: Storage and Indexing

InfluxDB organizes data into time-partitioned chunks called TSM (Time-Structured Merge) files, and maintains indexes for fast retrieval.

Storage highlights:

In-memory WAL (Write-Ahead Log): Recent writes are stored in memory before flushing to disk.
TSM chunks on disk: Store older data efficiently with compression.
Tag-based indexing: Each metric can be filtered by tags like host, region, or service.

This architecture enables efficient sequential writes and fast range queries, even on millions of data points.

Step 3: Query Execution

Example query: "Average CPU usage for server1 over the last 5 minutes."

InfluxQL:

SELECT mean(usage_user)
FROM cpu
WHERE host='server1' AND time > now() - 5m

Execution steps:

Identify chunks containing server1 using the tag index.
Merge in-memory WAL and on-disk TSM chunks.
Read only the relevant time range.
Apply compression-aware aggregation (mean) directly on compressed data when possible..
Return the aggregated result.

Step 4: Retention and Downsampling

Retention Policies:

Keep 7 days of raw CPU metrics.
Older data is automatically deleted.

Downsampling via Continuous Queries:

Compute hourly averages for historical data.
Store aggregated data in a downsampled measurement to reduce storage and maintain query performance.

This ensures recent data remains granular, while historical data is summarized efficiently.

Step 5: Visualization

Query results can be fed into Grafana dashboards:

Real-time metrics: Display per-second CPU usage for recent activity.
Historical trends: Use hourly averages to visualize long-term patterns.
Alerts: Trigger notifications if CPU usage exceeds thresholds for a specified duration.

Conclusion

Time-Series Databases are purpose-built engines that address the unique challenges of temporal data: high-volume writes, time-ordered queries, automated retention, and efficient storage. Unlike general-purpose relational or NoSQL databases, TSDBs are designed from the ground up to handle continuous, sequential, and often high-frequency workloads with minimal operational overhead.

Through our InfluxDB workflow example, we've seen how a TSDB handles the full lifecycle of time-series data: from ingesting per-second CPU metrics, organizing them in time-partitioned and indexed storage, executing compression-aware queries, managing retention and downsampling, to visualizing insights in real-time dashboards. This end-to-end perspective highlights the architectural optimizations—append-only writes, in-memory WAL, TSM storage, tag-based indexing, and automated rollups—that make TSDBs uniquely suited for temporal workloads.

When choosing a time-series database, it's essential to balance ingestion throughput, query complexity, retention requirements, and operational considerations. Engines like InfluxDB, TimescaleDB, Prometheus, and VictoriaMetrics each make different trade-offs, reflecting the diversity of time-series use cases from infrastructure monitoring and IoT telemetry to financial tick data and event logging.

Ultimately, understanding the core principles and trade-offs behind TSDBs empowers engineers to select the right tool for their workloads, ensuring that temporal data is captured efficiently, queried rapidly, and stored sustainably. By leveraging purpose-built time-series engines, teams can gain actionable insights from data streams that would overwhelm general-purpose databases, unlocking performance, scalability, and observability in systems that rely on real-time temporal information.