Why TimescaleDB?
Time-series data is everywhere: IoT sensors, stock prices, server metrics, user analytics. PostgreSQL alone struggles with billions of time-stamped rows. TimescaleDB extends PostgreSQL with automatic partitioning, compression, and continuous aggregates — while keeping full SQL compatibility.
Timescale Cloud free tier: 25 GB storage, 1 CPU, 1 month trial. Self-hosted: completely free forever.
Getting Started
Option 1: Timescale Cloud (Managed)
Sign up at timescale.com — get a managed PostgreSQL + TimescaleDB instance in 30 seconds.
Option 2: Self-Hosted (Docker)
docker run -d --name timescaledb \
-p 5432:5432 \
-e POSTGRES_PASSWORD=password \
timescale/timescaledb:latest-pg16
Create a Hypertable
-- Regular PostgreSQL table
CREATE TABLE sensor_data (
time TIMESTAMPTZ NOT NULL,
sensor_id TEXT NOT NULL,
temperature DOUBLE PRECISION,
humidity DOUBLE PRECISION
);
-- Convert to hypertable — this is the magic
SELECT create_hypertable('sensor_data', by_range('time'));
Insert Data
INSERT INTO sensor_data (time, sensor_id, temperature, humidity)
VALUES
(NOW(), 'sensor-1', 22.5, 45.0),
(NOW() - INTERVAL '1 hour', 'sensor-1', 21.8, 47.2),
(NOW() - INTERVAL '2 hours', 'sensor-2', 23.1, 42.8);
Python Example
import psycopg2
from datetime import datetime, timedelta
import random
conn = psycopg2.connect("postgresql://postgres:password@localhost:5432/postgres")
cur = conn.cursor()
# Generate sample data
for i in range(10000):
cur.execute(
"INSERT INTO sensor_data VALUES (%s, %s, %s, %s)",
(
datetime.now() - timedelta(minutes=i),
f"sensor-{random.randint(1, 10)}",
20 + random.random() * 10,
40 + random.random() * 20
)
)
conn.commit()
# Time-bucket query — aggregate by 1 hour
cur.execute("""
SELECT time_bucket('1 hour', time) AS bucket,
sensor_id,
AVG(temperature) AS avg_temp,
MAX(temperature) AS max_temp
FROM sensor_data
WHERE time > NOW() - INTERVAL '24 hours'
GROUP BY bucket, sensor_id
ORDER BY bucket DESC;
""")
for row in cur.fetchall():
print(f"{row[0]} | {row[1]} | avg: {row[2]:.1f}°C | max: {row[3]:.1f}°C")
Continuous Aggregates (Auto-Refresh Materialized Views)
CREATE MATERIALIZED VIEW hourly_stats
WITH (timescaledb.continuous) AS
SELECT time_bucket('1 hour', time) AS bucket,
sensor_id,
AVG(temperature) AS avg_temp,
MIN(temperature) AS min_temp,
MAX(temperature) AS max_temp,
COUNT(*) AS readings
FROM sensor_data
GROUP BY bucket, sensor_id;
-- Auto-refresh every hour
SELECT add_continuous_aggregate_policy('hourly_stats',
start_offset => INTERVAL '3 hours',
end_offset => INTERVAL '1 hour',
schedule_interval => INTERVAL '1 hour');
Compression (90%+ Space Savings)
ALTER TABLE sensor_data SET (
timescaledb.compress,
timescaledb.compress_segmentby = 'sensor_id'
);
SELECT add_compression_policy('sensor_data', INTERVAL '7 days');
Node.js Example
const { Client } = require("pg");
const client = new Client({ connectionString: "postgresql://postgres:password@localhost:5432/postgres" });
await client.connect();
// Query with time_bucket
const result = await client.query(`
SELECT time_bucket('15 minutes', time) AS bucket,
AVG(temperature) AS avg_temp
FROM sensor_data
WHERE time > NOW() - INTERVAL '6 hours'
GROUP BY bucket
ORDER BY bucket DESC
LIMIT 24
`);
result.rows.forEach(row => {
console.log(`${row.bucket}: ${parseFloat(row.avg_temp).toFixed(1)}°C`);
});
Use Cases
- IoT monitoring — sensor data with automatic downsampling
- Financial data — OHLCV candles with time_bucket
- DevOps metrics — server/container metrics with retention policies
- Analytics — user event tracking with fast aggregation queries
Need to scrape time-series data from the web? I build production-ready scrapers for price tracking, monitoring, and data collection. Check out my Apify actors or email spinov001@gmail.com for custom solutions.
What time-series data are you working with? Share in the comments!
Top comments (0)