In the world of Observability and Big Data, ClickHouse is currently the undisputed king. It's incredibly fast, compresses data like magic, and can handle petabytes of logs without breaking a sweat.
So, when I started building LogWard (an open-source Datadog alternative), everyone told me: "Just use ClickHouse."
I didn't. I chose PostgreSQL (specifically, TimescaleDB).
And honestly? For 99% of developers and SMBs, I think it's the better choice. Here's why.
The Context: The "Middle Ground" Problem
I built LogWard because I needed a self-hosted log management tool that fits on a $5 DigitalOcean droplet.
- ELK Stack (Elasticsearch) was out immediately. It eats 4GB of RAM just to say "Hello".
- Loki is great, but I find LogQL hard to teach to junior developers, and the setup (Promtail + Loki + Grafana) feels disjointed.
I needed something lightweight, transactional, and easy to manage.
ClickHouse is a Formula 1 Car... in a Traffic Jam
ClickHouse is amazing. I use it for analytics in other projects. But for a self-hosted, "install-and-forget" logging tool for developers, it has drawbacks:
- Operational Complexity: Managing ZooKeeper (or ClickHouse Keeper) and sharding is not trivial for a solo dev or a small team.
- Resource Usage: While lighter than Elastic, it still loves CPU and RAM for aggressive merging.
-
The "Update/Delete" Problem: Logs are mostly append-only, but sometimes you need to delete old data or handle GDPR "Right to be Forgotten" requests. In Postgres, that's a
DELETE FROM logs WHERE.... In columnar stores, mutations are expensive heavy operations.
Enter TimescaleDB: The "Boring" Solution
TimescaleDB is an extension that turns PostgreSQL into a time-series powerhouse. It gives you Hypertables (automatic partitioning by time) and Compression that rivals columnar stores.
Here's why I chose it for LogWard:
1. It's Just Postgres (SQL)
Developers know SQL. If I want to find all error logs from a specific service in the last hour, I don't need to learn a new query language (like LogQL or Splunk SPL).
SELECT * FROM logs
WHERE service = 'api-gateway'
AND level = 'error'
AND time > NOW() - INTERVAL '1 hour';
Every developer on the planet can read this.
2. Automatic Partitioning & Retention
With one command, TimescaleDB automatically splits my huge logs table into smaller chunks based on time.
Even better, implementing a retention policy (e.g., "Delete logs older than 14 days") is literally one line of code:
SELECT add_retention_policy('logs', INTERVAL '14 days');
No cron jobs, no scripts. The database handles the cleanup in the background.
3. Compression is Insane
Logs are highly repetitive text. TimescaleDB compresses chunks of data after a certain period. In my production tests with LogWard, I'm seeing 90-95% compression rates. A 100GB log file becomes ~5-7GB on disk. That makes storing logs on cheap SSDs viable.
4. Ingestion Speed
Can Postgres handle the write load? Using COPY or batched inserts (which I implemented in Fastify), a single Postgres instance can easily handle 50k-100k rows per second. Unless you're Netflix or Uber, you likely don't have that scale. And if you do, you have the budget for Splunk.
Conclusion: Use the Tool That Fits the Constraints
If you're ingesting 1 billion logs a day, please use ClickHouse (or Datadog). But if you're building a SaaS, a startup, or a side project, and you want a logging system that:
- Runs on Docker with 512MB RAM
- Uses standard SQL
- Backs up easily with standard tools (
pg_dump)
Then Postgres + TimescaleDB is the pragmatic winner.
This architecture is exactly what powers LogWard. It's open-source, and you can spin it up with Docker Compose to see how it performs on your machine.
💻 GitHub Repo: https://github.com/logward-dev/logward
☁️ Free Cloud Alpha: https://logward.dev
Let me know in the comments: Would you trust Postgres with your logs, or is ClickHouse the only way?
Top comments (0)