Part 1 of a series on building a metrics pipeline into ClickHouse
Collecting metrics is easy.
Shipping them to an analytical database without losing your mind is the hard part.
The Goal
At one point, the task seemed straightforward:
Collect system metrics (CPU, memory, GPU) and store them in ClickHouse for analysis.
This is a common observability use case.
You collect metrics, send them somewhere, and run queries on top.
Simple enough.
But in practice, it didn’t go as planned.
The Initial Approach: Telegraf
I started with Telegraf.
It’s widely used for collecting system metrics and has a plugin-based architecture, which makes it a natural first choice.
This was also where I first came across TOML.
At first, it felt like I just needed to “write a config and run it.”
But very quickly, I realized:
Configuration isn’t just syntax-it defines how your system behaves.
What I Was Trying to Build
The idea was simple:
- Collect host-level metrics (CPU, memory, etc.)
- Collect GPU metrics
- Push everything into ClickHouse
- Run analytical queries on top
Essentially, a basic observability pipeline.
Where Things Started Breaking
On paper, Telegraf looked like it should work.
In reality, I ran into a few issues:
- No straightforward way to push data into ClickHouse
- Lack of a native ClickHouse output plugin
- Debugging wasn’t very intuitive
- Configurations became rigid as complexity increased
At some point, I was spending more time trying to make the tool fit the use case than actually solving the problem.
A Shift in Perspective
This is where something important clicked.
Up until this point, I was thinking in terms of:
Write config → Run tool → Expect output
But that approach wasn’t working.
What I needed instead was a clearer understanding of how data actually flows:
Data source → Transformation → Destination
The problem wasn’t just the tool-it was the lack of control over how data moved through the system.
Why I Decided to Move Away
At this stage, it became clear that I needed:
- More control over data transformations
- Better visibility into how data flows
- A system that is easier to debug
Telegraf, while powerful, didn’t give me that level of flexibility for this use case.
What’s Next
That’s when I decided to try a different approach using Vector.
Instead of treating configuration as static setup, Vector treats it as a pipeline.
In the next part, I’ll walk through:
- How Vector pipelines work
- Why the sources → transforms → sinks model made a difference
- And what changed when I adopted that approach
Series Overview
This post is part of a series:
- Part 1: Telegraf struggles and initial setup (this post)
- Part 2: Moving to Vector and understanding pipelines
- More parts in this series will be published soon.
Final Thought
What started as a simple setup turned into a deeper lesson:
Tools don’t solve problems-understanding systems does.
Once that became clear, the direction forward was much easier.
Top comments (0)