If you’ve worked with data lakes for a while, you’ve probably heard names like Apache Iceberg, Delta Lake, or Apache Hudi.
They’re often mentioned together - but what problem do they actually solve, and why do modern systems like ClickHouse care about them?
This post walks through:
- What a data lake really is (and why it breaks down)
- What Apache Iceberg is and what it is not
- Why metadata matters so much
- How writers and readers work together
- Where tools like ClickHouse fit into the picture
- A real-world example tying everything together
By the end, you should have a clear mental model of Iceberg and open table formats.
What is a Data Lake?
A data lake is typically built on cheap, scalable object storage like S3 (or S3-compatible systems such as MinIO).
At its core, a data lake is:
- A place to store large amounts of raw data
- Usually in file formats like Parquet, ORC, or Avro
- Cheap and flexible
But here’s the key thing:
Object storage has no concept of tables, transactions, or schemas.
From the storage system’s perspective:
- A “table” is just a folder
- Files can appear at any time
- There is no guarantee a file is complete
- There is no notion of “latest data”
This is where problems start.
Why Data Lakes Become Painful at Scale
Early data lakes relied on conventions:
- Folder-based partitions (
date=2025-01-01/) - Manual rules for writers and readers
- “Just don’t read while writing”
This works… until it doesn’t.
Common issues:
- Readers see partial writes
- Queries mix old and new data
- Schema changes break jobs
- Deletes and reprocessing are unsafe
- Multiple engines step on each other
In short:
Storage can hold files, but it can’t manage tables.
Enter Apache Iceberg
Apache Iceberg is an open table format designed to bring database-like guarantees to data lakes.
Important clarification:
- 🚫 Iceberg is not a database
- 🚫 Iceberg is not a query engine
- 🟢 Iceberg is a table format
Its job is to define:
- What files belong to a table
- Which version of the table is current
- How readers and writers coordinate safely
You can think of Iceberg as the brain of a data lake table.
The Core Idea: Metadata Over File Scanning
Iceberg introduces a metadata-driven model:
- Data files → Actual Parquet files in object storage
- Metadata files → Describe schemas, partitions, and snapshots
- Snapshots → Define an exact version of the table
- Manifests → List data files and their statistics
A crucial idea:
Query engines using Iceberg never discover data files by scanning object storage; they rely entirely on Iceberg metadata to locate valid files.
This is what enables consistency, time travel, and safe concurrent access.
Who Are “Writers” in a Data Lake?
A writer is not just “something that uploads files”.
A real writer:
- Writes data files (e.g., Parquet)
- Updates table metadata
- Atomically commits a new snapshot
Typical writers include:
- Apache Spark
- Apache Flink
- Streaming pipelines fed by Apache Kafka
If someone uploads files directly to S3 without updating metadata, they are bypassing the table format - and breaking consistency.
How Readers Use Iceberg Metadata
Readers do not guess which files to read.
A reader:
- Reads Iceberg metadata
- Finds the latest snapshot
- Gets an exact list of valid files
- Applies pruning using file statistics
- Reads only the required Parquet files
This makes queries:
- Correct
- Predictable
- Efficient
Where ClickHouse Fits In
ClickHouse is primarily a column-oriented analytical database.
It can play two roles:
1. Native database
- Owns its own storage (MergeTree)
- Ingests data directly
- Best performance
2. External reader / query engine
- Reads data from S3
- Can query Iceberg tables
- Relies on Iceberg metadata for correctness
This is why Iceberg works well in multi-engine environments:
- Spark writes data
- ClickHouse reads it
- No direct coordination required
A Real-World Scenario
Imagine this pipeline:
- Edge devices produce events
- Events flow into Kafka
- Spark or Flink processes the stream
- Data is written as Parquet to S3
- Iceberg commits a new snapshot
- ClickHouse queries the table
Without Iceberg:
- ClickHouse might read half-written files
- Queries may mix old and new data
- Fixing bad data is risky
With Iceberg:
- Writers commit atomic snapshots
- Readers always see a consistent view
- Old versions remain accessible
- Multiple engines can safely coexist
What About Delta Lake and Hudi?
Iceberg is not alone.
Other open table formats include:
- Delta Lake
- Apache Hudi
They all solve the same core problem:
Making object storage behave like a real table.
They differ mainly in philosophy:
- Iceberg → metadata-first, engine-neutral
- Delta Lake → strong Spark ecosystem integration
- Hudi → incremental and streaming-heavy use cases
Final Mental Model (Worth Remembering)
- Object storage → holds files
- Table formats → define tables and versions
- Writers → create data + metadata
- Readers → trust metadata, not folders
Or in one line:
Object storage stores data.
Table formats make it trustworthy.
Closing Thoughts
Apache Iceberg doesn’t make queries magically faster.
What it does is far more important:
It makes data lakes correct, consistent, and usable at scale.
That’s why engines like ClickHouse can safely query data lakes today - something that was extremely fragile just a few years ago.



Top comments (0)