If you've worked with Hive, you've felt these pains:
- "What did the data look like last Friday?" → Impossible to answer
- Rename a column → Every downstream query breaks
- Write and read the same table simultaneously → Inconsistency or lock contention
These aren't usage problems — they're fundamental architectural limits. Data lakes sit on object storage (S3/GCS/ADLS), and object storage has no transactions, no schema management, no version history.
Apache Iceberg exists to close these gaps.
What Is a Table Format?
A table format is a metadata layer on top of object storage. It tracks which files belong to a table, their schema, partition layout, and change history. It doesn't replace Parquet — it manages Parquet files.
Iceberg's three-layer metadata architecture:
┌─────────────────────────────────────────────────────┐
│ Iceberg Table Format │
├─────────────────────────────────────────────────────┤
│ Layer 1: Catalog │
│ └─ Pointer to current table state (Hive/Glue/Nessie│
├─────────────────────────────────────────────────────┤
│ Layer 2: Metadata Files │
│ └─ Schema, partition specs, snapshot history │
├─────────────────────────────────────────────────────┤
│ Layer 3: Data Files │
│ └─ Actual Parquet/ORC/Avro files + manifests │
└─────────────────────────────────────────────────────┘
Every write operation creates a new snapshot; old snapshots are preserved. This is how time travel works.
Four Core Capabilities
1. ACID Transactions
MERGE INTO prod.orders t
USING staging.orders_delta s ON t.order_id = s.order_id
WHEN MATCHED AND s.status = 'cancelled' THEN UPDATE SET t.status = 'cancelled'
WHEN NOT MATCHED THEN INSERT *;
Implementation: write new Parquet files → write manifest → atomically swap the snapshot pointer in the Catalog using CAS. If the CAS fails (concurrent conflict), all written files are orphaned and GC'd. Full ACID without a lock manager.
2. Time Travel
-- Query data as of 3 days ago
SELECT * FROM prod.orders TIMESTAMP AS OF '2026-03-20 00:00:00';
-- Query a specific snapshot
SELECT * FROM prod.orders VERSION AS OF 8765432109;
-- View snapshot history
SELECT * FROM prod.orders.history;
Use cases: debugging data quality issues, rollback after bad writes, comparing current vs historical data.
3. Schema Evolution Without Breaking Downstream
ALTER TABLE prod.orders ADD COLUMN discount DECIMAL(5,2); -- Safe
ALTER TABLE prod.orders RENAME COLUMN price TO sale_price; -- Safe
ALTER TABLE prod.orders DROP COLUMN legacy_field; -- Safe
ALTER TABLE prod.orders ALTER COLUMN amount TYPE BIGINT; -- Safe (widening)
ALTER TABLE prod.orders ALTER COLUMN amount TYPE INT; -- REJECTED (narrowing)
The key: Iceberg tracks columns by field ID, not name. Renaming price to sale_price doesn't change the ID of that column in existing Parquet files — Iceberg maps "field 42 = sale_price" transparently. Downstream queries are unaffected.
4. Partition Evolution Without Data Rewrite
-- Originally partitioned by day
CREATE TABLE prod.events (...)
PARTITIONED BY (days(event_time));
-- Data grew; switch to hourly partitioning — NO data rewrite needed!
ALTER TABLE prod.events
REPLACE PARTITION FIELD days(event_time) WITH hours(event_time);
New writes go into hourly partitions; historical data stays in daily partitions. The query engine understands both partition layouts simultaneously.
Why Everyone Is Betting on Iceberg
Snowflake built Iceberg Universal Format — their tables can now be read natively by Spark/Flink/Trino without export.
Databricks has Delta Lake (its own table format), but also provides full Iceberg compatibility and has contributed to the spec.
AWS provides first-class Iceberg support in Glue, Athena, and EMR.
Three reasons driving this convergence:
- True multi-engine interoperability: Write in Snowflake, read in Spark, stream in Flink — no format conversion, no data copy
- Open format = no vendor lock-in: Your data lives in your S3 as Parquet; vendors provide compute engines, not data custody
- Database-grade features on cheap storage: ACID, time travel, schema evolution — previously warehouse-only features — are now available on S3
Migration Paths
New tables: Just use Iceberg from the start
CREATE TABLE catalog.db.new_table USING iceberg AS SELECT * FROM source;
Migrate existing Hive/Parquet tables (in-place, no data copy):
spark.sql("CALL catalog.system.migrate('db.legacy_table')")
Safe shadow migration: Create Iceberg table → bulk load history → dual-write → validate consistency → cut over reads → decommission old table.
Conclusion: The Data Lake's Operating System
Iceberg is not a query engine, not just a file format — it's the table format layer for data lakes: the combination of file system and transaction manager.
With Iceberg, data lakes finally get what they've always lacked:
- ✅ Database-grade transactions and consistency
- ✅ Any-point-in-time data history
- ✅ Painless schema evolution
- ✅ Multi-engine interoperability, zero vendor lock-in
For data platform teams, Iceberg is becoming table stakes (pun intended). If your customers are still running bare Hive+Parquet data lakes, an Iceberg migration proposal will be a genuinely valuable technical upgrade conversation to have.
References: Apache Iceberg Official Docs | Snowflake Iceberg Tables | Databricks Managed Iceberg | AWS Glue Iceberg Support | Knowledge Card W12D5
Top comments (0)