We set out to build an open lakehouse: Iceberg tables on AWS S3, Spark/Glue for pipelines, Snowflake for analytics compute power. What could go wrong? Everything—from query performance implosions to uncloneable dynamic tables. Here's the unfiltered journey, including why we pivoted to Snowflake-managed Iceberg via Horizon Catalog and abandoned automated dynamic tables for explicit, observable incremental processing.
TL;DR
- ❌ Phase 1 failed: Glue-generated Iceberg files (32-64MB) or bigger size caused 5-10x slower Snowflake queries
- ✅ Phase 2 wins: Snowflake-managed Iceberg auto-compacts to 256-512MB, 2-5x faster, ~35% cost savings
- ⚠️ Dynamic Tables gotcha: Cannot clone, opaque refresh timing—unusable for production DevOps
- ✅ Solution: Explicit Streams + Tasks on log tables—boring, debuggable, production-ready
- 🎯 Key decision: Data team owns Bronze→Silver only; business owns Gold (saved endless remodeling debates)
- 💰 Reality check: "Cloud-neutral" = readable across engines, not free migration
Our Team Context (Yours Will Differ)
Before diving into architecture decisions, here's who we are:
- Skills: Strong SQL/dbt, limited Spark/Scala experience
- Priorities: Ship fast, avoid operational black boxes
- Constraint: No dedicated DevOps for Glue cluster tuning
This shaped every decision below. A team fluent in Spark would have made different trade-offs.
🎯 The Goal
Unify data from SAP HANA (change data capture), Salesforce Data Cloud, and raw event streams into a single cloud-neutral lakehouse—no proprietary lock-in, full cross-tool interoperability.
Why Iceberg?
We evaluated Delta Lake, Apache Hudi, and Apache Iceberg a couple of years ago. Here's the comprehensive comparison that drove our decision:
| Criterion | Iceberg | Delta Lake | Hudi |
|---|---|---|---|
| Partition Evolution | ✅ Change without rewrite | ❌ Requires full table rewrite | ❌ Not supported |
| Primary Keys | ✅ Native support | ❌ Not supported | ✅ Supported |
| Automated Compaction | ✅ MAINTAIN ICEBERG TABLE | ⚠️ Manual (OPTIMIZE tuning) | ⚠️ Manual/semi-auto |
| Schema Evolution | ✅ Full (add/drop/rename/reorder) | ✅ Add/drop columns | ⚠️ Limited |
| Engine Compatibility | ✅ Spark, Trino, Flink, Snowflake, Dremio | ⚠️ Spark-first, limited others | ⚠️ Spark-first |
| Platform Support | ✅ AWS Glue/Athena, Azure, GCP, Snowflake | ✅ AWS, Azure (Fabric native), Databricks | ⚠️ AWS, limited Azure |
| File Format Flexibility | ✅ Parquet, ORC, Avro | ⚠️ Parquet only | ✅ Parquet, ORC, Avro |
| Community Governance | ✅ Apache Foundation (vendor-neutral) | ⚠️ Databricks-controlled | ✅ Apache Foundation |
Why We Chose Iceberg (and Ruled Out Delta/Hudi)
Iceberg's winning factors for our context:
Partition evolution without rewrites: Our SAP data's partitioning strategy evolved over time (daily → monthly as data matured). Iceberg lets us change partition specs without rewriting billions of rows. Delta requires full table rewrite—a multi-day, multi-TB operation we couldn't afford.
Snowflake-managed support: Only Iceberg offers native managed tables in Snowflake Horizon Catalog with automatic compaction. Delta/Hudi would lock us into external table limitations with the Phase 1 performance issues.
Automated compaction: Snowflake's
MAINTAIN ICEBERG TABLEhandles file optimization automatically. Delta requires manual compaction tuning in Spark—expertise our SQL-first team doesn't have.Primary key enforcement: Iceberg supports primary keys natively, critical for our SAP source data integrity (customer IDs, order numbers). Delta lacks this—you must enforce it in application logic.
Vendor-neutral governance: Apache Foundation stewardship means no single vendor controls the spec. Delta's governance is tied to Databricks' business interests.
Bottom line: Iceberg was the safe bet for multi-engine flexibility without vendor lock-in. If you're Azure-only with Fabric, Delta is pragmatic. If you're Databricks-native, Delta is the path of least resistance. But for AWS + Snowflake + future optionality, Iceberg was the only choice.
Why Snowflake for Compute?
Three differentiators that closed the decision:
1. Separation of Storage & Compute
Scale workloads independently to meet business demands while enabling detailed chargeback per team or domain—without disrupting other workloads. → Deep dive: Understanding Snowflake Virtual Warehouses
2. Automatic Caching + Smart Pruning (RELY Operators)
Sub-second query performance on petabyte-scale data through intelligent result/metadata caching and constraint-based optimization. → Deep dive: RELY Constraint for Star-Schema Queries
3. Compute Billing Precision
POC result: Our thousands of BI dashboards were only charged for calculation time—not data transfer to BI tools. Estimated 40–60% cost savings vs. **competitors **billing full query duration.
🏛️ Medallion Architecture
Our lakehouse follows the classic three-layer medallion model, tailored for SAP source systems:
┌─────────────────────────────────────────────────────────────┐
│ DATA SOURCES │
│ SAP HANA (CDC) │ Salesforce Data Cloud │ Raw Streams │
└────────────┬──────────────────┬─────────────────────┬───────┘
│ │ │
▼ ▼ ▼
┌─────────────────────────────────────────────────────────────┐
│ 🥉 BRONZE LAYER — AWS Glue Spark → Iceberg writes to S3 │
│ Raw ingestion, no transformation, full fidelity │
└─────────────────────────┬───────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ 🥈 SILVER LAYER — Iceberg Tables (AWS-hosted, S3) │
│ Matches source table structures │
│ e.g., KNA1 (Customers), MARA (Materials), VBAK (Orders) │
└─────────────────────────┬───────────────────────────────────┘
│ Stream on log_table
▼
┌─────────────────────────────────────────────────────────────┐
│ 🥇 GOLD LAYER — Snowflake-Managed Iceberg (Horizon Cat.) │
│ Proper dimensions & facts with business names │
│ e.g., DIM_CUSTOMER, FACT_SALES_ORDER, DIM_PRODUCT │
└─────────────────────────┬───────────────────────────────────┘
│
▼
BI Tools / ML / Fabric Export
Sounds ideal? Early reality: Snowflake choked on our Glue-generated files.
The Architecture Decision Nobody Talks About: Layer Ownership
Here's what we learned the hard way: Don't own Gold if you don't have to.
Why this boundary matters:
- Silver = Source truth: Matches SAP table structures (KNA1, MARA, VBAK). Data engineering controls quality, structure, and change tracking.
- Gold = Business semantics: DIM_CUSTOMER, FACT_SALES_ORDER. Business teams decide how to model, aggregate, and interpret.
- Boundary = Contract: Silver provides clean, change-tracked source data with explicit SLAs; business teams own downstream transformations.
This Decision Saved Us From:
❌ Endless "why did the customer count change?" debates (business definition shifts, not data quality issues)
❌ Remodeling dimensions every quarter when business logic evolves
❌ Being the bottleneck for every dashboard request
❌ Owning interpretations of business rules we don't fully understand
Your team may differ, but define the ownership boundary early or you'll own every downstream interpretation forever. The tools (dbt, Iceberg, Snowflake) don't enforce this—you must.
Phase 1: Glue-Managed Iceberg + Snowflake External Tables ❌
The Performance Trap
We started simple: Spark jobs in Glue created Iceberg tables stored in Glue Catalog; Snowflake linked them as external Iceberg tables.
What Broke: File Size Mismatch
Snowflake's Iceberg scanner is optimized for specific file characteristics per official recommendations:
| Parameter | Snowflake Recommendation | What Glue Produced |
|---|---|---|
| File size | 256 – 512 MB | 32 – 64 MB (many small files) |
| Row group size | 16 – 256 MB | < 16 MB |
| Row groups per file | Multiple (for parallelism) | Often 1 |
The result: Full table scans instead of pruned reads, query times 5–10x slower than expected.
┌──────────────────────────────────────────────────────┐
│ PHASE 1 ARCHITECTURE │
│ │
│ AWS Glue Spark ──writes──▶ Iceberg (Glue Catalog) │
│ │ │
│ S3 Parquet files │
│ (small, fragmented) │
│ │ │
│ Snowflake ◀──external table──────┘ │
│ (slow scans, no auto-compaction) │
└──────────────────────────────────────────────────────┘
🔀 Decision Point: Invest in Spark Tuning or Snowflake-Managed?
Option A: Hire Spark expertise, tune file compaction settings
- Timeline: 3-6 months
- Ongoing cost: Maintain Spark expertise, monitor file sizes
- Risk: Our team lacks Spark internals experience
Option B: Snowflake-managed Iceberg via Horizon Catalog
- Timeline: 2 weeks
- Ongoing cost: Snowflake MAINTAIN ICEBERG TABLE compute
- Upside: Handles compaction automatically, team stays in SQL/dbt comfort zone
We chose Option B: Our team's strength is SQL/dbt, not Spark internals. Let Snowflake handle the file lifecycle.
Phase 2: Snowflake-Managed Iceberg + Horizon Catalog ✅
The Pivot
Snowflake-managed Iceberg tables put Snowflake in charge of the table lifecycle on your S3 bucket—Horizon Catalog governs metadata, access, and interoperability.
┌──────────────────────────────────────────────────────────┐
│ PHASE 2 ARCHITECTURE │
│ │
│ AWS Glue Spark ──writes──▶ Snowflake Horizon Catalog │
│ │ │
│ Horizon manages Iceberg metadata │
│ Auto-compaction to 256-512MB files │
│ │ │
│ S3 (your bucket) │
│ Optimal Parquet layout │
│ │ │
│ Snowflake ◀──native read─────────┘ │
│ (2–5x faster, full pruning, cloneable*) │
│ │
│ BI Tools ◀── Snowflake compute │
│ Glue/Spark ◀── Iceberg open format (bidirectional) │
└──────────────────────────────────────────────────────────┘
Wins
✅ 2–5x query speedup vs. Phase 1 external tables
✅ Automatic compaction to optimal file sizes via MAINTAIN ICEBERG TABLE
✅ Bidirectional access: Glue/Spark can write, Snowflake reads natively; BI tools use Snowflake compute
✅ Open format preserved: Gold layer exportable to Fabric/Polaris later
Trade-off: Snowflake compute is billed for maintenance runs—but total ops cost is lower than Phase 1's slow queries burning warehouse credits.
⚠️ Why We Abandoned Dynamic Tables for Production
Dynamic Tables on Iceberg sounded perfect—zero-code pipelines, automatic refresh. We hit two walls that forced us back to explicit patterns:
Problem 1 — No Cloning Support:
-- This FAILS silently on dynamic Iceberg tables
CREATE DATABASE prod_clone CLONE prod_db;
-- Dynamic Iceberg tables are simply skipped in the clone
DB/schema clones skip dynamic Iceberg tables entirely. This is a DevOps killer—no dev/test environment parity, no blue-green deploys.
Problem 2 — Opaque Refresh Timing:
Incremental refresh latency was unpredictable and nearly impossible to debug for SLA enforcement. Monitoring refresh lag and debugging failures was guesswork with no visibility into what triggered refreshes or why they were delayed.
🔀 Trade-off: Zero-Code vs. Zero-Surprise
Dynamic Tables promise automation but hide:
- When refreshes actually run
- What triggered the refresh
- How to debug failures in production at 3am
Streams + Tasks = More code, but debuggable in production.
Architecture decision-makers: Optimize for production support, not dev convenience.
Resolution: Abandoned dynamic tables for Silver → Gold. Back to explicit Streams + Tasks.
Our Solution: Log Table Pattern
┌─────────────────────────────────────────────────────┐
│ SILVER → GOLD PIPELINE (Current) │
│ │
│ Silver Layer │
│ │ │
│ ├──▶ silver.log_table (tracks all changes) │
│ │ │
│ ▼ │
│ CREATE STREAM log_changes │
│ ON TABLE silver.log_table; ◀── stream │
│ │ │
│ ▼ │
│ CREATE TASK gold_refresh ◀── task │
│ SCHEDULE = '5 MINUTE' │
│ AS MERGE INTO gold.customer_360 │
│ USING log_changes ...; │
│ │ │
│ ▼ │
│ Gold Layer (Snowflake-Managed Iceberg) │
└─────────────────────────────────────────────────────┘
-- The explicit, cloneable, debuggable pattern
CREATE STREAM log_changes ON TABLE silver.log_table;
CREATE TASK gold_refresh
WAREHOUSE = compute_xs
SCHEDULE = '5 MINUTE'
WHEN SYSTEM$STREAM_HAS_DATA('log_changes')
AS
MERGE INTO gold.customer_360 tgt
USING (
SELECT * FROM log_changes WHERE metadata$action = 'INSERT'
) src
ON tgt.customer_id = src.customer_id
WHEN MATCHED THEN UPDATE SET ...
WHEN NOT MATCHED THEN INSERT ...;
Why This Wins:
✅ Full cloning support (dev/prod parity restored)
✅ Transparent costs — every execution logged in TASK_HISTORY
✅ Debuggable — stream offset visible, failures isolated
✅ Fabric-friendly — Gold can be exported as Iceberg/Parquet later
🌐 Cross-Platform Reality: Why We Can (and Can't) Pivot
Current State: AWS + Snowflake, Iceberg on S3
✅ Open format preserved: Can read Iceberg from Spark, Trino, Athena
✅ Gold exportable: Stream to Parquet → Fabric mirroring works
⚠️ Fabric constraint: OneLake wants Delta + same Azure region for zero-copy
⚠️ Horizon lock-in: Snowflake-managed Iceberg metadata tied to Horizon Catalog
What "Cloud-Neutral" Actually Means
Not: "Deploy anywhere tomorrow with zero effort"
Actually: "Readable by multiple engines, movable with effort"
If We Had to Migrate to Fabric:
| Layer | Migration Effort | Estimated Timeline |
|---|---|---|
| Bronze → Silver ingestion | Rewrite Glue jobs to Delta | 2-3 weeks |
| Silver → Gold dbt models | Port to Fabric SQL (syntax diffs) | 1-2 months |
| Gold Iceberg tables | Export as Parquet, re-create in Fabric Warehouse | 1-2 weeks |
┌────────────────────────────────────────────────────────────┐
│ CATALOG LANDSCAPE │
│ │
│ Our Setup: AWS us-east-1 │
│ ┌────────────────────────────────────────────────────┐ │
│ │ Snowflake Horizon Catalog (current) │ │
│ │ + AWS Glue Catalog (bronze/silver ingestion) │ │
│ └────────────────────────────────────────────────────┘ │
│ │
│ Future Options: │
│ ┌─────────────────┐ ┌──────────────────────────────┐ │
│ │ Polaris Catalog │ │ Microsoft Fabric OneLake │ │
│ │ (Snowflake SaaS)│ │ (wants Delta; Azure-region │ │
│ │ Maturing fast │ │ only for zero-copy) │ │
│ └─────────────────┘ └──────────────────────────────┘ │
│ │
│ Gold Streams → Parquet export → Fabric compatible ✅ │
└────────────────────────────────────────────────────────────┘
Our Decision:
We prioritized Snowflake ecosystem depth over day-1 multi-cloud portability.
Iceberg gave us an exit path, not a free exit. For teams needing Fabric OneLake zero-copy, starting with Delta on Azure is the pragmatic choice.
💰 Cost Reality Check
Phase 1 (Glue + External Iceberg):
- Glue: costs to write and maintain iceberg tables
- Snowflake query costs: 3x higher due to full scans
Phase 2 (Snowflake-managed Iceberg):
- MAINTAIN ICEBERG TABLE: in warehouse credits
- Query costs: 60% reduction (pruning + caching works correctly)
- Net savings: 35% monthly vs. Phase 1
Takeaway: Optimize for query performance where your users actually spend time, not just ingestion costs.
🔑 Lessons for Your Lakehouse
1. File sizes first
Target 256-512MB Parquets or let Snowflake MAINTAIN ICEBERG TABLE handle it automatically. Small files kill Snowflake performance.
2. Clone-test early
Dynamic Iceberg tables are powerful but not clone-safe for databases/schemas. Test your DevOps workflow before committing to production.
3. Horizon bidirectional is a game-changer
If your team uses both Spark and Snowflake, Horizon gives you the best of both without choosing sides. Write with Spark, read with Snowflake—all on the same Iceberg tables.
4. Streams > automation black boxes
Explicit Streams + Tasks always win in production ops: observable at 3am, debuggable from logs, cloneable for dev/test.
5. Phase your migration
Don't try to migrate all layers at once. Start with Gold (highest query frequency), validate performance, then move Silver.
6. Define ownership boundaries early
Bronze→Silver→Gold isn't just technical layers—it's organizational boundaries. Decide who owns what before the first production table.
When to Use What: Decision Matrix
| Your Situation | Recommendation | Why |
|---|---|---|
| Team strong in Spark, need multi-engine | Glue-managed Iceberg + External tables | Keep expertise where it is |
| Team SQL-first, Snowflake primary engine | Snowflake-managed Iceberg | Let Snowflake handle file lifecycle |
| Must support Fabric OneLake zero-copy | Delta Lake on Azure | Iceberg works but not zero-copy on Fabric |
| Need dev/prod clones + SLA guarantees | Streams + Tasks (avoid Dynamic Tables) | Observable, debuggable |
| Bronze/Silver only (like us) | dbt incremental + Developer Toolkit | Explicit watermark control |
Bottom line: Cloud-neutral means readable across tools, not free migration. Choose the platform that matches your team's strengths, use open formats for portability insurance.
What's Next for Us
- Investigate Polaris Catalog or wait for Horizon Catalog for true multi-vendor Iceberg metadata management
- Evaluate cost/performance of streaming directly from Kafka → Snowflake Iceberg
- Explore Iceberg v3 features for enhanced BCDR and CDC capabilities
🆕 Update: Iceberg v3 Features (March 2026) That Could Change Your Decision
Since our original evaluation, Snowflake released Apache Iceberg v3 support in public preview (March 2026) with capabilities that address several of our pain points and unlock new architectural patterns.
What's New in Iceberg v3
1. Cross-Region Replication for Snowflake-Managed Tables
- What it enables: BCDR failover and replication groups for Iceberg tables across regions
- Why it matters: Previously, our DR strategy required complex Parquet exports. Now, Snowflake-managed Iceberg tables can replicate with full consistency (including row lineage and deletion vectors)
- Architecture impact: We can now deploy active-passive DR without custom tooling
- 📖 Replication Config Docs
2. Catalog-Linked Databases
- What it enables: Connect to remote Iceberg catalogs (AWS Glue, Polaris, etc.) with automatic namespace discovery and read/write support
- Why it matters: Our Phase 1 external table limitations are eliminated—we can now write back to Glue Catalog-managed tables from Snowflake
- Architecture impact: True bidirectional catalog federation; could unblock our "Glue as ingestion, Snowflake as analytics" hybrid
- 📖 CREATE Catalog-Linked Database Docs
Which Features Would Have Changed Our Decision?
Would have stayed with our choice (Iceberg + Horizon):
- ✅ Cross-region replication validates our bet on Snowflake-managed Iceberg (Delta still doesn't have this)
- ✅ Catalog-linked databases eliminate the Phase 1 external table pain without abandoning Glue ingestion
Bottom Line: Iceberg v3 Strengthens the "Safe Bet"
For SQL-first teams on Snowflake, Iceberg v3 eliminates the last major operational friction points we encountered. The combination of Horizon Catalog + v3 features delivers the "cloud-neutral with vendor optimization" balance we were seeking.
What's your biggest Iceberg-on-Snowflake headache? Drop it below 👇
Top comments (0)