DatanestDigital

Posted on Mar 23 • Edited on Jun 17

Spark Performance Masterclass: Delta Lake Optimization Cheatsheet

#spark #databricks #deltalake #performance

Delta Lake Optimization Cheatsheet

Quick reference for every Delta Lake performance optimization technique.
Covers OPTIMIZE, VACUUM, table properties, MERGE patterns, data skipping,
partitioning strategies, liquid clustering, and file sizing.

OPTIMIZE
VACUUM
Z-ORDER
Liquid Clustering
Table Properties
MERGE Optimization
Data Skipping
Partitioning Strategy
File Size Guidelines
Schema Evolution
Time Travel and Restore
Maintenance Schedule

OPTIMIZE

Compacts small files into larger ones for better read performance.

Basic Usage

-- Optimize entire table
OPTIMIZE my_database.my_table;

-- Optimize specific partitions only (much faster)
OPTIMIZE my_database.my_table
WHERE date >= '2025-01-01' AND date < '2025-02-01';

What OPTIMIZE Does

Before OPTIMIZE:
  partition=2025-01-15/
    ├── part-00000.parquet   (2 MiB)
    ├── part-00001.parquet   (1 MiB)
    ├── part-00002.parquet   (500 KiB)
    ├── part-00003.parquet   (3 MiB)
    ├── ...
    └── part-00099.parquet   (800 KiB)
    Total: 100 files, 150 MiB

After OPTIMIZE:
  partition=2025-01-15/
    ├── part-00000.parquet   (128 MiB)
    └── part-00001.parquet   (22 MiB)
    Total: 2 files, 150 MiB

OPTIMIZE with Z-ORDER

-- Optimize and co-locate data by column values
OPTIMIZE my_database.my_table
ZORDER BY (customer_id, product_id);

-- Z-ORDER on specific partitions
OPTIMIZE my_database.my_table
WHERE date = '2025-01-15'
ZORDER BY (customer_id);

OPTIMIZE Metrics

OPTIMIZE returns these metrics:
  numFilesAdded:     2         (new compacted files)
  numFilesRemoved:   100       (old small files)
  filesAdded.min:    22 MiB
  filesAdded.max:    128 MiB
  filesAdded.avg:    75 MiB
  numBatches:        1
  totalConsideredFiles: 100
  totalFilesSkipped: 0

Key Settings

Setting	Default	Recommended	Notes
Target file size	1 GiB (OSS) / varies (Databricks)	128-256 MiB	Set via table property
`spark.databricks.delta.optimizeWrite.enabled`	false	true	Auto-size files at write
`spark.databricks.delta.autoCompact.enabled`	false	true	Auto-compact after writes

-- Set target file size per table
ALTER TABLE my_table SET TBLPROPERTIES ('delta.targetFileSize' = '128m');

VACUUM

Removes files no longer referenced by the Delta log. Essential for storage cost control.

Basic Usage

-- Remove files older than default retention (7 days)
VACUUM my_database.my_table;

-- Remove files older than 30 days
VACUUM my_database.my_table RETAIN 720 HOURS;

-- Dry run — see what would be deleted without deleting
VACUUM my_database.my_table DRY RUN;

Safety Rules

Rule	Details
Minimum retention	7 days by default (controlled by `delta.deletedFileRetentionDuration`)
Never vacuum below retention	Active readers/writers could fail if their referenced files are deleted
VACUUM is idempotent	Safe to run multiple times
VACUUM only deletes unreferenced files	Never deletes current table data

Override Minimum Retention (Dangerous!)

-- Only if you're CERTAIN no jobs are reading old versions
SET spark.databricks.delta.retentionDurationCheck.enabled = false;
VACUUM my_table RETAIN 0 HOURS;  -- Deletes everything not in current version

Storage Savings Estimate

# Check table size before/after VACUUM
from delta.tables import DeltaTable

dt = DeltaTable.forName(spark, "my_database.my_table")
detail = dt.detail().collect()[0]
print(f"Files: {detail['numFiles']}")
print(f"Size: {detail['sizeInBytes'] / (1024**3):.2f} GiB")

# Run VACUUM DRY RUN to see what would be cleaned
spark.sql("VACUUM my_database.my_table DRY RUN")

Z-ORDER

Multi-dimensional clustering that co-locates related data in the same files
for faster data skipping.

When to Z-ORDER

Scenario	Z-ORDER?	Why
Column frequently in WHERE clauses	Yes	Dramatically reduces files scanned
Column used in JOIN keys	Yes	Improves join-side scan performance
High-cardinality column (e.g., user_id)	Yes	Great data skipping potential
Low-cardinality column (e.g., status)	No	Partitioning is better
Column never filtered on	No	Z-ORDER won't help queries that don't filter

Best Practices

-- Z-ORDER on 1-3 columns (more columns = less effective per column)
OPTIMIZE my_table ZORDER BY (customer_id);                    -- Best for single-column filters
OPTIMIZE my_table ZORDER BY (customer_id, product_id);        -- Good for two-column filters
OPTIMIZE my_table ZORDER BY (customer_id, product_id, date);  -- Diminishing returns on 3rd column

Z-ORDER vs Partitioning

                    Low Cardinality        High Cardinality
                    (< 1000 values)        (> 1000 values)
                    ─────────────────      ─────────────────
Frequently filtered  PARTITION BY           Z-ORDER BY
Rarely filtered      Neither                Neither

Measuring Z-ORDER Effectiveness

# Before Z-ORDER: check files scanned
spark.sql("SELECT * FROM my_table WHERE customer_id = 'C12345'").explain()
# Look at: "number of files read" in FileScan metrics

# After Z-ORDER: same query should read far fewer files
# Typical improvement: 10-100× fewer files scanned

Liquid Clustering

Delta Lake's next-generation clustering (Databricks, Delta Lake 3.0+).
Replaces both partitioning and Z-ORDER with incremental, automatic clustering.

Key Differences from Partitioning + Z-ORDER

Feature	Partitioning + Z-ORDER	Liquid Clustering
Clustering changes	Requires full rewrite	Incremental, change cluster keys anytime
Maintenance	Manual OPTIMIZE + ZORDER	Automatic with writes
Key changes	ALTER TABLE + full rewrite	ALTER TABLE, applied incrementally
Small file handling	Separate concern	Built-in compaction
Partition evolution	Very expensive	Seamless

Usage

-- Create table with liquid clustering
CREATE TABLE my_table (
    id BIGINT,
    customer_id STRING,
    date DATE,
    amount DECIMAL(18, 2)
) USING DELTA
CLUSTER BY (customer_id, date);

-- Change cluster keys (no rewrite needed!)
ALTER TABLE my_table CLUSTER BY (date, region);

-- Remove clustering
ALTER TABLE my_table CLUSTER BY NONE;

-- Trigger clustering (runs automatically with optimizeWrite, but can be manual)
OPTIMIZE my_table;

When to Use Liquid Clustering

Use Case	Recommendation
New tables on Databricks	Prefer liquid clustering over partitioning
Tables with evolving query patterns	Liquid clustering (can change keys)
Existing well-partitioned tables	Keep current partitioning unless issues
Tables with many small updates	Liquid clustering handles compaction
OSS Delta (< 3.0)	Not available — use partitioning + Z-ORDER

Table Properties

Performance-Related Properties

ALTER TABLE my_table SET TBLPROPERTIES (
    -- File sizing
    'delta.targetFileSize' = '128m',                    -- Target file size for OPTIMIZE
    'delta.tuneFileSizesForRewrites' = 'true',          -- Auto-tune for MERGE/UPDATE

    -- Auto-optimization
    'delta.autoOptimize.optimizeWrite' = 'true',        -- Optimize file sizes on write
    'delta.autoOptimize.autoCompact' = 'true',          -- Auto-compact small files

    -- Data skipping
    'delta.dataSkippingNumIndexedCols' = '32',          -- Columns with min/max stats (default 32)
    'delta.dataSkippingStatsColumns' = 'customer_id,date,amount',  -- Explicit stats columns

    -- Retention
    'delta.deletedFileRetentionDuration' = 'interval 7 days',  -- VACUUM retention
    'delta.logRetentionDuration' = 'interval 30 days',          -- Transaction log retention

    -- Schema
    'delta.columnMapping.mode' = 'name',                -- Enable column rename/drop
    'delta.minReaderVersion' = '2',
    'delta.minWriterVersion' = '5'
);

Viewing Current Properties

DESCRIBE DETAIL my_database.my_table;
SHOW TBLPROPERTIES my_database.my_table;

MERGE Optimization

MERGE (upsert) is the most expensive Delta operation. These techniques
minimize its cost.

Basic MERGE Pattern

MERGE INTO target AS t
USING source AS s
ON t.id = s.id
WHEN MATCHED AND s.updated_at > t.updated_at THEN
    UPDATE SET *
WHEN NOT MATCHED THEN
    INSERT *;

Optimization Techniques

1. Partition-Aligned MERGE

-- Add partition filter to reduce scan scope
MERGE INTO target AS t
USING source AS s
ON t.id = s.id AND t.date = s.date  -- date is partition column
WHEN MATCHED THEN UPDATE SET *
WHEN NOT MATCHED THEN INSERT *;

Without the partition column in the ON clause, MERGE scans ALL files in the
target table. With it, only matching partitions are scanned.

2. Reduce Source Size

# De-duplicate source before MERGE (keep latest per key)
from pyspark.sql.window import Window
from pyspark.sql.functions import row_number, desc

window = Window.partitionBy("id").orderBy(desc("updated_at"))
source_deduped = source.withColumn("rn", row_number().over(window)) \
    .filter("rn = 1").drop("rn")

3. Z-ORDER on MERGE Key

-- Z-ORDER target table on the join key used in MERGE
OPTIMIZE target_table ZORDER BY (id);

This ensures data skipping works effectively during the MERGE scan.

4. Low-Shuffle MERGE

# Enable low-shuffle merge (Databricks)
spark.conf.set("spark.databricks.delta.merge.enableLowShuffle", "true")

Low-shuffle merge rewrites only the files that contain matching rows,
rather than shuffling the entire target table.

5. Tune File Size for MERGE

-- Smaller files = less rewrite per MERGE operation
ALTER TABLE target_table SET TBLPROPERTIES ('delta.tuneFileSizesForRewrites' = 'true');

MERGE Performance Checklist

□ MERGE key includes partition column(s)
□ Source is de-duplicated on MERGE key
□ Target is Z-ORDERed on MERGE key
□ Low-shuffle merge is enabled
□ Source DataFrame is small enough to broadcast (if possible)
□ Only update columns that actually changed (avoid UPDATE SET *)
□ Run OPTIMIZE after large MERGE batches

Data Skipping

Delta Lake stores min/max statistics per file for the first N columns.
Queries with filters on these columns skip files whose range doesn't match.

How It Works

File 1: customer_id min="A001" max="A500"  amount min=10 max=500
File 2: customer_id min="A501" max="B200"  amount min=5 max=1200
File 3: customer_id min="B201" max="C100"  amount min=50 max=800

Query: WHERE customer_id = 'B100' AND amount > 1000
  → Skip File 1 (B100 not in A001-A500)
  → Read File 2 (B100 in A501-B200, 1000 in 5-1200)  ← Might match
  → Skip File 3 (B100 not in B201-C100)

Result: Only 1 of 3 files read (67% skipped)

Maximizing Data Skipping

Technique	Impact	Notes
Z-ORDER on filter columns	High	Co-locates similar values → tighter min/max ranges
Partition by low-cardinality columns	High	Eliminates entire partitions
Put frequently filtered columns first	Medium	Default: stats on first 32 columns
Increase indexed columns	Low-Medium	`delta.dataSkippingNumIndexedCols`
Avoid wide tables with filtered columns beyond column 32	Medium	Or set explicit stats columns

Checking Data Skipping Effectiveness

# Enable detailed Spark UI metrics, then run your query
# In SQL tab, check FileScan metrics:
#   "number of files read" vs total files in table
#   "size of files read" vs total table size

# Programmatically:
df = spark.sql("SELECT * FROM my_table WHERE customer_id = 'C12345'")
df.explain()  # Check PartitionFilters and PushedFilters

Partitioning Strategy

Decision Matrix

                        High Write Volume    Low Write Volume
                        ─────────────────    ────────────────
Low Cardinality         PARTITION BY         PARTITION BY
(< 100 values)          (e.g., date, region) (e.g., status)

Medium Cardinality      Z-ORDER or           Z-ORDER
(100-10000 values)      Liquid Clustering

High Cardinality        Z-ORDER or           Z-ORDER
(> 10000 values)        Liquid Clustering

Partitioning Best Practices

-- GOOD: Partition by date (365 values/year, each partition has many rows)
CREATE TABLE events (...) USING DELTA PARTITIONED BY (date);

-- GOOD: Partition by region (< 50 values)
CREATE TABLE sales (...) USING DELTA PARTITIONED BY (region);

-- BAD: Partition by user_id (millions of values → millions of tiny directories)
CREATE TABLE events (...) USING DELTA PARTITIONED BY (user_id);  -- DON'T DO THIS

-- BAD: Partition by timestamp (near-unique → one file per partition)
CREATE TABLE events (...) USING DELTA PARTITIONED BY (event_timestamp);  -- DON'T DO THIS

Partition Size Guidelines

Metric	Target	Warning
Files per partition	1-100	> 1000 = too many small files
Data per partition	1-100 GiB	< 1 MiB = over-partitioned
Total partitions	< 10,000	> 100,000 = partition explosion
File size in partition	64-256 MiB	< 1 MiB = small file problem

Changing Partition Strategy

# You CANNOT change partitioning of an existing Delta table in place
# Instead: rewrite the table

# Read existing data
df = spark.read.table("my_database.old_table")

# Write with new partitioning
df.write \
    .format("delta") \
    .partitionBy("date") \
    .mode("overwrite") \
    .option("overwriteSchema", "true") \
    .saveAsTable("my_database.new_table")

File Size Guidelines

Target File Sizes

Workload Type	Target File Size	Why
General purpose	128 MiB	Good balance of read speed and metadata
Frequently merged	32-64 MiB	Smaller files = less rewrite per MERGE
Append-only analytical	256 MiB	Fewer files = faster metadata
Streaming sink	64-128 MiB	Depends on trigger interval and throughput

Calculating Optimal File Count

... [content trimmed for length — full version in the complete kit]

This is part of Datanest Platform Pro — production-grade Databricks & Azure data-platform toolkits. Get the Spark Optimization Playbook with all files, templates, and documentation for $69.

Get the Full Kit →

Or grab the entire Databricks Mastery Bundle for $199 — the complete Databricks collection.

Get the Complete Bundle →

Delta Lake Optimization Cheatsheet

Table of Contents

OPTIMIZE

Basic Usage

What OPTIMIZE Does

OPTIMIZE with Z-ORDER

OPTIMIZE Metrics

Key Settings

VACUUM

Basic Usage

Safety Rules

Override Minimum Retention (Dangerous!)

Storage Savings Estimate

Z-ORDER

When to Z-ORDER

Best Practices

Z-ORDER vs Partitioning

Measuring Z-ORDER Effectiveness

Liquid Clustering

Key Differences from Partitioning + Z-ORDER

Usage

When to Use Liquid Clustering

Table Properties

Performance-Related Properties

Viewing Current Properties

MERGE Optimization

Basic MERGE Pattern

Optimization Techniques

1. Partition-Aligned MERGE

2. Reduce Source Size

3. Z-ORDER on MERGE Key

4. Low-Shuffle MERGE

5. Tune File Size for MERGE

MERGE Performance Checklist

Data Skipping

How It Works

Maximizing Data Skipping

Checking Data Skipping Effectiveness

Partitioning Strategy

Decision Matrix

Partitioning Best Practices

Partition Size Guidelines

Changing Partition Strategy

File Size Guidelines

Target File Sizes

Calculating Optimal File Count

Related Articles