DEV Community

Krishna Tangudu
Krishna Tangudu

Posted on

How a Simple Warehouse Resize Saved Us 11% in Daily Credits While Boosting Performance

Our team hit a familiar Snowflake paradox: slower ETL runs arrived at the same time as FinOps alerts about rising credits.

The warehouse in question was WH_ETL_BRONZE_01, a multi-cluster warehouse dedicated to Bronze layer ingestion and merge workloads. What looked like a simple cost problem turned out to be a workload-isolation and concurrency problem.

The Problem

We started with this setup:

ALTER WAREHOUSE WH_ETL_BRONZE_01 SET
  WAREHOUSE_SIZE = 'XSMALL',
  MAX_CLUSTER_COUNT = 10,
  MIN_CLUSTER_COUNT = 1,
  SCALING_POLICY = 'STANDARD',
  AUTO_SUSPEND = 60,
  MAX_CONCURRENCY_LEVEL = 8;  -- default behavior
Enter fullscreen mode Exit fullscreen mode

And we saw both of these at once:

  • Higher queue times (queued_overload_time)
  • Higher daily credits

At that point, we were loading 1,500+ tables through this single warehouse. That meant heavyweight Bronze MERGE workloads and smaller ingestion/utility queries were all competing in the same execution pool.

The critical issue was workload mix. Long-running MERGE statements (30 to 60 minutes) were running alongside tiny queries. When too many heavy MERGE statements landed on the same node/cluster, they competed for resources and all slowed down.

In other words, even with multi-cluster enabled, assignment patterns could create unstable performance if too many heavyweight queries were packed together.

We also learned we needed workload separation: large-table loads should run in a dedicated warehouse so they do not contend with smaller table loads and operational queries.

What We Changed

This was not a single before/after discovery from history. We ran controlled experiments in sequence, while splitting the largest table loads into a separate warehouse path.

Phase 1: Baseline

ALTER WAREHOUSE WH_ETL_BRONZE_01 SET
  WAREHOUSE_SIZE = 'XSMALL',
  MAX_CLUSTER_COUNT = 10,
  MIN_CLUSTER_COUNT = 1,
  SCALING_POLICY = 'STANDARD',
  AUTO_SUSPEND = 60,
  MAX_CONCURRENCY_LEVEL = 8;
Enter fullscreen mode Exit fullscreen mode

Phase 2: First Experiment

ALTER WAREHOUSE WH_ETL_BRONZE_01 SET
  WAREHOUSE_SIZE = 'XSMALL',
  MAX_CLUSTER_COUNT = 10,
  MIN_CLUSTER_COUNT = 1,
  SCALING_POLICY = 'STANDARD',
  AUTO_SUSPEND = 60,
  MAX_CONCURRENCY_LEVEL = 2;
Enter fullscreen mode Exit fullscreen mode

Goal: force scale-out behavior earlier and reduce the chance that many heavy MERGE jobs share the same node.

At the same time, we moved larger-table processing to a separate warehouse so those jobs would not compete with the remaining 1,500+ table ingestion flow.

Phase 3: Final Optimization

ALTER WAREHOUSE WH_ETL_BRONZE_01 SET
  WAREHOUSE_SIZE = 'SMALL',
  MAX_CLUSTER_COUNT = 10,
  MIN_CLUSTER_COUNT = 1,
  SCALING_POLICY = 'STANDARD',
  AUTO_SUSPEND = 60,
  MAX_CONCURRENCY_LEVEL = 3;
Enter fullscreen mode Exit fullscreen mode

This became the best balance for our workload profile.

Final state for this ETL warehouse path:

  • SMALL size
  • MAX_CONCURRENCY_LEVEL = 3
  • Large-table workloads isolated to a separate warehouse

Why This Helped

From deeper analysis of the workload behavior:

  • The number of queries and output volume could look similar across runs, but bytes scanned at table level could still rise.
  • Heavy MERGE overlap was a key instability driver.
  • On slow runs, many heavy MERGE statements could get assigned to the same node, causing resource contention.
  • On fast runs, fewer heavy MERGE statements shared a node, leaving room for short-running queries.
  • Isolating large-table workloads reduced cross-workload contention in the primary Bronze ETL warehouse.

Lowering concurrency reduced the tendency to over-pack heavyweight merges. Then moving to SMALL provided enough per-query resources to cut elapsed time and queueing further. Separating large-table loads into a dedicated warehouse stabilized performance for the rest of the pipeline.

Configuration Summary

Phase Size Max Clusters Max Concurrency Intent
Baseline XSMALL 10 8 Initial setup
Experiment XSMALL 10 2 Force earlier scale-out / reduce heavy-query packing
Final SMALL 10 3 Balance throughput, queueing, and cost

The two final intentional changes vs baseline were:

  • WAREHOUSE_SIZE: XSMALL -> SMALL
  • MAX_CONCURRENCY_LEVEL: 8 -> 3

How We Measured

We used Snowflake ACCOUNT_USAGE views for both performance and cost, comparing baseline and final optimization windows.

Query Performance (Before/After)

SELECT
  CASE
    WHEN start_time < '2026-05-05' THEN 'XSMALL Period'
    ELSE 'SMALL Period'
  END AS period,
  COUNT(*) AS total_queries,
  ROUND(AVG(total_elapsed_time)/1000, 2) AS avg_elapsed_sec,
  ROUND(AVG(queued_overload_time)/1000, 2) AS avg_queued_sec,
  ROUND(MAX(total_elapsed_time)/1000, 2) AS max_elapsed_sec
FROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY
WHERE warehouse_name = 'WH_ETL_BRONZE_01'
  AND start_time >= '2026-04-30'
GROUP BY period
ORDER BY period;
Enter fullscreen mode Exit fullscreen mode

Credit Consumption (Daily)

SELECT
  CASE
    WHEN start_time::DATE < '2026-05-05' THEN 'XSMALL Period'
    ELSE 'SMALL Period'
  END AS period,
  ROUND(SUM(credits_used_compute) / COUNT(DISTINCT start_time::DATE), 2) AS daily_credits
FROM SNOWFLAKE.ACCOUNT_USAGE.WAREHOUSE_METERING_HISTORY
WHERE warehouse_name = 'WH_ETL_BRONZE_01'
  AND start_time::DATE >= '2026-04-30'
GROUP BY period
ORDER BY period;
Enter fullscreen mode Exit fullscreen mode

Results

Cost

Period Size Daily Credits
Baseline (5 days) XSMALL 15.12/day
Optimized (3 days) SMALL 13.50/day

Daily credits dropped by about 11% after upsizing.

Performance

Metric XSMALL SMALL Improvement
Avg query time 26.15s 24.13s 8% faster
Max query time 3,475s 2,850s 18% faster
Total queue time 2.20 min 0.28 min 87% less queuing

Note: The intermediate XSMALL plus concurrency 2 phase was used to validate behavior and direction. The published KPI table above compares the stable baseline period against the final tuned period.

The Counterintuitive Lesson

In some workloads, a bigger warehouse can cost less.

1. Faster execution means earlier suspend

Queries completed faster on SMALL, so the warehouse reached AUTO_SUSPEND = 60 sooner. Less runtime plus less idle overhead translated to fewer daily credits.

2. Higher concurrency can reduce cluster sprawl

Concurrency must match workload shape. In our case, moving from default 8 down to 2 first improved isolation for heavy MERGE jobs, then landing at 3 with SMALL gave the right balance of throughput and stability.

3. Less queueing improves throughput and utilization

With 87% less queue time, work finished in tighter windows. The warehouse did useful work and went to sleep sooner.

Key Takeaways

  1. Do not assume smaller is always cheaper.
  2. Monitor queued_overload_time closely. Persistent queueing often means you are under-sized or under-concurrented.
  3. Tune MAX_CONCURRENCY_LEVEL with warehouse size and workload type; they should be optimized together.
  4. Isolate heavyweight large-table workflows into a dedicated warehouse to reduce overlap and contention.
  5. Use both QUERY_HISTORY (performance) and WAREHOUSE_METERING_HISTORY (cost) for before/after decisions.
  6. Keep aggressive auto-suspend for bursty workloads. Faster queries plus short suspend windows compound savings.

Closing Thought

Warehouse right-sizing is a performance and FinOps exercise, not just a cost-control exercise. In many real workloads, a slightly larger warehouse with slightly higher concurrency can win on both speed and spend.

Disclosure and Scope

  • Results reflect our specific environment and workload profile; outcomes may vary.

This analysis used Snowflake built-in ACCOUNT_USAGE views only. No third-party monitoring stack required.

Top comments (0)