DEV Community

Bill Schneider
Bill Schneider

Posted on

4 1

Measuring AWS Redshift Query Compile Latency

This article originally appeared on my blog

AWS is transparent that Redshift's distributed architecture entails a fixed cost every time a new query is issued. The documentation says the impact "might be especially noticeable when you run one-off (ad hoc) queries."

I went deeper to try to quantify exactly what "noticeable" means.

To isolate the impacts of data cache hits/misses from query compilation, I ran a bunch of queries on empty tables so there is no data to load or cache. Each query was slightly modified to trigger a recompilation, by changing the columns or aggregate functions.

I found that the compile latency scales with the complexity of the query.

  • Simple query: usually between 1-1.5 sec, with an outlier around 3 seconds. Example of a simple query:
select sum(a1) from foo where a2 = 1;
select sum(a2) from foo where a3 = 1;
-- etc.
  • More complex query with more conditions, and group-by: usually around 2-3 seconds. Example of a query in this category:
select a8, a9, sum(a1), sum(a2)
from foo
where foo.a3 > 10 and foo.a4 < foo.a5
group by a8, a9;
  • Even more complex, with joins and group-by: average around 5 seconds, ranging between 3-7 seconds. Example query:
select s, s2, count(a6), sum(a7)
from foo
join bar on bar.a = foo.a6
join baz on baz.b = foo.a7
where foo.a3 = 1 and baz.s2 is not null
group by s, s2;

AWS GenAI LIVE image

Real challenges. Real solutions. Real talk.

From technical discussions to philosophical debates, AWS and AWS Partners examine the impact and evolution of gen AI.

Learn more

Top comments (0)

Billboard image

Create up to 10 Postgres Databases on Neon's free plan.

If you're starting a new project, Neon has got your databases covered. No credit cards. No trials. No getting in your way.

Try Neon for Free →