Discussion on: Cascade of doom: JIT, and how a Postgres update led to 70% failure on a critical national service

View post

Laurence 'GreenReaper' Parry • Nov 14 '21

I've had a similar experience with JIT - for short queries, well-optimized by index, it can impose a cost that outweighs any benefit it provides. It also greatly increases the size of the postgres package (though for most I doubt this is a big concern) - it'd be nice if distro packagers considered splitting its optimization symbols into optional dependency packages, as removing their files both saves space and is a convenient way to disable JIT.

A related issue occurred when they changed CTEs ("WITH x ...") to no longer materialize X on the grounds that it would give the optimizer more options. In our case we were using it as an optimization gate (since the community has an aversion to them otherwise...) and changing that caused several queries that can be very expensive if the query planner makes a bad call to run long, causing CPU issues 'til we added MATERIALIZED.

Of course PostgreSQL 14 is quite new, and in my experience it can take a few revisions to shake out issues with both new and existing functionality (as shown in the 14.1 patch notes) - so as a general rule, it's best to hang back. But I can appreciate wanting to switch to it quickly, especially if it fixes issues you've experienced.

Pouria Hadjibagheri • Nov 14 '21

Not much to add here - I agree with almost everything you said. CTEs, especially recursive ones, tend to be very expensive processes in Postgres. Needless to say that expensive processes, regardless of their execution time, activate JIT... and that's part of the problem.

Once you have lots of data (we have ~8 billion records in the DB) and the need to accommodate high-throughput clients, a lot of things change... generic tutorials you find on the internet tend to become a lot less helpful. They need real world experience, and that's something which is becoming increasingly more difficult to find in the market.

Materialised views of course help with CTEs as they virtually create the stuff as a separate, all inclusive, indexed table, and eliminate the need for lots of in-memory processes.

Re the upgrade; I tend to be of the opinion the we shouldn't upgrade to the latest release of a software immediately - not in production anyway. The issue for us is that we use partitioned tables. Prior to PG14, querying a partitioned tables using the main (parent) table via the partition key was virtually impossible because of the latency, so we had to directly query the partitions - for instance:

SELECT -- whatever
FROM covid19.time_series_p2021_11_14_ltla AS ts
  JOIN -- whatever
WHERE 
  -- whatever

PG14 introduced a fix that allows us to query the parent partition without the added latency, so the same query now looks like this:

SELECT -- whatever
FROM covid19.time_series AS ts
  JOIN -- whatever
WHERE 
  ts.partition_id = '2021_11_14|ltla'
  -- whatever

Makes life a lot easier and allows us to generate queries and use ORMs more easily.

There were other reasons too, but this was the most important one. Also, it's Postgres for crying out loud, these things shouldn't happen :/

Thanks for taking the time to share your experience and insight.

Laurence 'GreenReaper' Parry • Nov 15 '21

Yeah, I can appreciate the frustration there. It's probably frustrating for the developers, too!

My issue was not materialized views, but WITH x AS MATERIALISED, which is required to force PG 12+ to use non-recursive CTEs without side-effects as an optimization fence.

Half a decade ago we had an issue with mailing list code that generated a list of recommended users who had recently posted work that was liked by users that the person being mailed had watched, who they did not themselves watch (if they did, they would've seen the work already).

Many users did not actually watch anyone and so the check was quick, but it fell apart when the planner decided to compute the list of liked work (which could be expensive) before it considered whether there were any watched users who might have made those likes.

I hoisted watched users into a CTE, forcing it to quickly find out whether it should not execute the rest (apologies for 'legacy' code):

        $tmp_user_id = $this->sanitize($user_id);
        $tmp_count = $this->sanitize($count);
        $tmp_date = $this->sanitize($date);

-       $arr = $this->runQuery("select u.username from watches w, submission_favorites sf, submissions s, user_statistics t, users u where w.user_id = '$tmp_user_id' and w.watched_user_id = sf.user_id and sf.create_datetime > '$tmp_date' and sf.submission_id = s.submission_id and s.deleted != 't' and s.public = 't' {$this->acl_filter} and s.user_id not in (select watched_user_id from watches where user_id = '".$tmp_user_id."') and s.user_id != '$tmp_user_id' and s.user_id = u.user_id and u.user_id = t.user_id group by t.submission_views_count, u.username order by t.submission_views_count desc, u.username asc limit '$tmp_count';");
+       #The with clause here is necessary to avoid an unduly expensive query plan
+       $arr = $this->runQuery("with watched_users as (select watched_user_id from watches where user_id = '$tmp_user_id') select u.username from watches w, submission_favorites sf, submissions s, user_statistics t, users u where w.user_id = '$tmp_user_id' and w.watched_user_id = sf.user_id and sf.create_datetime > '$tmp_date' and sf.submission_id = s.submission_id and s.deleted != 't' and s.public = 't' {$this->acl_filter} and s.user_id not in (select * from watched_users) and s.user_id != '$tmp_user_id' and s.user_id = u.user_id and u.user_id = t.user_id group by t.submission_views_count, u.username order by t.submission_views_count desc, u.username asc limit '$tmp_count';");

        return $arr;
    }

This fell apart again when we switched to PostgreSQL 12 - though we didn't notice until the weekly mailer, when things got noticeably slower for a long time. Thankfully it was limited to one CPU core, but that's on a quad-core which normally hit an average of three at peak - and it wasn't the only place we'd used that pattern, just one of the most expensive.

Having said that, usually version upgrades being benefits. But they're definitely a two-edged sword - I'm still not sure about upgrading to 14 because it means we'll lose SSL compression (used to avoid overages due to replication transfers), or have to hack OpenSSL more to ignore it being explicitly disabled. A replacement feature is still stuck in the conmitfest process. Perhaps we'll stick with 13 for now - we have another four years to think about migration.

Pouria Hadjibagheri • Nov 15 '21

Gotcha. Don't get me wrong though, PG14 is great, and it's faster... just make sure you turn JIT off if you compile with LLVM. :))

Thanks again for sharing your thoughts. Great to hear about your experience. Some of these edge cases are quite rare, and it's great to have them documented in a consolidated format - even as a comment.

I guess the reason why they are edge cases is - at least to some extent - because nowadays most devs rely almost exclusively on ORMs to deal with databases, which means they never really get to know or learn about the true power of an RDBMS.

Once again, thanks Laurence.