This is a blogpost that goes through the process of executing a query in postgres. This is a trivial process, but when I searched for a clear explanation of it, I could not find something that I found adequate and explaining enough detail to me, which is the reason that I created this post.
I like to go through these essential processes in great detail, because on huge scale small details matter. And with postgres we all have access to the sourcecode, so we can actually read what it is doing without guessing.
I am working for Yugabyte, and we reuse postgres as the SQL layer for our distributed database. We currently use version 11 of postgres, which is why I tend to look into the sourcecode of that version.
The actual loop which reads a database client response is in a function called PostgresMain in a layer called 'traffic cop' or 'tcop' in the file postgres.c. Inside this function, a postgres server process, which is known as a 'backend', is waiting for a command here. So far, you might find it cool to understand where in the code your psql session is spending its idle time or not.
However, if you scroll down a little further you will find a case command, which performs the choice of how postgres is executing the response from its client. As you can see, it's using the variable 'firstchar' which contains a capital letter after a client response. I am going to look into 'Q', also commented as 'simple query'.
I am now first going to take a step back. I am a bit confused by the name 'traffic cop'. I might miss background, I am relatively new to postgres, but I don't understand the name. The idea I get from that name is that it would perform some kind of network or client traffic management such as resource management. What I see the code in this layer is doing, is glueing all the different steps that query execution must perform together to produce a result. You could argue that its guarding execution, but it's not the first thing I think of when hearing 'traffic cop'.
I did my research, and there are quite some posts and there is quite some material that to me seems to indicate traffic cop is a layer you cross onto the next layers. Like I explained in the previous paragraph, this is not the case. The traffic cop layer is the base layer which points execution to the next step in another layer, which returns to this layer, and then gets pointed another step, etcetera, constantly returning to this layer until execution is done.
This blogpost is looking at the 'simple query protocol'. This protocol implements sending a query to the postgres backend in a single logical message, which be processed and will send the response back.
The alternative is the 'extended query protocol', which allows finer control over the different steps query processing is and must be doing on postgres, but at the cost of doing it in multiple messages alias roundtrips.
A prominent reason to look into the simple query protocol is because the 'psql' utility uses this protocol exclusively to access the database (and thus the 'ysqlsh' utility in YugabyteDB). PGJDBC by default uses the extended query protocol, but can optionally use the simple query protocol.
When using the simple query protocol, all work is done in one step, assigned by 'Q' in the variable 'firstchar'. This step executes the function exec_simple_query. In this step, all the actual needed steps for execution are performed in one go.
It would be silly to describe every single function. However I do think you should know the essential steps simple query execution takes. These are:
* exec_simple_query * ㄴ pg_parse_query ㄴ raw_parser Parse (syntactic) * ㄴ pg_analyze_and_rewrite ㄴ parse_analyze Parse (semantic) * ㄴ pg_rewrite_query ㄴ QueryRewrite Rewriter * ㄴ pg_plan_queries * ㄴ pg_plan_query ㄴ planner Planner * ㄴ PortalRun * ㄴ PortalRunSelect ㄴ ExecutorRun Executor
You might wonder what these asterisks mean. These indicate which of these functions are in the traffic cop layer. So everything starts in the traffic cop layer, and goes back to the traffic cop layer.
At this point you might be thinking if this blogpost is actually going anywhere, because there isn't anything that can be changed to make a difference, so why look into this then?
The only way to sensibly be able to look into performance is to understand how much time is spent in executing the work, and what is actually included in that time.
There are a couple of ways in postgres to look into execution latency.
A very common way to look into query performance is to use the extension pg_stat_statements, which groups SQLs and records execution statistics such as the number of executes, total time, min time, max time, etcetera. pg_stat_statements is a socalled contrib extension, which means it's an extension (so not something in the database code, it's external), but the extension is delivered as part of the source code.
The way pg_stat_statements can work inside the database and obtain runtime information, is using hooks, which allows external code in an extension to interact with database processing. Such code can subscribe itself to the hook, which means that the code get run when the hook is triggered.
The hooks that pg_stat_statements is using/subscribing itself to are ExecutorStart_hook and ExecutorFinish_hook (among others).
Actually ExecutorStart_hook and ExecutorFinish_hook are the hooks that record 'total time', 'min time' and others. Despite the naming, the time recorded is only the time between ExecutorStart_hook and ExecutorFinish_hook. That means that the time in pg_stat_statements is not the full query latency, but only the time it's spending in the executor executing the plan tree.
It is important to note that starting from version 13, the time recorded by pg_stat_statements can be changed from execution only to execution and planning. However, since we are using version 11, this choice is not available. And regardless of whether planning is included or not, it still is not the full query latency, although the most logically places which would take significant latency are included.
log_min_duration_statement is a parameter that can be set that will record a SQL if execution latency exceeds the value set with log_min_duration_statement in milliseconds.
The downside is that it records (writes) the statement in the database logfile, so you cannot query it from the database, like you can with pg_stat_statements.
The good news is it logs the entire time spent in exec_simple_query, including all the earlier mentioned steps. This means the time log_min_duration_statement records is measuring something different than the time recorded by pg_stat_statements.
I tried explaining some of the internal working of executing using the simple query protocol, and two of the common methods in current postgres to investigate performance.
It turns out pg_stat_statements only tracks the time of execution and not the full query execution latency, while log_min_duration_statement does track the full time.
You could argue it's likely that most of the time is spend in execution of a statement, and I would agree. But you must understand what the figures are that your tools are providing you, otherwise it's easy to come to wrong conclusions.