DEV Community: Aliaksei Kirkouski

Strange PostgreSQL query planner behavior in subqueries

Aliaksei Kirkouski — Thu, 19 Sep 2024 09:03:28 +0000

In a previous article, I described the problems that arise when working with temporary tables. Then I briefly described why we have to use them so often. In particular, one of the reasons was that the query planner in PostgreSQL doesn't work properly. In this article I will show another quite simple and often used case when the planner goes wrong, which can lead to a significant increase in resource consumption.

The problem is reproduced on the latest currently stable version of PostgreSQL - 16.4. The default PostgreSQL settings are used. I tried changing different settings, but I was not able to achieve the correct plan in the general case, since in this case the problem is more logical rather than computation costing. However, anyone can easily reproduce this situation locally and try playing around with the settings.
Consider a simple domain logic where there are documents and their lines. For each line, a sum is entered. The lines are in a separate table and refer to the document :

CREATE TABLE doc (id int PRIMARY  KEY);
CREATE TABLE line (id int PRIMARY  KEY, docId int, amount numeric);
CREATE INDEX line_doc ON line (docid);

Let's fill the table with test data. Generate 100,000 documents having 10, 20, 50 and 100 lines in equal proportion :

INSERT INTO doc (id) SELECT generate_series AS id FROM generate_series(1, 100000);
INSERT INTO line (id, docId, amount) (SELECT docId*100 + id AS id, docId, random() AS amount FROM generate_series(1, 10) AS id, generate_series(1, 25000) AS docid);
INSERT INTO line (id, docId, amount) (SELECT docId*100 + id AS id, docId, random() AS amount FROM generate_series(1, 20) AS id, generate_series(25001, 50000) AS docid);
INSERT INTO line (id, docId, amount) (SELECT docId*100 + id AS id, docId, random() AS amount FROM generate_series(1, 50) AS id, generate_series(50001, 75000) AS docid);
INSERT INTO line (id, docId, amount) (SELECT docId*100 + id AS id, docId, random() AS amount FROM generate_series(1, 100) AS id, generate_series(75001, 100000) AS docid);

Let's run ANALYZE so that PostgreSQL collects the correct statistics for them :

ANALYZE doc;
ANALYZE line;

Let's get a plan for a simple query to get the lines for a particular document :

SELECT id FROM line WHERE docId = 4353;

Index Scan using line_doc on line  (cost=1.73..12.03 rows=70 width=4)
  Index Cond: (docid = 4353)

We can immediately see that PostgreSQL is a pessimist, as it thinks it will get 70 records at the output. On the test data, on average each document has about half as many records. This is not critical, and in some cases even useful. PostgreSQL can be understood, but specifically in our case such pessimism will only make the problem worse.

Next, let's try to make a simple, from the developer's point of view, query :

SELECT docId, SUM(amount) FROM line WHERE docId IN (3531,6572) GROUP BY 1;

It just gets the two documents along with the sum by line for each of them. But what do we see in the plan :

GroupAggregate (cost=1.73..35.03 rows=139 width=36)
Group Key: docid
-> Index Scan using line_doc on line (cost=1.73..23.91 rows=139 width=15)
Index Cond: (docid = ANY ('{3531,6572}'::integer[])))

The query planner thinks that for these two documents it will select 139 rows (which corresponds to the statistics of the previous query), but as a result of grouping by documents it will have the same 139(!!!) documents. Although, obviously, there will be at most 2 documents. As a result, the statistics does not correspond to the real one almost 70 times.

By the way, if you make a query with one document, the statistics is already correct (apparently there are some heuristics in the planning) :

SELECT docId, SUM(amount) FROM line WHERE docId IN (3531) GROUP BY 1;
GroupAggregate (cost=1.73..12.79 rows=1 width=36)

-> Index Scan using line_doc on line (cost=1.73..12.03 rows=70 width=15)
Index Cond: (docid = 3531)

This optimization for one value appeared only in the latest versions of PostgreSQL. Before version 15 it was not available yet, and the planner thought that it would have 70 records.

Next, let's consider the following task. We will need to display to the user a page of the list of documents, consisting of 50 records and containing the amounts for each document. To solve this task, let's record the codes of all the documents on one page in a separate temporary table :

CREATE TEMPORARY  TABLE tmp (id int PRIMARY  KEY);
INSERT INTO tmp (id) SELECT * FROM generate_series(1, 50);
ANALYZE tmp;

Finally, let's generate a query to get the document amounts, using a subquery to calculate the amounts for each document. First, let's do the easiest way that first comes to mind, using the subquery :

SELECT tmp.id,
       sq.amount
FROM tmp
LEFT JOIN
  (SELECT docid,
          SUM(amount) AS amount
   FROM line
   GROUP BY 1) sq ON sq.docid = tmp.id

Unfortunately, PostgreSQL doesn't realize that it needs to calculate amounts for only 50 documents, so it calculates them for all documents in the table :

Hash Right Join (cost=155816.90..163627.72 rows=1616168 width=36)
  Hash Cond: (line.docid = tmp.id)
  -> Finalize HashAggregate (cost=155811.30..159691.74 rows=64674 width=36)
        Group Key: line.docid
        -> Gather (cost=135115.62..151930.86 rows=129348 width=36)
              Workers Planned: 2
              -> Partial HashAggregate (cost=134115.62..137996.06 rows=64674 width=36)
                    Group Key: line.docid
                    -> Parallel Seq Scan on line ( cost=0.00..96615.82 rows=1874990 width=15)
  -> Hash (cost=2.60..2.60 rows=50 width=4)
        -> Seq Scan on tmp ( cost=0.00..2.60 rows=50 width=4).

This, of course, is not the optimal plan, so we will help PostgreSQL a bit, and add a JOIN with our temporary table (by the way, the lsFusion platform does this automatically when generating queries) :

SELECT tmp.id,
       sq.amount
FROM tmp
LEFT JOIN
  (SELECT docid,
          SUM(amount) AS amount
   FROM line
   JOIN tmp ON tmp.id = docid
   GROUP BY 1) sq ON sq.docid = tmp.id

This results in a much better plan :

Hash Right Join (cost=8.82..744.26 rows=870 width=36)
  Hash Cond: (line.docid = tmp.id)
  -> GroupAggregate (cost=3.22..527.25 rows=3479 width=36)
        Group Key: line.docid
        -> Merge Join (cost=3.22..248.93 rows=3479 width=15)
              Merge Cond: (line.docid = tmp_1.id)
              -> Index Scan using line_doc on line (cost=1.73..453359.63 rows=4499977 width=15)
              -> Index Only Scan using tmp_pkey on tmp tmp_1 (cost=0.56..5.86 rows=50 width=4)
  -> Hash (cost=2.60..2.60 rows=50 width=4)
        -> Seq Scan on tmp (cost=0.00..2.60 rows=50 width=4).

However, even though the plan is actually optimal, there is one big problem with the statistics. First, when grouping by document, PostgreSQL assumes that there will be as many documents as rows (3479). That is, like in the examples above, it is wrong by a factor of 70.

And secondly, after executing LEFT JOIN with the temporary table tmp, it does not guess that the subquery will have unique keys. This is extremely surprising, considering the fact that GROUP BY always has unique values of the fields by which the grouping is performed.

Obviously, the result cannot have more records than the original table, but PostgreSQL expects there to be 870 records, which is more than 15 times higher than the correct statistics.

In the simple case, this error in statistics is not that critical. However, if the query becomes more complex, this error can lead to completely incorrect query plans. For example, for the test, in order not to complicate the logic, let's add LEFT JOIN with the same table 2 more times (as if we need to calculate the sums from other tables). In addition, let's add another field read from the source table doc :

SELECT tmp.id,
       sq.amount,
       sq2.amount,
       sq3.amount,
       d1.id
FROM tmp
LEFT JOIN
  (SELECT docid,
          SUM(amount) AS amount
   FROM line
   JOIN tmp ON tmp.id = docid
   GROUP BY 1) sq ON sq.docid = tmp.id

LEFT JOIN
  (SELECT docid,
          SUM(amount) AS amount
   FROM line
   JOIN tmp ON tmp.id = docid
   GROUP BY 1) sq2 ON sq2.docid = tmp.id

LEFT JOIN
  (SELECT docid,
          SUM(amount) AS amount
   FROM line
   JOIN tmp ON tmp.id = docid
   GROUP BY 1) sq3 ON sq3.docid = tmp.id

LEFT JOIN doc d1
    ON tmp.id = d1.id

We get a plan like this :

Hash Left Join (cost=1824.83..2788.04 rows=263256 width=104)
  Hash Cond: (tmp.id = sq3.docid)
  -> Hash Left Join (cost=914.89..1715.05 rows=15134 width=72)
        Hash Cond: (tmp.id = sq2.docid)
        -> Merge Left Join (cost=4.95..795.76 rows=870 width=40)
              Merge Cond: (tmp.id = line.docid)
              -> Merge Left Join (cost=1.73..15.76 rows=50 width=8)
                    Merge Cond: (tmp.id = d1.id)
                    -> Index Only Scan using tmp_pkey on tmp (cost=0.56..5.86 rows=50 width=4)
                    -> Index Only Scan using doc_pkey on doc d1 (cost=1.17..10028.77 rows=100000 width=4)
              -> GroupAggregate (cost=3.22..527.25 rows=3479 width=36)
                    Group Key: line.docid
                    -> Merge Join (cost=3.22..248.93 rows=3479 width=15)
                          Merge Cond: (line.docid = tmp_1.id)
                          -> Index Scan using line_doc on line (cost=1.73..453359.63 rows=4499977 width=15)
                          -> Index Only Scan using tmp_pkey on tmp tmp_1 (cost=0.56..5.86 rows=50 width=4)
        -> Hash (cost=701.20..701.20 rows=3479 width=36)
              -> Subquery Scan on sq2 (cost=3.22..701.20 rows=3479 width=36)
                    -> GroupAggregate (cost=3.22..527.25 rows=3479 width=36)
                          Group Key: line_1.docid
                          -> Merge Join (cost=3.22..248.93 rows=3479 width=15)
                                Merge Cond: (line_1.docid = tmp_2.id)
                                -> Index Scan using line_doc on  line_1 (cost=1.73..453359.63 rows=4499977 width=15)
                                -> Index Only Scan using tmp_pkey on tmp tmp_2 (cost=0.56..5.86 rows=50 width=4)
  -> Hash (cost=701.20..701.20 rows=3479 width=36)
        -> Subquery Scan on sq3 (cost=3.22..701.20 rows=3479 width=36)
              -> GroupAggregate (cost=3.22..527.25 rows=3479 width=36)
                    Group Key: line_2.docid
                    -> Merge Join (cost=3.22..248.93 rows=3479 width=15)
                          Merge Cond: (line_2.docid = tmp_3.id)
                          -> Index Scan using line_doc on  line line_2 (cost=1.73..453359.63 rows=4499977 width=15)
                          -> Index Only Scan using tmp_pkey on tmp tmp_3 (cost=0.56..5.86 rows=50 width=4)

In the plan we can see that the error “accumulates” and here PostgreSQL expects 263256 records (although there cannot be more than 50 of them). However, the DBMS optimizer guesses to rearrange the JOIN executions and reads fields from the doc table only for records from the temporary table (see lines 7-10 in the plan). The problem is that if the number of JOINs is large (in particular, more than the join_collapse_limit parameter, which is 8 by default), such optimization may not help.

Let's simulate this situation by simply setting the join_collapse_limit parameter to one and running the same query :
SET join_collapse_limit=1;

Hash Left Join (cost=12873.00..16545.19 rows=263256 width=104)
  Hash Cond: (tmp.id = d1.id)
  -> Hash Left Join (cost=1828.70..2736.54 rows=263256 width=100)
        Hash Cond: (tmp.id = sq3.docid)
        -> Hash Left Join (cost=918.76..1663.55 rows=15134 width=68)
              Hash Cond: (tmp.id = sq2.docid)
              -> Hash Right Join (cost=8.82..744.26 rows=870 width=36)
                    Hash Cond: (line.docid = tmp.id)
                    -> GroupAggregate (cost=3.22..527.25 rows=3479 width=36)
                          Group Key: line.docid
                          -> Merge Join (cost=3.22..248.93 rows=3479 width=15)
                                Merge Cond: (line.docid = tmp_1.id)
                                -> Index Scan using line_doc on line (cost=1.73..453359.63 rows=4499977 width=15)
                                -> Index Only Scan using tmp_pkey on tmp tmp_1 (cost=0.56..5.86 rows=50 width=4)
                    -> Hash (cost=2.60..2.60 rows=50 width=4)
                          -> Seq Scan on tmp (cost=0.00..2.60 rows=50 width=4)
              -> Hash (cost=701.20..701.20 rows=3479 width=36)
                    -> Subquery Scan on sq2 (cost=3.22..701.20 rows=3479 width=36)
                          -> GroupAggregate (cost=3.22..527.25 rows=3479 width=36)
                                Group Key: line_1.docid
                                -> Merge Join (cost=3.22..248.93 rows=3479 width=15)
                                      Merge Cond: (line_1.docid = tmp_2.id)
                                      -> Index Scan using line_doc on  line_1 (cost=1.73..453359.63 rows=4499977 width=15)
                                      -> Index Only Scan using tmp_pkey on tmp tmp_2 (cost=0.56..5.86 rows=50 width=4)
        -> Hash (cost=701.20..701.20 rows=3479 width=36)
              -> Subquery Scan on sq3 (cost=3.22..701.20 rows=3479 width=36)
                    -> GroupAggregate (cost=3.22..527.25 rows=3479 width=36)
                          Group Key: line_2.docid
                          -> Merge Join (cost=3.22..248.93 rows=3479 width=15)
                                Merge Cond: (line_2.docid = tmp_3.id)
                                -> Index Scan using line_doc on  line line_2 (cost=1.73..453359.63 rows=4499977 width=15)
                                -> Index Only Scan using tmp_pkey on tmp tmp_3 (cost=0.56..5.86 rows=50 width=4)
  -> Hash (cost=5044.30..5044.30 rows=100000 width=4)
        -> Seq Scan on doc d1 (cost=0.00..5044.30 rows=100000 width=4)

We can see that PostgreSQL started executing JOINs in the same order as in the query itself. As a result, due to an error in the statistics, the query planner started to consider that it would be efficient to do a seq scan (see the last 2 lines of the plan) of the entire doc table followed by a hash join. And it would be correct if the first part of the query would really give 260 thousand records. In practice, however, there will be 50 records there, and it will be much more efficient to do a simple run on the index.

If we use a regular table instead of the temporary table tmp, the result will not change. However, if you use the main table with WHERE instead of the temporary table, then suddenly scheduling becomes correct :

SELECT doc.id,
       sq.amount
FROM doc
LEFT JOIN
  (SELECT docid,
          SUM(amount) AS amount
   FROM line
   JOIN tmp ON tmp.id = docid
   GROUP BY 1) sq ON sq.docid = doc.id
WHERE doc.id >= 1 AND doc.id <= 50;

Merge Left Join (cost=4.39..744.12 rows=52 width=36)
  Merge Cond: (doc.id = line.docid)
  -> Index Only Scan using doc_pkey on doc  (cost=1.17..7.51 rows=52 width=4)
        Index Cond: ((id >= 1) AND (id <= 50))
  -> GroupAggregate (cost=3.22..527.25 rows=3479 width=36)
        Group Key: line.docid
        -> Merge Join (cost=3.22..248.93 rows=3479 width=15)
              Merge Cond: (line.docid = tmp.id)
              -> Index Scan using line_doc on line (cost=1.73..453359.63 rows=4499977 width=15)
              -> Index Only Scan using tmp_pkey on tmp (cost=0.56..5.86 rows=50 width=4).

As we can see, PostgreSQL now expects 52 rows. And it also makes the same mistake when determining the number of rows in the nested subquery. However, after JOIN it does not think that the number of records will increase. And all this despite the fact that there is actually no connection between the doc table and the nested subquery. The subquery uses only tables line and tmp, which from the point of view of the database schema are in no way related to the original doc table.

Unfortunately, I don't have an explanation for this behavior of the query planner yet. Using the source table with WHERE in an external query is also not very good, because in case of a complex filter PostgreSQL can also make a mistake and get incorrect statistics on the number of records. When JOIN with a temporary table, it at least knows exactly how many records are in it.

I have only considered the simplest case of grouping in a nested query. In practice, there are much more complex subqueries on which the scheduling error is also reproduced. Fortunately, this problem doesn't occur very often, as optimizing PostgreSQL with JOIN rearrangement usually helps. However, every now and then we stumble upon such inefficient queries.

In the lsFusion platform, this problem is easily fixed by materializing a nested subquery without changing the application logic. But this has negative effects: increasing the number of stored fields increases the size of the database and also slows down the database write transaction. Another option to solve the problem is to pre-write nested queries into temporary tables and then run ANALYZE for them. This is also done automatically by the platform when the query execution time exceeds some threshold.

Recently, a patch has been proposed in PostgreSQL that fixes this problem quite easily. Perhaps, if it is accepted, the problem will be solved.

PostgreSQL and Temporary Tables

Aliaksei Kirkouski — Thu, 12 Oct 2023 10:29:12 +0000

We have been using PostgreSQL as our main database for many years. During this time, it has proved to be a fast and reliable RDBMS. However, there is one problem in PostgreSQL that we have to face quite often. Unfortunately, the implementation of temporary table logic in it has a number of drawbacks that negatively affect system performance.

Temporary tables are most actively used by platforms in which the developer does not work directly with the database, and tables and queries are generated directly by the platform itself. We are developing one of such platforms.

In this article I will describe why you have to use temporary tables, what the problem is, and how to improve performance by customizing the operating system and PostgreSQL.

What is the problem

Temporary tables in PostgreSQL are implemented in much the same way as regular tables. Developers can be understood, because otherwise they would have to make two code branches for regular and temporary tables separately. This would greatly complicate the DBMS logic and add reliability and performance problems. On the other hand, the use cases of temporary tables are very narrow, and many mechanisms used for regular tables are redundant for temporary tables. In particular, they are guaranteed not to be used by multiple connections at the same time, are not subject to locks, do not require write reliability, etc.

As for regular tables, information about temporary tables is located in PostgreSQL system tables. In addition, for each table, one or more files are created on disk (by default, in the same folder as the files for regular tables).

One side effect of this implementation of temporary tables in PostgreSQL is transactional support. If the temporary tables were modified within a transaction, and the transaction is then rolled back, the temporary table will also be restored to its pre-transaction state. In most cases, this behavior is not really needed and also creates some overhead.

The problem is that temporary tables must be frequently purged. Doing it with DELETE ALL is bad because PostgreSQL uses MVCC for temporary tables as well, and deleting records in this model is a relatively slow process. Therefore, we have to use TRUNCATE.

TRUNCATE simply creates a new file on disk and does an UPDATE of the pg_class table. This is easily verified by doing a query like this one and seeing what happens on disk after each one:



CREATE TEMPORARY TABLE t0 (key0 integer);
SELECT relfilenode FROM pg_class WHERE relname = 't0';
TRUNCATE t0;
SELECT relfilenode FROM pg_class WHERE relname = 't0';

One file of the kind t0_9782399 will be created on disk first, and then another with the new relfilenode.

Also, after adding records to a temporary table, you have to ANALYZE it so that PostgreSQL knows the correct statistics of the data in it. ANALYZE, in turn, also modifies the system tables and accesses the disk (in the visibilitymap_count function).

With a large number of TRUNCATE and ANALYZE, two problems arise :

The pg_class table (and other system tables) becomes bloated. If pg_class is 30 MB after VACUUM FULL, it can grow to 1 GB in a couple of hours. Considering that system tables are accessed very often, increasing its size increases the CPU load. Plus the autovacuum of the pg_class table itself is often triggered for this reason.

New files are constantly created and old ones are deleted, which requires access to the file system. And everything would be all right, but in case of a large disk system load with constant buffer rotation, it starts to slow down users who normally perform actions that do not require disk access, although they already have all data in shared buffers. But TRUNCATE of temporary tables "stops" them waiting for disk access (although in fact their size does not exceed temp_buffers, and there is no sense to use the disk).

As a result, working with temporary tables (specifically in the DDL part) uses a significant part of processor time:

And direct access to the file system takes 13.5% of CPU :

Theoretically, all work with temporary files should take place inside the disk cache, and at least do not wait for IO access, and at most do not access the disk at all. However, firstly, disk caches are actively used for the main database as well, and secondly, even in asynchronous mode, from time to time there will still be writing, which creates additional IO operations. For the test we moved all the temporary tables to a separate disk, and there was quite a large write to the disk with almost no reads :

The problem is worsened by the fact that all files that are created for temporary tables are actually in one folder. Their number can reach several hundred thousand at certain times. Unfortunately, not all file systems (e.g. xfs) handle such a large number of files in a single directory well. At the same time, files are not just lying there, but are constantly being created and deleted at a tremendous rate.

Also, processes spend most of their time in LWLockAttemptLock, which is called from LockRelationOID.

In a normal situation, calls to LWLockAttemptLock are quite fast and do not affect performance much. However, this is a potential bottleneck and can lead to a very severe degradation of the whole database server. In particular, we have had this happen due to virtualization issues, because inside LWLockAttemptLock uses kernel functions for process synchronization. Once there was a situation when in PostgreSQL the number of concurrent processes in active status reached 150, among which 100 were locked by LWLock / LockManager on CREATE TEMPORARY TABLE and TRUNCATE queries.

Why to use them

We are often accused that we use temporary tables too much and we could do without them. Of course, we would love to get rid of them, but unfortunately, there are at least three situations when it is hard to think of another way.

Intermediate calculations

There are situations when you need to perform some complex calculations for a certain subset of objects. In this case, it is often most efficient to first write the object keys into a temporary table and then use it to calculate specific values using JOIN. You can, of course, always embed the original object subset filters in queries, but this can lead to repeating the same calculations, which would be less efficient.

Incorrect statistics

PostgreSQL has a relatively simple, yet fast, query planning algorithm. However, it has one major problem. It builds a plan, and then sticks to it, even if it turns out to be wrong. In the worst case, you may get a situation when PostgreSQL expects 1-2 records in the intermediate calculations, but in fact it turns out to be thousands of times more. As a result, Nested Loop execution leads to a huge complexity of the algorithm, which causes the process to hang with high CPU load. Unfortunately, PostgreSQL does not have the ability to specify hints like some other DBMSs. As a workaround, the lsFusion platform, when it sees a hung query by timeout, first cancels the query and then splits it into several queries using temporary tables. Unfortunately, it is very hard to determine where exactly PostgreSQL made a mistake, so the partitioning algorithm is heuristic. The algorithm first writes some intermediate data (for example, nested subqueries) into temporary tables by separate queries, and then these tables are used in the final query after ANALYZE. By doing so, PostgreSQL will already have the correct statistics of the intermediate computations, and the subsequent plan will be more accurate.

Storing changes

While working in the system, when a user makes some changes, they are not written to the database immediately, but only when the Save button is pressed. Only at this moment the transaction starts and the changes are directly published to the database with checking of constraints and recalculation of all dependent fields. However, before saving, all changes are stored in temporary tables. There are several reasons for this.

First, the costs associated with data transfer between the application server and the DBMS are reduced. For example, the user changed some data on the form - they are written to the temporary table, and at the moment of saving, an INSERT or UPDATE is simply executed from it to the main table.

Secondly, it simplifies the logic of calculations. For example, if it is necessary to calculate some parameter taking into account the changes, you can do SELECT SUM(COALESCE(<main table>.field, <temporary table>.field) ... FROM <main table> FULL JOIN <temporary table> ON ... GROUP BY ....

If you keep the changes on the application server (or on the client in general), you will have to transfer all the necessary data there and perform the corresponding calculations there as well. This will result in both greater memory consumption and duplication of the logic of calculating the values on the DBMS and the application server, if the same values have to be calculated in SQL queries. Besides, all the tables on our forms are built in the form of "dynamic lists". That is, only the "visible window" of data is transferred to the client for optimization purposes. Accordingly, for example, all the document lines necessary for calculating the data may simply not exist on the client and the application server.

Of course, this approach has its disadvantages. The main one is the significant complication of clustering and horizontal scaling. Since temporary tables in PostgreSQL are bound to connections, they have to be "assigned" to specific users. Accordingly, switching between servers is possible only with the transfer of temporary tables to a new server. However, in practice, vertical scaling is enough for now. Due to sufficiently optimal queries, our largest clients with several thousand concurrent users have enough resources of one database server with 48 cores (with HT - 96) and 512GB of memory.

Since usually the same temporary tables are required during operation, in order not to constantly create and delete tables, we "cache" them by purging them with a lighter TRUNCATE command. This reduces the load on the database because truncating a temporary table requires fewer resources, but it increases the number of concurrent files on disk (while the table is cached but not in use).

What to do

In Linux, there is one approach that can significantly reduce disk usage of temporary tables. It is to allocate a separate RAM disk for temporary tables. This does not require any changes in the program code, and the above procedure can be performed on a running database without stopping the DBMS or application.

PostgreSQL has the temp_tablespaces option, which defines the default tablespace in which all temporary tables will be created. If the value is empty, temporary tables are created next to the main database tables.

To implement our task, we first need to create a new tablespace specifically for temporary tables. Before that, we need to create the directory where the tablespace files will be stored. For example, let it be /mnt/dbtemp. After creating the directory, you need to set the access rights that PostgreSQL requires:



mkdir /mnt/dbtemp
chmod 700 /mnt/dbtemp
chown postgres:postgres /mnt/dbtemp

Next, go into psql and execute the following command :



CREATE TABLESPACE temp LOCATION '/mnt/dbtemp';

As a result, an empty directory like PG_13_202007201 will be created inside /mnt/dbtemp. As tablespace is used, directories for each database will be created in it, as well as the pgsql_tmp directory. In the last one, files will be created during the execution of SQL queries, if intermediate calculations did not fit into work_mem.

Theoretically, you can load the entire PG_13_202007201 directory into memory. However, then the work_mem parameter loses its meaning, because if it is exceeded, it will again be written to memory, not to disk. For this reason, we usually make a RAM disk exclusively for the required database, not for the whole catalog.

In order to make a catalog for the desired database appear, the easiest thing to do is to go into psql and run :



CREATE TEMPORARY TABLE mytemptable (key0 int8) TABLESPACE temp;
DROP TABLE mytemptable;

After that, a directory of the kind 936082, which is equal to the internal identifier of the database, will appear in the tablespace directory. It is this directory that we will load into memory. To do this, add the following line to /etc/fstab :



tmpfs /mnt/dbtemp/PG_13_202007201/936082 tmpfs rw,nodev,nosuid,noatime,nodiratime,size=1G 0 0

Then mount the RAM disk explicitly using the command :



mount /mnt/dbtemp/PG_13_202007201/936082

It is important to choose the right size of the RAM-disk. In the example above it is 1GB, but the value can be changed at your choice. You should remember that the size of the RAM disk is its limit, and as long as there is no data on it, the memory is not actually used. On the other hand, a large limit can result in either running out of memory or the operating system going into swap.

There is one more thing to keep in mind when working with temporary tables in PostgreSQL. If at the moment of INSERT INTO <temporary table> execution the RAM disk space runs out, the DBMS will throw an error, but the file will remain and the space will not be freed. After that, even TRUNCATE <temporary table> will throw an error, because this command must first create a new file, which will be impossible due to the lack of space. The only thing that can be done in such a situation is DROP TABLE <temporary table>. By the way, the lsFusion platform does all this automatically.

If everything was successful, the only thing left to do is to change the temp_tablespaces option in postgresql.conf :



temp_tablespaces = 'temp'

And to make it apply, in psql run :



SELECT pg_reload_conf();

Right after that all new temp tables will start to be created in memory, and after some time all will move there.

As a result, when analyzing perf, the time of ext4 function calls is reduced to 1.6%, while working with tmpfs is almost invisible :

Basic PostgreSQL Configuration on Linux

Aliaksei Kirkouski — Wed, 16 Aug 2023 07:25:20 +0000

From time to time I have to hear some system administrators say that installation, configuration and support of PostgreSQL on Linux is very complicated. That it is much cheaper to buy Windows and Microsoft SQL Server licenses than to hire highly qualified administrators to administer all these open-source systems.

In each of our business applications that use PostgreSQL as a DBMS, from 500 to 2500 users work concurrently. The applications implement almost all the main processes of retail chains. The size of the databases currently ranges from 2 to 4TB. And all of them work practically with standard PostgreSQL settings on single servers without any clustering. Even in the most heavily loaded servers there is still a significant reserve of resources for further increase in workload without the need for clustering.

Yes, of course, much depends on DBMS queries, and a few bad queries can bring down the whole server. However, it is just as possible to put down Oracle and MSSQL.

In this article, I will fully describe all the PostgreSQL (and a little bit of OS) configurations that we do on our systems. In addition, we specifically try not to change those settings that don't give a visible performance change, so that we don't have to wonder why one environment has a problem and another doesn't.

Installation

I will not focus on the installation itself, as there are a million instructions on it on the Internet, but I will briefly describe it to show how easy it is.

Right now we use Debian 11 as the operating system for most of our customers.

The only recommendation we try to give to the client's system administrators when installing the operating system is to have a separate disk for the database (preferably even without LVM). This is convenient because then you can easily change the OS if necessary by simply connecting the disk with the database to another virtual machine.

After the OS is installed and SSH-access is obtained, the installation is done as described on the official PostgreSQL site. In the console you need to run the following commands :

sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list'
wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -.
sudo apt-get update
sudo apt-get -y install postgresql postgresql-contrib

These commands add the apt repository to the list, install the PostgreSQL program files, and create the database cluster.

By default, the database will be installed under the path /var/lib/postgresql/15/main, and the configuration files will be in the /etc/postgresql/15/main directory.

Configuration

All basic PostgreSQL settings are located in two files: postgresql.conf and pg_hba.conf. The first stores the settings of the database itself, and the second - the settings for accessing it. You can change the settings by editing these files in any text editor.

After any parameter changes, you should notify the DBMS that the configuration must be reread. Only a small portion of the parameters require restarting the PostgreSQL service (using systemctl restart postgresql). Most of the parameters can be changed on the fly in several ways. I most often use psql for this. To do this, first in bash you need to execute :

su postgres
psql

And then inside psql you run :

SELECT pg_reload_conf();

Main settings

The main settings that can significantly affect performance, which should be done first, are the memory settings. By default, PostgreSQL is configured to run on any machine without changing any settings.

The first thing you should probably do is to increase the max_connections parameter. By default, it is 100, which may not be enough if you have a large number of users. However, you should not set it too high either (as there are additional costs). Since our platform creates a dedicated connection for each user, we usually set :

max_connections = <number of users> * 2

PostgreSQL does not work with data on disk directly. When it needs to read or write something, it loads the corresponding pages from disk into a block of memory called shared buffers. This is shared memory that is used by all connections at the same time. The higher the size of these buffers, the less the load on the disk will be. For fine tuning you can analyze in dynamics how exactly these buffers are rotated, but in practice we usually set from 30 to 50% of all available memory on the server:

shared_buffers = 128GB

In addition to this parameter, we usually configure the next ones right away :

temp_buffers = 32MB
work_mem = 32MB

These parameters specify how much memory can be used by each connection for internal operations when running queries and working with temporary tables. Since they determine how much memory each connection will consume, these parameters are best adjusted empirically at runtime depending on available memory. If memory has been sufficient for the last week, you can increase both parameters (and vice versa).

There is one technique used automatically by the lsFusion platform to reduce memory consumption. Each connection to PostgreSQL is a separate process in the OS. As queries are executed, these processes are not always quick to give the used memory back to the OS. To deal with this, the platform closes active connections from time to time and reopens them. In this way, the process that consumes a lot of memory is terminated and a new one is created in its place. This significantly reduces the amount of private memory consumed by all user connections.

maintenance_work_mem = 2GB

This parameter is too small by default and it is better to increase it to speed up various system operations.

Additional settings

The settings described above are already sufficient for PostgreSQL to work well enough. Next, I will describe the settings we make to improve performance. Each of them does not give a significant increase, but they can be useful in certain cases.

wal_level = minimal
synchronous_commit = off

If we don't plan to configure asynchronous replication, we usually lower wal_level. Synchronous_commit is also disabled, since we are not writing banking systems. If the server goes down (which happens very rarely), it makes no difference whether the user gets a successful save message or not. But all transactions will work a bit faster.

checkpoint_timeout = 20min
max_wal_size = 16GB

Under heavy load, some of our clients' DBMSs manage to write 1GB of wal per minute. When max_wal_size is set to 1GB, it turns out that checkpoints will occur once a minute, which is not good (especially when full_page_writes is enabled). That's why we usually increase the value so that checkpoints occur once every 20 minutes. Accordingly, the disk load decreases a bit. Yes, it will take longer to recover in case of a crash, but it happens very rarely.

seq_page_cost = 0.1
random_page_cost = 0.1

cpu_tuple_cost = 0.05
cpu_index_tuple_cost = 0.05
cpu_operator_cost = 0.01

Usually we significantly lower (compared to the default) disk costs and in turn increase the cost of CPU operations. This is done because PostgreSQL was originally configured for slow HDD disks. We always use SSD disks in RAID-arrays, where the cost of reading is much lower, and random write/read is not much different from sequential.

The only thing is that we try to use identical settings of cost parameters everywhere so that the execution plans of requests are the same. Otherwise, everything may work fast on a test environment, while in a production environment there will be a different plan that will be much slower.

It should be noted here that changes in PostgreSQL parameters do not always lead to the expected result in query plans. We had a situation when a simple increase in the work_mem parameter resulted in a query running for 2 hours instead of 20 minutes. The execution plan started using hash join with a preliminary seq scan of the entire table, which had to be read from disk. Here lies one of the main problems of query planning in PostgreSQL. Plans do not take into account what data is in shared buffers and what is not. And often it is much more profitable to make a run on the data that are in the cache (even though there are much more of them) than to read a smaller volume from disk.

External access

If the application server is not located on the same machine as PostgreSQL, it is required to allow connections from another server. By default, PostgreSQL accepts only local connections for security reasons. To allow accepting connections from outside, you need to set the following parameter in postgresql.conf :

listen_addresses = '*'

After that you will need to restart the PostgreSQL service.

Next you need to add in pg_hba.conf IP from which to accept connections (namely the address of the application server) :

host all all 192.168.1.22/32 trust

Instead of trust you should use scram-sha-256 if access is required by password.

Additional Linux settings

In addition to the PostgreSQL settings described earlier, on memory intensive servers we often change a few other settings in Debian itself.

First, the following parameters are set in /etc/sysctl.conf :

vm.min_free_kbytes = 4194304
vm.swappiness = 1

The first parameter sets the minimum amount of free memory that the OS will try to keep. This is necessary to get rid of memory fragmentation and high system time in certain cases. Swappiness should be set to 1, as swap will be very harmful, and 0 is kind of not recommended (although I didn't notice any difference in behavior between 0 and 1).

Next, in /etc/fstab, when mounting a disk with a database, write the options noatime,nodiratime. It's a small thing, but it won't be worse. For example :

/dev/sdb /data xfs defaults,noatime,nodiratime 0 0

Also on a large memory size we usually configure the use of huge pages. To do this, first disable THP and then add a fixed number of pages that corresponds to the size of shared buffers. In the /etc/sysctl.conf file add :

vm.nr_hugepages = 67502 # (<shared_buffers> / 2MB) + 3% - for 128GB shared buffers

Finally, since we are using high performance SSD disks, we usually turn off the I/O scheduler, enabling noop or none mode. There are many ways to do this, but usually we just configure the service :

[Unit]
Description=Change scheduler

[Service]
Type=simple
ExecStart=/bin/sh -c "echo 'noop' > /sys/block/sdb/queue/scheduler" 

[Install]
WantedBy=multi-user.target

Server Resources

In conclusion, I would like to write a few words about the hardware used for PostgreSQL. Despite the fact that virtualization is usually used, the PostgreSQL machine is installed as the only one on the entire physical server.

For example, one of our customers uses a server with two Intel Gold processors with 24 cores each (giving 96 virtual cores) and 256GB of memory. The server is directly connected via PCI express to 4 NVME disks of 3TB each, which are built into a software RAID-10 (via LVM) of about 5.8TB. Now the database there occupies about 3TB, with which work about 1000 concurrent users.

This configuration yields very high speeds for both disk operations and a large number of CPUs. In particular, the CPU utilization graph on this server is as follows :

At peak times, the read speed on such a server reaches 1.5GByte/second without a significant increase in waiting time :

Such a server performance margin will be sufficient when the number of users increases by 2-3 times before clustering should be used.

Conservatism of Software Users

Aliaksei Kirkouski — Tue, 02 May 2023 07:30:04 +0000

Over the past few years, we’ve migrated our customers to new versions of the platform several times. At the same time, we were constantly making changes to the graphical interface. Naturally, we did everything we could to improve the usability of the program. But each time, we encountered user complaints, which demanded that everything be reverted.

At first, we took it personally, thinking that perhaps we were doing something wrong. But after a while it became clear that the problem was not about specific changes, but that many users simply never want to change anything. However, progress is not possible without change. Therefore we began to accept it as a natural process and somehow learned to live with this and put up with the dissatisfaction of users.

Don’t change anything

The conservatism of software users is not something unique. In real life, everything happens in exactly the same way. It is believed that if any survey of people on any changes were to include an item “Leave everything as it is”, then a large enough percentage of respondents would choose this. And it is quite understandable why. People are always getting used to the current situation, they know all its problems, and any change leads to the unknown. They have to think and decide how to act in the new conditions.

Once upon a time, when DOS was still prevalent, we had a software product in a text (console) interface. Over several years, the users got so used to working with it, that their actions became fully automatic and they performed certain processes with some incredible speed, using only the keyboard. At some point, new Windows systems with a graphical interface began to appear. When these same users started to use them, I remember immediately getting a lot of complaints that it was much slower to work with a mouse than with a keyboard. Correspondingly, the speed of work, and hence the efficiency, decreased many times over. Businesses immediately began to complain that they were making losses, and let’s go back to the old software product. However, in the end, no one ever went back. To be honest, until we had our own product with a GUI, we actively used this conservatism in our sales against our competitors.

Sometimes this conservatism takes some very strange forms. I remember when we were transferring one client from an outdated program to our new system, I personally encountered a user who needed to get the movement of goods through the warehouse. We showed her how it was done in our system, to which she replied that the old system did it much better. We asked her to show us exactly how, to which she began clicking with frantic speed, navigating between different forms, and eventually came up with the right one. Unfortunately, the concept of “better” is quite subjective, so we offered to count the number of clicks. We had two, and she had about 10. But she still insisted, and claimed that our system was very inconvenient.

It is worth noting that many programmers are also very conservative. You can often hear them ask questions along the lines of “Why do we need another programming language?” Or “Why do we need this technology, I can do everything better by hand? I remember when SQL servers were just gaining popularity, and there were skeptics, who claimed that their plans are inefficient, while I’m much more efficient to write a readout of data from tables and indexes stored as files. Therefore, you cannot argue that a person’s conservatism is due to their less developed analytical thinking.

Pros and cons

Yes, it must be acknowledged that in some cases there may be some degradation in usability in certain places. In an ideal user’s world, all changes should come in the style of “If I had asked people what they wanted, they would have said faster horses.” That is, people would want everything to stay the same, only certain parameters improved. Unfortunately, it doesn’t work that way in life. Each solution has its advantages and disadvantages. Comparing them is a subjective process, and sometimes you have to sacrifice something to improve something else. Accordingly, when the program is updated, the user is immediately confronted with what has become worse, but does not immediately see the benefits. This is due to the fact that he does not yet know the new functionality, or simply takes it for granted, not yet understanding what it will give him. It is for this reason that at the first stage, the negatives outweigh the positives, and the person immediately wants to go back to the way things were.

Since, as mentioned above, the comparison of advantages and disadvantages depends on specific users, the part that evaluates disadvantages much higher immediately begins to protest. Those who see the advantages first and foremost sit quietly and are not heard. Our clients sometimes have several thousand active users. Accordingly, when even a small percentage (and this would be dozens of people) start to actively complain, then the IT department or management, who are not themselves actively working with the program, get the impression that the update has really made things worse. Although, in fact, perhaps most are happy about the change.

Once, there was a case where after one of these updates, some users began to actively complain about the changes in the mechanism of one basic action. And they complained not directly to us but to their own IT department. Accordingly, the client’s IT department complained to us. Within a month, we made adjustments so that they could go back to the old way of doing things. They turned it on, and everything was back to the way it was. Immediately, another part of the users started to complain actively that they should go back to the new mechanism — it was much more convenient than the old one. After that they started asking us to make it so that the mechanism could be switched for each user separately, which in the end we had to do.

It used to be better

Unfortunately, users sometimes lack rational arguments and are guided primarily by emotions. When we switched from version 4 to version 5 of the lsFusion platform, the way to select objects from the catalog changed. However, in fact, it became the same as in most other programs. In the fourth version of the platform, a modal form was always opened for selection :

In the fifth version of the platform, the user types text, and the system searches for matching entries by occurrence of a substring, taking into account the words. At the same time, the user has the ability to force a dialog if necessary :

From our point of view, it was obvious that in the new version of the platform, the mechanism has improved, and it will become more convenient for users. However, we encountered some resistance from them. Their head of IT began to ask to return everything as it was, because users are complaining massively about the fact that they have difficulty finding some objects in the directories. And it was not for all the objects, but only for those where there were badly entered names, and it was hard to determine which object to select. And then there was the most interesting part. We enabled the option that returned the old mechanism. To this we were told that it was inconvenient this way too, and it was better before. After we said that it used to be exactly like that, and if they thought otherwise, then let them tell us what it was like before. In the end, it ended with us being told that the users don’t remember what it was like before, but it was definitely better. And it’s only been a couple of months.

In about the same way, some people say that things used to be just fine, but they can’t say what exactly. And there’s nothing surprising about that. That’s just how human memory is constructed — it forgets all the bad things, but remembers the good. Therefore, over time, it seems that before everything was much better.

What to do

Over the years we have gone through hundreds of times to upgrade various systems to new versions. And each time we’ve encountered the problems described above. As a result, it all became routine for us. At the same time for the users of our clients such significant changes are very rare, and each of them is a big stress.

For you, the day Bison graced your village was the most important day of your life. But for me, it was Tuesday.

Nevertheless, there is one trick to reduce user complaints a bit. It works in real life, by the way. To do this, after you’ve made a change, when the wave of complaints has reached its maximum, you have to put something back the way it was. To do this you have to choose something uncomplicated, and make it obligatory optional, so that new users don’t know that it was somehow different before. In the end, the degree of protest goes down, and over time users get used to working in a new way, accepting it as the new normal.

Multiple Dispatch in Modern Languages

Aliaksei Kirkouski — Thu, 16 Mar 2023 14:55:48 +0000

Rock, paper, scissors is a hand game that two or more people can play. Each player chooses one of three shapes: rock, paper, or scissors. The rules are: rock beats scissors, scissors beats paper, and paper beats rock.

We need to implement this logic using object-oriented programming without using if statements.

In a fictional language this could look like this :

abstract class Shape;
abstract boolean beats (Shape a, Shape b);

class Rock : Shape;
class Scissors : Shape;
class Paper : Shape;

// "+{ }" means implementation of the abstract function
beats(Shape a, Shape b) +{ return false; }

beats (Rock a, Scissors b) +{ return true; }
beats (Scissors a, Paper b) +{ return true; }
beats (Paper a, Rock b) +{ return true; }

A call to the "beats" function for any two Shapes will return the result of the game.

The advantage of this approach is that you can easily extend the logic with new elements. For example :

class Lizard : Shape;
class Spock : Shape;

beats (Lizard a, Paper b) +{ return true; }
beats (Lizard a, Spock b) +{ return true; }
beats (Spock a, Rock b) +{ return true; }
beats (Spock a, Scissors b) +{ return true; }

We implemented similar logic in our DSL, but our approach does not fit into general-purpose languages.

What is the most elegant way to implement the described problem in modern programming languages?

Visual Programming vs DSL

Aliaksei Kirkouski — Mon, 13 Feb 2023 11:49:29 +0000

Recently, low-code and no-code platforms have been gaining popularity. They propose to use visual programming to develop applications. In this approach, developers, who are ordinary business users, instead of writing program code, create an application using the mouse in a graphical interface.

But what are the advantages of visual programming compared to DSL (Domain Specific Language) ? Of course it depends on the application area. On the one hand, visual programming is almost never used in classical languages. At the same time, this approach certainly has many advantages when developing a graphical interface. However, for example, when creating interfaces with the popular React library, we use flat code more.

I see the following advantages in using DSL:

Version control system support. In visual programming it is much less convenient to monitor changes in the logic of the program and resolve conflicts in commits.
Ability to copy/paste, search and replace by text values, code generation.

In visual programming, the pluses include the following :

Lower entry threshold. Most users know how to click on buttons, enter values in fields and drag and drop objects on forms.
Better visualization when creating the user interface.

It should be mentioned that DSL and visual programming are not alternatives to each other. With the right approach, a visual constructor can and should generate DSL. XML is often used as such a language. However, the readability of XML leaves much to be desired.

One of the main drawbacks of visual programming is the very limited area of application. Yes, for creating simple CRUD applications it works rather well. But when some more complex logic is required, you have to return to program code anyway. This creates a certain semantic gap, when some of the logic is created visually and some is created with classic code.

Using DSL, in turn, allows you to create rather complicated applications with much less effort. Writing code in a specialized language can seem to be a difficult task for users. However, we should not forget that the SQL language was created exactly for regular users, and it is still used not only by programmers. And if DSL is as high-level as SQL, using normal words instead of special characters, the entry threshold is greatly reduced (here is an example of one such language).

Of course, for managing "dynamic" systems, a graphical interface is much better than a textual one. For example, most users do use the graphical interface of operating systems rather than the command console (although in my opinion, the console is more convenient in some places to manage servers). However, programs are more static systems. Typically, changes in the operation of the programs is carried out on a routine basis, rather rarely, and may lead to other significant changes (eg, data migration). And in this case the importance of checking the changes made, which is much easier to do just with flat program code, increases considerably.

One of the distinctive features of visual programming is that the developer immediately sees all the features of the development platform. By this he rarely has to refer to the documentation. On the other hand, it creates visual complexity, thereby "blurring the attention". It is more difficult for the developer to understand which settings he has changed, and which are set by default. In flat files, all changes can easily be seen with the version control system.

It is likely that in the near future, DSLs will increase in popularity in relation to visual programming, as AI technology develops. We've been trying to teach ChatGPT our brand new language, and the results have been very interesting and promising.

lsFusion : Open-Source Platform for Business Applications

Aliaksei Kirkouski — Tue, 31 Jan 2023 07:19:00 +0000

lsFusion platform is designed for rapid development of business applications. It is distributed under the terms of a Lesser General Public License (LGPLv3). The source code of the platform is available on Github.

lsFusion is best suited for creating complex systems with large numbers of entities and forms, where users need to input and process large amounts of data. However, the platform can also be used to quickly create simple applications instead of spreadsheets when Excel’s functionality is not enough.

At the same time the use of the platform will not give a great advantage when developing applications aimed at interaction with a large number of “external” users or without the need for any complex calculations. You should also take into account that the web interface is a single page application using JavaScript. Therefore, the lsFusion platform is not well-suited for creating websites, for example.

Alternatives

Currently, there are several platforms on the market to solve similar problems. Of those that have been in use for a long time, Access and Visual Foxpro can be singled out. More modern analogues are low-code platforms like Power Apps. However, lsFusion takes a fundamentally different approach to application development. In particular, it uses almost no visual programming, and all logic is set using a high-level language.

Paradigm

Most modern platforms use standard relational logic, in which the application structure is defined by entities and the relationships between them. In lsFusion, classes that can inherit from each other are used instead of entities. Fields and relations are defined by properties. Properties are essentially functions that accept class objects as parameters, returning either primitive type values (fields) or other objects (relations). A field in a table with multiple keys from relational logic will be represented as a property with the corresponding objects as parameters.

All properties fall into two categories :

Primary. Entered directly by the user, and do not directly depend on any other properties.
Computed. Computed using operators based on other properties.
A similar scheme is used in spreadsheets. Primary properties correspond to cells that store values. Cells that contain formulas are similar to computed properties.

The user interface is created using forms to which objects and properties are added, and their relationships specified by expressions.

With the help of the event mechanism, it is possible to perform certain actions upon the occurrence of given conditions.

Thus, when using lsFusion platform, most of the logic is not set in a sequence of commands, but as a specification of the solution to the problem. That is, the expected result is described, but not how to get it. This approach is called declarative, and has several advantages over the classic imperative.

Language

Since the platform paradigm differs significantly from existing approaches, a specialized language was created to specify the logic. SQL-like syntax with keywords instead of symbols was chosen to lower the entry threshold into development.

The application logic is specified in the form of flat text files called modules. These files contain the source code in an internal language. Each module can simultaneously contain declarations of classes and properties as well as forms, events, constraints, designs, and so on. Dependencies can be specified between modules. The source code can only use elements from the modules it depends on (including recursively).

Here are few examples :

More different examples of language syntax and use cases can be found here.

Architecture

lsFusion platform consists of two Java applications. A Java virtual machine must be installed in the operating system to run them.

The first is an application server that reads the program logic created by the developer from specific files. These files have the extension lsf, and contain the source code written in the internal language of the platform. They describe both the domain logic and the user interface. The project structure is as follows:

The second application is Apache Tomcat, with a web server deployed on it that runs the web application. It communicates with the application server through RMI requests using an internal protocol.

How to develop

Applications in lsFusion are developed using a special plugin for IntelliJ IDEA Community Edition. The plugin highlights syntax and errors, supports navigation and refactoring of source code.

The project structure on lsFusion is similar to that of any java application. The source code in the internal language acts as regular resource files that do not need to be compiled.

Due to the fact that the platform is based on Java and the source code has a flat text structure, you can use Git, Maven, and connect external Java classes and libraries to solve highly specialized problems.

How it works

The relational DBMS PostgreSQL is used as a data storage system.

When the application server starts, the platform reads the system logic from the required modules and synchronizes the database structure with it. The user interface is automatically generated based on the specified forms.

Based on the lsFusion code, the platform automatically generates all SQL queries to read and write data without using any ORM patterns. Thus, almost all calculations are done in PostgreSQL instead of on the application server.

Changes made within a session (e.g. when a user edits data on a form) are written to temporary tables. At the moment of saving session changes (e.g., when Save button is pressed) a transaction is started, events are executed, and changes are written to the database.

A distinctive feature of the lsFusion platform is that you can set MATERIALIZED flag for any calculated property. A separate field is created for such a property in the table, which is automatically recalculated when the data on which it depends changes. At the same time, when calculating the values of dependent properties, the already calculated value from this field will be used. Thus you can easily balance the load on the database for writing and reading without changing the logic of the system. This feature is similar to Materialized View with Fast Refresh in Oracle only without any restrictions.

Who can develop

The complexity of development in lsFusion is slightly higher than development in SQL, but much lower than in the classical general-purpose programming languages. The developer does not need to manually manage the memory, locks, transactions, the interaction between client and server, or application server and database. All of this is done automatically by the platform based on logic defined in a single, uniform language.

This high-level approach greatly reduces the entry threshold into application development. As a result, applications can be created by people without any programming experience. One week is enough for a basic study of the lsFusion platform, and in a month a developer can already create rather complicated applications. This is achieved due to the fact that the platform operates with a small number of abstractions, on the basis of which you can set almost any business logic. A developer just needs to learn how to define classes (entities), build properties using a few basic operators, and assemble forms from them.

Features

Object Oriented Programming

The lsFusion platform supports multiple class inheritance and polymorphism over several parameters. For example :

The declaration of an abstract property and its implementation can reside in different modules. This makes it possible to develop a system with a Low Coupling / High Cohesion architecture.

Print forms

Print forms, text documents and spreadsheets are generated using the JasperReports library. The platform can automatically generate complex structure of reports and subreports, and the developer only needs to adjust the arrangement of elements visually using JasperSoft Studio.

API

The platform supports sending and receiving requests via HTTP protocol. Combined with the built-in mechanism for exporting and importing JSON/XML files it’s easy to interact with any external systems.

Extension
If the task can not be solved by the standard functionality of the platform, you can use the underlying programming languages. In particular, you can create properties which calculations are implemented in PL/pgSQL language, which are then built into SQL queries. Using Java you can create highly specialized actions. With JavaScript you can create custom visual elements for displaying properties and objects.

Interface

In any table in a form the user can sort and select rows, hide columns and change their order. Filtered data from any table can be easily exported into spreadsheets with a single button. Navigation through the table rows is implemented as “infinite scrolling”.

In addition to changing the value of any cell, the user can change values in all records using the group editing mechanism. Pasting into any table from the clipboard is also supported.

Pivoting mode is automatically available in each table of the form, which allows user to group data and build diagrams in any slice on their own.

First steps

First you need to install lsFusion on your local computer. Under Windows, there is an installer that will install PostgreSQL, Java, IntelliJ IDEA, and the platform itself on the computer. Then you need to create an empty lsFusion project in IDEA.

The best way to start studying lsFusion is to use the examples described in the How-to section and two simple examples: Score table and Materials Management.

You can then explore the source code for MyCompany, which implements the basic logic of a simple ERP system. This solution is distributed under the Apache-2.0 license, and can be used as a basis for developing your own system.

You can view the demo version at: https://demo.lsfusion.org/mycompany.

Support

The lsFusion platform has been developed for 12 years by a distributed team of programmers from different countries. During this time, a large number of applications based on it were implemented, from simple applications with a few forms to complex ERP systems with thousands of users and processes. The development of the platform is funded by companies creating their own commercial solutions based on it.

lsFusion is suitable for application development by individual programmers as well as large teams, due to its high declarability, OOP support and the possibility of using Git. The license enables both in-house application development and commercial products based on it.

Functional Database Concept

Aliaksei Kirkouski — Mon, 23 Jan 2023 09:51:55 +0000

The database world has long been taken over by relational DBMSs that use the SQL language. So much so that all alternatives are called NoSQL. They managed to win back a certain place in this market, but relational DBMSs are not going to die, and continue to be actively used for their purposes.

In this article, I want to describe the concept of a functional database. For better understanding, I will do this by comparing it to the classical relational model. The examples will be tasks from various SQL exercises found on the Internet.

Introduction

Relational databases operate with tables and fields. In a functional database, classes and functions will be used instead, respectively. A field in a table with N keys will be represented as a function of N parameters. Instead of relations between tables, functions that return objects of the class to which the relation goes will be used. Instead of JOIN a composition of functions will be used.

Before I go directly to the tasks, I will describe the definition of the domain logic. For the DDL I will use PostgreSQL syntax. For the functional one I will use my own syntax.

Tables and fields

A simple Sku object with name and price field:

Relational

CREATE TABLE Sku
(
    id bigint NOT NULL,
    name character varying(100),
    price numeric(10,5),
    CONSTRAINT id_pkey PRIMARY KEY (id)
)

Functional

CLASS Sku;
name = DATA STRING[100] (Sku);
price = DATA NUMERIC[10,5] (Sku);

We declare two functions that take as input one parameter Sku, and return a primitive type.

It is assumed that each object in the functional database will have some internal code that is automatically generated and can be accessed if needed.

Let's set a price for a product / store / supplier. It can change over time, so let's add to the table the field time. I will skip the declaration of tables for the tables in the relational database to reduce the code:

Relational

CREATE TABLE prices
(
    skuId bigint NOT NULL,
    storeId bigint NOT NULL,
    supplierId bigint NOT NULL,
    dateTime timestamp without time zone,
    price numeric(10,5),
    CONSTRAINT prices_pkey PRIMARY KEY (skuId, storeId, supplierId)
)

Functional

CLASS Sku;
CLASS Store;
CLASS Supplier;
dateTime = DATA DATETIME (Sku, Store, Supplier);
price = DATA NUMERIC[10,5] (Sku, Store, Supplier);

Indexes

Finally, we build an index for all the keys and the date, so we can quickly find the price at a certain time.

Relational

CREATE INDEX prices_date
    ON prices
    (skuId, storeId, supplierId, dateTime)

Functional

INDEX sku sk, Store st, Supplier sp, dateTime(sk, st, sp);

Tasks

Let's start with relatively simple examples. First let's declare the domain logic as follows (employees and departments) :

Functional

CLASS Department;
name = DATA STRING[100] (Department);

CLASS Employee;
department = DATA Department (Employee);
name = DATA STRING[100] (Employee);
salary = DATA NUMERIC[14,2] (Employee);

Task 1.1.

Select a list of employees that are paid more than their immediate chief's salary.

select a.*
from   employee a
where  a.salary = ( select max(salary) from employee b
                    where  b.department_id = a.department_id )

SELECT name(Employee a) WHERE salary(a) > salary(chief(a));

Task 1.2.

Select a list of employees that have the maximum salary in their department.

select a.*
from employee a
where a.salary = ( select max(salary) from employee b
                    where b.department_id = a.department_id )

maxSalary (Department s) = 
    GROUP MAX salary(Employee e) IF department(e) = s;

SELECT name(Employee a) WHERE salary(a) = maxSalary(department(a));

//or "inline"
SELECT name(Employee a) WHERE 
    salary(a) = maxSalary(GROUP MAX salary(Employee e) IF department(e) = department(a));

These two implementations are equivalent. For the first case, you can use CREATE VIEW in a relational database, which will first calculate the maximum salary for a specific department in the same way. In the future, I will use the first case for clarity, because it better shows the solution.

Task 1.3.

Select a list of IDs of the departments that do not have more than 3 employees.

select department_id
from employee
group by department_id
having count(*) <= 3

countEmployees (Department d) = 
    GROUP SUM 1 IF department(Employee e) = d;

SELECT Department d WHERE countEmployees(d) <= 3;

Task 1.4.

Select a list of employees that do not have an assigned chief working in the same department.

select a.*
from employee a
left join employee b on (b.id = a.chief_id and b.department_id = a.department_id)
where b.id is null

SELECT name(Employee a) WHERE NOT (department(chief(a)) = department(a));

Task 1.5.

Find a list of department IDs with the maximum total salary of employees.

with sum_salary as
  ( select department_id, sum(salary) salary
    from employee
    group by department_id )
select department_id
from sum_salary a       
where a.salary = ( select max(salary) from sum_salary )

salarySum (Department d) = 
    GROUP SUM salary(Employee e) IF department(e) = d;

maxSalarySum () = GROUP MAX salarySum(Department d);

SELECT Department d WHERE salarySum(d) = maxSalarySum();

Now, let's proceed to more complex tasks. Consider the following domain logic :

Functional definition (part needed for the following example) :

CLASS Employee;
lastName = DATA STRING[100] (Employee);

CLASS Product;
id = DATA INTEGER (Product);
name = DATA STRING[100] (Product);

CLASS Order;
date = DATA DATE (Order);
employee = DATA Employee (Order);

CLASS Detail;
order = DATA Order (Detail);
product = DATA Product (Detail);
quantity = DATA NUMERIC[10,5] (Detail);

Task 2.1.

Which employees sold more than 30 pieces of product #1 in 1997?

select LastName
from Employees as e
where (
  select sum(od.Quantity)
  from [Order Details] as od
  where od.ProductID = 1 and od.OrderID in (
    select o.OrderID
    from Orders as o
    where year(o.OrderDate) = 1997 and e.EmployeeID = o.EmployeeID)
) > 30

sold (Employee e, INTEGER productId, INTEGER year) = 
    GROUP SUM quantity(OrderDetail d) IF 
        employee(order(d)) = e AND 
        id(product(d)) = productId AND 
        extractYear(date(order(d))) = year;

SELECT lastName(Employee e) WHERE sold(e, 1, 1997) > 30;

Task 2.2.

For each customer, find the two products for which the customer spent the most money in 1997.

Let's extend the domain logic from the previous example:

CLASS Customer;
contactName = DATA STRING[100] (Customer);

customer = DATA Customer (Order);

unitPrice = DATA NUMERIC[14,2] (Detail);
discount = DATA NUMERIC[6,2] (Detail);

Solution :

SELECT ContactName, ProductName FROM (
SELECT c.ContactName, p.ProductName
, ROW_NUMBER() OVER (
    PARTITION BY c.ContactName
    ORDER BY SUM(od.Quantity * od.UnitPrice * (1 - od.Discount)) DESC
) AS RatingByAmt
FROM Customers c
JOIN Orders o ON o.CustomerID = c.CustomerID
JOIN [Order Details] od ON od.OrderID = o.OrderID
JOIN Products p ON p.ProductID = od.ProductID
WHERE YEAR(o.OrderDate) = 1997
GROUP BY c.ContactName, p.ProductName
) t
WHERE RatingByAmt < 3

sum (Detail d) = quantity(d) * unitPrice(d) * (1 - discount(d));

bought (Customer c, Product p, INTEGER y) = 
    GROUP SUM sum(Detail(d) IF 
        customer(order(d)) = c AND 
        product(d) = p AND 
        extractYear(date(order(d))) = y;

rating (Customer c, Product p, INTEGER y) = 
    PARTITION SUM 1 ORDER DESC bought(c, p, y), p 
              BY c, y;

SELECT contactName(Customer c), name(Product p) WHERE rating(c, p, 1997) < 3;

The PARTITION operator works according to the following principle: it sums up the expression specified after SUM (here 1) inside the specified groups (here Customer and Year, but it can be any expression), sorting inside the groups by the expressions specified in ORDER.

Task 2.3.

How many products need to be ordered from suppliers to fulfill the current orders.

Again we expand the domain logic:

CLASS Supplier;
companyName = DATA STRING[100] (Supplier);

supplier = DATA Supplier (Product);

unitsInStock = DATA NUMERIC[10,3] (Product);
reorderLevel = DATA NUMERIC[10,3] (Product);

Solution :

select s.CompanyName, p.ProductName, sum(od.Quantity) + p.ReorderLevel - p.UnitsInStock as ToOrder
from Orders o
join [Order Details] od on o.OrderID = od.OrderID
join Products p on od.ProductID = p.ProductID
join Suppliers s on p.SupplierID = s.SupplierID
where o.ShippedDate is null
group by s.CompanyName, p.ProductName, p.UnitsInStock, p.ReorderLevel
having p.UnitsInStock < sum(od.Quantity) + p.ReorderLevel

orderedNotShipped (Product p) = 
    GROUP SUM quantity(OrderDetail d) IF product(d) = p;

toOrder (Product p) = orderedNotShipped(p) + reorderLevel(p) - unitsInStock(p);

SELECT companyName(supplier(Product p)), name(p), toOrder(p) WHERE toOrder(p) > 0;

A task with an asterisk

And one last example. There is a "social network" logic. People can be friends with each other and like each other. In terms of a functional database, this would look like this:

CLASS Person;
likes = DATA BOOLEAN (Person, Person);
friends = DATA BOOLEAN (Person, Person);

We need to find possible friendship candidates. More formally, we need to find all people A, B, C such that A is friends with B and B is friends with C, A likes C, but A is not friends with C.
In terms of a functional database, the query would look like this:

SELECT Person a, Person b, Person c WHERE 
    likes(a, c) AND NOT friends(a, c) AND 
    friends(a, b) AND friends(b, c);

It is assumed that there are far fewer friends than likes. Therefore they are stored in separate tables. There is also a problem with two asterisks. In it the friendship is not symmetric. On a functional database it will look like this:

SELECT Person a, Person b, Person c WHERE 
    likes(a, c) AND NOT friends(a, c) AND 
    (friends(a, b) OR friends(b, a)) AND 
    (friends(b, c) OR friends(c, b));

In relational logic, I won't give you the SQL query because it would be much more complicated and take up a lot of space.

Conclusion

It should be noted that the above mentioned language syntax is just one of the options for implementing the above concept. SQL was taken as the basis and the goal was to make it as similar to it as possible. Of course, some may not like the names of keywords, word registers, etc. What matters here is the concept itself. It is possible to make both C++ and Python similar syntax.

The described database concept, in my opinion, has the following advantages:

Simplicity. This is a relatively subjective metric that is not obvious on simple cases. But if we look at more complex cases (for example, tasks with asterisks), it is much easier, in my opinion, to write such queries.
Encapsulation. In some examples, I declared intermediate functions (e.g. sold, bought, etc.), from which subsequent functions were built. This allows you to change the logic of certain functions, if necessary, without changing the logic of the functions that depend on them. For example, you can make sold count from completely different objects, without changing the rest of the logic. Yes, you can do that with CREATE VIEW in RDBMS. But if all logic is written this way, it won't look very readable.
No semantic gap. Such a database operates with functions and classes (instead of tables and fields). Exactly as in classical programming (if we consider that a method is a function with the first parameter as the class to which it refers). Accordingly, it should be much easier to "friend" with universal programming languages. In addition, this concept allows you to implement much more complex functions. For example, you can embed operators like:

CONSTRAINT sold(Employee e, 1, 2019) > 100 
           IF name(e) = 'John' 
    MESSAGE 'John sells too many products in 2019';

Inheritance and polymorphism. In a functional database, you can introduce multiple inheritance through CLASS ClassP: Class1, Class2 constructs and implement multiple polymorphism.

Despite the fact that this is just a concept, there is already an implementation in Java which translates all the functional logic into relational logic. It also implements the user interface logic, and much more, making it a whole platform. It uses RDBMS (so far only PostgreSQL) as a "virtual machine". However, in theory, it is possible to implement a database management system, which will use as a storage structure, adapted specifically to the functional logic.