Franck Pachot for YugabyteDB

Posted on Feb 7, 2022

Retry logic for optimistic locking and transparent failover 🐘🚀

#yugabytedb #postgres #optimistic #retry

Here is a small demo to show the retry logic that must be implemented to handle transaction conflict or transparent failover errors from a database. The goal is to be able to do some experiments on your laptop, because there are, unfortunately, no magic solutions.

Setup a PostgreSQL database

Start a local PostgreSQL:

docker run --name pg -e POSTGRES_PASSWORD=franck -p 5432:5432 -d postgres

Create a simple 10 rows table:

PGUSER=postgres PGPASSWORD=franck PGHOST=localhost PGPORT=5432  psql -e <<SQL
create table franck(id int primary key, val int);
insert into franck select generate_series(1,10),0;
SQL

create table franck(id int primary key, val int)
CREATE TABLE

insert into franck select generate_series(1,10),0;
INSERT 0 10

Get YBDemo

I use a very simple program to run queries in thread, Download the YBDemo.jar which includes the PostgreSQL and YugabyteDB JDBC drivers and Hikari connection pool:

curl -Ls https://github.com/FranckPachot/ybdemo/releases/download/v0.0.1/YBDemo-0.0.1-SNAPSHOT-jar-with-dependencies.jar > YBDemo.jar
curl -Ls https://jdbc.postgresql.org/download/postgresql-42.3.2.jar > postgresql.jar

I setup the connection parameters in Hikary properties:

cat > hikari.properties <<'INI'
dataSourceClassName=org.postgresql.ds.PGSimpleDataSource
dataSource.url=jdbc:postgresql://localhost:5432/postgres?user=postgres&password=franck
INI

Here is how I run 5 threads that update all rows in my demo table:

for i in {1..5} ; do
 echo "update franck set val=val+1 returning val+1"
done | java -jar YBDemo.jar

This is a very simple example, update franck set val=val+1 on the 10 rows of my table, run from multiple threads, to show transaction conflicts. Of course, if you have this in your application, you have a problem. But the goal of a lab is to analyze simple things in order to get full understanding.

Read committed

The default isolation level in PostgreSQL is Read Committed. You can check quickly that it is the same with my program: java -jar YBDemo.jar <<< "select current_setting('transaction_isolation')"

So, If you have run the simple update franck set val=val+1 returning val+1 above from at least 3 threads you will see some transaction conflict errors:

 Thread-2      5 ms: 17735
 Thread-0      5 ms: 17736
 Thread-1      5 ms: 17737

update franck set val=val+1 returning val+1
Error in thread  Thread-0   1016 ms SQLSTATE(40P01) - retry 0/10
org.postgresql.util.PSQLException: ERROR: deadlock detected
  Detail: Process 180 waits for ShareLock on transaction 20923; blocked by process 182.
Process 182 waits for ShareLock on transaction 20926; blocked by process 180.
  Hint: See server log for query details.
  Where: while rechecking updated tuple (11,5) in relation "franck"
 wait in thread  Thread-0     15 ms after   0 retries

update franck set val=val+1 returning val+1
Error in thread  Thread-1   1005 ms SQLSTATE(40P01) - retry 0/10
org.postgresql.util.PSQLException: ERROR: deadlock detected
  Detail: Process 181 waits for ShareLock on transaction 20989; blocked by process 182.
Process 182 waits for ShareLock on transaction 20988; blocked by process 181.
  Hint: See server log for query details.
  Where: while rechecking updated tuple (9,21) in relation "franck"
 wait in thread  Thread-1     10 ms after   0 retries

I've run this example to make it clear that, even in Read Committed isolation level, with one simple SQL statement, transaction conflict can happen and the application must handle it with a retry logic. Why a retry logic? because you don't want to stop your application and you don't want to ignore the unsuccessful transaction. This is, by the way, the main goal of my YBDemo.java.

So, whatever the isolation level is, SQLSTATE 40P01 (deadlock_detected) must be retried by an highly available application.

Serializable

With Serializable isolation level, you can expect more transaction conflicts because the read/write stability is guaranteed for the whole transaction.

I'm adding this line to set the isolation level for each transaction:

cat >> hikari.properties <<'INI'
connectionInitSql= set default_transaction_isolation=serializable;
INI

and run the same again:

for i in {1..5} ; do
 echo "update franck set val=val+1 returning val+1"
done | java -jar YBDemo.jar

I can see many errors like:

 Thread-3      2 ms: 153703
 Thread-3      2 ms: 153704
 Thread-3      2 ms: 153705
 wait in thread  Thread-0   2567 ms after   8 retries
 Thread-3      2 ms: 153706

update franck set val=val+1 returning val+1
Error in thread  Thread-0      2 ms SQLSTATE(40001) - retry 9/10
org.postgresql.util.PSQLException: ERROR: could not serialize access due to concurrent update
 Thread-3      3 ms: 153707
 Thread-3      2 ms: 153708
 Thread-3      2 ms: 153709
 Thread-3      2 ms: 153710

This one was at the limit because my YBDemo.java fails above 10 retries. This happened quickly:

Error in thread  Thread-2      2 ms SQLSTATE(40001) - retry 0/10
 wait in thread  Thread-2     10 ms after   0 retries
Error in thread  Thread-2      2 ms SQLSTATE(40001) - retry 1/10
 wait in thread  Thread-2     29 ms after   1 retries
Error in thread  Thread-2      2 ms SQLSTATE(40001) - retry 2/10
 wait in thread  Thread-2     41 ms after   2 retries
Error in thread  Thread-2      2 ms SQLSTATE(40001) - retry 3/10
 wait in thread  Thread-2     86 ms after   3 retries
Error in thread  Thread-2      2 ms SQLSTATE(40001) - retry 4/10
 wait in thread  Thread-2    164 ms after   4 retries
Error in thread  Thread-2      1 ms SQLSTATE(40001) - retry 5/10
 wait in thread  Thread-2    328 ms after   5 retries
Error in thread  Thread-2      1 ms SQLSTATE(40001) - retry 6/10
 wait in thread  Thread-2    641 ms after   6 retries
Error in thread  Thread-2      3 ms SQLSTATE(40001) - retry 7/10
 wait in thread  Thread-2   1289 ms after   7 retries
Error in thread  Thread-2      4 ms SQLSTATE(40001) - retry 8/10
 wait in thread  Thread-2   2567 ms after   8 retries
Error in thread  Thread-2      3 ms SQLSTATE(40001) - retry 9/10
 wait in thread  Thread-2   5125 ms after   9 retries
Error in thread  Thread-2      2 ms SQLSTATE(40001) - retry 10/10

I use exponential backoff to retry, you can see how in the code. The last one waited 5 seconds.

YugabyteDB

With a distributed database, optimistic locking is the rule, and retry-able errors can happen for many reasons. I've started a cluster on my laptop with 3 tablet servers using a docker-compose generated by yb-lab

I create the demo table (the docker yb-tserver-0 is redirected on localhost port 5433):

psql -p 5433 -U yugabyte -d yugabyte <<SQL
create table franck(id int primary key, val int);
insert into franck select generate_series(1,10),0;
SQL

and set the connection properties for it:

cat > hikari.properties <<INI
dataSourceClassName=com.yugabyte.ysql.YBClusterAwareDataSource
dataSource.url=jdbc:yugabytedb://localhost:5433/yugabyte?user=yugabyte&load-balance=false
connectionInitSql= set default_transaction_isolation=serializable;
INI

I see some serialization errors, of course, but never reached the maximum number of retries:

 Thread-3     17 ms: 1474
 Thread-3     17 ms: 1475
 Thread-3     17 ms: 1476
 Thread-3     17 ms: 1477
 Thread-3     18 ms: 1478
 Thread-3     17 ms: 1479
 Thread-3     18 ms: 1480
 Thread-3     17 ms: 1481

update franck set val=val+1 returning val+1
Error in thread  Thread-0   6541 ms SQLSTATE(40001) - retry 1/10
com.yugabyte.util.PSQLException: ERROR: Operation failed. Try again.: [Operation failed. Try again. (yb/tablet/running_transaction.cc:456): Transaction aborted: 9ced2fc8-06d2-490c-b400-732e53013a56 (pgsql error 40001)]
 Thread-3     18 ms: 1482
 wait in thread  Thread-0     21 ms after   1 retries
 Thread-3     18 ms: 1483
 Thread-0     20 ms: 1484
 Thread-0     16 ms: 1485
 Thread-0     18 ms: 1486
 Thread-0     20 ms: 1487
 Thread-0     19 ms: 1488
 Thread-0     19 ms: 1489
 Thread-0     18 ms: 1490
 Thread-0     17 ms: 1491

After one hour running I got two threads reaching 9 retries.

So, when using Serializable isolation level, SQLSTATE 40001 (serialization_failure) must be retried and the maximum number of retried (and time to wait between retries) should be correctly tuned for your system.

High Availability

While the previous was running, I've stopped and started one tablet server with docker-compose restart yb-tserver-2. I had a few more waits for a few seconds (because the tablets that had their leader in the stopped node have to get a follower elected as the new leader). But still showing only SQLSTATE 40001.

However, I was connected only to yb-tserver-0 in the previous test. The load is balanced to the cluster, but the connection happens in one place. When the node I'm connected to is down, the application will continue by connecting to the other nodes (thanks to the connection pool and the cluster-aware JDBC driver). The transient errors may also be handled by the application.

 Thread-0     22 ms: 59648^M
 Thread-0     22 ms: 59649^M
 Thread-0     29 ms: 59650^M
53171 [Thread-3] WARN com.zaxxer.hikari.pool.ProxyConnection - HikariPool-1 - Connection com.yugabyte.jdbc.PgConnection@61336f01 marked as broken because of SQLSTATE(08006), ErrorCode(0)^M
53171 [Thread-0] WARN com.zaxxer.hikari.pool.ProxyConnection - HikariPool-1 - Connection com.yugabyte.jdbc.PgConnection@3b373f9c marked as broken because of SQLSTATE(08006), ErrorCode(0)^M
53171 [Thread-2] WARN com.zaxxer.hikari.pool.ProxyConnection - HikariPool-1 - Connection com.yugabyte.jdbc.PgConnection@2abffd27 marked as broken because of SQLSTATE(08006), ErrorCode(0)^M
53171 [Thread-4] WARN com.zaxxer.hikari.pool.ProxyConnection - HikariPool-1 - Connection com.yugabyte.jdbc.PgConnection@39961dd8 marked as broken because of SQLSTATE(08006), ErrorCode(0)^M
53171 [Thread-1] WARN com.zaxxer.hikari.pool.ProxyConnection - HikariPool-1 - Connection com.yugabyte.jdbc.PgConnection@683b2ab4 marked as broken because of SQLSTATE(08006), ErrorCode(0)^M
com.yugabyte.util.PSQLException: An I/O error occurred while sending to the backend.^M
  at com.yugabyte.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:382)^M
  at com.yugabyte.jdbc.PgStatement.executeInternal(PgStatement.java:490)^M
  at com.yugabyte.jdbc.PgStatement.execute(PgStatement.java:408)^M
  at com.yugabyte.jdbc.PgStatement.executeWithFlags(PgStatement.java:329)^M
  at com.yugabyte.jdbc.PgStatement.executeCachedSql(PgStatement.java:315)^M
  at com.yugabyte.jdbc.PgStatement.executeWithFlags(PgStatement.java:291)^M
  at com.yugabyte.jdbc.PgStatement.executeQuery(PgStatement.java:243)^M
  at com.zaxxer.hikari.pool.ProxyStatement.executeQuery(ProxyStatement.java:110)^M
  at com.zaxxer.hikari.pool.HikariProxyStatement.executeQuery(HikariProxyStatement.java)^M
  at YBDemo.run(YBDemo.java:38)^M

The SQLSTATE 08006 (connection_failure) should also be retried. There is no need to wait because, in YugabyteDB, other nodes are immediately available.

I also got some:

yb-lab-yb-demo-write-1  | Error in thread  Thread-1   1589ms SQLSTATE(XX000) - retry 0/5
yb-lab-yb-demo-write-1  | com.yugabyte.util.PSQLException: ERROR: no owned sequence found
yb-lab-yb-demo-write-1  |   Where: Catalog Version Mismatch: A DDL occurred while processing this query. Try again.

when testing different failover situations so I decided to retry 'XX000'.

And the re-connection may also get a SQLTransientConnectionException which should alo be retried, without waiting because the connection pool has already a timeout setting

It is difficult to provide some generic code to handle this. If it was that easy, this would be automated within the database or the driver. And YugabyteDB has already many situation where conflicts are retried automatically. But some situation must be handled by the application:

did you do some non-transactional operations that risk to be done twice (examples: send e-mail, notifications...)
are you sure that you got the connection error before the changes were committed, or do you need to check the state before retry?

In summary

Any SQL database, because it is shared, and guarantees ACID properties to avoid data corruption, may raise transaction conflicts. Some situations can be retried automatically. Their sole consequence is a higher latency but, given the low probability of it, makes it better than pessimistic locking when you need to scale. Some other situations must be retried with care of the application logic, for which the database doesn't know, especially if you don't encapsulate all database actions into stored procedures.

Here is the repo where I have this demo program and docker-compose examples:

FranckPachot / ybdemo

Lab environment for YugabyteDB demos

🐳 The best use of this is from the docker-compose environment set in ./docker/yb-lab on Podman, Docker

or even:

YBDemo

YBDemo is a simple Java program that creates an HikariCP connection pool from the hikari.properties file in the current directory, and takes SQL statements to execute as lines from stdin. There's no multi-line statement: each line is a thread, executing the statement in a loop. The goal is to make it easy to run a demo with concurrent threads by providing the set of queries in a simple way, interactive or though file redirection. Only the first column of the first row is displayed, I use row_to_json() or json_agg() to format a larger result into one value. The thread stops if no row is returned, I use RETURNING to get a row from DML.

The goal is to run it on PostgreSQL compatible databases, especially distributed ones like YugabyteDB…

View on GitHub