Franck Pachot

Posted on Apr 28, 2023 • Edited on Nov 5, 2024

TiDB: Distributed MySQL with Foreign Key (not production ready) and not Serializable

#distributed #sql #database #tidb

In this series testing the simplest SQL schema to evaluate which databases can qualify for Distributed SQL, I was not impressed by the capabilities of many MySQL-compatible ones because referential integrity is a second-class citizen. TiDB is usually qualified as Distributed SQL because it is based on a Spanner architecture. Let's try it.

I'm trying on TiDB cloud and, as there's a Chat2Query artificial intelligence, let's ask create the traditional EMP DEPT tables with autoincrement and referential integrity:

The result is not bad at all:

However, to do the same as I did with all posts in this series, I have run the same as in the previous post:

use sample_data;

CREATE TABLE dept (
  deptno integer NOT NULL,
  dname text,
  loc text,
  description text,
  CONSTRAINT pk_dept PRIMARY KEY (deptno asc)
);

CREATE TABLE emp (
  empno integer NOT NULL auto_increment,
  ename text NOT NULL,
  job text,
  mgr integer,
  hiredate date,
  sal integer,
  comm integer,
  deptno integer NOT NULL,
  email varchar(90),
  other_info json,
  CONSTRAINT pk_emp PRIMARY KEY (empno),
  CONSTRAINT emp_email_uk UNIQUE (email),
  CONSTRAINT fk_deptno FOREIGN KEY (deptno) REFERENCES dept(deptno),
  CONSTRAINT fk_mgr FOREIGN KEY (mgr) REFERENCES emp(empno)
);

INSERT INTO dept (deptno,  dname,        loc, description)
     VALUES    (10,     'ACCOUNTING', 'NEW YORK','preparation of financial statements, maintenance of general ledger, payment of bills, preparation of customer bills, payroll, and more.'),
            (20,     'RESEARCH',   'DALLAS','responsible for preparing the substance of a research report or security recommendation.'),
            (30,     'SALES',      'CHICAGO','division of a business that is responsible for selling products or services'),
            (40,     'OPERATIONS', 'BOSTON','administration of business practices to create the highest level of efficiency possible within an organization');

INSERT INTO emp (empno, ename,    job,        mgr,   hiredate,     sal, comm, deptno, email, other_info)
     VALUES   (7369, 'SMITH',  'CLERK',     7902, '1980-12-17',  800, NULL,   20,'SMITH@acme.com', '{"skills":["accounting"]}'),
            (7499, 'ALLEN',  'SALESMAN',  7698, '1981-02-20', 1600,  300,   30,'ALLEN@acme.com', null),
            (7521, 'WARD',   'SALESMAN',  7698, '1981-02-22', 1250,  500,   30,'WARD@compuserve.com', null),
            (7566, 'JONES',  'MANAGER',   7839, '1981-04-02', 2975, NULL,   20,'JONES@gmail.com', null),
            (7654, 'MARTIN', 'SALESMAN',  7698, '1981-09-28', 1250, 1400,   30,'MARTIN@acme.com', null),
            (7698, 'BLAKE',  'MANAGER',   7839, '1981-05-01', 2850, NULL,   30,'BLAKE@hotmail.com', null),
            (7782, 'CLARK',  'MANAGER',   7839, '1981-06-09', 2450, NULL,   10,'CLARK@acme.com', '{"skills":["C","C++","SQL"]}'),
            (7788, 'SCOTT',  'ANALYST',   7566, '1982-12-09', 3000, NULL,   20,'SCOTT@acme.com', '{"cat":"tiger"}'),
            (7839, 'KING',   'PRESIDENT', NULL, '1981-11-17', 5000, NULL,   10,'KING@aol.com', null),
            (7844, 'TURNER', 'SALESMAN',  7698, '1981-09-08', 1500,    0,   30,'TURNER@acme.com', null),
            (7876, 'ADAMS',  'CLERK',     7788, '1983-01-12', 1100, NULL,   20,'ADAMS@acme.org', null),
            (7900, 'JAMES',  'CLERK',     7698, '1981-12-03',  950, NULL,   30,'JAMES@acme.org', null),
            (7902, 'FORD',   'ANALYST',   7566, '1981-12-03', 3000, NULL,   20,'FORD@acme.com', '{"skills":["SQL","CQL"]}'),
            (7934, 'MILLER', 'CLERK',     7782, '1982-01-23', 1300, NULL,   10,'MILLER@acme.com', null);

This just works. The foreign key is just an accepted syntax for the stable release (v6.5) but is truly enforced in this preview release (v6.6)

Testing Foreign Key

If I delete a department with child rows, an exception is raised:

mysql> delete from dept where deptno=10;
ERROR 1451 (23000): Cannot delete or update a parent row: a foreign key constraint fails (`test`.`emp`, CONSTRAINT `fk_deptno` FOREIGN KEY (`deptno`) REFERENCES `dept` (`deptno`))
mysql>

This works as expected. And the error message is clear.

Now testing concurrent transactions.
On session 1:

mysql> start transaction;
Query OK, 0 rows affected (0.19 sec)

mysql> insert into emp(deptno, ename) values (40, 'Franck');
Query OK, 1 row affected (0.19 sec)

On session 2:

mysql> delete from dept where deptno=40;

This waits, which is normal
Session 1:

mysql> rollback;
Query OK, 0 rows affected (0.19 sec)

Session 2:

mysql> delete from dept where deptno=40;
Query OK, 1 row affected (10.41 sec)

Ok, this works as expected. Pessimistic locking.

Testing serializable

First session:

mysql> set transaction isolation level serializable;
ERROR 8048 (HY000): The isolation level 'SERIALIZABLE' is not supported. Set tidb_skip_isolation_level_check=1 to skip this error

Ok, time to look at the documentation. Here is the version I'm running on:

mysql> select version();
+-------------------------------+
| version()                     |
+-------------------------------+
| 5.7.25-TiDB-v6.6.0-serverless |
+-------------------------------+
1 row in set (0.19 sec)

There is a good documentation about transactions and isolation levels: https://docs.pingcap.com/tidb/stable/transaction-isolation-levels which explains what is supported and how it is different from other databases, including MySQL. Clearly, TiDB doesn't try to be compatible with MySQL on isolation levels like CockroachDB is not with PostgreSQL. This is different from YugabyteDB which behaves like PostgreSQL for all isolation levels.

I may add more in the future about this if I test it further. For the moment, supporting foreign keys for a MySQL-compatible database is already a good thing when compared to other MySQL-compatible databases. Note that it is currently only in the preview version.

Is it sufficient to qualify for being Distributed SQL? From the common usage of this term, I guess it is because many other Distributed SQL databases have limited isolation levels. Only YugabyteDB can map to all isolation levels defined by the SQL standard.

Other features

I tried Chat2Query to generate test cases for some SQL features:
-- please show an example for the following features: stored procedure, expression index, partial index

The CREATE PROCEDURE doesn't work. TiDB doesn't support stored procedures or user-defined functions.

The documentation has a good list of compatibility with MySQL: https://docs.pingcap.com/tidb/stable/mysql-compatibility

All these SQL features are supported in YugabyteDB (with their PostgreSQL syntax and behavior of course).

Scalability

Another important difference with YugabyteDB is that, in order to have a global ordering of transactions, TiDB allocates a timestamp that is generated by a single "Placement Driver" which is roughly equivalent to the YugabyteDB yb-master. Having a unique source for this timestamp limits the scalability as all transactions must get their timestamp from it. Think about it as a NOCACHE sequence in SQL - that's something we don't recommend. In practice, this means that TiDB is not designed for multi-region geo-distribution.

With YugabyteDB the yb-master is never in the critical path of SQL transactions. Global ordering of transactions use a Hybrid Logical Clock (HLC) as explained in:

Evolving Clock Sync in Distributed Databases | YugabyteDB

Explore different approaches to synchronizing time across nodes in a distributed database and why YugabyteDB uses hybrid logical clocks.

yugabyte.com

Conclusion about Foreign Keys

TiDB is implementing foreign keys (as of version 6.6.0, currently experimental and not intended for production use), which are essential for any SQL database. However, it’s important to consider the scalability implications. TiDB doesn't have shared row locks. The foreign keys lock the parent row in exclusive mode. This behavior is acceptable for aggregate tables, such as Orders and OrderLines, where concurrent sessions are unlikely to add lines to the same order simultaneously. However, it poses challenges for lookup tables, like Products, where only one session can order a product at a time. Additionally, it can be problematic for strong entities, such as Customers and Orders, where you might expect multiple orders from the same customer concurrently.

YugabyteDB implements all PostgreSQL-compatible transaction intents and can lock the parent row in exclusive (like a SELECT FOR UPDATE) or share mode (like a SELECT FOR SHARE). Referential integrity uses the share mode to allow concurrent DML with the same parent value.

DEV Community

TiDB: Distributed MySQL with Foreign Key (not production ready) and not Serializable

Testing Foreign Key

Testing serializable

Other features

Scalability

Evolving Clock Sync in Distributed Databases | YugabyteDB

Conclusion about Foreign Keys

Top comments (0)

Read next

Run a Local 💻 MySQL 🐬 Instance

Designing a Scalable Database System for High-Volume Data with Real-Time Analytics

Designing a Scalable Database System for High-Volume Data with Real-Time Analytics

ClickHouse: The Key to Faster Insights