In this series testing the simplest SQL schema to evaluate which databases can qualify for Distributed SQL, I was not impressed by the capabilities of many MySQL-compatible ones because referential integrity is a second-class citizen. TiDB is usually qualified as Distributed SQL because it is based on a Spanner architecture. Let's try it.
I'm trying on TiDB cloud and, as there's a Chat2Query artificial intelligence, let's ask create the traditional EMP DEPT tables with autoincrement and referential integrity
:
The result is not bad at all:
However, to do the same as I did with all posts in this series, I have run the same as in the previous post:
use sample_data;
CREATE TABLE dept (
deptno integer NOT NULL,
dname text,
loc text,
description text,
CONSTRAINT pk_dept PRIMARY KEY (deptno asc)
);
CREATE TABLE emp (
empno integer NOT NULL auto_increment,
ename text NOT NULL,
job text,
mgr integer,
hiredate date,
sal integer,
comm integer,
deptno integer NOT NULL,
email varchar(90),
other_info json,
CONSTRAINT pk_emp PRIMARY KEY (empno),
CONSTRAINT emp_email_uk UNIQUE (email),
CONSTRAINT fk_deptno FOREIGN KEY (deptno) REFERENCES dept(deptno),
CONSTRAINT fk_mgr FOREIGN KEY (mgr) REFERENCES emp(empno)
);
INSERT INTO dept (deptno, dname, loc, description)
VALUES (10, 'ACCOUNTING', 'NEW YORK','preparation of financial statements, maintenance of general ledger, payment of bills, preparation of customer bills, payroll, and more.'),
(20, 'RESEARCH', 'DALLAS','responsible for preparing the substance of a research report or security recommendation.'),
(30, 'SALES', 'CHICAGO','division of a business that is responsible for selling products or services'),
(40, 'OPERATIONS', 'BOSTON','administration of business practices to create the highest level of efficiency possible within an organization');
INSERT INTO emp (empno, ename, job, mgr, hiredate, sal, comm, deptno, email, other_info)
VALUES (7369, 'SMITH', 'CLERK', 7902, '1980-12-17', 800, NULL, 20,'SMITH@acme.com', '{"skills":["accounting"]}'),
(7499, 'ALLEN', 'SALESMAN', 7698, '1981-02-20', 1600, 300, 30,'ALLEN@acme.com', null),
(7521, 'WARD', 'SALESMAN', 7698, '1981-02-22', 1250, 500, 30,'WARD@compuserve.com', null),
(7566, 'JONES', 'MANAGER', 7839, '1981-04-02', 2975, NULL, 20,'JONES@gmail.com', null),
(7654, 'MARTIN', 'SALESMAN', 7698, '1981-09-28', 1250, 1400, 30,'MARTIN@acme.com', null),
(7698, 'BLAKE', 'MANAGER', 7839, '1981-05-01', 2850, NULL, 30,'BLAKE@hotmail.com', null),
(7782, 'CLARK', 'MANAGER', 7839, '1981-06-09', 2450, NULL, 10,'CLARK@acme.com', '{"skills":["C","C++","SQL"]}'),
(7788, 'SCOTT', 'ANALYST', 7566, '1982-12-09', 3000, NULL, 20,'SCOTT@acme.com', '{"cat":"tiger"}'),
(7839, 'KING', 'PRESIDENT', NULL, '1981-11-17', 5000, NULL, 10,'KING@aol.com', null),
(7844, 'TURNER', 'SALESMAN', 7698, '1981-09-08', 1500, 0, 30,'TURNER@acme.com', null),
(7876, 'ADAMS', 'CLERK', 7788, '1983-01-12', 1100, NULL, 20,'ADAMS@acme.org', null),
(7900, 'JAMES', 'CLERK', 7698, '1981-12-03', 950, NULL, 30,'JAMES@acme.org', null),
(7902, 'FORD', 'ANALYST', 7566, '1981-12-03', 3000, NULL, 20,'FORD@acme.com', '{"skills":["SQL","CQL"]}'),
(7934, 'MILLER', 'CLERK', 7782, '1982-01-23', 1300, NULL, 10,'MILLER@acme.com', null);
This just works. The foreign key is just an accepted syntax for the stable release (v6.5) but is truly enforced in this preview release (v6.6)
Testing Foreign Key
If I delete a department with child rows, an exception is raised:
mysql> delete from dept where deptno=10;
ERROR 1451 (23000): Cannot delete or update a parent row: a foreign key constraint fails (`test`.`emp`, CONSTRAINT `fk_deptno` FOREIGN KEY (`deptno`) REFERENCES `dept` (`deptno`))
mysql>
This works as expected. And the error message is clear.
Now testing concurrent transactions.
On session 1:
mysql> start transaction;
Query OK, 0 rows affected (0.19 sec)
mysql> insert into emp(deptno, ename) values (40, 'Franck');
Query OK, 1 row affected (0.19 sec)
On session 2:
mysql> delete from dept where deptno=40;
This waits, which is normal
Session 1:
mysql> rollback;
Query OK, 0 rows affected (0.19 sec)
Session 2:
mysql> delete from dept where deptno=40;
Query OK, 1 row affected (10.41 sec)
Ok, this works as expected. Pessimistic locking.
Testing serializable
First session:
mysql> set transaction isolation level serializable;
ERROR 8048 (HY000): The isolation level 'SERIALIZABLE' is not supported. Set tidb_skip_isolation_level_check=1 to skip this error
Ok, time to look at the documentation. Here is the version I'm running on:
mysql> select version();
+-------------------------------+
| version() |
+-------------------------------+
| 5.7.25-TiDB-v6.6.0-serverless |
+-------------------------------+
1 row in set (0.19 sec)
There is a good documentation about transactions and isolation levels: https://docs.pingcap.com/tidb/stable/transaction-isolation-levels which explains what is supported and how it is different from other databases, including MySQL. Clearly, TiDB doesn't try to be compatible with MySQL on isolation levels like CockroachDB is not with PostgreSQL. This is different from YugabyteDB which behaves like PostgreSQL for all isolation levels.
I may add more in the future about this if I test it further. For the moment, supporting foreign keys for a MySQL-compatible database is already a good thing when compared to other MySQL-compatible databases. Note that it is currently only in the preview version.
Is it sufficient to qualify for being Distributed SQL? From the common usage of this term, I guess it is because many other Distributed SQL databases have limited isolation levels. Only YugabyteDB can map to all isolation levels defined by the SQL standard.
Other features
I tried Chat2Query to generate test cases for some SQL features:
-- please show an example for the following features: stored procedure, expression index, partial index
The CREATE PROCEDURE doesn't work. TiDB doesn't support stored procedures or user-defined functions.
The documentation has a good list of compatibility with MySQL: https://docs.pingcap.com/tidb/stable/mysql-compatibility
All these SQL features are supported in YugabyteDB (with their PostgreSQL syntax and behavior of course).
Scalability
Another important difference with YugabyteDB is that, in order to have a global ordering of transactions, TiDB allocates a timestamp that is generated by a single "Placement Driver" which is roughly equivalent to the YugabyteDB yb-master. Having a unique source for this timestamp limits the scalability as all transactions must get their timestamp from it. Think about it as a NOCACHE sequence in SQL - that's something we don't recommend. In practice, this means that TiDB is not designed for multi-region geo-distribution.
With YugabyteDB the yb-master is never in the critical path of SQL transactions. Global ordering of transactions use a Hybrid Logical Clock (HLC) as explained in:
Top comments (0)