<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Viktor Dorotovič</title>
    <description>The latest articles on DEV Community by Viktor Dorotovič (@vdorot).</description>
    <link>https://dev.to/vdorot</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F925013%2F7f234581-5fa9-4da5-a6f8-bc1452302c74.jpg</url>
      <title>DEV Community: Viktor Dorotovič</title>
      <link>https://dev.to/vdorot</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/vdorot"/>
    <language>en</language>
    <item>
      <title>Partitioning InnoDB tables by time-based pseudo-sequential UUIDs</title>
      <dc:creator>Viktor Dorotovič</dc:creator>
      <pubDate>Tue, 11 Oct 2022 22:13:24 +0000</pubDate>
      <link>https://dev.to/vdorot/partitioning-innodb-tables-by-time-based-pseudo-sequential-uuids-523m</link>
      <guid>https://dev.to/vdorot/partitioning-innodb-tables-by-time-based-pseudo-sequential-uuids-523m</guid>
      <description>&lt;p&gt;Partitioning has multiple uses – spreading load onto multiple disks, cold storage of older data on cheaper disks, and probably others. Most importantly though, &lt;a href="https://mwidlake.wordpress.com/2009/10/29/partitions-are-not-for-performance/"&gt;partitions are not for performance&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The main use case I’m going to explain is time-based partitioning as a tool for limiting the scope of stored data. This could be required to comply with a data retention policy or simply to save money on disk space.&lt;/p&gt;

&lt;p&gt;A straightforward approach would be periodically issuing DELETE queries to get rid of rows that are older than a certain time period:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;DELETE&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;temperature&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;time_of_measurement&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="s1"&gt;'2020-01-01'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This incurs a lot of I/O though. Each row has to be found, then removed from the page, while also updating each index pointing to the row.&lt;/p&gt;

&lt;p&gt;If the table was range-partitioned by a time column, whole chunks of rows could be removed in one go:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;temperature&lt;/span&gt; &lt;span class="k"&gt;DROP&lt;/span&gt; &lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="n"&gt;p_2019_11&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p_2019_12&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://mariadb.com/kb/en/partition-maintenance/#partition-maintenance-for-the-time-series-case"&gt;Dropping a partition is much faster than deleting individual rows&lt;/a&gt;. Each partition is essentially a separate table, so when it is dropped, the space occupied by the table can just be marked as unused, instead of updating data structures row-by-row.&lt;/p&gt;

&lt;h2&gt;
  
  
  Case in point
&lt;/h2&gt;

&lt;p&gt;As an example, I’m going to use a toy database for storing information about workshops and participants in IT conferences. There are going to be 2 entity classes, with a many-to-many relationship between them:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Workshop&lt;/li&gt;
&lt;li&gt;Participant&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A workshop can be attended by multiple participants, and a participant can attend multiple workshops. For simplicity, the fact that a participant is not able to attend two workshops at the same time will not be checked.&lt;/p&gt;

&lt;p&gt;There are different data retention requirements – workshops will be stored for 5 years for historical reference, while participant records can be deleted after 2 years.&lt;/p&gt;

&lt;p&gt;Apart from listing the individual members of each entity class, there are 2 read operations that the setup will need to support:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Get all participants attending a given workshop&lt;/li&gt;
&lt;li&gt;Get all workshops that a given participant has signed up for&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I ran the first attempt at defining the table schema:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;workshops&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;workshop_id&lt;/span&gt; &lt;span class="nb"&gt;BIGINT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="n"&gt;AUTO_INCREMENT&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;255&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;time_scheduled&lt;/span&gt; &lt;span class="nb"&gt;DATETIME&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;RANGE&lt;/span&gt; &lt;span class="n"&gt;COLUMNS&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;time_scheduled&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="n"&gt;p_default&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="k"&gt;LESS&lt;/span&gt; &lt;span class="k"&gt;THAN&lt;/span&gt; &lt;span class="k"&gt;MAXVALUE&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;participants&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;participant_id&lt;/span&gt; &lt;span class="nb"&gt;BIGINT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="n"&gt;AUTO_INCREMENT&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;255&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;time_created&lt;/span&gt; &lt;span class="nb"&gt;DATETIME&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;RANGE&lt;/span&gt; &lt;span class="n"&gt;COLUMNS&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;time_created&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="n"&gt;p_default&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="k"&gt;LESS&lt;/span&gt; &lt;span class="k"&gt;THAN&lt;/span&gt; &lt;span class="k"&gt;MAXVALUE&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;workshop_participants&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;workshop_participant_id&lt;/span&gt; &lt;span class="nb"&gt;BIGINT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="n"&gt;AUTO_INCREMENT&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;workshop_id&lt;/span&gt; &lt;span class="nb"&gt;BIGINT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;participant_id&lt;/span&gt; &lt;span class="nb"&gt;BIGINT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  
  &lt;span class="n"&gt;time_created&lt;/span&gt; &lt;span class="nb"&gt;DATETIME&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;RANGE&lt;/span&gt; &lt;span class="n"&gt;COLUMNS&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;time_created&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="n"&gt;p_default&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="k"&gt;LESS&lt;/span&gt; &lt;span class="k"&gt;THAN&lt;/span&gt; &lt;span class="k"&gt;MAXVALUE&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Uh-oh, MySQL is throwing an error:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;A PRIMARY KEY must include all columns in the table's partitioning function (prefixed columns are not considered).
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Apparently, this limitation applies to any unique constraint (index), not just primary keys, as described in &lt;a href="https://dev.mysql.com/doc/refman/8.0/en/partitioning-limitations-partitioning-keys-unique-keys.html"&gt;MySQL’s documentation: Partitioning Keys, Primary Keys, and Unique Keys&lt;/a&gt;. I have not found any resource explaining why this is necessary though. &lt;a href="https://www.postgresql.org/docs/current/ddl-partitioning.html#DDL-PARTITIONING-DECLARATIVE-LIMITATIONS"&gt;PostgreSQL has the same restriction&lt;/a&gt; and explains it as follows: “This limitation exists because the individual indexes making up the constraint can only directly enforce uniqueness within their own partitions; therefore, the partition structure itself must guarantee that there are no duplicates in different partitions.“&lt;/p&gt;

&lt;h2&gt;
  
  
  Working example #1, but not particularly good
&lt;/h2&gt;

&lt;p&gt;To get around this limitation, the primary keys can be defined as compound keys over the autoincremented ID and the &lt;code&gt;time_created&lt;/code&gt; field:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;workshops&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;workshop_id&lt;/span&gt; &lt;span class="nb"&gt;BIGINT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="n"&gt;AUTO_INCREMENT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;255&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;time_scheduled&lt;/span&gt; &lt;span class="nb"&gt;DATETIME&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;workshop_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;time_scheduled&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;RANGE&lt;/span&gt; &lt;span class="n"&gt;COLUMNS&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;time_scheduled&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="n"&gt;p_future&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="k"&gt;LESS&lt;/span&gt; &lt;span class="k"&gt;THAN&lt;/span&gt; &lt;span class="k"&gt;MAXVALUE&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;participants&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;participant_id&lt;/span&gt; &lt;span class="nb"&gt;BIGINT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="n"&gt;AUTO_INCREMENT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;255&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;time_created&lt;/span&gt; &lt;span class="nb"&gt;DATETIME&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;participant_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;time_created&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;RANGE&lt;/span&gt; &lt;span class="n"&gt;COLUMNS&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;time_created&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="n"&gt;p_future&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="k"&gt;LESS&lt;/span&gt; &lt;span class="k"&gt;THAN&lt;/span&gt; &lt;span class="k"&gt;MAXVALUE&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;workshop_participants&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;workshop_participant_id&lt;/span&gt; &lt;span class="nb"&gt;BIGINT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="n"&gt;AUTO_INCREMENT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;workshop_id&lt;/span&gt; &lt;span class="nb"&gt;BIGINT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;participant_id&lt;/span&gt; &lt;span class="nb"&gt;BIGINT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  
  &lt;span class="n"&gt;time_created&lt;/span&gt; &lt;span class="nb"&gt;DATETIME&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;workshop_participant_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;time_created&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;wp_i_participant_id&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;participant_id&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;wp_i_workshop_id&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;workshop_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;RANGE&lt;/span&gt; &lt;span class="n"&gt;COLUMNS&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;time_created&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="n"&gt;p_future&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="k"&gt;LESS&lt;/span&gt; &lt;span class="k"&gt;THAN&lt;/span&gt; &lt;span class="k"&gt;MAXVALUE&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The 2 required read operations will be served by these SELECT queries:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Get all participants attending a given workshop&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;participant_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;time_created&lt;/span&gt; 
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;workshop_participants&lt;/span&gt; &lt;span class="n"&gt;wp&lt;/span&gt;
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;participants&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;participant_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;wp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;participant_id&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;wp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;workshop_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;workshop_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;;&lt;/span&gt;


&lt;span class="c1"&gt;-- Get all workshops that a given participant has signed up for&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;workshop_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;time_scheduled&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;workshop_participants&lt;/span&gt; &lt;span class="n"&gt;wp&lt;/span&gt;
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;workshops&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;workshop_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;wp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;workshop_id&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;wp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;participant_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;participant_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There are a couple of problems with this approach. Firstly, &lt;code&gt;workshop_id&lt;/code&gt; and &lt;code&gt;participant_id&lt;/code&gt; are being used as if they were unique, which are not actually guaranteed to be – it’s only guaranteed if the autoincrement default value is not overridden.&lt;/p&gt;

&lt;p&gt;Secondly, &lt;a href="https://dev.mysql.com/doc/refman/8.0/en/partitioning-pruning.html"&gt;partition pruning&lt;/a&gt; cannot be used. Partition pruning is an optimizer feature that uses existing information contained in the query to establish a subset of partitions that is guaranteed to contain all the data that’s being looked for. To put it another way, it prunes partitions that cannot contain any matching rows based on the WHERE condition.&lt;/p&gt;

&lt;p&gt;In this case, I can validate this by creating the partitions &lt;code&gt;p_2020_01&lt;/code&gt;, &lt;code&gt;p_2020_02&lt;/code&gt;, adding some sample data and looking at the output of &lt;code&gt;EXPLAIN&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;workshops&lt;/span&gt;
&lt;span class="n"&gt;REORGANIZE&lt;/span&gt; &lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="n"&gt;p_future&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="n"&gt;p_old&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="k"&gt;LESS&lt;/span&gt; &lt;span class="k"&gt;THAN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'2020-01-01'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="n"&gt;p_2020_01&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="k"&gt;LESS&lt;/span&gt; &lt;span class="k"&gt;THAN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'2020-02-02'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="n"&gt;p_2020_02&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="k"&gt;LESS&lt;/span&gt; &lt;span class="k"&gt;THAN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'2020-02-03'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="n"&gt;p_future&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="k"&gt;LESS&lt;/span&gt; &lt;span class="k"&gt;THAN&lt;/span&gt; &lt;span class="k"&gt;MAXVALUE&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;workshops&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;workshop_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;time_scheduled&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'UUIDs - when and how to use them'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2020-01-18 10:11'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Partitioning InnoDB tables'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2020-01-18 11:32'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Denormalizing database schemas for performance'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2020-02-05 14:30'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;participants&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;participant_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;time_created&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; 
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'john Doe'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2020-01-01 15:00'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;workshop_participants&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;workshop_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;participant_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;time_created&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2020-01-06 10:02'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2020-01-06 10:05'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;wp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;workshop_id&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;workshop_participants&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;wp&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;wp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;participant_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Output:&lt;/span&gt;
&lt;span class="c1"&gt;-- | workshop_id |&lt;/span&gt;
&lt;span class="c1"&gt;-- | ----------- |&lt;/span&gt;
&lt;span class="c1"&gt;-- | 1           |&lt;/span&gt;
&lt;span class="c1"&gt;-- | 2           |&lt;/span&gt;

&lt;span class="c1"&gt;-- for MySQL &amp;lt; 5.8, use EXPLAIN PARTIIONS&lt;/span&gt;
&lt;span class="k"&gt;EXPLAIN&lt;/span&gt; &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;workshop_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;time_scheduled&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;workshops&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;workshop_id&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- Output:&lt;/span&gt;
&lt;span class="c1"&gt;-- | id  | select_type | table | partitions                         | type  | possible_keys | key     | ...&lt;/span&gt;
&lt;span class="c1"&gt;-- | --- | ----------- | ----- | ---------------------------------- | ----- | ------------- | ------- | ...&lt;/span&gt;
&lt;span class="c1"&gt;-- | 1   | SIMPLE      | w     | p_old,p_2020_01,p_2020_02,p_future | range | PRIMARY       | PRIMARY | ...&lt;/span&gt;



&lt;span class="c1"&gt;-- Try it yourself with DB Fiddle: https://www.db-fiddle.com/f/uJbWPwS5dU9XNUCQPBMuNs/0&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The EXPLAIN output shows that all of the defined partitions need to be visited when evaluating the SELECt query, even though all the fetched rows are stored in a single partition.&lt;/p&gt;

&lt;p&gt;Note that instead of using a &lt;code&gt;JOIN&lt;/code&gt; or a &lt;code&gt;WHERE IN (SELECT ...)&lt;/code&gt; condition, the queries are separated into 2. The first one fetches a list of IDs from the bridging table, and the second includes a constant &lt;code&gt;IN(…)&lt;/code&gt; clause. It’s not clear to me at the time of writing, whether partition pruning can work with dynamically obtained values during query execution in InnoDB, I’ve not seen it mentioned anywhere. &lt;a href="https://www.postgresql.org/docs/current/ddl-partitioning.html#DDL-PARTITION-PRUNING:~:text=During%20actual%20execution%20of%20the%20query%20plan."&gt;PostgreSQL supports run-time pruning&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Working example #2, getting closer
&lt;/h2&gt;

&lt;p&gt;The proper solution would be using the whole compound keys as foreign keys in the bridging table:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;workshop_participants&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;workshop_participant_id&lt;/span&gt; &lt;span class="nb"&gt;BIGINT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="n"&gt;AUTO_INCREMENT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;workshop_id&lt;/span&gt; &lt;span class="nb"&gt;BIGINT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;workshop_time_scheduled&lt;/span&gt; &lt;span class="nb"&gt;DATETIME&lt;/span&gt; &lt;span class="k"&gt;NOt&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;participant_id&lt;/span&gt; &lt;span class="nb"&gt;BIGINT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  
  &lt;span class="n"&gt;participant_time_created&lt;/span&gt; &lt;span class="nb"&gt;DATETIME&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;time_created&lt;/span&gt; &lt;span class="nb"&gt;DATETIME&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;workshop_participant_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;time_created&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;wp_i_workshop&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;workshop_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;workshop_time_scheduled&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;wp_i_participant&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;participant_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;participant_time_created&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;RANGE&lt;/span&gt; &lt;span class="n"&gt;COLUMNS&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;time_created&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="n"&gt;p_future&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="k"&gt;LESS&lt;/span&gt; &lt;span class="k"&gt;THAN&lt;/span&gt; &lt;span class="k"&gt;MAXVALUE&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;wp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;workshop_time_scheduled&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;wp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;workshop_id&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;workshop_participants&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;wp&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;wp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;participant_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Output:&lt;/span&gt;
&lt;span class="c1"&gt;-- | workshop_time_scheduled | workshop_id |&lt;/span&gt;
&lt;span class="c1"&gt;-- | ----------------------- | ----------- |&lt;/span&gt;
&lt;span class="c1"&gt;-- | 2020-01-18 10:11:00     | 1           |&lt;/span&gt;
&lt;span class="c1"&gt;-- | 2020-01-18 11:32:00     | 2           |&lt;/span&gt;

&lt;span class="k"&gt;EXPLAIN&lt;/span&gt; &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;workshop_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;time_scheduled&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;workshops&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;time_scheduled&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;workshop_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'2020-01-18 10:11:00'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'2020-01-18 11:32:00'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;-- | id  | select_type | table | partitions | type  | possible_keys | key     | key_len | ref | rows | filtered | Extra       |&lt;/span&gt;
&lt;span class="c1"&gt;-- | --- | ----------- | ----- | ---------- | ----- | ------------- | ------- | ------- | --- | ---- | -------- | ----------- |&lt;/span&gt;
&lt;span class="c1"&gt;-- | 1   | SIMPLE      | w     | p_2020_01  | range | PRIMARY       | PRIMARY | 13      |     | 2    | 100      | Using where |&lt;/span&gt;



&lt;span class="c1"&gt;--- Try it in DD Fiddle: https://www.db-fiddle.com/f/vQCCQFcqTj9dndNwPDmqVT/0&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now that the &lt;code&gt;WHERE ... IN&lt;/code&gt; pairs start with the value for the partitioning column, the query optimizer can remove non-matching partitions from the plan. This is quite unwieldy though and may be especially problematic when combined with an &lt;a href="https://en.wikipedia.org/wiki/Object%E2%80%93relational_mapping"&gt;object-relational mapper&lt;/a&gt; that may not support compound primary keys.&lt;/p&gt;

&lt;h2&gt;
  
  
  A better solution — using UUID7
&lt;/h2&gt;

&lt;p&gt;Apart from other &lt;a href="https://www.rfc-editor.org/rfc/rfc4122.html#section-2"&gt;UUID advantages over AUTO_INCREMENT&lt;/a&gt;, UUID7 is prefixed by the UNIX timestamp, which means that the lexicographic ordering of UUID7 values is the same as the order in which the IDs were generated in time. To get a better picture of the structure of UUID7, take a look at my previous post about UUID performance in indices. The timestamp prefix makes it possible to define discrete ranges of UUID7 values the same way time ranges would be defined. Be aware though, that UUID version 7 is defined as an Internet Draft at the time of writing, and may change.&lt;/p&gt;

&lt;p&gt;When comparing a time to a UUID7, equality comparisons cannot be used, because the random part of UUID7 may be any value. Only less-than or greater-than relations are useful.&lt;/p&gt;

&lt;p&gt;I made a function that can create a UUID7 from a custom datetime object and fills in the rest of the bytes with either 0x00 or 0xff depending on whether the value is going to be used as a lower or upper boundary:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;datetime&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;uuid6&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;uuid7&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;uuid7_time_boundary&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;lower&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;timestamp_ms&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;uuid_int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;timestamp_ms&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="mh"&gt;0xFFFFFFFFFFFF&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;80&lt;/span&gt;
    &lt;span class="n"&gt;fill_byte&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"00"&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;lower&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="s"&gt;"ff"&lt;/span&gt;
    &lt;span class="n"&gt;uuid_int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;uuid_int&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fill_byte&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# using the UUID constructor from uuid6 library, which automatically populates the version and variant fields
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;uuid_int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;version&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;a href="https://pypi.org/project/uuid6/2022.6.25/"&gt;uuid6 library in (alpha) version 2022.6.25&lt;/a&gt; implements both uuid6 and uuid7 from the draft. It doesn’t match draft 4 exactly though — it &lt;a href="https://github.com/oittaa/uuid6-python/blob/2022.06.25/src/uuid6/__init__.py#L107"&gt;stores the submillisecond parts of the timestamp&lt;/a&gt; at the beginning of what’s supposed to be pseudorandom bits. The boundary values for comparison are only computed with 1-second precision, so they would work with either implementation.&lt;/p&gt;

&lt;h3&gt;
  
  
  UUID7 &amp;lt; time
&lt;/h3&gt;

&lt;p&gt;To find UUID7-s generated before 2022–04–03 10:30:00, we can just build a boundary UUID from this date and compare UUID objects directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;dt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strptime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"2022-04-03 10:30:00"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"%Y-%m-%d %H:%M:%S"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tzinfo&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;timezone&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;utc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;####################################################################################################################
&lt;/span&gt;    &lt;span class="c1"&gt;# Using an upper boundary with &amp;lt;=:
&lt;/span&gt;    &lt;span class="c1"&gt;####################################################################################################################
&lt;/span&gt;    &lt;span class="n"&gt;boundary_uuid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;uuid7_time_boundary&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;lower&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;earlier_uuid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;uuid7_time_boundary&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dt&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;timedelta&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;milliseconds&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;lower&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# compare with earlier uuid
&lt;/span&gt;    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"%s&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;%s&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;== %s&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;earlier_uuid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;boundary_uuid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;earlier_uuid&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;boundary_uuid&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="c1"&gt;# 017feef9-743f-7fff-bfff-ffffffffffff
&lt;/span&gt;    &lt;span class="c1"&gt;# &amp;lt;
&lt;/span&gt;    &lt;span class="c1"&gt;# 017feef9-7440-7000-8000-000000000000
&lt;/span&gt;    &lt;span class="c1"&gt;# == True
&lt;/span&gt;
    &lt;span class="n"&gt;later_uuid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'018ea382-283f-7bfc-addb-a6c06bf382da'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# 2024-04-03
&lt;/span&gt;
    &lt;span class="c1"&gt;# compare with later uuid
&lt;/span&gt;    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"%s&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;%s&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;== %s&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;later_uuid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;boundary_uuid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;later_uuid&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;boundary_uuid&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="c1"&gt;# 018ea382-283f-7bfc-addb-a6c06bf382da
&lt;/span&gt;    &lt;span class="c1"&gt;# &amp;lt;
&lt;/span&gt;    &lt;span class="c1"&gt;# 017feef9-7440-7000-8000-000000000000
&lt;/span&gt;    &lt;span class="c1"&gt;# == False
&lt;/span&gt;
    &lt;span class="c1"&gt;####################################################################################################################
&lt;/span&gt;    &lt;span class="c1"&gt;# Using an upper boundary with &amp;lt;=:
&lt;/span&gt;    &lt;span class="c1"&gt;####################################################################################################################
&lt;/span&gt;
    &lt;span class="n"&gt;upper_boundary_uuid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;uuid7_time_boundary&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dt&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;timedelta&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;milliseconds&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;lower&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"%s&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;=&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;%s&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;== %s&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;earlier_uuid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;upper_boundary_uuid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;earlier_uuid&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;upper_boundary_uuid&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="c1"&gt;# 017feef9-743f-7fff-bfff-ffffffffffff
&lt;/span&gt;    &lt;span class="c1"&gt;# &amp;lt;=
&lt;/span&gt;    &lt;span class="c1"&gt;# 017feef9-743f-7fff-bfff-ffffffffffff
&lt;/span&gt;    &lt;span class="c1"&gt;# == True
&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"%s&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;=&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;%s&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;== %s&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;later_uuid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;upper_boundary_uuid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;later_uuid&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;upper_boundary_uuid&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="c1"&gt;# 018ea382-283f-7bfc-addb-a6c06bf382da
&lt;/span&gt;    &lt;span class="c1"&gt;# &amp;lt;=
&lt;/span&gt;    &lt;span class="c1"&gt;# 017feef9-743f-7fff-bfff-ffffffffffff
&lt;/span&gt;    &lt;span class="c1"&gt;# == False
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Partition boundaries
&lt;/h3&gt;

&lt;p&gt;In the world of MySQL, the partitions’ limits can be defined by using hexadecimal constants starting with &lt;code&gt;0x&lt;/code&gt;. This is valid SQL:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;workshops&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;workshop_id&lt;/span&gt; &lt;span class="nb"&gt;BINARY&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;-- UUIDs will be stored in compact binary form&lt;/span&gt;
  &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;255&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;time_scheduled&lt;/span&gt; &lt;span class="nb"&gt;DATETIME&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;workshop_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;time_scheduled&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;RANGE&lt;/span&gt; &lt;span class="n"&gt;COLUMNS&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;workshop_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="c1"&gt;-- monthly partitions&lt;/span&gt;
  &lt;span class="c1"&gt;-- for the LESS THAN value we're using the boundary UUID encoded as a hex constant (with dashes removed)&lt;/span&gt;
  &lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="n"&gt;p_old&lt;/span&gt;     &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="k"&gt;LESS&lt;/span&gt; &lt;span class="k"&gt;THAN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="n"&gt;x016f5e66e80070008000000000000000&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="c1"&gt;-- &amp;lt; 2020-01-01 00:00:00&lt;/span&gt;
  &lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="n"&gt;p_2020_01&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="k"&gt;LESS&lt;/span&gt; &lt;span class="k"&gt;THAN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="n"&gt;x016ffe0c0c0070008000000000000000&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="c1"&gt;-- &amp;lt; 2020-02-01 00:00:00&lt;/span&gt;
  &lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="n"&gt;p_2020_02&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="k"&gt;LESS&lt;/span&gt; &lt;span class="k"&gt;THAN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="n"&gt;x01709364780070008000000000000000&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="c1"&gt;-- &amp;lt; 2020-03-01 00:00:00  &lt;/span&gt;
  &lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="n"&gt;p_future&lt;/span&gt;  &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="k"&gt;LESS&lt;/span&gt; &lt;span class="k"&gt;THAN&lt;/span&gt; &lt;span class="k"&gt;MAXVALUE&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;participants&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;participant_id&lt;/span&gt; &lt;span class="nb"&gt;BINARY&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;255&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;time_created&lt;/span&gt; &lt;span class="nb"&gt;DATETIME&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;participant_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;time_created&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;RANGE&lt;/span&gt; &lt;span class="n"&gt;COLUMNS&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;participant_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="n"&gt;p_old&lt;/span&gt;     &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="k"&gt;LESS&lt;/span&gt; &lt;span class="k"&gt;THAN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="n"&gt;x016f5e66e80070008000000000000000&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="c1"&gt;-- &amp;lt; 2020-01-01 00:00:00&lt;/span&gt;
  &lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="n"&gt;p_2020_01&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="k"&gt;LESS&lt;/span&gt; &lt;span class="k"&gt;THAN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="n"&gt;x016ffe0c0c0070008000000000000000&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="c1"&gt;-- &amp;lt; 2020-02-01 00:00:00&lt;/span&gt;
  &lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="n"&gt;p_2020_02&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="k"&gt;LESS&lt;/span&gt; &lt;span class="k"&gt;THAN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="n"&gt;x01709364780070008000000000000000&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="c1"&gt;-- &amp;lt; 2020-03-01 00:00:00  &lt;/span&gt;
  &lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="n"&gt;p_future&lt;/span&gt;  &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="k"&gt;LESS&lt;/span&gt; &lt;span class="k"&gt;THAN&lt;/span&gt; &lt;span class="k"&gt;MAXVALUE&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;workshop_participants&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;workshop_participant_id&lt;/span&gt; &lt;span class="nb"&gt;BINARY&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;workshop_id&lt;/span&gt; &lt;span class="nb"&gt;BINARY&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;participant_id&lt;/span&gt; &lt;span class="nb"&gt;BINARY&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  
  &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;workshop_participant_id&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;wp_i_workshop_id&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;workshop_id&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;wp_i_participant_id&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;participant_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;RANGE&lt;/span&gt; &lt;span class="n"&gt;COLUMNS&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;workshop_participant_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="n"&gt;p_old&lt;/span&gt;     &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="k"&gt;LESS&lt;/span&gt; &lt;span class="k"&gt;THAN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="n"&gt;x016f5e66e80070008000000000000000&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="c1"&gt;-- &amp;lt; 2020-01-01 00:00:00&lt;/span&gt;
  &lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="n"&gt;p_2020_01&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="k"&gt;LESS&lt;/span&gt; &lt;span class="k"&gt;THAN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="n"&gt;x016ffe0c0c0070008000000000000000&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="c1"&gt;-- &amp;lt; 2020-02-01 00:00:00&lt;/span&gt;
  &lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="n"&gt;p_2020_02&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="k"&gt;LESS&lt;/span&gt; &lt;span class="k"&gt;THAN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="n"&gt;x01709364780070008000000000000000&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="c1"&gt;-- &amp;lt; 2020-03-01 00:00:00  &lt;/span&gt;
  &lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="n"&gt;p_future&lt;/span&gt;  &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="k"&gt;LESS&lt;/span&gt; &lt;span class="k"&gt;THAN&lt;/span&gt; &lt;span class="k"&gt;MAXVALUE&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Splitting the many-to-many bridging table
&lt;/h2&gt;

&lt;p&gt;In its current form, the bridging table is not great, because selecting from workshop_participants either by &lt;code&gt;workshop_id&lt;/code&gt; or by &lt;code&gt;participant_id&lt;/code&gt; would not use partition pruning, because none of these foreign keys are in the partitioning expression. As a solution, the bridging table can be split into two:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;workshop_participants&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;workshop_id&lt;/span&gt; &lt;span class="nb"&gt;BINARY&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;participant_id&lt;/span&gt; &lt;span class="nb"&gt;BINARY&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;time_created&lt;/span&gt; &lt;span class="nb"&gt;DATETIME&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;workshop_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;participant_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;-- partitioning condition is satisfied - the partitioning column is part of the primary key&lt;/span&gt;
&lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;RANGE&lt;/span&gt; &lt;span class="n"&gt;COLUMNS&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;workshop_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="c1"&gt;-- ...  &lt;/span&gt;
  &lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="n"&gt;p_future&lt;/span&gt;  &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="k"&gt;LESS&lt;/span&gt; &lt;span class="k"&gt;THAN&lt;/span&gt; &lt;span class="k"&gt;MAXVALUE&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;participant_workshops&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;participant_id&lt;/span&gt; &lt;span class="nb"&gt;BINARY&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;workshop_id&lt;/span&gt; &lt;span class="nb"&gt;BINARY&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;time_created&lt;/span&gt; &lt;span class="nb"&gt;DATETIME&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;participant_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;workshop_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;-- partitioning condition is satisfied - the partitioning column is part of the primary key&lt;/span&gt;
&lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;RANGE&lt;/span&gt; &lt;span class="n"&gt;COLUMNS&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;participant_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="c1"&gt;-- ...  &lt;/span&gt;
  &lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="n"&gt;p_future&lt;/span&gt;  &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="k"&gt;LESS&lt;/span&gt; &lt;span class="k"&gt;THAN&lt;/span&gt; &lt;span class="k"&gt;MAXVALUE&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This technique also solves the problem of when partitions are going to be purged from the bridging table if the 2 connected tables have different data retention times. If one side is going to be deleted earlier than the other, there are going to be hanging rows in the other table of the split, but that’s bearable. Since actually defined foreign keys are not supported with partitioning the application logic must be written in such a way as to deal with the possibility of missing rows.&lt;/p&gt;

&lt;p&gt;As a side effect, the split tables will also probably take up less disk space than the original combined one, because instead of 3 indexes — 1 clustered and two secondary indexes for the foreign keys, there are now only 2 clustered indexes — 1 for each table. I haven’t checked it though.&lt;/p&gt;

&lt;p&gt;The SELECTs for the required read operations also need to be updated. Selecting all participants attending a given workshop will be accomplished as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;HEX&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;participant_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;workshop_participants&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;workshop_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="n"&gt;x016fb86c8f807d1d9a8960a15b7eb63e&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Output:&lt;/span&gt;
&lt;span class="c1"&gt;-- | HEX(participant_id)              |&lt;/span&gt;
&lt;span class="c1"&gt;-- | -------------------------------- |&lt;/span&gt;
&lt;span class="c1"&gt;-- | 016D87D5D180713484C5D4E471CCA418 |&lt;/span&gt;
&lt;span class="c1"&gt;-- | 016F619EE18071D085FD3968AA46EF0B |&lt;/span&gt;
&lt;span class="k"&gt;EXPLAIN&lt;/span&gt; &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;HEX&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;participant_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;workshop_participants&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;workshop_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="n"&gt;x016fb86c8f807d1d9a8960a15b7eb63e&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="c1"&gt;-- Explain:&lt;/span&gt;
&lt;span class="c1"&gt;-- | id  | select_type | table                 | partitions | type | possible_keys | key     | key_len | ref   | rows | filtered | Extra                    |&lt;/span&gt;
&lt;span class="c1"&gt;-- | --- | ----------- | --------------------- | ---------- | ---- | ------------- | ------- | ------- | ----- | ---- | -------- | ------------------------ |&lt;/span&gt;
&lt;span class="c1"&gt;-- | 1   | SIMPLE      | workshop_participants | p_2020_01  | ref  | PRIMARY       | PRIMARY | 16      | const | 2    | 100      | Using where; Using index |&lt;/span&gt;

&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;HEX&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;participant_id&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;time_created&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;participants&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;participant_id&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="n"&gt;x016D87D5D180713484C5D4E471CCA418&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="n"&gt;x016F619EE18071D085FD3968AA46EF0B&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- Output:&lt;/span&gt;
&lt;span class="c1"&gt;-- | HEX(participant_id)              | name     | time_created        |&lt;/span&gt;
&lt;span class="c1"&gt;-- | -------------------------------- | -------- | ------------------- |&lt;/span&gt;
&lt;span class="c1"&gt;-- | 016D87D5D180713484C5D4E471CCA418 | Jane Doe | 2019-10-01 15:00:00 |&lt;/span&gt;
&lt;span class="c1"&gt;-- | 016F619EE18071D085FD3968AA46EF0B | john Doe | 2020-01-01 15:00:00 |&lt;/span&gt;

&lt;span class="k"&gt;EXPLAIN&lt;/span&gt; &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;HEX&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;participant_id&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;time_created&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;participants&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;participant_id&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="n"&gt;x016D87D5D180713484C5D4E471CCA418&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="n"&gt;x016F619EE18071D085FD3968AA46EF0B&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;-- Explain&lt;/span&gt;
&lt;span class="c1"&gt;-- | id  | select_type | table        | partitions      | type  | possible_keys | key     | key_len | ref | rows | filtered | Extra       |&lt;/span&gt;
&lt;span class="c1"&gt;-- | --- | ----------- | ------------ | --------------- | ----- | ------------- | ------- | ------- | --- | ---- | -------- | ----------- |&lt;/span&gt;
&lt;span class="c1"&gt;-- | 1   | SIMPLE      | participants | p_old,p_2020_01 | range | PRIMARY       | PRIMARY | 16      |     | 2    | 100      | Using where |&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To get all workshops that a given participant has signed up for, the other table of the split many-to-many bridge is accessed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;HEX&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;workshop_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;participant_workshops&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;participant_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="n"&gt;x016f619ee18071d085fd3968aa46ef0b&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Output:&lt;/span&gt;
&lt;span class="c1"&gt;-- | HEX(workshop_id)                 |&lt;/span&gt;
&lt;span class="c1"&gt;-- | -------------------------------- |&lt;/span&gt;
&lt;span class="c1"&gt;-- | 016FB82267207367A7E6E580C4871E32 |&lt;/span&gt;
&lt;span class="c1"&gt;-- | 016FB86C8F807D1D9A8960A15B7EB63E |&lt;/span&gt;

&lt;span class="k"&gt;EXPLAIN&lt;/span&gt; &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;HEX&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;workshop_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;participant_workshops&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;participant_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="n"&gt;x016f619ee18071d085fd3968aa46ef0b&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Explain:&lt;/span&gt;
&lt;span class="c1"&gt;-- | id  | select_type | table                 | partitions | type | possible_keys | key     | key_len | ref   | rows | filtered | Extra                    |&lt;/span&gt;
&lt;span class="c1"&gt;-- | --- | ----------- | --------------------- | ---------- | ---- | ------------- | ------- | ------- | ----- | ---- | -------- | ------------------------ |&lt;/span&gt;
&lt;span class="c1"&gt;-- | 1   | SIMPLE      | participant_workshops | p_2020_01  | ref  | PRIMARY       | PRIMARY | 16      | const | 2    | 100      | Using where; Using index |&lt;/span&gt;

&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;HEX&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;workshop_id&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;time_scheduled&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;workshops&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;workshop_id&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="n"&gt;x016FB82267207367A7E6E580C4871E32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="n"&gt;x016FB86C8F807D1D9A8960A15B7EB63E&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- Output:&lt;/span&gt;
&lt;span class="c1"&gt;-- | HEX(workshop_id)                 | name                             | time_scheduled      |&lt;/span&gt;
&lt;span class="c1"&gt;-- | -------------------------------- | -------------------------------- | ------------------- |&lt;/span&gt;
&lt;span class="c1"&gt;-- | 016FB82267207367A7E6E580C4871E32 | UUIDs - when and how to use them | 2020-01-18 10:11:00 |&lt;/span&gt;
&lt;span class="c1"&gt;-- | 016FB86C8F807D1D9A8960A15B7EB63E | Partitioning InnoDB tables       | 2020-01-18 11:32:00 |&lt;/span&gt;

&lt;span class="k"&gt;EXPLAIN&lt;/span&gt; &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;workshop_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;time_scheduled&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;workshops&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;workshop_id&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="n"&gt;x016FB82267207367A7E6E580C4871E32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="n"&gt;x016FB86C8F807D1D9A8960A15B7EB63E&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- Explain:&lt;/span&gt;
&lt;span class="c1"&gt;-- | id  | select_type | table     | partitions | type  | possible_keys | key     | key_len | ref | rows | filtered | Extra       |&lt;/span&gt;
&lt;span class="c1"&gt;-- | --- | ----------- | --------- | ---------- | ----- | ------------- | ------- | ------- | --- | ---- | -------- | ----------- |&lt;/span&gt;
&lt;span class="c1"&gt;-- | 1   | SIMPLE      | workshops | p_2020_01  | range | PRIMARY       | PRIMARY | 16      |     | 2    | 100      | Using where |&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;All of the above queries take advantage of partition pruning. &lt;/p&gt;

&lt;p&gt;The sample data and all queries are available for playing around with at &lt;a href="https://www.db-fiddle.com/f/xs4a6RCDzSa5vJUCC5L64e/0"&gt;DB Fiddle&lt;/a&gt;, or as a &lt;a href="https://www.toomanyafterthoughts.com/wp-content/uploads/2022/09/uuid_partitioning.zip"&gt;downloadable .sql file&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Hiccups
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Listing partitions in INFORMATION_SCHEMA.PARTITIONS
&lt;/h3&gt;

&lt;p&gt;The read-only view can be used to list all partitions of a given table. For range partitions, the boundary value is included in the PARTITION_DESCRIPTION field. This works fine for datetime fields but binary values are garbled. If you need to read back this value, it’s better to manually encode it in the partition’s name.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;PARTITION_NAME&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;PARTITION_ORDINAL_POSITION&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;PARTITION_DESCRIPTION&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;INFORMATION_SCHEMA&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PARTITIONS&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;TABLE_NAME&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'workshops'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Output MySQL 5.7:&lt;/span&gt;
&lt;span class="c1"&gt;-- | PARTITION_NAME | PARTITION_ORDINAL_POSITION | PARTITION_DESCRIPTION |&lt;/span&gt;
&lt;span class="c1"&gt;-- | -------------- | -------------------------- | --------------------- |&lt;/span&gt;
&lt;span class="c1"&gt;-- | p_old          | 1                          | 'o^f                 |&lt;/span&gt;
&lt;span class="c1"&gt;-- | p_2020_01      | 2                          | 'o                   |&lt;/span&gt;
&lt;span class="c1"&gt;-- | p_2020_02      | 3                          | 'p                   |&lt;/span&gt;
&lt;span class="c1"&gt;-- | p_future       | 4                          | MAXVALUE              |&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Update&lt;/strong&gt;: It’s been fixed in MySQL 8.0:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;PARTITION_NAME&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;PARTITION_ORDINAL_POSITION&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;PARTITION_DESCRIPTION&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;INFORMATION_SCHEMA&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PARTITIONS&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;TABLE_NAME&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'workshops'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="c1"&gt;-- Output MySQL 8.0:&lt;/span&gt;

&lt;span class="c1"&gt;-- | PARTITION_NAME | PARTITION_ORDINAL_POSITION | PARTITION_DESCRIPTION                      |&lt;/span&gt;
&lt;span class="c1"&gt;-- | -------------- | -------------------------- | ------------------------------------------ |&lt;/span&gt;
&lt;span class="c1"&gt;-- | p_future       | 4                          | MAXVALUE                                   |&lt;/span&gt;
&lt;span class="c1"&gt;-- | p_2020_02      | 3                          | _binary 0x01709364780070008000000000000000 |&lt;/span&gt;
&lt;span class="c1"&gt;-- | p_2020_01      | 2                          | _binary 0x016FFE0C0C0070008000000000000000 |&lt;/span&gt;
&lt;span class="c1"&gt;-- | p_old          | 1                          | _binary 0x016F5E66E80070008000000000000000 |&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  ORDER BY partitioning_col LIMIT N
&lt;/h3&gt;

&lt;p&gt;A typical query for this would be “Get me the latest 10 items”. When the table is partitioned by time, it would make sense to use the implicit ordering or partitions and start from the latest partition. If fewer than 10 items were found, then continue onto the next partition, and so on. After 10 items were found, the query can immediately finish. AFAIK this is not implemented in MYSQL, instead, it will access all partitions and do a merge sort. PostgreSQL has an improved query planner and &lt;a href="https://www.2ndquadrant.com/en/blog/postgresql-12-partitioning/#:~:text=Ordered%20partition%20scans"&gt;supports ordered partition scans since version 12&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Other DB Engines
&lt;/h2&gt;

&lt;p&gt;I haven’t tested this on DB engines other than MySQL (with InnoDB storage), but the basic principle should work. It all depends on whether binary sequences can be used as partition boundaries. I’d be curious to know if you’re using a similar approach, as well as what works and what doesn’t.&lt;/p&gt;

</description>
      <category>database</category>
      <category>mysql</category>
      <category>postgres</category>
      <category>architecture</category>
    </item>
    <item>
      <title>UUIDs Are Bad for Database Index Performance, enter UUID7!</title>
      <dc:creator>Viktor Dorotovič</dc:creator>
      <pubDate>Fri, 07 Oct 2022 22:10:14 +0000</pubDate>
      <link>https://dev.to/vdorot/uuids-are-bad-for-database-index-performance-enter-uuid7-4pbj</link>
      <guid>https://dev.to/vdorot/uuids-are-bad-for-database-index-performance-enter-uuid7-4pbj</guid>
      <description>&lt;p&gt;UUIDs, Universal Unique Identifiers, are a specific form of identifier designed to be unique even when generated on multiple machines. Compared to autoincremented sequential identifiers commonly used in relational databases, generating does not require centralized storage of the current state, I.e., the identifiers that have already been allocated. This is useful when a centralized system would pose a performance bottleneck or a single point of failure. UUIDs are designed to be able to support very high allocation rates, up to 10 million per second per machine.&lt;/p&gt;

&lt;p&gt;Despite the fact that some types (eg., UUID4) are not guaranteed to be unique by their method of generation, the chance of generating two conflicting UUIDs is very low. This is also due to the fact that UUIDs are 128 bits long.&lt;/p&gt;

&lt;p&gt;UUIDs were formally defined as an &lt;a href="https://datatracker.ietf.org/doc/rfc4122/" rel="noopener noreferrer"&gt;Internet Draft in 2002&lt;/a&gt;, which was promoted to a Proposed Standard as &lt;a href="https://www.rfc-editor.org/rfc/rfc4122.html" rel="noopener noreferrer"&gt;RFC 4122&lt;/a&gt;. At the time of writing, 17 years later, it still has the status of a proposal.&lt;/p&gt;

&lt;h2&gt;
  
  
  Representation &amp;amp; structure
&lt;/h2&gt;

&lt;p&gt;UUIDs are just 128 bit numbers, or 16-byte long binary sequences, and can be stored as such for efficiency. In API calls, they are commonly transferred in a text format of hexadecimal sequenced separated by dashes in the pattern of 8-4-4-4-12. This mirrors the internal structure as defined by the RFC:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;   0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                          time_low                             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |       time_mid                |         time_hi_and_version   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |clk_seq_hi_res |  clk_seq_low  |         node (0-1)            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                         node (2-5)                            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  UUID versions
&lt;/h2&gt;

&lt;p&gt;Even though the layout looks like a single schema for UUID generation, there are multiple versions of how the individual fields are populated – UUID1 to UUID5. The version is actually encoded in the UUID itself and makes up the most significant 4 bits of the &lt;code&gt;time_hi_and_version&lt;/code&gt; field.&lt;/p&gt;

&lt;h3&gt;
  
  
  UUID1 - time-based
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo177d1mys9mlz5su4fir.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo177d1mys9mlz5su4fir.png" alt="The layout of UUID1 in the text representation"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  UUID2 – DCE security version
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flkquba8donkcepuzkyoj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flkquba8donkcepuzkyoj.png" alt="The layout of UUID2 in the text representation"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The specification of this version is available at &lt;a href="https://pubs.opengroup.org/onlinepubs/9696989899/chap5.htm#tagcjh_08_02_01_01" rel="noopener noreferrer"&gt;https://pubs.opengroup.org/onlinepubs/9696989899/chap5.htm#tagcjh_08_02_01_01&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  UUID3 – name-based, MD5
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5evbpw5xdp0pnslajw7l.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5evbpw5xdp0pnslajw7l.png" alt="The layout of UUID3 in text text representation"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  UUID4 – random
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs60l7fczf2ttwpuoyrzc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs60l7fczf2ttwpuoyrzc.png" alt="The layout of UUID4 in the text representation"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  UUID5 – name-based, SHA1
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fulwb49ab0svx5byo81sq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fulwb49ab0svx5byo81sq.png" alt="The layout of UUID5 in the text representation&amp;lt;br&amp;gt;
"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  UUIDs and B-tree indices
&lt;/h2&gt;

&lt;p&gt;As described in the &lt;a href="https://dev.to/vdorot/choose-the-right-primary-key-to-save-a-large-amount-of-disk-io-1o1l"&gt;post comparing random and sequential primary keys&lt;/a&gt;, B-tree indexes are well suited for inserting keys when values are increasing and not so much when the values are randomly distributed. Notice something about UUIDs? &lt;strong&gt;Neither version is designed to generate increasing values&lt;/strong&gt;, where the sort order is similar to how the UUIDs were generated in time. No, not even UUID1. &lt;strong&gt;UUID1 is based on time, but the timestamp is split up into multiple segments and they are in the wrong order&lt;/strong&gt; in the UUID structure. The least significant bits – the bits that have the highest change frequency are located at the start of the UUID stream – as the &lt;code&gt;time_low&lt;/code&gt; field.&lt;/p&gt;

&lt;h2&gt;
  
  
  UUID1 vs UUID4
&lt;/h2&gt;

&lt;p&gt;Commonly used UUID versions are 1 and 4 because they can be generated “out of thin air”, without requiring some custom values (namespace, name).&lt;br&gt;
Considering that none of them is increasing, there should be little difference in insert performance between UUID1 and UUID4.&lt;/p&gt;

&lt;p&gt;To test this hypothesis, I extended the &lt;a href="https://github.com/vdorot/primary_key_io" rel="noopener noreferrer"&gt;test rig&lt;/a&gt; developed for &lt;a href="https://dev.to/vdorot/choose-the-right-primary-key-to-save-a-large-amount-of-disk-io-1o1l"&gt;comparing random and sequential integer IDs&lt;/a&gt;. The script inserts equal-sized chunks of records into a table and tracks the I/O write volume of the DB engine generated by the inserts. The I/O measurement is done by running the DB engine in a Docker container and polling &lt;a href="https://docs.docker.com/engine/api/v1.41/#tag/Container/operation/ContainerStats" rel="noopener noreferrer"&gt;Docker’s stats API&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;For this comparison, I’ve used &lt;a href="https://www.toomanyafterthoughts.com/primary-key-random-sequential-performance/#htoc-sqlite" rel="noopener noreferrer"&gt;SQLite in clustered mode&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F73zpisvybsevb89wxlyl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F73zpisvybsevb89wxlyl.png" alt="Written bytes to disk (cumulative), SQLite, clustered index"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Umm, nope. Hypothesis busted. UUID1 performs much better.&lt;/p&gt;

&lt;p&gt;The reason is that while overall, uuid1 isn’t ordered by time, it is increasing in short periods until time_low rolls over and starts from 0 again. The time it takes time_low to roll over is about 7 minutes and 9 seconds. After this period, filled B-tree pages will be revisited again and page splitting will occur.&lt;/p&gt;

&lt;p&gt;Inserting 1M records during the test took about 33 minutes, so this would mean only 4 rollovers. In order to show how it would perform on a longer timescale, I created a modified version of UUID for which time passes 100× faster – &lt;code&gt;uuid1_fastrollover&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmtfwj3lz1yhb6cxf7dlf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmtfwj3lz1yhb6cxf7dlf.png" alt="Written bytes to disk (cumulative), SQLite, clusteed index"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Designing efficient UUIDs
&lt;/h2&gt;

&lt;p&gt;What if a new UUID version could be designed that would take the randomness of UUID4 and combine it with a timestamp prefix? This would make the UUID increase overall, but not locally – due to the random postfix. The random part ensures uniqueness when a high generation rate is necessary and also makes the UUIDs hard to predict – it’s not possible to guess the previous, or next UUID. It’s fairly simple to devise a custom UUID scheme, but fortunately, there is a new Internet-Draft (at the time of writing) defining new pseudo-sequential UUID versions that aim to solve exactly this issue: &lt;a href="https://datatracker.ietf.org/doc/html/draft-peabody-dispatch-new-uuid-format-04" rel="noopener noreferrer"&gt;draft-peabody-dispatch-new-uuid-format-04&lt;/a&gt;. The current state and progress can be viewed at &lt;a href="https://datatracker.ietf.org/doc/draft-peabody-dispatch-new-uuid-format/" rel="noopener noreferrer"&gt;IETF Datatracker&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;a href="https://datatracker.ietf.org/doc/html/draft-peabody-dispatch-new-uuid-format#section-5.1" rel="noopener noreferrer"&gt;UUID 6&lt;/a&gt; – field-compatible version of UUID1
&lt;/h3&gt;

&lt;p&gt;The original field structure is kept, but the timestamp fields are shuffled around so that the whole timestamp is in the correct order – starting with the most significant bits and ending with the least significant ones. This way, consecutively generated UUIDs are always increasing in value.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbg58jhfi7cmuq5mqnr5z.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbg58jhfi7cmuq5mqnr5z.png" alt="The layout of UUID6 in its text representation"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;a href="https://datatracker.ietf.org/doc/html/draft-peabody-dispatch-new-uuid-format#section-5.2" rel="noopener noreferrer"&gt;UUID 7&lt;/a&gt; – time-ordered
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flfrnjfsvlwmiw9hq04gz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flfrnjfsvlwmiw9hq04gz.png" alt="The layout of UUID7 in its text representation"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is is the go-to version for use in newly developed systems, where forward compatibility is not necessary. Notice that UUID1 uses a Gregorian epoch timestamp, while UUID7 is defined to use the Unix epoch timestamp.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;a href="https://datatracker.ietf.org/doc/html/draft-peabody-dispatch-new-uuid-format#section-5.3" rel="noopener noreferrer"&gt;UUID 8&lt;/a&gt; – custom, vendor-specific
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foyu0o72jijopakvn2fxf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foyu0o72jijopakvn2fxf.png" alt="The layout of UUID8 in its text representation"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I have included this version just for completeness – it is in the draft, but here the ordering will clearly depend on the custom data, so this version is not going to be tested.&lt;/p&gt;

&lt;h2&gt;
  
  
  Testing &amp;amp; comparison
&lt;/h2&gt;

&lt;p&gt;Implementation notes – which Python library was used for UUID7 and UUID8?&lt;/p&gt;

&lt;p&gt;I’ve found a library on PyPI that appears to implement UUID7 – uuid7, but upon further inspection of (the only) version 0.1.0, it does not seem to match the structure defined in the draft at the time of writing.&lt;/p&gt;

&lt;p&gt;Confusingly enough, there is a package named &lt;a href="https://pypi.org/project/uuid6/" rel="noopener noreferrer"&gt;uuid6&lt;/a&gt; that implements both UUID6 and UUID7 and seems to follow the draft. I used it for the test.&lt;/p&gt;

&lt;h3&gt;
  
  
  InnoDB – clustered index
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8pvrut1eub60m6dh4zd8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8pvrut1eub60m6dh4zd8.png" alt="UUID versions tested on MariaDB clustered index"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;UUID6 and UUID7 perform equally well, as expected.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3u7wm4yazict2bwonp3b.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3u7wm4yazict2bwonp3b.png" alt="total written bytes - UUID1_fastrollover, UUID4, UUID6, UUID7 compared"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  PostgreSQL – non-clustered index
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx54bi79cc9t721fwm5hm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx54bi79cc9t721fwm5hm.png" alt="UUID versions compared on PostgreSQL non-clustered index"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When using Postgres’s non-clustered index, there is a similar overhead for UUID4 over UUID6/7. The I/O is an order of magnitude lower than for InnoDB though.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fep54pvnr77oynthpn9mk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fep54pvnr77oynthpn9mk.png" alt="Total written bytes compared on PostgreSQL - UUID_fastrollover, UUID4, UUID6, UUID7"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;When comparing UUID1 and UUID4 from the original RFC, UUID1 performs much better even though the time fields are in the wrong order, this is because the low bits still take about 7 minutes to roll over, and during this time the UUID value is increasing.&lt;/p&gt;

&lt;p&gt;UUID6 and UUID7 from the draft take this further and keep the ordering of generated values over the full timeline. Even though the versions are not yet standardized, there are existing libraries, at least for Python, that follow the draft. When designing a new database, UUID7 is the go-to option and would perform well in most database engines.&lt;/p&gt;

&lt;p&gt;If you’re working on a system that already generates UUID1 or UUID4 but you can still choose a DB engine that supports non-clustered indexes like PostgreSQL, then it is a viable option. There is a performance hit, but much less pronounced compared to InnoDB’s clustered index. Microsoft SQL Server also &lt;a href="https://learn.microsoft.com/en-us/sql/relational-databases/indexes/clustered-and-nonclustered-indexes-described?view=sql-server-ver16" rel="noopener noreferrer"&gt;supports both index types&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Alternative solutions &amp;amp; related studies&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/oklog/ulid" rel="noopener noreferrer"&gt;Universally Unique Lexicographically Sortable Identifier&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.percona.com/blog/2019/11/22/uuids-are-popular-but-bad-for-performance-lets-discuss/" rel="noopener noreferrer"&gt;UUIDs are Popular, but Bad for Performance — Let’s Discuss&lt;/a&gt; – Percona Database Performance Blog&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.percona.com/blog/2015/04/03/illustrating-primary-key-models-in-innodb-and-their-impact-on-disk-usage/" rel="noopener noreferrer"&gt;Illustrating Primary Key models in InnoDB and their impact on disk usage&lt;/a&gt; – Percona Database Performance&lt;br&gt;
I dig the straightforward graphical view of InnoDB page allocation.&lt;/p&gt;

&lt;p&gt;MySQL 8 can convert UUID1 to binary and shuffle the time fields to the “correct” order:&lt;br&gt;
&lt;a href="https://dev.mysql.com/doc/refman/8.0/en/miscellaneous-functions.html#function_uuid-to-bin" rel="noopener noreferrer"&gt;UUID_TO_BIN(string_uuid, swap_flag)&lt;/a&gt;&lt;br&gt;
&lt;a href="https://dev.mysql.com/doc/refman/8.0/en/miscellaneous-functions.html#function_bin-to-uuid" rel="noopener noreferrer"&gt;BIN_TO_UUID(binary_uuid, swap_flag)&lt;/a&gt;&lt;br&gt;
&lt;a href="http://mysql.rjweb.org/doc.php/uuid#mysql_8_0" rel="noopener noreferrer"&gt;Rick James: GUID/UUID Performance Breakthrough&lt;/a&gt;&lt;/p&gt;

</description>
      <category>database</category>
      <category>performance</category>
      <category>webdev</category>
      <category>news</category>
    </item>
    <item>
      <title>Choose the right primary key to save a large amount of disk I/O</title>
      <dc:creator>Viktor Dorotovič</dc:creator>
      <pubDate>Wed, 05 Oct 2022 11:22:17 +0000</pubDate>
      <link>https://dev.to/vdorot/choose-the-right-primary-key-to-save-a-large-amount-of-disk-io-1o1l</link>
      <guid>https://dev.to/vdorot/choose-the-right-primary-key-to-save-a-large-amount-of-disk-io-1o1l</guid>
      <description>&lt;p&gt;Imagine you’re working in a large book warehouse and in charge of new arrivals. There’s a separate, digital system for metadata like authors, categories, etc., so the only information you’ll use during storage is the inventory number. Each book is identified by a unique number with many digits and all books must be findable by their number. To make handling quicker, books are packed in boxes, ordered by the inventory number. When looking for a book, the box must be identified first. Therefore, each box is labeled by the lowest inventory number it contains and the first number that’s in the next box.&lt;/p&gt;

&lt;p&gt;Your job in the arrivals room is to pick up books-to-be-stored one by one, assign them a new inventory number in the metadata system, label them by number, and put them in a box as mentioned before. Now, the room is quite small and if you run out of space, you’ll need to move the filled boxes into the basement, which might be two floors down.&lt;/p&gt;

&lt;p&gt;The way new inventory numbers are created is important here, and we’ll consider two of the many methods to do it.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Roll the dice and generate a completely random number
&lt;/h2&gt;

&lt;p&gt;Considering that the inventory number has many digits, the chance of two books getting the same number is minimal, so this method would work fine. After a book is given its number, it needs to be stored in the correct spot in the correct box. However, because the numbers are generated across the whole range, the book likely needs to go into a box that was already removed from the arrivals room. Therefore, the box will need to be fetched from the basement, and the book placed in it. If the box is already full, it will need to be split into two boxes, each of which will now be about half-empty even after placing the new book. The same process is repeated for each subsequent book.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Use a counter and just increase the number by 1 for each book
&lt;/h2&gt;

&lt;p&gt;In this case, boxes can just be filled from start to end, and their order is guaranteed. If a box becomes full, the next books are put into another empty box, until it becomes full as well. The process then repeats, and there’s no need to fetch anything from the basement.&lt;/p&gt;

&lt;p&gt;Which one of these two methods will allow the handling of more books per day with the same size arrivals room and a single worker? In this case, the one with the sequential numbers, obviously.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why are random identifiers even a thing?
&lt;/h2&gt;

&lt;p&gt;Generating random inventory numbers might sound absurd, but there is a security advantage.&lt;/p&gt;

&lt;p&gt;Let’s assume that the only information required for fetching a book is its number, and there are multiple clients storing books. If the numbers are sequential, a client might get crafty and try to fetch a book that doesn’t belong to him, by taking his highest inventory number and just adding 1. If the book was not the last in storage, the misbehaving client is guaranteed to get a book that is not their own. In the random system, a client can still guess inventory numbers, but there’s a low chance of getting a correct hit, because there are large gaps between individual inventory numbers, even when they’re in order (remember that the numbers have many digits). &lt;sup id="fnref1"&gt;1&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;Another advantage is that random inventory numbers do not give out information about how many books were processed in a given time period.&lt;/p&gt;

&lt;h2&gt;
  
  
  Indexes in relational databases
&lt;/h2&gt;

&lt;p&gt;The box system as described previously is an analogy for the leaf-level nodes of a &lt;a href="https://en.wikipedia.org/wiki/B-tree" rel="noopener noreferrer"&gt;B-tree&lt;/a&gt;, a data structure commonly used to create lookup indices in relational databases. A B-tree is not limited to one level though. The “boxes” are put into larger boxes with the range also written on them. In this way, not every box needs to be examined when doing a lookup. If there is enough data, the 2nd level can be put into even larger boxes and so on, until there is only a single “root” box that ranges over all of the identifiers in storage.&lt;br&gt;
The problem of inserting records into random places in the leaf nodes still holds though.&lt;/p&gt;

&lt;p&gt;When looking at how storage engines are implemented, there are two ways to organize rows in a table’s data space (tablespace):&lt;/p&gt;
&lt;h3&gt;
  
  
  1. Non-clustered index
&lt;/h3&gt;

&lt;p&gt;The rows in a table using a non-clustered index are written into the tablespace in the order they are received, so there’s no need to access old(er) pages. If it’s necessary to quickly find a row by an identifier, a separate index structure is built that only contains the index values and a pointer to the location of the row in the main data space.&lt;/p&gt;
&lt;h3&gt;
  
  
  2. Clustered index
&lt;/h3&gt;

&lt;p&gt;All the data is held inside the structure of the primary index. If the access pattern is random, when the pages are retrieved from the disk, they will mostly contain actual record data and not just index values. Data from existing records is useless when inserting new records, but it still takes up space in the page cache. This means that a lesser portion of the table’s rows fit in the cache, compared to when only the index values are stored in the B tree. This leads to more cache misses.&lt;/p&gt;

&lt;p&gt;While the random access pattern is not efficient for non-clustered (secondary) indices, it is especially inefficient when the index values only make up a small part of the full row and a clustered index is used.&lt;/p&gt;
&lt;h2&gt;
  
  
  Experiment and measure!
&lt;/h2&gt;

&lt;p&gt;To demonstrate, I’ve written a smallish &lt;a href="https://github.com/vdorot/primary_key_io" rel="noopener noreferrer"&gt;test rig&lt;/a&gt; that supports three database engines: SQLite, MariaDB (MySQL) and PostgreSQL.&lt;/p&gt;

&lt;p&gt;The test script fills a table with the following structure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;+--------------+--------------+--------------------------+
|    Column    |     Type     |         Comment          |
+--------------+--------------+--------------------------+
| id           | BIGINT       | Primary key              |
| time_created | VARCHAR (64) | Filler data, unimportant |
| data         | TEXT (10240) | Random data              |
+--------------+--------------+--------------------------+
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;To minimize the impact of data generation and other processes running on the test machine, the main metric I’ve chosen to monitor is the bulk of I/O operations. Compared to a time-based metric, such as sustained inserts per second, it is much more stable.&lt;/p&gt;

&lt;p&gt;While disk I/O might not be the bottleneck in all scenarios, computing (CPU) power and network connectivity also matters.&lt;/p&gt;

&lt;p&gt;I’ve opted to start the database instances inside Docker containers, so the I/O can be measured using &lt;a href="https://docs.docker.com/engine/api/v1.41/#tag/Container/operation/ContainerStats" rel="noopener noreferrer"&gt;Docker’s stats API&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;
  
  
  SQLite
&lt;/h2&gt;

&lt;p&gt;The first tests were done using SQLite, because it supports both options. By default, it uses non-clustered indexes; however, a clustered index can be created for the primary by using the &lt;code&gt;WITHOUT ROWID&lt;/code&gt; keyword during table definition.&lt;/p&gt;

&lt;p&gt;For each database engine, the page size was set to 16KB, and the page cache size to 10MB.&lt;/p&gt;

&lt;p&gt;In the case of SQLite this meant setting the following PRAGMAs:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;PRAGMA page_size = 16384; PRAGMA cache_size = -10240;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h3&gt;
  
  
  Non-clustered index
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frprea6cafv1ho9pbqw7a.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frprea6cafv1ho9pbqw7a.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Clustered index
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5lkg0bzikmqt9694ewgo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5lkg0bzikmqt9694ewgo.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Whoa!&lt;/p&gt;

&lt;p&gt;Both clustered and non-clustered in the same plot for comparison:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foxm0bkakwo3nims4ra10.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foxm0bkakwo3nims4ra10.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Graphing writes per one row inserted
&lt;/h3&gt;

&lt;p&gt;By differentiating the series for write volume, we can calculate the write volume needed for a single inserted row.&lt;/p&gt;

&lt;p&gt;The data needed some smoothing, but the writes-per-one-row seem to grow logarithmically with the size of the table:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frf1qldq3fxmpw4v4fufj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frf1qldq3fxmpw4v4fufj.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The random primary key’s performance (red) seems to match the &lt;a href="https://en.wikipedia.org/wiki/B-tree" rel="noopener noreferrer"&gt;B-tree&lt;/a&gt;‘s expected insert time complexity of O(log n). Still, we can do much better because pages can be kept in the cache during filling and written out only once (yellow).&lt;/p&gt;
&lt;h2&gt;
  
  
  MySQL (InnoDB)
&lt;/h2&gt;

&lt;p&gt;The simpler MyISAM storage engine in MySQL is quite limited and doesn’t support transactions. In the current version, InnoDB is the default storage engine and sadly, &lt;a href="https://dev.mysql.com/doc/refman/5.7/en/innodb-index-types.html" rel="noopener noreferrer"&gt;each table is built around a clustered index&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The following config variables were used to keep the comparisons fair:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;innodb_page_size=16k
innodb_buffer_pool_size=10M
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmzn2f8w870riahnhmbud.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmzn2f8w870riahnhmbud.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8q4gcd3zf220kegulw2q.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8q4gcd3zf220kegulw2q.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As we can see, InnoDB performs similarly to SQLite’s clustered index.&lt;/p&gt;
&lt;h2&gt;
  
  
  PostgreSQL
&lt;/h2&gt;

&lt;p&gt;In the case of PostgreSQL, clustered indices are not supported. This can be confusing because a &lt;a href="https://www.postgresql.org/docs/current/sql-cluster.html" rel="noopener noreferrer"&gt;CLUSTER command&lt;/a&gt; can be issued, but it is only a one-time operation and will not affect continuous writes.&lt;/p&gt;

&lt;p&gt;The impact of random IDs should be much lower and should be similar to SQLite’s default non-clustered index.&lt;/p&gt;

&lt;p&gt;As for the setup, PostgreSQL can only change its page size (block size) as a compilation parameter, so I kept the default value of 8KB. I set the page cache (shared_buffers) and WAL size to 10MB by passing the following arguments to the postgres executable:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;-c shared_buffers=1280 -c wal_buffers=1280
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyz5qh4rnyw6u6m0fo7a9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyz5qh4rnyw6u6m0fo7a9.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Quite a surprise! It seems that Postgres handles random values better than sequential ones. On a larger scale, though, this relationship switches back to the expected one, where random values “lose:”&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2pdumdpyepjvjztj9ln7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2pdumdpyepjvjztj9ln7.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When taking the differentiated writes (per row) on a larger table, the overhead of random values is visible as well:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frw50ge98gf83ufhpxi2t.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frw50ge98gf83ufhpxi2t.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The difference is obviously less significant than on a clustered index, but it is still there.&lt;/p&gt;

&lt;p&gt;If you want to use a random identifier or have an existing back-end application that generates random id-s, and you are free to choose a relational database, PostgreSQL will meet your needs.&lt;/p&gt;
&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;When using a clustered index in a table that is not purely composed of index values, generating completely random values for the primary key across the full set of possible values will lead to many disk I/O operations because the page cache cannot be used effectively.&lt;/p&gt;

&lt;p&gt;Note: A clustered index is still useful when scanning a large number of rows on a range of the primary key &lt;sup id="fnref2"&gt;2&lt;/sup&gt;, but this is unlikely to be needed with completely random identifiers.&lt;/p&gt;
&lt;h2&gt;
  
  
  How much I/O can be saved?
&lt;/h2&gt;

&lt;p&gt;The number of saved I/O operations is not constant and it depends on the size of the index and therefore on the number of records written. The saved I/O can be estimated by taking the integral of the B-tree insert complexity for 1 insert:&lt;/p&gt;

&lt;p&gt;

&lt;/p&gt;
&lt;div class="katex-element"&gt;
  &lt;span class="katex-display"&gt;&lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;∫ln⁡(n) dn=n(ln⁡(n)−1)+constant\int \ln(n) \,dn = n (\ln(n) -1 ) \textcolor{grey}{ + constant} &lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mop op-symbol large-op"&gt;∫&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mop"&gt;ln&lt;/span&gt;&lt;span class="mopen"&gt;(&lt;/span&gt;&lt;span class="mord mathnormal"&gt;n&lt;/span&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;d&lt;/span&gt;&lt;span class="mord mathnormal"&gt;n&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;n&lt;/span&gt;&lt;span class="mopen"&gt;(&lt;/span&gt;&lt;span class="mop"&gt;ln&lt;/span&gt;&lt;span class="mopen"&gt;(&lt;/span&gt;&lt;span class="mord mathnormal"&gt;n&lt;/span&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;−&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;1&lt;/span&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;+&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;co&lt;/span&gt;&lt;span class="mord mathnormal"&gt;n&lt;/span&gt;&lt;span class="mord mathnormal"&gt;s&lt;/span&gt;&lt;span class="mord mathnormal"&gt;t&lt;/span&gt;&lt;span class="mord mathnormal"&gt;an&lt;/span&gt;&lt;span class="mord mathnormal"&gt;t&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/div&gt;



&lt;p&gt;If we assume that the sequential id’s insert complexity is constant 1, the resulting ratio is as follows:&lt;/p&gt;


&lt;div class="katex-element"&gt;
  &lt;span class="katex-display"&gt;&lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;saved I/O portion≈1−nn(ln⁡(n)−1)≈1−1ln(n)−1saved \,I/O\, portion \approx 1 - \frac{n}{n(\ln(n)-1)} \approx 1 - \frac{1}{ln(n)-1} &lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;s&lt;/span&gt;&lt;span class="mord mathnormal"&gt;a&lt;/span&gt;&lt;span class="mord mathnormal"&gt;v&lt;/span&gt;&lt;span class="mord mathnormal"&gt;e&lt;/span&gt;&lt;span class="mord mathnormal"&gt;d&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;I&lt;/span&gt;&lt;span class="mord"&gt;/&lt;/span&gt;&lt;span class="mord mathnormal"&gt;O&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;p&lt;/span&gt;&lt;span class="mord mathnormal"&gt;or&lt;/span&gt;&lt;span class="mord mathnormal"&gt;t&lt;/span&gt;&lt;span class="mord mathnormal"&gt;i&lt;/span&gt;&lt;span class="mord mathnormal"&gt;o&lt;/span&gt;&lt;span class="mord mathnormal"&gt;n&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;≈&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;1&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;−&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mopen nulldelimiter"&gt;&lt;/span&gt;&lt;span class="mfrac"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;n&lt;/span&gt;&lt;span class="mopen"&gt;(&lt;/span&gt;&lt;span class="mop"&gt;ln&lt;/span&gt;&lt;span class="mopen"&gt;(&lt;/span&gt;&lt;span class="mord mathnormal"&gt;n&lt;/span&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;−&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mord"&gt;1&lt;/span&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="frac-line"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;n&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mclose nulldelimiter"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;≈&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;1&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;−&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mopen nulldelimiter"&gt;&lt;/span&gt;&lt;span class="mfrac"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;l&lt;/span&gt;&lt;span class="mord mathnormal"&gt;n&lt;/span&gt;&lt;span class="mopen"&gt;(&lt;/span&gt;&lt;span class="mord mathnormal"&gt;n&lt;/span&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;−&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mord"&gt;1&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="frac-line"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;1&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mclose nulldelimiter"&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/div&gt;


&lt;p&gt;When inserting one million rows into a clustered index, using a sequential primary key would save approximately 92% of I/O volume.&lt;/p&gt;

&lt;h2&gt;
  
  
  UUIDs
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://www.rfc-editor.org/rfc/rfc4122.html" rel="noopener noreferrer"&gt;Universally Unique Identifiers&lt;/a&gt; (UUIDs) have a couple of advantages over sequential id-s generated by the RDBMS (AUTOINCREMENT, SERIAL), mainly that they can be generated prior to sending the insert query to the DB. Unfortunately, InnoDB with its clustered index is one of the most popular storage engines. Using it with UUIDs, which are random by-design, leads to excess disk I/O, which is non-trivial to avoid.&lt;/p&gt;

&lt;h2&gt;
  
  
  Related Reading
&lt;/h2&gt;

&lt;p&gt;Trudeau, Y. (2019, November 22). UUIDs are Popular, but Bad for Performance — Let’s Discuss. Percona Database Performance Blog. &lt;a href="https://www.percona.com/blog/2019/11/22/uuids-are-popular-but-bad-for-performance-lets-discuss/" rel="noopener noreferrer"&gt;https://www.percona.com/blog/2019/11/22/uuids-are-popular-but-bad-for-performance-lets-discuss/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Coburn, M. (2015, April 2). Illustrating Primary Key models in InnoDB and their impact on disk usage. Percona Database Performance Blog. &lt;a href="https://www.percona.com/blog/2015/04/03/illustrating-primary-key-models-in-innodb-and-their-impact-on-disk-usage/" rel="noopener noreferrer"&gt;https://www.percona.com/blog/2015/04/03/illustrating-primary-key-models-in-innodb-and-their-impact-on-disk-usage/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Vondra, T. (2019, February 28). Sequential UUID Generators. 2ndQuadrant | PostgreSQL. &lt;a href="https://www.2ndquadrant.com/en/blog/sequential-uuid-generators/" rel="noopener noreferrer"&gt;https://www.2ndquadrant.com/en/blog/sequential-uuid-generators/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Nanoglyph. (n.d.). Identity Crisis: Sequence v. UUID as Primary Key. &lt;a href="https://brandur.org/nanoglyphs/026-ids" rel="noopener noreferrer"&gt;https://brandur.org/nanoglyphs/026-ids&lt;/a&gt;&lt;/p&gt;




&lt;ol&gt;

&lt;li id="fn1"&gt;
&lt;p&gt;This is an instance of security through obscurity, but it can be advantageous in terms of performance and ease-of-use (simpler API). ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn2"&gt;
&lt;p&gt;&lt;a href="https://www.guru99.com/clustered-vs-non-clustered-index.html#:~:text=used%20in%20joins.-,Advantages%20of%20Clustered%20Index,-The%20pros/benefits" rel="noopener noreferrer"&gt;https://www.guru99.com/clustered-vs-non-clustered-index.html#:~:text=used%20in%20joins.-,Advantages%20of%20Clustered%20Index,-The%20pros/benefits&lt;/a&gt; ↩&lt;/p&gt;
&lt;/li&gt;

&lt;/ol&gt;

</description>
      <category>database</category>
      <category>performance</category>
      <category>algorithms</category>
      <category>webdev</category>
    </item>
  </channel>
</rss>
