<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Umair Abid</title>
    <description>The latest articles on DEV Community by Umair Abid (@umairabid).</description>
    <link>https://dev.to/umairabid</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1169120%2Fb18f07a4-df3c-4c5c-84c2-92018fa6ba34.png</url>
      <title>DEV Community: Umair Abid</title>
      <link>https://dev.to/umairabid</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/umairabid"/>
    <language>en</language>
    <item>
      <title>Traveling time with Postgres Range Columns</title>
      <dc:creator>Umair Abid</dc:creator>
      <pubDate>Mon, 23 Oct 2023 07:54:00 +0000</pubDate>
      <link>https://dev.to/umairabid/traveling-time-with-postgres-range-columns-1c35</link>
      <guid>https://dev.to/umairabid/traveling-time-with-postgres-range-columns-1c35</guid>
      <description>&lt;p&gt;In &lt;a href="https://umairabid.com/blog/temporal-chaos-without-proper-database-structure/"&gt;Challenges of Time-Based Systems Without Proper Database Structures&lt;/a&gt;, we looked into everything that went wrong when we tried to build a temporal system without a compatible foundation. In this article, we will describe how we added that foundation to support temporal use cases. We will start by discussing how we built the foundation using Postgres ranges that could be a potential denominator for any time-based system. The solution might not be general enough but it can provide some good insights for building a foundation for the temporal system.&lt;/p&gt;

&lt;h2&gt;
  
  
  Migrating First Table
&lt;/h2&gt;

&lt;p&gt;We started by migrating the &lt;code&gt;state_taxes&lt;/code&gt; as it contained fewer rows and had fewer dependencies than other tables. The reason for starting with a relatively simple table was to vet the solution with minimum dependencies and then expand to other tables. The first version of the table structure we came up with was as follows.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
CREATE TABLE IF NOT EXISTS public.state_taxes
(
    id bigint NOT NULL DEFAULT nextval('state_taxes_id_seq'::regclass),
    state_id integer NOT NULL,
    tax_type character varying COLLATE pg_catalog."default" NOT NULL,
    rate numeric NOT NULL,
    effective_range daterange NOT NULL,
    system_range tsrange NOT NULL,
    CONSTRAINT state_taxes_pkey PRIMARY KEY (id),
    CONSTRAINT prevent_overlapping_state_taxes EXCLUDE USING gist (
        system_range WITH &amp;amp;&amp;amp;,
        state_id WITH =,
        effective_range WITH &amp;amp;&amp;amp;,
        tax_type WITH =
    )
)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Understanding State Taxes Structure
&lt;/h3&gt;

&lt;p&gt;The key and important difference from the previous version is two columns &lt;code&gt;effective_range&lt;/code&gt; and &lt;code&gt;system_range&lt;/code&gt; with the addition of the constraint &lt;code&gt;prevent_overlapping_state_taxes&lt;/code&gt;. Let's go through each of them and see what value they add&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Effective Range Column&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This column unlocks the ability to create timelines by having a rate for a specific start and end date, eliminating the need for &lt;code&gt;year&lt;/code&gt; the column. The clients will add rates only by providing a start date and the backend system will automatically detect the end date for the rate. The benefit of using range columns is that querying becomes easier using powerful &lt;a href="https://www.postgresql.org/docs/9.3/functions-range.html?ref=umairabid.com"&gt;Postgres range functions&lt;/a&gt;. For example, if a client asks for a rate on a specific effective date we can easily find it by searching a row whose effective range overlaps with the provided effective date.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;System Range Column&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;system_range&lt;/code&gt; helps us solve the shoe store problem discussed in the last article. This column stores the validity of data in terms of system time, also in the form of a range with specific start and end dates. When a rate is added, the system will set the current time at the time of change as the start of the validity range. Later if the rate is invalidated, the system will set the end time as the end of the system range when the change was made. This eliminates any need for maintaining &lt;code&gt;deleted_at&lt;/code&gt; columns. The system range actually removes the concept of soft deletes and replaces it with versioning the data with system validity.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Exclude Constraint&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You can think of this constraint as a unique constraint but since ranges are involved and we want to check for overlapping ranges, the exclude constraint was used. Exclude constraint basically doesn't allow two rows to exist that return true for the provided gist condition. This helps us ensure we only get one valid row for one effective date.&lt;/p&gt;

&lt;h3&gt;
  
  
  Adding Timeline Logic to State Taxes
&lt;/h3&gt;

&lt;p&gt;With a solid underlying table structure to support temporal operations next step was to add logic to &lt;code&gt;StateTaxes&lt;/code&gt; model which will ensure the timeline logic of changes as they are added. We defined the following expectations for handling changes&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;First Change&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If a rate is added for state tax for the first time for the effective date let's say &lt;code&gt;2023-01-01&lt;/code&gt; we expected the following record in the table&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--T_wYcpvG--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://umairabid.com/content/images/2023/10/Screen-Shot-2023-10-15-at-5.01.03-PM-1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--T_wYcpvG--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://umairabid.com/content/images/2023/10/Screen-Shot-2023-10-15-at-5.01.03-PM-1.png" alt="" width="800" height="59"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This row tells us that the rate &lt;code&gt;0.15&lt;/code&gt; is effective from &lt;code&gt;2023-01-01&lt;/code&gt; till the end of time and it is valid from &lt;code&gt;2023-10-16&lt;/code&gt; (the time it was added) to the end of time, for &lt;code&gt;state_id=1&lt;/code&gt; and &lt;code&gt;tax_type=income_tax&lt;/code&gt; (identified unique tax rate). This statement can be understood by a few queries, let's ask the system for a rate effective on &lt;code&gt;2023-05-01&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
SELECT rate 
FROM state_taxes 
  WHERE state_id = 1 AND 
  tax_type = 'income_tax' AND
  effective_range @&amp;gt; '2023-05-01'::date

#=&amp;gt; 0.15

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This seems correct since the rate is effective from &lt;code&gt;2023-01-01&lt;/code&gt; to end of time, let's ask for the rate before this date&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
SELECT rate 
FROM state_taxes 
  WHERE state_id = 1 AND 
  tax_type = 'income_tax' AND
  effective_range @&amp;gt; '2022-12-31'::date

#=&amp;gt; null

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As expected since the date is before the date the first rate is effective, the query returned &lt;code&gt;nil&lt;/code&gt;. Now let's query for any rates valid in the system time before the date &lt;code&gt;2023-10-16&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
SELECT rate 
FROM state_taxes 
  WHERE state_id = 1 AND 
  tax_type = 'income_tax' AND
  system_range @&amp;gt; '2022-10-16'::timestamp

#=&amp;gt; nil

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This returns &lt;code&gt;nil&lt;/code&gt; because as far as the system is concerned no rate existed in the system time for &lt;code&gt;2023-10-16&lt;/code&gt;, this is how it helps in the example of a shoe store by finding rates when transactions occurred in the system.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;After First Change&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If the first change is already added the rest of the changes will fall in one or a combination of the following scenarios&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The new change has the same effective date as the effective date (Correction)&lt;/li&gt;
&lt;li&gt;The new change effective date is before the existing change effective date (Past Change)&lt;/li&gt;
&lt;li&gt;The new change effective date is after the existing change effecting date (Future Change)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Adding a correction&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;When a new change has the same effective date as an existing change, we need to invalidate the existing change and replace it with a new one. It is called a correction because the new change replaced the old one. If we correct our first change rate from &lt;code&gt;0.15&lt;/code&gt; to &lt;code&gt;0.19&lt;/code&gt; the result will look like something below&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--X4f-ssEW--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://umairabid.com/content/images/2023/10/Screen-Shot-2023-10-19-at-10.19.38-PM.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--X4f-ssEW--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://umairabid.com/content/images/2023/10/Screen-Shot-2023-10-19-at-10.19.38-PM.png" alt="" width="800" height="82"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It shows that we invalidated our first change by adding an end to &lt;code&gt;system_range&lt;/code&gt; of the first change and then added the correction with the new rate. Now if only query valid rates effective on or after &lt;code&gt;2023-01-01&lt;/code&gt; we get &lt;code&gt;0.19&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
SELECT rate 
FROM state_taxes 
  WHERE state_id = 1 AND 
  tax_type = 'income_tax' AND
  lower(effective_range) &amp;gt;= '2023-01-01' AND
  upper(system_range) is null # only valid rates have system_range null

#=&amp;gt; 0.19

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Adding a Past Change&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;When a new change is added whose effective date is before the already existing change, then the new change should automatically assume an end date as well. This makes sure that end result is a consistent timeline where effective ranges don't overlap. For example, continuing from before, if we add a change for the effective date &lt;code&gt;2022-12-01&lt;/code&gt; with rate &lt;code&gt;0.14&lt;/code&gt; then execute the query&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
SELECT * 
FROM state_taxes 
  WHERE state_id = 1 AND 
  tax_type = 'income_tax' AND
  upper(system_range) is null
ORDER BY lower(effective_range)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It will return the following result&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--3DpsZWNu--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://umairabid.com/content/images/2023/10/Screen-Shot-2023-10-19-at-11.01.50-PM.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--3DpsZWNu--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://umairabid.com/content/images/2023/10/Screen-Shot-2023-10-19-at-11.01.50-PM.png" alt="" width="800" height="83"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Adding a Future Change&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;When a change is added whose effective date is after the existing change, the existing change needs to have a new end date. So in order to apply the change, we &lt;strong&gt;correct&lt;/strong&gt; the existing change by replacing it with a new end date. Now in our example if we add a rate &lt;code&gt;0.25&lt;/code&gt; with effective date &lt;code&gt;2023-02-01&lt;/code&gt; the query in the previous example will return the following result&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--rtJd7fQf--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://umairabid.com/content/images/2023/10/Screen-Shot-2023-10-19-at-11.28.29-PM.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--rtJd7fQf--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://umairabid.com/content/images/2023/10/Screen-Shot-2023-10-19-at-11.28.29-PM.png" alt="" width="800" height="124"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For reference fetching changes including the invalidated ones results in the below&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--9XS3BvM4--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://umairabid.com/content/images/2023/10/Screen-Shot-2023-10-19-at-11.32.00-PM.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--9XS3BvM4--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://umairabid.com/content/images/2023/10/Screen-Shot-2023-10-19-at-11.32.00-PM.png" alt="" width="800" height="146"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can find the implementation for the rails model &lt;a href="https://gist.github.com/umairabid/54ca1f6ab7a32439554551418847ced5?ref=umairabid.com"&gt;here&lt;/a&gt; and migration &lt;a href="https://gist.github.com/umairabid/7fe9619d73e0a17558145b5d4fe6e9fe?ref=umairabid.com"&gt;here&lt;/a&gt; to run examples by yourself.&lt;/p&gt;

&lt;h2&gt;
  
  
  Scaling beyond State Tax Table
&lt;/h2&gt;

&lt;p&gt;After completing the implementation for the state tax table, the next task was to assess how this implementation would work when joining tables and how the same implementation could be applied to other tables. We immediately saw that we needed to modify our approach or rethink our table relations.&lt;/p&gt;

&lt;h3&gt;
  
  
  Problem with Relations
&lt;/h3&gt;

&lt;p&gt;Initially before adding effectivity to &lt;code&gt;state_tax&lt;/code&gt; table, the id was an explicit primary key to identifying a unique tax rate, whereas the composite key &lt;code&gt;(state_id, tax_type)&lt;/code&gt; served as the implicit primary key. However, with the new structure, the id was no longer the key to identify a tax rate hence won't work as a foreign key meant to identify a unique tax, and reason why we had to resort to using the composite key to identify taxes.&lt;/p&gt;

&lt;p&gt;The nature of the issue can be traced to the fact that before the change each row &lt;code&gt;state_tax&lt;/code&gt; was one "tax rate" but after, a row was one "tax rate change". In other words, after changing the structure the table should also have been renamed to &lt;code&gt;state_tax_changes&lt;/code&gt;. To fix the relations we thought about just having a running id in the table to be used as the foreign key in the related tables. Still, the insight that we have fundamentally changed the table prevented us from continuing with the running id hack.&lt;/p&gt;

&lt;h3&gt;
  
  
  Splitting the Tables
&lt;/h3&gt;

&lt;p&gt;To resolve the relations as they were defined currently we decided to not replace tables but rather split tables into the main model and its effective attributes. So effective attributes of &lt;code&gt;state_taxes&lt;/code&gt; were moved to another table &lt;code&gt;state_tax_changes&lt;/code&gt;. The resulting table structures looked something like the ones below&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
CREATE TABLE IF NOT EXISTS public.state_taxes
(
    id bigint NOT NULL DEFAULT nextval('state_taxes_id_seq'::regclass),
    state_id integer NOT NULL,
    tax_type character varying COLLATE pg_catalog."default" NOT NULL,
)



CREATE TABLE IF NOT EXISTS public.state_tax_changes
(
    id bigint NOT NULL DEFAULT nextval('state_taxes_id_seq'::regclass),
    state_tax_id integer NOT NUL
    rate numeric NOT NULL,
    effective_range daterange NOT NULL,
    system_range tsrange NOT NULL,
    CONSTRAINT state_tax_changes_pkey PRIMARY KEY (id),
    CONSTRAINT prevent_overlapping_state_taxes EXCLUDE USING gist (
        state_tax_id WITH &amp;amp;&amp;amp;,
        effective_range WITH &amp;amp;&amp;amp;,
        tax_type WITH =
    )
)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Although from the implementation perspective splitting tables added more complexity due to breaking up existing tables. However, this complexity was only temporary and was expected to subside with the migration of old tables. The benefit of this approach was that it reflected the true nature of our data tables. Previously one state tax had one rate and now one tax had many which was nicely reflected in &lt;code&gt;state_taxes&lt;/code&gt; and &lt;code&gt;state_tax_changes&lt;/code&gt; table.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;This project was not easy or smooth by easy means as we had to deal with some issues that were not directly related to not have temporality but as we moved ahead with the system the choice of undertaking a large refactor proved to be correct. It was a great reminder that no matter how good you are design is, if it isn't compatible with business it can't get you very far.&lt;/p&gt;

</description>
      <category>casestudies</category>
      <category>postgres</category>
      <category>systemdesign</category>
      <category>database</category>
    </item>
    <item>
      <title>Challenges of Time-Based Systems Without Proper Database Structures</title>
      <dc:creator>Umair Abid</dc:creator>
      <pubDate>Wed, 06 Sep 2023 05:56:56 +0000</pubDate>
      <link>https://dev.to/umairabid/challenges-of-time-based-systems-without-proper-database-structures-12hh</link>
      <guid>https://dev.to/umairabid/challenges-of-time-based-systems-without-proper-database-structures-12hh</guid>
      <description>&lt;p&gt;When we store information in our database, we normally store it without a time dimension even if it is only valid for a specific period of time. For example, people move around all the time, but most apps ask for your current address and rely on you to change it whenever you move. This works because most applications have no use case to be aware of your address history and only need your current address.&lt;/p&gt;

&lt;p&gt;However, for some systems, the time dimension is omnipresent whenever data is queried or mutated, and implementing them on traditional data models can pose serious challenges. I had a chance to work on a project with similar challenges that provided a good learning experience on how to overcome them. The project makes a good use case of how temporality can help streamline operations. To go into the details while not revealing proprietary information, let's use an example of a tax system.&lt;/p&gt;

&lt;h2&gt;
  
  
  Situation
&lt;/h2&gt;

&lt;p&gt;To understand the challenges, let's start with an overview of the tax system. We first define some use cases for our hypothetical tax system, understand the structure of tables involved in recording tax returns for users, and deep dive into problems due to that structure.&lt;/p&gt;

&lt;h3&gt;
  
  
  Overview of Tax System
&lt;/h3&gt;

&lt;p&gt;The tax system is a single tool for residents of a country to submit their tax returns according to the tax percentages set at the state level. The system is used by two roles: administrators and taxpayers. To avoid confusion, please refrain from comparing this system to a real-world tax system, as it serves only as a reflection of the actual system we worked with. Our hypothetical tax system only supports the following use cases.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--QC2krKv3--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://umairabid.com/content/images/2023/08/Traveling-time-with-Postgres-timestamp-range-columns.svg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--QC2krKv3--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://umairabid.com/content/images/2023/08/Traveling-time-with-Postgres-timestamp-range-columns.svg" alt="" width="556" height="591"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Taxpayers, when they sign up, enroll themselves in tax types like income tax, capital gains tax, etc. Then each year, the system calculates the amount of tax that is due for that tax year and also allows them to enter the tax they paid throughout the year. For the sake of simplicity, how those two values, i.e., tax paid and tax due, are balanced is not our concern.&lt;/p&gt;

&lt;h3&gt;
  
  
  Structure of Critical Tables
&lt;/h3&gt;

&lt;p&gt;Although the problems spanned multiple tables, they can be generalized using two tables used for storing state taxes and tax returns. The &lt;code&gt;state_taxes&lt;/code&gt; table stores the &lt;code&gt;rate&lt;/code&gt; to calculate the tax due for the taxpayer. For example, if the income tax rate is &lt;code&gt;0.7&lt;/code&gt; and taxpayer income is &lt;code&gt;100$&lt;/code&gt; then the income tax due is &lt;code&gt;100 * 0.07 = $7&lt;/code&gt;. The rate varies by type of tax and state.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--oXnqKm5m--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://umairabid.com/content/images/2023/08/Traveling-time-with-Postgres-timestamp-range-columns-1.svg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--oXnqKm5m--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://umairabid.com/content/images/2023/08/Traveling-time-with-Postgres-timestamp-range-columns-1.svg" alt="" width="531" height="291"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;One important thing to point out here is that the system was not designed to handle varying versions of data over time, although we have the column &lt;code&gt;year&lt;/code&gt; in the table &lt;code&gt;state_taxes&lt;/code&gt;. The access patterns assumed one row per tax for a state and the type of tax when the table is joined or read directly. In other words, there is a &lt;code&gt;unique(state_id, type)&lt;/code&gt; constraint on the table. That essentially means you cannot add the same tax for the same type, for different years. To have some audit compatibilities rows were not updated, rather updates were applied by soft deleting the old and creating new rows with updates.&lt;/p&gt;

&lt;p&gt;The other table to consider is &lt;code&gt;tax_returns&lt;/code&gt; responsible for storing the tax returns of a specific taxpayer. The table has one row per tax type for each payer, it stores tax returns within that row in the form of a JSON array.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--iV7VgbXH--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://umairabid.com/content/images/2023/08/Traveling-time-with-Postgres-timestamp-range-columns--1--1.svg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--iV7VgbXH--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://umairabid.com/content/images/2023/08/Traveling-time-with-Postgres-timestamp-range-columns--1--1.svg" alt="" width="531" height="331"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;returns&lt;/code&gt; column was added as a solution for storing returns for each user while still conforming to having only one row per tax. The &lt;code&gt;deleted_at&lt;/code&gt; key served the same purpose for each JSON object as it did in &lt;code&gt;state_taxes&lt;/code&gt; the table.&lt;/p&gt;

&lt;h3&gt;
  
  
  Problems with the Underlying Structure
&lt;/h3&gt;

&lt;p&gt;The above structure functioned correctly only when data was added in a linear time order. However, a single retroactive update, whether to correct a mistake or add a new record, could introduce data inconsistencies. These inconsistencies sometimes led to data corruption, while in other cases, data loss occurred.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Data Loss on Updates&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Unlike the &lt;code&gt;returns&lt;/code&gt; column in &lt;code&gt;tax_returns&lt;/code&gt;, the &lt;code&gt;state_taxes&lt;/code&gt; table lacks a JSON column to store tax rates per year, presumably due to the absence of a use case for displaying rates for each tax year. As a result, any rate update, whether for correction or addition, results in the removal of the previous rate. In cases of retroactive updates, the system effectively loses the currently effective rate.&lt;/p&gt;

&lt;p&gt;For example, suppose admin has added rates for tax years 2021 and 2023 (currently effective). They later realized that the rate for 2021 was incorrect and wanted to update it. Now since &lt;code&gt;state_taxes&lt;/code&gt; can only support one row for a tax, adding a correct rate for 2021 will result in a loss of the 2023 rate. Another case is that rates were added correctly for years 2021 and 2023 but they missed adding a rate for 2022, now adding that rate will again overwrite the rate for 2023.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Data Corruption on Updates&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;tax_due&lt;/code&gt; in the &lt;code&gt;results&lt;/code&gt; column of &lt;code&gt;tax_returns&lt;/code&gt; is a dynamic value calculated based on existing data in the system i.e. &lt;code&gt;income * tax_ratio&lt;/code&gt;. Normally, such a calculated value shouldn't be stored, but due to the data loss issue mentioned earlier, it was necessary to save it to preserve the value using the tax rate effective at the time of calculation. However, this would be more akin to keeping the best possible value rather than the correct value.&lt;/p&gt;

&lt;p&gt;The value stored at the time of adding tax returns remains valid as long as the factors used for its calculation, such as the tax ratio and income, are not updated. If these factors are updated, the field will contain an incorrect value according to the current system data and cannot be verified. In some cases, it might be argued that having no value stored is preferable to having an outdated or unverifiable one&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Ineffective auditing capabilities&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The system is frequently used &lt;code&gt;deleted_at&lt;/code&gt; and soft deletes to prevent loss of information for auditing purposes. Since they were system level, not application level construct, they were quite ineffective in providing any help to address the problems we have seen so far, when retroactive changes were made. The best case scenario was using them to figure out if a version of data existed at some point in some system and that is it.&lt;/p&gt;

&lt;p&gt;In temporal systems, auditing capability is required at the application level to facilitate resolving risks. For example, let's say you bought a pair of shoes. After selling you that pair, the shop realized that the price was entered incorrectly in the system and they fixed it. Now, if you go back to return the shoes if they have a proper temporal system, they can quickly find out the effective price of shoes on the date when they were sold to you. Otherwise, there is no way for the system to find out the price on the date when the shoes were sold.&lt;/p&gt;

&lt;h3&gt;
  
  
  Expectations from the Temporal System
&lt;/h3&gt;

&lt;p&gt;What we went through while trying to uncover the problems were basically a consequence of implementing time-based systems without a proper structure to support temporal transactions. This now leads us to define expectations for a temporal system to avoid the problems that we uncovered while also making it easier for users to work with it.&lt;/p&gt;

&lt;h4&gt;
  
  
  Consistent Timelines
&lt;/h4&gt;

&lt;p&gt;As we have observed, when data validity is time-dependent, it results in multiple versions of data corresponding to different points in time. These variations collectively form timelines, and it is essential to maintain their consistency. Overlapping timelines can lead to indeterministic outcomes when attempting to identify a valid record for a specific date. To address this issue, consider the following example using the &lt;code&gt;state_taxes&lt;/code&gt; table, which employs an &lt;a href="https://en.wikipedia.org/wiki/Effective_date?ref=umairabid.com"&gt;effective date range&lt;/a&gt; to denote the validity of tax rates.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--alOVDqia--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://umairabid.com/content/images/2023/09/Traveling-time-with-Postgres-timestamp-range-columns.svg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--alOVDqia--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://umairabid.com/content/images/2023/09/Traveling-time-with-Postgres-timestamp-range-columns.svg" alt="" width="511" height="121"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;[start_time, end_time) is a convention to define ranges with start and end date. Here "[" means range includes start_time and ")" excludes end_time&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Now, let's consider the scenario where we need to determine the income tax rate effective on the date 2023-01-15. Upon inspecting the date ranges, we can identify that this date falls within the row with &lt;code&gt;id=1&lt;/code&gt;. In this case, obtaining a single row ensures determinism.&lt;/p&gt;

&lt;p&gt;However, if we attempt to find the rate for any date within February 2023, we would retrieve two rows. Consequently, for this rate, it becomes impossible to ascertain which ratio to apply. The motivation behind enforcing consistent timeframes is precisely to prevent such situations from arising.&lt;/p&gt;

&lt;h4&gt;
  
  
  Consistent Implementation across tables
&lt;/h4&gt;

&lt;p&gt;The implementation of temporal tables can vary from one table to another, and there may be situations where such customization is necessary. However, in most cases, it is not the ideal approach.&lt;/p&gt;

&lt;p&gt;For instance, consider a scenario where you need to join three temporal tables together, and each of these tables has implemented temporality differently. In such cases, fetching data in a single query can be challenging, if not entirely impossible.&lt;/p&gt;

&lt;p&gt;Moreover, while it may still be feasible to write data in such a setup, doing so often means sacrificing the potential for abstraction in both read and write patterns. A consistent implementation approach, on the other hand, enables seamless integration with Object-Relational Mapping (ORM) systems, making working with temporal tables a much more straightforward and efficient process."&lt;/p&gt;

&lt;h4&gt;
  
  
  Prevent the loss of information
&lt;/h4&gt;

&lt;p&gt;One of the fundamental reasons for incorporating a temporal aspect into your data is the preservation of information. In cases where information undergoes retroactive changes, it's crucial that the system retains the data as it existed before the alteration to maintain auditing capabilities.&lt;/p&gt;

&lt;p&gt;In monetary systems, calculations often depend on specific configurations, even if those configurations are initially incorrect. These incorrect configurations are utilized in calculations until corrected. When these configurations are rectified later, with their effective or validity period remaining the same but only the data being updated, the system is still expected to retain the original configurations. They can help with auditing when you need to check what was calculated before at a specific point in time.&lt;/p&gt;

&lt;h2&gt;
  
  
  Solution
&lt;/h2&gt;

&lt;p&gt;As you might have already discerned, while many of these challenges and expectations can be addressed by extending the current design, such as expanding JSON columns to cover other columns and implementing upsert hooks to maintain system consistency, it's evident that straightforward use cases can rapidly escalate the complexity of a system.&lt;/p&gt;

&lt;p&gt;In our forthcoming article, we will delve into a solution that tackles these issues without unnecessarily inflating system complexity.&lt;/p&gt;

</description>
      <category>casestudies</category>
      <category>systemdesign</category>
      <category>databasedesign</category>
      <category>temporalsystems</category>
    </item>
    <item>
      <title>Automation Engine Refactor for Performance and Maintainability</title>
      <dc:creator>Umair Abid</dc:creator>
      <pubDate>Mon, 07 Aug 2023 08:33:10 +0000</pubDate>
      <link>https://dev.to/umairabid/automation-engine-refactor-for-performance-and-maintainability-17lh</link>
      <guid>https://dev.to/umairabid/automation-engine-refactor-for-performance-and-maintainability-17lh</guid>
      <description>&lt;p&gt;Imagine starting your day with your mailbox full of outages due to all database connections being held up for an extensive period. Nobody likes it and our team went on a mission to ensure we never have such a day again, at least for the exact root cause.&lt;/p&gt;

&lt;h2&gt;
  
  
  Situation
&lt;/h2&gt;

&lt;p&gt;The problem originated from the Pipeline Automation Engine of our CRM app. A pipeline consists of a series of stages that a &lt;a href="https://en.wikipedia.org/wiki/Lead_generation?ref=umairabid.com"&gt;lead&lt;/a&gt; go through to either become a sale or be lost. Each stage has associated actions like sending emails or text, in addition to the move action which decides the next stage for the lead. To understand how the flow works, please consider the preliminary database design below.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--QYfZUWOB--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://umairabid.com/content/images/2023/08/Automation-Engine-Refactor-for-Performance-and-Maintainability.svg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--QYfZUWOB--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://umairabid.com/content/images/2023/08/Automation-Engine-Refactor-for-Performance-and-Maintainability.svg" alt="" width="476" height="342"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The right side of the design is relations or tables containing the configuration which dictates how automation will be executed. Whereas the left side helps run the pipeline automation for the specific lead. Here is a brief summary of each table,&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pipeline:&lt;/strong&gt; For example "Google Adwords Campaign", can be one pipeline to convert leads from google adwords campaign to sales.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pipelines Stages:&lt;/strong&gt; Contains stages for each pipeline, for example, "Inquired", "Responded" etc can be the stages that a lead goes through.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pipeline Stage Actions:&lt;/strong&gt; Send an introduction email and then move them to the "responded" stage would be an example of how actions work together, where sending an email and moving them to the stage are separate actions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Lead:&lt;/strong&gt; Any internet user who clicked on your ad, landed on your page, and gave their information.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Lead Stages:&lt;/strong&gt; Contains all the stages a lead has been or is currently in.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Lead Stage Actions:&lt;/strong&gt; All the actions which have been performed on lead are recorded by stage in this table. As soon as the lead enters in stage, this table is also populated with actions for that stage. Actions are executed serially.&lt;/p&gt;

&lt;h2&gt;
  
  
  Problem
&lt;/h2&gt;

&lt;p&gt;The beauty of startups is that you build something for one purpose and customers may use it in all different ways except the one it was intended for. This automation feature was built to manage the lead automation coming from landing pages, but one of our customers imported around 16k leads and ran automation on all of them. This caused an instant outage, where the connections were held up by queries coming from the automation system code. When we investigated the code scheduled to run after every five minutes, the problem became very apparent. Below is the simplified version of that code.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
for each lead in leads
  lead_stages = get_lead_stages
  for each lead_stage in lead_stages
    last_performed_action = lead_stage.last_performed_action
    sequence_number = last_performed_action.sequence_number || 0
    action_to_perform = lead_stage.pipleline_stage.action_after(sequence_number)

    if action_to_perform
        action_to_perform.perform

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The thing which instantly comes out and explains the problem is that we are querying the full leads table after every five minutes, some non-apparent problems which were adding fuel to the fire were,&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The job had no unique clause or any preventive measures to not schedule the job if the previously scheduled was still running&lt;/li&gt;
&lt;li&gt;No eager loading is being used&lt;/li&gt;
&lt;li&gt;Truly brute force, not making any use of information already stored in the system to determine which leads and actions need to be performed. Hence too many unnecessary computations.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Solution
&lt;/h2&gt;

&lt;p&gt;The brute-force nature of the solution provided an obvious hint for the solution i.e. limit unnecessary computations. Considering the major source of unnecessary computations was scanning the leads table, we could also rephrase the problem to "How do we only fetch the leads which have pending stage actions". Once the problem was stated, the solution was a no-brainer since we can easily filter out the leads for whom all stage actions have been executed.&lt;/p&gt;

&lt;p&gt;Couple the above improvement which significantly reduced the leads every time the job is run with improvement over making the job unique and not scheduling it if the is still in progress, the two quick fixes helped us to resolve the outage, but we had to ask how long until the next outage?&lt;/p&gt;

&lt;h3&gt;
  
  
  Challenges on the Horizon
&lt;/h3&gt;

&lt;p&gt;This was one of the core features where performance was not only expected but needed to be guaranteed under specific SLAs (e.g. the next action should be performed within 2 minutes after performing the previous one). Considering how one customer used the system in a way it was not intended to be used, it was only a matter of time before other customers put the system under identical stress. The system had to be rethought and replanned to at least give the first few hundred customers the best experience while we invested in other parts of the app.&lt;/p&gt;

&lt;p&gt;After a few discussions and meetings, the following problems (in order of their priority) were identified to be fixed,&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;One lead action should not block another lead action. Sending emails or texts can be an expensive operations&lt;/li&gt;
&lt;li&gt;Performing actions can fail due to any number of reasons and the system is missing the retry ability. This especially becomes important due to the rate limits of third-party services. This also aggregates the first problem.&lt;/li&gt;
&lt;li&gt;Importing 16000 leads is fine but adding them all into automation at once is not, especially when multiple accounts do that in a narrow window of time. There should be a limit on how many automation can be scheduled per account.&lt;/li&gt;
&lt;li&gt;The query to fetch leads with pending actions might still return where no further action is required. For example

&lt;ul&gt;
&lt;li&gt;"wait" action can be used to add a buffer between actions until the wait time is over, no action can be performed on lead&lt;/li&gt;
&lt;li&gt;Some actions might be triggered when as a response from the lead, like a reply to an email or text. Until a reply is received or the threshold to receive a reply is not over, no action can be performed on the lead.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Scaling upwards
&lt;/h3&gt;

&lt;p&gt;The first two problems pointed out that our system is missing two key pieces, making the automation loop async and applying throttling on automations by the organization. For rest, we also needed to augment &lt;code&gt;lead_stage_actions&lt;/code&gt; table to store some extra information which would help filter out the leads if they are pending on user action or just need to be scheduled at some time in the future. To work around the problems we added,&lt;/p&gt;

&lt;p&gt;Two columns in the &lt;code&gt;lead_stage_actions&lt;/code&gt; table,&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;perform_at&lt;/code&gt; Action can only be performed after this timestamp&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;status&lt;/code&gt; Hold status for lead actions, only pending actions can be performed &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;and few classes, two fundamental classes were &lt;code&gt;Scheduler&lt;/code&gt; responsible for querying and distributing actions to their appropriate handlers and &lt;code&gt;AutomationActionHandler&lt;/code&gt; which all individual action handlers extend from (e.g. &lt;code&gt;EmailActionHandler&lt;/code&gt;, &lt;code&gt;SmsActionHandler&lt;/code&gt; etc.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--pouKz2Po--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://umairabid.com/content/images/2023/08/Automation-Engine-Refactor-for-Performance-and-Maintainability---2.svg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--pouKz2Po--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://umairabid.com/content/images/2023/08/Automation-Engine-Refactor-for-Performance-and-Maintainability---2.svg" alt="" width="672" height="380"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Knitting Everything Together
&lt;/h3&gt;

&lt;p&gt;Eventually, we replaced the original automation loop with the following flow encapsulating primary automation flow end to end. The dashed lines represent the async/indirect flow where the next step is not executed in the same process. Few highlights of the flow&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The scheduler is lean and only depends on one simple query.&lt;/li&gt;
&lt;li&gt;All action handlers get executed in separate threads independently.&lt;/li&gt;
&lt;li&gt;Rate limiting is applied on the action handlers level, allowing users to add leads in stages but preventing organizations to use more than allocated processing.&lt;/li&gt;
&lt;li&gt;The automation fails gracefully in case of errors.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--mWml4-xK--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://umairabid.com/content/images/2023/08/Automation-Engine-Refactor-for-Performance-and-Maintainability---3.svg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--mWml4-xK--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://umairabid.com/content/images/2023/08/Automation-Engine-Refactor-for-Performance-and-Maintainability---3.svg" alt="" width="689" height="1211"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Aftermath
&lt;/h2&gt;

&lt;p&gt;After the release, we continue monitoring the performance and user activities but nothing major came up, except tweaking limits and small bug fixes here and there. We had some concerns that relying on status to identify pending action may run into concurrency issues but since the application was not supposed to run on a massive scale just yet, we relied on database locks to ensure consistency.&lt;/p&gt;

&lt;h3&gt;
  
  
  Shoutout
&lt;/h3&gt;

&lt;p&gt;Delivering this project was not possible without &lt;a href="https://www.linkedin.com/in/syed-wasif-raza-4a850476?ref=umairabid.com"&gt;Wasif Raza&lt;/a&gt;, we both worked as peers on this project and thoroughly enjoyed it.&lt;/p&gt;

</description>
      <category>casestudies</category>
      <category>systemdesign</category>
      <category>backgroundprocessing</category>
      <category>automations</category>
    </item>
  </channel>
</rss>
