<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Ivan Kwong</title>
    <description>The latest articles on DEV Community by Ivan Kwong (@ivankwongtszfung).</description>
    <link>https://dev.to/ivankwongtszfung</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F622344%2F797e9072-ac34-4cb7-a6ba-12745a5b6dd4.png</url>
      <title>DEV Community: Ivan Kwong</title>
      <link>https://dev.to/ivankwongtszfung</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/ivankwongtszfung"/>
    <language>en</language>
    <item>
      <title>What did I learn by reviewing merge sort again.</title>
      <dc:creator>Ivan Kwong</dc:creator>
      <pubDate>Thu, 19 Aug 2021 16:09:18 +0000</pubDate>
      <link>https://dev.to/ivankwongtszfung/what-did-i-learn-by-reviewing-merge-sort-again-90</link>
      <guid>https://dev.to/ivankwongtszfung/what-did-i-learn-by-reviewing-merge-sort-again-90</guid>
      <description>&lt;p&gt;After graduated from college, I rarely study in algorithm questions, because I don't have to implement special algorithm most of my time. I did use BFS one or twice when I deal with the workflow problem, but most of the time you don't bother. Also, software engineering is a board topic. Being a good software engineer means more than just using efficient algorithm, and a readable O(n^2) solution is better than a complicated O(nlogn) solution if there are no performance issues.&lt;br&gt;
Therefore, this is very refreshing and eye-opening when I revisit the merge sort algorithm.&lt;/p&gt;

&lt;p&gt;The implementation of merge sort is O(nlogn) because T(n) = 2T(n/2) + C, the C is n, you could think of we do this O(n) operation log(n) times.&lt;/p&gt;

&lt;p&gt;This is something that I oversee when I learn this algorithm, when C is something that is better then O(n^2) this is automatically a better solution than O(n^2).&lt;/p&gt;

&lt;p&gt;Let say we want to count the number of value which is smaller than self like &lt;a href="https://leetcode.com/problems/count-of-smaller-numbers-after-self/"&gt;this question&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Since counting numbers that is larger than self is O(n), then we can use the idea of merge sort to divide and conquer the problem and obtain O(nlogn).&lt;/p&gt;

</description>
      <category>algorithms</category>
      <category>python</category>
    </item>
    <item>
      <title>Safe update operation in PostgreSQL using SQLAlchemy</title>
      <dc:creator>Ivan Kwong</dc:creator>
      <pubDate>Sat, 08 May 2021 09:46:59 +0000</pubDate>
      <link>https://dev.to/ivankwongtszfung/safe-update-operation-in-postgresql-using-sqlalchemy-3ela</link>
      <guid>https://dev.to/ivankwongtszfung/safe-update-operation-in-postgresql-using-sqlalchemy-3ela</guid>
      <description>&lt;p&gt;If you are using the default isolation mode in PostgreSQL. When you try to update your database in your web application, ACID transaction will &lt;strong&gt;not&lt;/strong&gt; safe you from race condition. if you run the code below with more than one process running. (which multi processes are very common for web servers)&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;sqlalchemy&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;create_engine&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;sqlalchemy.orm&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Session&lt;/span&gt;
&lt;span class="c1"&gt;# setup
&lt;/span&gt;&lt;span class="n"&gt;engine&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;create_engine&lt;/span&gt;&lt;span class="p"&gt;(...)&lt;/span&gt;
&lt;span class="n"&gt;session&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Session&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;engine&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# read from the table Account
&lt;/span&gt;&lt;span class="n"&gt;account&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Account&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# modify the record, account is decimal
&lt;/span&gt;&lt;span class="n"&gt;account&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;account&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;
&lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;commit&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Why is this not safe? This is because SELECT will not pose any locks in row level. It blocks only when update happens which it locks the row exclusively (&lt;code&gt;FOR UPDATE/FOR NO KEY UPDATE&lt;/code&gt;). Race condition could happen in this scenario.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Transaction 1 (T1)&lt;/th&gt;
&lt;th&gt;Transaction 2 (T2)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;START&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SELECT amount&lt;/td&gt;
&lt;td&gt;START&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;amount += 100&lt;/td&gt;
&lt;td&gt;SELECT amount&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;UPDATE amount&lt;/td&gt;
&lt;td&gt;amount += 100&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;COMMIT&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;UPDATE amount&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;COMMIT &lt;code&gt;SAME value as T1&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Since T1 doesn't commit before T2 reads, so T2 will select the same value as T1 did. which eventually commits the same thing again.&lt;/p&gt;

&lt;p&gt;So, how to prevent this situation in default isolation mode. (Read Committed mode)&lt;/p&gt;

&lt;h3&gt;
  
  
  Solution 1: Just don't read
&lt;/h3&gt;

&lt;p&gt;The above snippet is implementing with an anti-pattern called &lt;br&gt;
&lt;code&gt;read-modify-write&lt;/code&gt;. One way to prevent that is to not read and change the value directly with the column value. Given read is not super necessary in this scenario.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# same session setting as above
&lt;/span&gt;&lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Account&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;filter_by&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;\
     &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;update&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="s"&gt;"amount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Account&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;commit&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Solution 2: Update Lock
&lt;/h3&gt;

&lt;p&gt;There is some scenarios that you want to do a complicated modification and you just have to read first. In this case, you could use update lock when you read from the database. In SQLAlchemy, there is a method called &lt;code&gt;with_for_update&lt;/code&gt; which locks the row you want to change with a &lt;code&gt;FOR UPDATE&lt;/code&gt; lock. &lt;code&gt;FOR UPDATE&lt;/code&gt; lock is not self compatible, another transactions have to wait until this transaction release the lock.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# same session setting as above
# locks the row that id = 1
&lt;/span&gt;&lt;span class="n"&gt;account&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Account&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;filter_by&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;\
     &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;with_for_update&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="n"&gt;one&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;account&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;account&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;
&lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;commit&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="c1"&gt;# save and release the lock
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Solution 3: Version tracking (optimistic locking)
&lt;/h3&gt;

&lt;p&gt;If you want to rollback all other processes which run any updates on the same rows. You can add a version column in the table to track the updates on this row. Let say we have version v in some rows, when you wants to update the row, you will search the row along with version v and update the version to v + 1.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;update&lt;/span&gt; &lt;span class="n"&gt;account&lt;/span&gt; &lt;span class="k"&gt;set&lt;/span&gt; &lt;span class="k"&gt;version&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt; &lt;span class="k"&gt;where&lt;/span&gt; &lt;span class="k"&gt;version&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="p"&gt;...;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is quite annoying to implement ourselves, Fortunately, most of the ORM library support version tracking. In SQLAlchemy, you can set the version column name in the mapper, so when you update, it will manage the version for you and if some processes work on the same row, it will raise an Exception.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Account&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Base&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;__tablename__&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"account"&lt;/span&gt;
    &lt;span class="p"&gt;...&lt;/span&gt;
    &lt;span class="n"&gt;version&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Column&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Integer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;nullable&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;__mapper_args__&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s"&gt;"version_id_col"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;version&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;version_tracking&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;change&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;account&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Account&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;account&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;account&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;change&lt;/span&gt;
        &lt;span class="n"&gt;print_account&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;account&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;change&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;commit&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="n"&gt;StaleDataError&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"someone has changed the account, plz retry."&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="c1"&gt;# some actions...
&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;All the solutions given above, assume we are using the Read Committed mode which is the default isolation model in Postgres. I also assume you are not using any aggregation, when you select the target rows. There are some more ways to prevent the same problem, such as using a stricter isolation mode so that the transactions will act like working serially. However, we will not go through this in this article.&lt;/p&gt;

&lt;h3&gt;
  
  
  For more Info.
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://github.com/ivankwongtszfung/proof_of_concept/tree/main/psql_acid_test"&gt;All Source Code&lt;/a&gt;&lt;br&gt;
&lt;a href="https://www.kite.com/python/docs/sqlalchemy.sql.expression.GenerativeSelect.with_for_update"&gt;SQLAlchemy for update (kite)&lt;/a&gt;&lt;br&gt;
&lt;a href="https://docs.sqlalchemy.org/en/14/orm/versioning.html"&gt;SQLAlchemy Version Tracking&lt;/a&gt;&lt;br&gt;
&lt;a href="https://www.postgresql.org/docs/current/transaction-iso.html"&gt;PSQL Transaction Isolation&lt;/a&gt;&lt;br&gt;
&lt;a href="https://www.postgresql.org/docs/current/explicit-locking.html"&gt;PSQL Locking&lt;/a&gt;&lt;/p&gt;

</description>
      <category>python</category>
      <category>postgres</category>
      <category>database</category>
    </item>
  </channel>
</rss>
