<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Veerpal</title>
    <description>The latest articles on DEV Community by Veerpal (@veerpalb).</description>
    <link>https://dev.to/veerpalb</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F371338%2F9290b24d-6cef-47cf-8283-59b427159bfa.png</url>
      <title>DEV Community: Veerpal</title>
      <link>https://dev.to/veerpalb</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/veerpalb"/>
    <language>en</language>
    <item>
      <title>How to Run a Program or Script Hourly on macOS</title>
      <dc:creator>Veerpal</dc:creator>
      <pubDate>Thu, 18 Jan 2024 12:44:07 +0000</pubDate>
      <link>https://dev.to/veerpalb/how-to-run-a-program-or-script-hourly-on-macos-1opm</link>
      <guid>https://dev.to/veerpalb/how-to-run-a-program-or-script-hourly-on-macos-1opm</guid>
      <description>&lt;p&gt;Do you have a program or bash script that needs to run continuously or on a specific time interval on your Mac? The solution lies in using &lt;code&gt;launchd&lt;/code&gt;, an Apple-recommended approach and an open-source service management framework. &lt;/p&gt;

&lt;h2&gt;
  
  
  What is &lt;code&gt;launchd&lt;/code&gt;?
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;launchd&lt;/code&gt; is an open-source service management framework recommended by Apple. It enables you to "start, stop, and manage various processes, including daemons, applications, and scripts" [1]. For the purpose of this blog, we'll concentrate on working with a launch agent, a process that runs on behalf of the logged-in user.&lt;/p&gt;

&lt;h2&gt;
  
  
  How Does &lt;code&gt;launchd&lt;/code&gt; Work?
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Generate a Plist File&lt;/strong&gt;: Create a property list (plist) file, which stores preferences in XML format. Use any text editor to define which program or script to run and how often. I'll explain the structure of the file more below. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Save to &lt;code&gt;~/Library/LaunchAgents/&lt;/code&gt;&lt;/strong&gt;: Save the plist file to the &lt;code&gt;~/Library/LaunchAgents/&lt;/code&gt; folder. The system monitors this folder and uses the plist to run your program or script based on the specified time frequency.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Use &lt;code&gt;launchctl&lt;/code&gt; for Testing&lt;/strong&gt;: The &lt;code&gt;launchctl&lt;/code&gt; command-line utility helps start, stop, and load your job for testing purposes.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Creating a Plist File
&lt;/h2&gt;

&lt;p&gt;A plist file is a straightforward XML file with key-value entries. Here's an overview of the key entries and their significance:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Label (Required Key): The Label key is mandatory, serving as the unique name for your job. It must be distinctive to avoid conflicts with other jobs.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Program: The Program key specifies the program or script you want to run. In our example, it points to a script containing the logic you want to execute hourly. If you're using a script, ensure it is set to be executable by your user. You can achieve this with the command &lt;code&gt;chmod +x &amp;lt;path/to/script&amp;gt;&lt;/code&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;StartInterval: The StartInterval key determines how frequently your job should run, specified in seconds. For running a job every hour, set it to 3600 seconds (60 seconds/minute * 60 minutes/hour).&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;li&gt;&lt;p&gt;StandardOutPath, StandardInPath, StandardErrorPath: These keys allow you to define the paths for standard output, standard input, and standard error logs, respectively. It's useful for organizing and accessing logs related to your job.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Below is an example plist file for running a script every hour:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;&lt;span class="cp"&gt;&amp;lt;?xml version="1.0" encoding="UTF-8"?&amp;gt;&lt;/span&gt;
&lt;span class="cp"&gt;&amp;lt;!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;plist&lt;/span&gt; &lt;span class="na"&gt;version=&lt;/span&gt;&lt;span class="s"&gt;"1.0"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;dict&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;key&amp;gt;&lt;/span&gt;Label&lt;span class="nt"&gt;&amp;lt;/key&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;string&amp;gt;&lt;/span&gt;local.example.script.start&lt;span class="nt"&gt;&amp;lt;/string&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;key&amp;gt;&lt;/span&gt;Program&lt;span class="nt"&gt;&amp;lt;/key&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;string&amp;gt;&lt;/span&gt;/Users/me/path/to/file/script.sh&lt;span class="nt"&gt;&amp;lt;/string&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;key&amp;gt;&lt;/span&gt;StartInterval&lt;span class="nt"&gt;&amp;lt;/key&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;integer&amp;gt;&lt;/span&gt;3600&lt;span class="nt"&gt;&amp;lt;/integer&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;key&amp;gt;&lt;/span&gt;StandardOutPath&lt;span class="nt"&gt;&amp;lt;/key&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;string&amp;gt;&lt;/span&gt;/Users/me/path/to/logs/log.stdout&lt;span class="nt"&gt;&amp;lt;/string&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;key&amp;gt;&lt;/span&gt;StandardInPath&lt;span class="nt"&gt;&amp;lt;/key&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;string&amp;gt;&lt;/span&gt;/Users/me/path/to/logs/log.stdin&lt;span class="nt"&gt;&amp;lt;/string&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;key&amp;gt;&lt;/span&gt;StandardErrorPath&lt;span class="nt"&gt;&amp;lt;/key&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;string&amp;gt;&lt;/span&gt;/Users/me/path/to/logs/log.stderr&lt;span class="nt"&gt;&amp;lt;/string&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/dict&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/plist&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Ensure that your plist file adheres to this structure, and customize the values accordingly to suit your specific requirements.&lt;/p&gt;

&lt;h2&gt;
  
  
  Testing Your Launch Agent
&lt;/h2&gt;

&lt;p&gt;Use the following command to load a new job:&lt;code&gt;launchctl load -w ~/Library/LaunchAgents/local.example.script.start.plist&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Once loaded, run your job and check the final status. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;launchctl start local.example.script.start&lt;/code&gt;: Start a specific job.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;launchctl list | grep "local.example"&lt;/code&gt;: Check the status of your job. A status of zero is a successful run. &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you encounter a non-zero status, you can decipher them using: &lt;code&gt;launchctl error my_err_code&lt;/code&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Using LaunchControl
&lt;/h2&gt;

&lt;p&gt;For a more user-friendly experience and better error messages, consider using third-party software like LaunchControl. It verifies your plist file and helps identify issues. For instance, if your script is not executable, LaunchControl makes this clear in the UI and provides a clear error messages more precise then the output of &lt;code&gt;launchctl error&lt;/code&gt;. You can download LaunchControl for free, and the trial version allows you to verify your plist files.&lt;/p&gt;

&lt;h2&gt;
  
  
  Resources
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;a href="https://launchd.info/"&gt;Launchd.info&lt;/a&gt;: An excellent resource for learning more about configuring and running launch agents.&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>macos</category>
      <category>launchctl</category>
    </item>
    <item>
      <title>Solving Top K Frequent Objects with Count Min Sketch</title>
      <dc:creator>Veerpal</dc:creator>
      <pubDate>Fri, 29 Sep 2023 21:55:24 +0000</pubDate>
      <link>https://dev.to/veerpalb/solving-top-k-frequent-objects-with-count-min-sketch-3io6</link>
      <guid>https://dev.to/veerpalb/solving-top-k-frequent-objects-with-count-min-sketch-3io6</guid>
      <description>&lt;p&gt;A recent system design problem I came across is how to calculate top-K items at a high scale. For instance, determining the top 100 videos on a streaming site.&lt;/p&gt;

&lt;p&gt;In the "leetcode" version of a top K problem, a hash or a heap track the count of an item. However, both hash and heap have a space complexity of &lt;code&gt;O(n)&lt;/code&gt;. For 1 billion videos, that equates to 4GB of data – 8GB if you consider the need to store both video ID and count. Additionally, a heap has a &lt;code&gt;log(n)&lt;/code&gt; insertion time, so as more videos are tracked, updates will slow down. &lt;/p&gt;

&lt;p&gt;There's an alternative for counting a large number of items: the Count-Min Sketch.&lt;/p&gt;

&lt;h3&gt;
  
  
  Enter the Count-Min Sketch
&lt;/h3&gt;

&lt;p&gt;The Count-Min Sketch (CMS) is a probabilistic data structure. It provides approximate counts for large-scale data streams using limited memory.&lt;/p&gt;

&lt;p&gt;A CMS comprises many arrays of a fixed size &lt;code&gt;n&lt;/code&gt;. This &lt;code&gt;n&lt;/code&gt; can be smaller than the total number of items you're tracking. Each array has an associated hash function.&lt;/p&gt;

&lt;p&gt;When you want to increment the count of an item, iterate over each array. For each array, compute the item's hash. The resulting value is an index. To raise the count, you increment the value at that index.&lt;/p&gt;

&lt;p&gt;To find out the final count for an item, repeat the hashing process for each row. Instead of increasing the count, get the current value at the index. The item's count is the minimum value across all rows.&lt;/p&gt;

&lt;p&gt;Let's walk through an example for clarity.&lt;/p&gt;

&lt;h3&gt;
  
  
  Walkthrough
&lt;/h3&gt;

&lt;p&gt;Suppose we have a CMS with three arrays, each of size 4.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;0&lt;/th&gt;
&lt;th&gt;0&lt;/th&gt;
&lt;th&gt;0&lt;/th&gt;
&lt;th&gt;0&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Now, let's say we want to increment the count for videoOne. After hashing the video ID for each row, assume:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;hash_row_1(video_1) = 0&lt;/li&gt;
&lt;li&gt;hash_row_2(video_1) = 3&lt;/li&gt;
&lt;li&gt;hash_row_3(video_1) = 2&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Our CMS would then be:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;*&lt;em&gt;1 *&lt;/em&gt;
&lt;/th&gt;
&lt;th&gt;0&lt;/th&gt;
&lt;th&gt;0&lt;/th&gt;
&lt;th&gt;0&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;1&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;1&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Now, suppose we increment the count of another video, videoTwo. Using our hashing, the CMS would be:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;1&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;1&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;0&lt;/th&gt;
&lt;th&gt;0&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;2&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Collisions
&lt;/h3&gt;

&lt;p&gt;Note the collision in row 2, where both videos hashed to the same index. That's why when determining the count for an item, we take the minimum value across all arrays. For example, the count for videoOne is the minimum of (1, 2, 1), which is 1. It's improbable for two videos to hash identically across all rows. Hence, even if some rows have collisions, we use the minimum across all rows to determine the count.&lt;/p&gt;

&lt;p&gt;This makes the CMS an "approximation" algorithm. It does not guarantee an accurate count. It may overestimate but will never underestimate the real count.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;a href="https://redis.com/blog/count-min-sketch-the-art-and-science-of-estimating-stuff/"&gt;This blog&lt;/a&gt; mentions that with a "depth of 10 and a width of 2,000, the probability of not having an error is 99.9%" Increasing the depth of the CMS can further reduce the error rate.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Why Use an Approximation Algorithm?
&lt;/h3&gt;

&lt;p&gt;Why would we use an approximation algorithm that might not return precise results? CMS is memory-efficient since it uses fixed space for estimates. Regardless of how many items we track, its size remains constant. For instance, a 10x4000 CMS uses only 160KB, considerably less than the 4GB required for a heap.&lt;/p&gt;

&lt;p&gt;Additionally, a CMS has constant-time update and lookup, compared to the &lt;code&gt;log(n)&lt;/code&gt; update in a heap. This makes the CMS a faster solution, crucial when dealing with millions or billions of items.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;A min heap of size K is still used to track the final &lt;code&gt;K&lt;/code&gt; videos. For each item, update the sketch, estimate the count, and check if this estimate surpasses the heap's minimum. If so, the heap is updated. The computational cost of updating the min heap remains O(log(k)). A heap of a size like 100 remains manageable in memory compared to one of a million.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The primary trade-off with a CMS is accuracy for space and speed. In most high-scale systems, accuracy is vital. Therefore, a CMS can be paired with a more precise solution, such as MapReduce.&lt;/p&gt;

&lt;h3&gt;
  
  
  Using Count-Min Sketch with MapReduce
&lt;/h3&gt;

&lt;p&gt;The overarching strategy is to utilize CMS for instant, estimated updates on the top k videos. In the background, run more time-intensive calculations with MapReduce to achieve an accurate top k. Periodically, the count min estimates are refreshed with the precise calculations from MapReduce.&lt;/p&gt;

&lt;h4&gt;
  
  
  Workflow
&lt;/h4&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Real-time Updates:&lt;/strong&gt; Utilize Count-Min Sketch for immediate top k video estimates.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Batch Processing:&lt;/strong&gt; Periodically employ MapReduce for precise counts.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Refinement:&lt;/strong&gt; Refresh the CMS using the exact MapReduce values.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The result is the best of both worlds: immediate insights with gradually improved precision.&lt;/p&gt;

&lt;h3&gt;
  
  
  Conclusion
&lt;/h3&gt;

&lt;p&gt;Approximation algorithms, like the CMS, are effective for managing vast amounts of data without excessive storage requirements. If accuracy matters, such algorithms can be supplemented with more extended, accurate calculations, providing precise counts at intervals. &lt;/p&gt;

&lt;h3&gt;
  
  
  Code
&lt;/h3&gt;

&lt;p&gt;You can view a simple implementation of a CMS in this &lt;a href="https://gist.github.com/VeerpalBrar/9fb5cb9b0963a1396f4e961f6be69922"&gt;github gist&lt;/a&gt;. &lt;/p&gt;

&lt;h3&gt;
  
  
  Source
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://redis.com/blog/count-min-sketch-the-art-and-science-of-estimating-stuff/"&gt;Count-Min Sketch: The Art and Science of Estimating Stuff&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://towardsdatascience.com/big-data-with-sketchy-structures-part-1-the-count-min-sketch-b73fb3a33e2a#:~:text=Properties%20of%20Count%2DMin%20Sketch&amp;amp;text=We%20increment%20some%20counters%2C%20but,in%20both%20time%20and%20space."&gt;Big Data with Sketchy Structures, Part 1 — the Count-Min Sketch&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>systemdesign</category>
      <category>datastructures</category>
    </item>
    <item>
      <title>Connecting applications in Minikube</title>
      <dc:creator>Veerpal</dc:creator>
      <pubDate>Fri, 09 Jun 2023 21:55:03 +0000</pubDate>
      <link>https://dev.to/veerpalb/connecting-applications-in-minikube-4h70</link>
      <guid>https://dev.to/veerpalb/connecting-applications-in-minikube-4h70</guid>
      <description>&lt;p&gt;Over the past few months, I've been learning about Kubernetes through a side project. As I work with Minikube to run a local cluster with multiple services, I find myself just scratching the surface of Kubernetes. In this blog post, I aim to document my current understanding of the various ways applications in Minikube can connect to each other, the host machine, and the outside world.&lt;/p&gt;

&lt;p&gt;First, here is the setup I'm working with currently: I am using Minikube to create my cluster on my local machine. My cluster is running different services which need to communicate with one another. Some of these services talk to a database running on my host machine, outside of the cluster. Finally, some of these services expose HTTP ports outside the cluster, where a "user" can make API requests to. &lt;/p&gt;

&lt;h3&gt;
  
  
  Connecting to the host machine's database
&lt;/h3&gt;

&lt;p&gt;Within a Kubernetes cluster, services are isolated from the external environment. My database resides on my laptop outside of my cluster. I needed my services to connect to this database. Thankfully, Minikube offers a convenient solution by adding the host.minikube.internal hostname entry to the /etc/hosts file. This allows services to access the host's IP address and establish a connection with the database.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; &amp;gt; minikube ssh
Last login: Sat Mar 4 00:43:49 2023 from 192.168.49.1
docker@minikube:~$ cat /etc/hosts
127.0.0.1   localhost
::1 localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
192.168.49.2    minikube
192.168.65.2    host.minikube.internal
192.168.49.2    control-plane.minikube.internal
docker@minikube:~$
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Accessing Other Services:
&lt;/h3&gt;

&lt;p&gt;With Kubernetes, each pod in a cluster has a unique IP and can connect to other pods without extra network configuration. This extends to services as well which are an abstraction layer around a group of pods.&lt;/p&gt;

&lt;p&gt;To access a service within the cluster, you can utilize [the service name and port]((&lt;a href="https://kubernetes.io/docs/tasks/access-application-cluster/access-cluster-services/#manually-constructing-apiserver-proxy-urls"&gt;https://kubernetes.io/docs/tasks/access-application-cluster/access-cluster-services/#manually-constructing-apiserver-proxy-urls&lt;/a&gt;). For example, if there's an authentication service named auth running on port 5000, other services can connect to it using &lt;a href="http://auth:5000"&gt;http://auth:5000&lt;/a&gt;. It's important to note that this URL is not exposed outside the cluster and is limited to inter-cluster communication.&lt;/p&gt;

&lt;h3&gt;
  
  
  Utilizing Ingress for Gateway Applications:
&lt;/h3&gt;

&lt;p&gt;Ingress is a powerful tool for exposing HTTP and HTTPS routes externally in a Kubernetes cluster. By defining routing rules, Ingress allows external requests to be directed to different applications within the cluster. It's worth mentioning that Ingress supports only HTTP and HTTPS, while other protocols and ports require alternative services such as NodePort or LoadBalancer. (I only used ClusterIP services so far and so omit discussion about NodePort or LoadBalancer services from this post.)&lt;/p&gt;

&lt;p&gt;To set up Ingress, you define URLs to be exposed and specify the service within the cluster that each URL should route to. Consider the following example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: gateway-ingress
spec:
  rules:
    - host: my-public-url.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service: 
                name: gateway
                port:
                  number: 8080
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the above configuration, requests made to my-public-url will be automatically routed to the gateway service by Ingress.&lt;/p&gt;

&lt;p&gt;When working with Minikube, running minikube tunnel is essential to enable external access. Once set up, accessing my-public-url in a browser will route the request to the cluster running on your computer.&lt;/p&gt;

&lt;h3&gt;
  
  
  Conclusion:
&lt;/h3&gt;

&lt;p&gt;In this blog post, I've shared my learnings from working with Minikube and exploring connectivity in Kubernetes. While I've only scratched the surface, I hope this article provides valuable insights. As I continue to learn and delve deeper into Kubernetes, I may update this post with new insights in the future.&lt;/p&gt;

&lt;h3&gt;
  
  
  Resouces
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://dev.to/narasimha1997/communication-between-microservices-in-a-kubernetes-cluster-1n41"&gt;Communication between Microservices in a Kubernetes cluster&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://kubernetes.io/docs/tasks/access-application-cluster/access-cluster-services/"&gt;Access Services Running on Clusters&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://kubernetes.io/docs/tasks/access-application-cluster/ingress-minikube/"&gt;Set up Ingress on Minikube with the NGINX Ingress Controller&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://medium.com/@zhaoyi0113/kubernetes-how-does-service-network-work-in-the-cluster-d235b69ff536"&gt;Kubernetes — How does service network work in the cluster&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://kubebyexample.com/learning-paths/application-development-kubernetes/lesson-3-networking-kubernetes/exposing"&gt;Exposing Applications for Internal Access&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>minikube</category>
      <category>kubernetes</category>
      <category>beginners</category>
    </item>
    <item>
      <title>Using a bash function to push a docker image</title>
      <dc:creator>Veerpal</dc:creator>
      <pubDate>Fri, 09 Jun 2023 21:53:42 +0000</pubDate>
      <link>https://dev.to/veerpalb/using-a-bash-function-to-push-a-docker-image-320e</link>
      <guid>https://dev.to/veerpalb/using-a-bash-function-to-push-a-docker-image-320e</guid>
      <description>&lt;p&gt;I've been learning a bit about docker and found myself repeating the same commands over and over again to push a docker image. I decided to see if I could create an alias for multiple commands in bash. &lt;/p&gt;

&lt;p&gt;A quick google search shows that you can define functions in your &lt;code&gt;.bashrc&lt;/code&gt; to run multiple commands at once. In the end, this is what I came up with.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;function docker_push {
  LINE=$(docker build . 2&amp;gt;&amp;amp;1 | grep "writing image sha256")
  IMAGE_SHA=$(echo  $LINE |  awk '{print substr($0,26,10)}')
  docker tag $IMAGE_SHA $1
  docker push $1
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then you can run &lt;code&gt;docker_push username/repo:version&lt;/code&gt; to push your docker image. &lt;/p&gt;

&lt;p&gt;If you're bash knowledge is as rusty as mine, here is a quick breakdown of how &lt;code&gt;docker_push&lt;/code&gt; works. &lt;/p&gt;

&lt;h2&gt;
  
  
  Redirect docker build output to grep
&lt;/h2&gt;

&lt;p&gt;First off, I knew I wanted to grep the output for &lt;code&gt;docker build .&lt;/code&gt; for the docker image SHA256 code. I tried to do &lt;code&gt;docker build . | grep "writing image sha256"&lt;/code&gt; but that resulted in an empty file. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://forums.docker.com/t/capture-ouput-of-docker-build-into-a-log-file/123178/2"&gt;Then I realized that docker build outputs to stderr, not stdout.&lt;/a&gt; &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;💡 Bash automatically provides &lt;a href="https://catonmat.net/bash-one-liners-explained-part-three#:~:text=When%20bash%20starts%20it%20opens,them%20and%20read%20from%20them."&gt;3 types of file descriptors.&lt;/a&gt; There is stdout (file descriptor 1), stderr (file descriptor 2), and stdin (file descriptor 0). Commands read from stdin and then output to stdout or stdin. &lt;br&gt;
When we use &lt;code&gt;|&lt;/code&gt; in bash, we are piping the stdout of the first command as the stdin of the second command. &lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Therefore I used &lt;code&gt;2&amp;gt;&amp;amp;1&lt;/code&gt; to redirect the stderr of &lt;code&gt;docker build&lt;/code&gt; command to the stdout file descriptor instead. Then I could use &lt;code&gt;|&lt;/code&gt; to redirect the stdout of &lt;code&gt;docker build .&lt;/code&gt; to the stdin of the grep command. &lt;/p&gt;

&lt;p&gt;This let me grep for the line with the SHA256 code. I save the output of grep into a variable for later use. This is done with &lt;code&gt;MY_VAR=$(COMMAND)&lt;/code&gt; syntax, where the result of &lt;code&gt;COMMAND&lt;/code&gt; is saved to &lt;code&gt;MY_VAR&lt;/code&gt;. &lt;/p&gt;

&lt;p&gt;For reference, the value of &lt;code&gt;LINE&lt;/code&gt; is something like &lt;code&gt;#11 writing image sha256:ee19794e19c05bfab071c3e3593379a20ae9b59cf0dd47ac0c39274e0333e6b2 done&lt;/code&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Extracting the SHA256 code from the grep output
&lt;/h2&gt;

&lt;p&gt;Next, I used &lt;code&gt;awk&lt;/code&gt; to get the substring of the &lt;code&gt;LINE&lt;/code&gt; that contains the beginning of the SHA256 code. &lt;/p&gt;

&lt;p&gt;Since I know that &lt;code&gt;LINE&lt;/code&gt; always starts with &lt;code&gt;#11 writing image sha256:&lt;/code&gt;, I decided to get the substring starting at character 26, and get the next 10 characters, which are the starting of the SHA256 code. I did this with &lt;code&gt;awk '{print substr($0,26,10)}'&lt;/code&gt; (full credit: &lt;a href="https://stackoverflow.com/questions/24427009/is-there-a-cleaner-way-of-getting-the-last-n-characters-of-every-line"&gt;stackoverflow&lt;/a&gt;)&lt;/p&gt;

&lt;p&gt;Again, awk reads from &lt;code&gt;stdin&lt;/code&gt; so I used &lt;code&gt;echo $LINE&lt;/code&gt; to get the value of &lt;code&gt;LINE&lt;/code&gt; and then I redirected that to stdin of &lt;code&gt;awk&lt;/code&gt;. I save the result in $IMAGE_SHA. &lt;/p&gt;

&lt;h2&gt;
  
  
  Getting function arguments.
&lt;/h2&gt;

&lt;p&gt;Now that I have the image SHA256 code, I can pass that as an argument to &lt;code&gt;docker tag&lt;/code&gt;. The &lt;code&gt;docker tag&lt;/code&gt; command needs the SHA256 code and the repo tag. Since the repo tag value changes based on which repo you are working with, I decided to pass that in as an argument to &lt;code&gt;docker_push&lt;/code&gt;. Then I can use &lt;code&gt;$1&lt;/code&gt; to reference the first argument passed to my function. &lt;/p&gt;

&lt;p&gt;So if I call &lt;code&gt;docker_push username/repo:version&lt;/code&gt; then the value of &lt;code&gt;$1&lt;/code&gt; is "username/repo:version". &lt;/p&gt;

&lt;h3&gt;
  
  
  Sources:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://stackoverflow.com/questions/7131670/make-a-bash-alias-that-takes-a-parameter"&gt;Make a bash alias that takes a function&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://stackoverflow.com/questions/5955577/automatically-capture-output-of-last-command-into-a-variable-using-bash"&gt;Capture output of a command in a variable&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://unix.stackexchange.com/questions/400038/send-stderr-to-stdout-for-purposes-of-grep"&gt;Redirect stderr to stdout&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://catonmat.net/bash-one-liners-explained-part-three#:~:text=When%20bash%20starts%20it%20opens,them%20and%20read%20from%20them."&gt;Bash One-Liners Explained, Part III: All about redirection&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>docker</category>
      <category>bash</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Dependency Management With Bundler</title>
      <dc:creator>Veerpal</dc:creator>
      <pubDate>Fri, 09 Jun 2023 21:51:00 +0000</pubDate>
      <link>https://dev.to/veerpalb/dependency-management-with-bundler-34hm</link>
      <guid>https://dev.to/veerpalb/dependency-management-with-bundler-34hm</guid>
      <description>&lt;p&gt;The &lt;code&gt;venv&lt;/code&gt; module in python isolates packages of one python project from another project. I remember trying to install flask and running into dependency conflicts until I learned about &lt;code&gt;venv&lt;/code&gt;. Recently, I started wondering why I don't run into the same issues when working with rails. This led me down the rabbit hole of learning about bundler and dependency isolation. &lt;/p&gt;

&lt;h3&gt;
  
  
  What is bundler?
&lt;/h3&gt;

&lt;p&gt;Bundler is a popular ruby gem used to install project dependencies instead of installing each gem via ruby gem. An application can define a &lt;code&gt;Gemfile&lt;/code&gt; with all the project's gem dependencies. Then &lt;code&gt;bundle install&lt;/code&gt; will install each of the gems. It will also resolve any dependency conflicts. For example, assume the application depends on &lt;code&gt;gem_a&lt;/code&gt; and &lt;code&gt;gem_b&lt;/code&gt;.  It requires &lt;code&gt;gem_a&lt;/code&gt; to be version 3 or higher. &lt;code&gt;gem_b&lt;/code&gt; also depends on &lt;code&gt;gem_a&lt;/code&gt; but it requires version 4 or higher. Bundler will install version 4 since that satisfies all dependencies.  Bundler then creates a &lt;code&gt;Gemfile.lock&lt;/code&gt; file which lists all the gems installed and their versions. This makes it easy for another developer to install the same dependencies on their computer. &lt;/p&gt;

&lt;h3&gt;
  
  
  What is bundle exec?
&lt;/h3&gt;

&lt;p&gt;So how does that offer dependency isolation? Well, it is common to run rails with &lt;code&gt;bundle exec&lt;/code&gt; (ie &lt;code&gt;bundle exec rspec &amp;lt;path/to/file&amp;gt;&lt;/code&gt;). Running &lt;code&gt;bundle exec&lt;/code&gt; ensures all the gems specified in the &lt;code&gt;Gemfile&lt;/code&gt; are automatically available to the ruby application via &lt;code&gt;require&lt;/code&gt;. More so, it ensures that only those gems are available. So if you have many versions of a gem installed, it will ensure only the version specified in the &lt;code&gt;Gemfile&lt;/code&gt; is available to the application. &lt;/p&gt;

&lt;p&gt;For example, &lt;code&gt;require 'json'&lt;/code&gt; will always use the latest version of &lt;code&gt;json&lt;/code&gt; installed on the computer. So if another application is using a higher version of the gem, that version may be imported instead of the version you intended. &lt;/p&gt;

&lt;p&gt;With &lt;code&gt;bundle exec&lt;/code&gt;, only the versions specified in the &lt;code&gt;Gemfile&lt;/code&gt; will be available, which ensures the correct version is imported by &lt;code&gt;require&lt;/code&gt;. &lt;/p&gt;

&lt;h3&gt;
  
  
  Ruby load path
&lt;/h3&gt;

&lt;p&gt;So, how exactly does &lt;code&gt;bundle exec&lt;/code&gt; ensure the correct version is used by the application? &lt;/p&gt;

&lt;p&gt;RubyGem uses a global variable called &lt;code&gt;$LOAD_PATH&lt;/code&gt;, which stores the path to a gem on a computer. &lt;code&gt;require&lt;/code&gt; uses the &lt;code&gt;$LOAD_PATH&lt;/code&gt; to find the gem and import it. By default, the $LOAD_PATH has the path to the latest version of a gem.   &lt;/p&gt;

&lt;p&gt;However &lt;code&gt;bundle exec&lt;/code&gt; &lt;a href="https://github.com/rubygems/rubygems/blob/master/bundler/lib/bundler/runtime.rb#L16"&gt;overrides the &lt;code&gt;$LOAD_PATH&lt;/code&gt;&lt;/a&gt;to contain paths to the gems in the Gemfile (with the version specified in the Gemfile) and only those gems. This ensures that the correct version of each gem is always used regardless of which other versions may be installed on the computer. &lt;/p&gt;

&lt;h3&gt;
  
  
  Testing this in practice
&lt;/h3&gt;

&lt;p&gt;You can see this in action by running code that requires the JSON gem and then prints the load path. It also converts a hash to JSON.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;require 'json'
pp $LOAD_PATH

print JSON.generate({"key"=&amp;gt;"http://www.example.com/test"}, escape_slash: true)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Without bundler, it loads the latest json gem I have installed (2.6.3). Notice this version of the json gem escapes the slashes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
 % ruby json_with_escape.rb
["/Users/veerpalbrar/.rvm/gems/ruby-2.7.2/gems/json-2.6.3/lib",
 "/Users/veerpalbrar/.rvm/gems/ruby-2.7.2/extensions/x86_64-darwin-21/2.7.0/json-2.6.3",
 "/Users/veerpalbrar/.rvm/rubies/ruby-2.7.2/lib/ruby/site_ruby/2.7.0",
...]

{"key":"http:\/\/www.example.com\/test"}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When I create a Gemfile specifying version 2.3.1, and run the ruby file with &lt;code&gt;bundle exec&lt;/code&gt;, you can see that version 2.3.1 is listed in the $LAOD_PATH. You can also see this version of the gem doesn't escape the slashes in the url.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; % bundle exec ruby json_with_escape.rb
["/Users/veerpalbrar/.rvm/gems/ruby-2.7.2/gems/bundler-2.3.19/lib",
 "/Users/veerpalbrar/.rvm/gems/ruby-2.7.2/gems/json-2.3.1/lib",
 "/Users/veerpalbrar/.rvm/gems/ruby-2.7.2/extensions/x86_64-darwin-21/2.7.0/json-2.3.1",
 "/Users/veerpalbrar/.rvm/rubies/ruby-2.7.2/lib/ruby/site_ruby/2.7.0",
...]
{"key":"http://www.example.com/test"}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is why your application needs to specify its dependencies.  If the gem is updated, you want the application to continue to use the older version and not break existing behaviour. &lt;code&gt;bundler&lt;/code&gt; is one tool you can you for this dependency management. &lt;/p&gt;

&lt;h3&gt;
  
  
  Sources
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href=""&gt;Bundler&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.brianstorti.com/understanding-bundler-setup-process/"&gt;Understanding Bundler Setup Process&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://medium.com/@aayushsharda/why-to-use-load-path-in-ruby-ce971bc1d864#:~:text=%24LOAD_PATH%20is%20used%20for%20the,the%20dependencies%20in%20the%20project"&gt;Why to use $LOAD_PATH in ruby&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://webapps-for-beginners.rubymonstas.org/libraries/load_path.html"&gt;Ruby load_path&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ruby</category>
      <category>bundler</category>
      <category>rails</category>
    </item>
    <item>
      <title>Database updates using a quorum</title>
      <dc:creator>Veerpal</dc:creator>
      <pubDate>Wed, 11 May 2022 19:47:26 +0000</pubDate>
      <link>https://dev.to/veerpalb/database-updates-using-a-quorum-90a</link>
      <guid>https://dev.to/veerpalb/database-updates-using-a-quorum-90a</guid>
      <description>&lt;h3&gt;
  
  
  Problem Statement
&lt;/h3&gt;

&lt;p&gt;In a distributed system, you want many replica's of your database to ensure that data is never lost. The challenge with database replica's is ensuring the data stay's consistent across replica's. If you update the data in one database, all the replica's should also get updated. &lt;/p&gt;

&lt;p&gt;One approach is to update all the replica's on write but this can cause your system to become unreliable. If one replica is unavailable, the replica's would be out of sync. When even one replica is unavailable, the system can not write to the database.  The more database replica's there are, the more likely it is that a replica will be unavailable. &lt;/p&gt;

&lt;p&gt;One solution to the database consistency problem is to use a quorum. &lt;/p&gt;

&lt;h3&gt;
  
  
  What is a quorum?
&lt;/h3&gt;

&lt;p&gt;A quorum is the minimum number of nodes that need to perform an operation for it to be considered a success. Usually, the quorum will be a number that represents a majority. By not requiring all nodes to accept an operation, we make our system more fault tolerant. You can continue to perform read and write operations as long as most of the replica's are available. This is reliable because it's unlikely many replica's will be unavailable at the same time. &lt;/p&gt;

&lt;h3&gt;
  
  
  Example execution of a write operation
&lt;/h3&gt;

&lt;p&gt;Consider the case where we want to update to a row in our database. We need a majority of the replica's to agree to the update for it to be considered successful.&lt;/p&gt;

&lt;p&gt;If we have 5 replica's (&lt;code&gt;N1&lt;/code&gt;, &lt;code&gt;N2&lt;/code&gt;, &lt;code&gt;N3&lt;/code&gt;, &lt;code&gt;N4&lt;/code&gt;, &lt;code&gt;N5&lt;/code&gt;), we would push the update to all the replica's. We need to have three replica's  to form a quorum. Meaning three replica's need to respond and say the update was successful. For example, if  &lt;code&gt;N1&lt;/code&gt;, &lt;code&gt;N3&lt;/code&gt;, and &lt;code&gt;N4&lt;/code&gt;  respond to the update request, we have formed a quorum. We can tell the client the write was successful without waiting for a response from &lt;code&gt;N2&lt;/code&gt; and &lt;code&gt;N5&lt;/code&gt;. Note that &lt;code&gt;N2&lt;/code&gt; and &lt;code&gt;N5&lt;/code&gt; will still process the update if they are available. &lt;/p&gt;

&lt;p&gt;You can see a simple example of this below. In &lt;code&gt;wait_for_result&lt;/code&gt;, we wait for a response from the different "nodes".  Once we have enough responses to form a quorum we return and consider the write successful. &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Aside: I use threads and the &lt;code&gt;sleep&lt;/code&gt; function to represent how nodes take varying amounts of time to respond. I also kill threads early to mimic how some replica's can be unavailable and not receive the update.&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;class Quorum
  attr_reader :nodes

  def initialize(nodes)
    @nodes = nodes
  end

  def write(key, value)
    wait_for_result(:write, key, value, Time.now)
  end

  private

  def quorum_size
    @size ||= (@nodes.length / 2.to_f).ceil
  end

  def wait_for_result(action, *args)
    responses = []
    tasklist = []

    # Set the threads going
    puts "STARTING #: #{action} #{args}"
    nodes.each do |node|
      task = Thread.new do
        sleep(rand(3)); #mimic the variable response times from the network
        result = node.send(action, *args)
        responses.push(result)
      end
      tasklist &amp;lt;&amp;lt; task
    end

    # Wait for quorum to be formed
    sleep 0.1 while responses.length &amp;lt; quorum_size

    # thread clean up
    tasklist.each { |task|
      task.kill if task.alive?
    }

    puts "FINISHED #: #{action} #{args}"
    responses
  end
end
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Even if some nodes are unavailable, the other nodes successfully process the update. A quorum is formed and the operation is considered a success. Now, this could lead to some unavailable nodes not having the latest data. I'll show how we handle conflicts later on. &lt;/p&gt;

&lt;h3&gt;
  
  
  Example execution of a read operation
&lt;/h3&gt;

&lt;p&gt;Similar to how we have a quorum for the write operation, we need to form a quorum for reading data. If we were to only read from one replica, we risk returning outdated data if the replica is not up to date. &lt;/p&gt;

&lt;p&gt;Instead, we send the read request to all the replica's and wait for enough responses to form a quorum. If all the replica's in the quorum return the same data, we can assume the data is up to date and return it to the client.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;class quorum
  attr_reader :nodes

  def initialize(nodes)
    @nodes = nodes
  end

  def read(key)
    results = wait_for_result(:read, key)
    if read_conflicts?(results)
      raise "Conflicting reads"
    end

    puts "No conflicts"
    results.first[:value]
  end
end
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Conflict Resolution in Reads
&lt;/h4&gt;

&lt;p&gt;Sometimes, the replica's  in the quorum may not have the same data. If one of the replica's was unavailable during a previous update, it will have outdated data. &lt;/p&gt;

&lt;p&gt;That is why we have to check if all the replica's return the same result for the read operation. If the result is different, it means that some of the replica's have outdated data. &lt;/p&gt;

&lt;p&gt;In this case, we should return the result of the most recent update. If you look at the code for the write operation, you can see we save a timestamp with each write. We can use the timestamp to see which replica has the most recent update. This is the result we will return to the client.  &lt;/p&gt;

&lt;p&gt;Once we resolve a read conflict, we should update all the replica's to ensure they are  up-to-date.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;class quorum
  attr_reader :nodes

  def initialize(nodes)
    @nodes = nodes
  end

  def read(key)
    results = wait_for_result(:read, key)
    if read_conflicts?(results)
      puts "Conflicting reads: #{results.map{|r| r ? r[:value] : nil}.uniq}"

      latest_value = latest_value(results)
      wait_for_result(:write, key, latest_value[:value], latest_value[:time])

      return latest_value[:value]
    end

    puts "No conflicts"
    results.first[:value]
  end

  private


  def read_conflicts?(results)
    results.map { |result| result ? result[:value] : nil }.uniq.size &amp;gt; 1
  end

  def latest_value(results)
    results.reduce(nil) do |latest, result|
      if result &amp;amp;&amp;amp; (!latest || result[:time] &amp;gt; latest[:time])
        result
      else
        latest
      end
    end
  end
end

### SAMPLE OUTOUT 
STARTING #: write [:foo, "bar", 2022-05-10 15:35:52 -0400]
FINISHED #: write [:foo, "bar", 2022-05-10 15:35:52 -0400]
STARTING #: read [:foo]
FINISHED #: read [:foo]
Conflicting reads: ["bar", nil]
STARTING #: write [:foo, "bar", 2022-05-10 15:35:52 -0400]
FINISHED #: write [:foo, "bar", 2022-05-10 15:35:52 -0400]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Achieving consistency
&lt;/h3&gt;

&lt;p&gt;How can we be certain that one of the read results will be the most recent data? What if all the replica's in the quorum are out of data? Well, remember that we need a majority to form a write quorum. Likewise, when we read data, we need a response from a majority of the replica's. Thus, there will be an overlap between the replica's that are part of the write quorum and the read quorum. So we will see at least one response from a replica that was part of the last update. Thus, we can be certain that we will see the most recent result returned by at least one replica.&lt;/p&gt;

&lt;h3&gt;
  
  
  Conclusion
&lt;/h3&gt;

&lt;p&gt;In conclusion, when you have many database replica's, you need a system to keep the replica's in sync. Using a quorum is one way to ensure you provide consistent results while having a reliable and fault tolerant system. &lt;/p&gt;

&lt;h3&gt;
  
  
  Code
&lt;/h3&gt;

&lt;p&gt;View the code from this post in &lt;a href="https://gist.github.com/VeerpalBrar/9481931b396d89767e6b4aeca97715ec"&gt;github&lt;/a&gt;. &lt;/p&gt;

&lt;h3&gt;
  
  
  Sources
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Educative Grokking the System Design Interview course. &lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.youtube.com/watch?v=uNxl3BFcKSA"&gt;Distributed Systems 5.2: Quorums by Martin Kleppmann&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>quorum</category>
      <category>systemdesign</category>
    </item>
    <item>
      <title>Fixing N+1 queries when using validates_associated with has_many</title>
      <dc:creator>Veerpal</dc:creator>
      <pubDate>Tue, 21 Dec 2021 16:56:50 +0000</pubDate>
      <link>https://dev.to/veerpalb/fixing-n1-queries-when-using-validatesassociated-with-hasmany-4194</link>
      <guid>https://dev.to/veerpalb/fixing-n1-queries-when-using-validatesassociated-with-hasmany-4194</guid>
      <description>&lt;p&gt;In ActiveRecord, when validating an object, &lt;a href="https://apidock.com/rails/ActiveRecord/Validations/ClassMethods/validates_associated"&gt;&lt;code&gt;validates_accociated&lt;/code&gt;&lt;/a&gt; validates any associated objects. Assume that an author has many books. Every time the author is validated, &lt;code&gt;validates_associated&lt;/code&gt; also validates the author's books.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Author&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ActiveRecord&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Base&lt;/span&gt;
 &lt;span class="n"&gt;has_many&lt;/span&gt; &lt;span class="ss"&gt;:books&lt;/span&gt;
 &lt;span class="n"&gt;validates_associated&lt;/span&gt; &lt;span class="ss"&gt;:books&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Book&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ActiveRecord&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Base&lt;/span&gt;
 &lt;span class="n"&gt;belongs_to&lt;/span&gt; &lt;span class="ss"&gt;:author&lt;/span&gt;
 &lt;span class="n"&gt;has_one&lt;/span&gt; &lt;span class="ss"&gt;:cover&lt;/span&gt;
 &lt;span class="n"&gt;validates&lt;/span&gt; &lt;span class="ss"&gt;:title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;presence: &lt;/span&gt;&lt;span class="kp"&gt;true&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;Author&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;last&lt;/span&gt;
&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"New Name"&lt;/span&gt;
&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;save!&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can see the associated object validation in the logs. When saving the author, all the author's books are loaded into memory for validation.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="no"&gt;TRANSACTION&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="n"&gt;ms&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;begin&lt;/span&gt; &lt;span class="n"&gt;transaction&lt;/span&gt;

&lt;span class="no"&gt;Book&lt;/span&gt; &lt;span class="no"&gt;Load&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.3&lt;/span&gt;&lt;span class="n"&gt;ms&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="no"&gt;SELECT&lt;/span&gt; &lt;span class="s2"&gt;"books"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;*&lt;/span&gt; &lt;span class="no"&gt;FROM&lt;/span&gt; &lt;span class="s2"&gt;"books"&lt;/span&gt; &lt;span class="no"&gt;WHERE&lt;/span&gt; &lt;span class="s2"&gt;"books"&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="s2"&gt;"author_id"&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="s2"&gt;"author_id"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
&lt;span class="no"&gt;Author&lt;/span&gt; &lt;span class="no"&gt;Update&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.4&lt;/span&gt;&lt;span class="n"&gt;ms&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="no"&gt;UPDATE&lt;/span&gt; &lt;span class="s2"&gt;"authors"&lt;/span&gt; &lt;span class="no"&gt;SET&lt;/span&gt; &lt;span class="s2"&gt;"name"&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="no"&gt;WHERE&lt;/span&gt; &lt;span class="s2"&gt;"authors"&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="s2"&gt;"id"&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="s2"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"New Name"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;

&lt;span class="no"&gt;TRANSACTION&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;1.2&lt;/span&gt;&lt;span class="n"&gt;ms&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;commit&lt;/span&gt; &lt;span class="n"&gt;transaction&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;validates_associated&lt;/code&gt; is a quick way to ensure that active records objects dependent on each other don't become invalid when one of the objects changes. It may seem like a good idea to add this to all your models and always be confident that models are valid.&lt;/p&gt;

&lt;p&gt;However, it's important to not overuse this method. Assume that a book has a cover. Every time the book updates, the cover also needs to be validated to ensure the cover has the correct title.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Author&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ActiveRecord&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Base&lt;/span&gt;
 &lt;span class="n"&gt;has_many&lt;/span&gt; &lt;span class="ss"&gt;:books&lt;/span&gt;
 &lt;span class="n"&gt;validates_associated&lt;/span&gt; &lt;span class="ss"&gt;:books&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Book&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ActiveRecord&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Base&lt;/span&gt;
 &lt;span class="n"&gt;belongs_to&lt;/span&gt; &lt;span class="ss"&gt;:author&lt;/span&gt;
 &lt;span class="n"&gt;has_one&lt;/span&gt; &lt;span class="ss"&gt;:cover&lt;/span&gt;
 &lt;span class="n"&gt;validates&lt;/span&gt; &lt;span class="ss"&gt;:title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;presence: &lt;/span&gt;&lt;span class="kp"&gt;true&lt;/span&gt;
 &lt;span class="n"&gt;validates_associated&lt;/span&gt; &lt;span class="ss"&gt;:cover&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Cover&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ActiveRecord&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Base&lt;/span&gt;
 &lt;span class="n"&gt;belongs_to&lt;/span&gt; &lt;span class="ss"&gt;:book&lt;/span&gt;
 &lt;span class="n"&gt;validates_presence_of&lt;/span&gt; &lt;span class="ss"&gt;:book&lt;/span&gt;

 &lt;span class="n"&gt;validate&lt;/span&gt; &lt;span class="ss"&gt;:cover_has_correct_title?&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;Author&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;last&lt;/span&gt;
&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"New Name"&lt;/span&gt;
&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;save!&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When saving the author model, all the author's books are still loaded into memory for validation. All the covers for each book are &lt;em&gt;also&lt;/em&gt; loaded into memory one at a time. This uses N+1 queries to fetch all the books and covers from the database.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="no"&gt;TRANSACTION&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="n"&gt;ms&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;begin&lt;/span&gt; &lt;span class="n"&gt;transaction&lt;/span&gt;
&lt;span class="no"&gt;Book&lt;/span&gt; &lt;span class="no"&gt;Load&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="n"&gt;ms&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="no"&gt;SELECT&lt;/span&gt; &lt;span class="s2"&gt;"books"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;*&lt;/span&gt; &lt;span class="no"&gt;FROM&lt;/span&gt; &lt;span class="s2"&gt;"books"&lt;/span&gt; &lt;span class="no"&gt;WHERE&lt;/span&gt; &lt;span class="s2"&gt;"books"&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="s2"&gt;"author_id"&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="s2"&gt;"author_id"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;

&lt;span class="no"&gt;Cover&lt;/span&gt; &lt;span class="no"&gt;Load&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="n"&gt;ms&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="no"&gt;SELECT&lt;/span&gt; &lt;span class="s2"&gt;"covers"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;*&lt;/span&gt; &lt;span class="no"&gt;FROM&lt;/span&gt; &lt;span class="s2"&gt;"covers"&lt;/span&gt; &lt;span class="no"&gt;WHERE&lt;/span&gt; &lt;span class="s2"&gt;"covers"&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="s2"&gt;"book_id"&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="no"&gt;LIMIT&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="s2"&gt;"book_id"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"LIMIT"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
&lt;span class="no"&gt;Cover&lt;/span&gt; &lt;span class="no"&gt;Load&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="n"&gt;ms&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="no"&gt;SELECT&lt;/span&gt; &lt;span class="s2"&gt;"covers"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;*&lt;/span&gt; &lt;span class="no"&gt;FROM&lt;/span&gt; &lt;span class="s2"&gt;"covers"&lt;/span&gt; &lt;span class="no"&gt;WHERE&lt;/span&gt; &lt;span class="s2"&gt;"covers"&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="s2"&gt;"book_id"&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="no"&gt;LIMIT&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="s2"&gt;"book_id"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"LIMIT"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
&lt;span class="no"&gt;Cover&lt;/span&gt; &lt;span class="no"&gt;Load&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="n"&gt;ms&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="no"&gt;SELECT&lt;/span&gt; &lt;span class="s2"&gt;"covers"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;*&lt;/span&gt; &lt;span class="no"&gt;FROM&lt;/span&gt; &lt;span class="s2"&gt;"covers"&lt;/span&gt; &lt;span class="no"&gt;WHERE&lt;/span&gt; &lt;span class="s2"&gt;"covers"&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="s2"&gt;"book_id"&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="no"&gt;LIMIT&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="s2"&gt;"book_id"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"LIMIT"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;

&lt;span class="no"&gt;Author&lt;/span&gt; &lt;span class="no"&gt;Update&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.4&lt;/span&gt;&lt;span class="n"&gt;ms&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="no"&gt;UPDATE&lt;/span&gt; &lt;span class="s2"&gt;"authors"&lt;/span&gt; &lt;span class="no"&gt;SET&lt;/span&gt; &lt;span class="s2"&gt;"name"&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="no"&gt;WHERE&lt;/span&gt; &lt;span class="s2"&gt;"authors"&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="s2"&gt;"id"&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="s2"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"New Name"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;

&lt;span class="no"&gt;TRANSACTION&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="n"&gt;ms&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;commit&lt;/span&gt; &lt;span class="n"&gt;transaction&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Book covers need to be validated when a book updates not when the author information changes. Adding &lt;code&gt;validates_associated&lt;/code&gt; to a model is a simple change with potential performance hits. Now multiple database calls are made whenever an author's information changes.&lt;/p&gt;

&lt;h3&gt;
  
  
  Solution 1: Narrow the scope of validation
&lt;/h3&gt;

&lt;p&gt;The first solution I've found to this problem is to narrow the scope of validation. Consider the scenarios where a model can be invalid. Then, set up your validation to only trigger in that scenario instead of trigger on every validation check.&lt;/p&gt;

&lt;p&gt;In the author-book-cover example, a cover can be invalid if the title of the book changes. Then, the code should only validate the cover if a book's title has changed. This is possible by using some of the &lt;code&gt;validates_associated&lt;/code&gt; configuration options.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Book&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ActiveRecord&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Base&lt;/span&gt;
 &lt;span class="n"&gt;belongs_to&lt;/span&gt; &lt;span class="ss"&gt;:author&lt;/span&gt;
 &lt;span class="n"&gt;has_one&lt;/span&gt; &lt;span class="ss"&gt;:cover&lt;/span&gt;
 &lt;span class="n"&gt;validates&lt;/span&gt; &lt;span class="ss"&gt;:title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;presence: &lt;/span&gt;&lt;span class="kp"&gt;true&lt;/span&gt;
 &lt;span class="n"&gt;validates_associated&lt;/span&gt; &lt;span class="ss"&gt;:cover&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;if: &lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;title_changed?&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With this approach, &lt;code&gt;validates_associated&lt;/code&gt; checks if the title has changed. If it has, the associated cover is validated. Otherwise, assume that the cover is still valid from the last time the cover was validated.&lt;/p&gt;

&lt;p&gt;Now, if you look at the logs, you can see that the N+1 query does not happen:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="no"&gt;TRANSACTION&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="n"&gt;ms&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;begin&lt;/span&gt; &lt;span class="n"&gt;transaction&lt;/span&gt;
&lt;span class="no"&gt;Book&lt;/span&gt; &lt;span class="no"&gt;Load&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="n"&gt;ms&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="no"&gt;SELECT&lt;/span&gt; &lt;span class="s2"&gt;"books"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;*&lt;/span&gt; &lt;span class="no"&gt;FROM&lt;/span&gt; &lt;span class="s2"&gt;"books"&lt;/span&gt; &lt;span class="no"&gt;WHERE&lt;/span&gt; &lt;span class="s2"&gt;"books"&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="s2"&gt;"author_id"&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="s2"&gt;"author_id"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;

&lt;span class="no"&gt;Author&lt;/span&gt; &lt;span class="no"&gt;Update&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.4&lt;/span&gt;&lt;span class="n"&gt;ms&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="no"&gt;UPDATE&lt;/span&gt; &lt;span class="s2"&gt;"authors"&lt;/span&gt; &lt;span class="no"&gt;SET&lt;/span&gt; &lt;span class="s2"&gt;"name"&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="no"&gt;WHERE&lt;/span&gt; &lt;span class="s2"&gt;"authors"&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="s2"&gt;"id"&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="s2"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"New Name"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;

&lt;span class="no"&gt;TRANSACTION&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;1.7&lt;/span&gt;&lt;span class="n"&gt;ms&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;commit&lt;/span&gt; &lt;span class="n"&gt;transaction&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;By being more fine-grained with your validation, you can ensure you do not trigger unnecessary processing.&lt;/p&gt;

&lt;h3&gt;
  
  
  Solution #2
&lt;/h3&gt;

&lt;p&gt;If you need to validate books and covers every time an author updates, then avoid using &lt;code&gt;validate_associated&lt;/code&gt;. Instead, have the author load both books and covers before running the validation.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Author&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ActiveRecord&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Base&lt;/span&gt;
 &lt;span class="n"&gt;has_many&lt;/span&gt; &lt;span class="ss"&gt;:books&lt;/span&gt;
 &lt;span class="n"&gt;validate&lt;/span&gt; &lt;span class="ss"&gt;:books_are_valid&lt;/span&gt;

 &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;books_are_valid&lt;/span&gt;
 &lt;span class="n"&gt;books&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;preload&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;:cover&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;all?&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="ss"&gt;:valid?&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Book&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ActiveRecord&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Base&lt;/span&gt;
 &lt;span class="n"&gt;belongs_to&lt;/span&gt; &lt;span class="ss"&gt;:author&lt;/span&gt;
 &lt;span class="n"&gt;has_one&lt;/span&gt; &lt;span class="ss"&gt;:cover&lt;/span&gt;
 &lt;span class="n"&gt;validates&lt;/span&gt; &lt;span class="ss"&gt;:title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;presence: &lt;/span&gt;&lt;span class="kp"&gt;true&lt;/span&gt;
 &lt;span class="n"&gt;validates_associated&lt;/span&gt; &lt;span class="ss"&gt;:cover&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;Author&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;last&lt;/span&gt;
&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"New Name"&lt;/span&gt;
&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;save!&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;preload&lt;/code&gt; loads all the covers for all the author's books in one database query. This avoids the N+1 query problem caused by loading the covers for each book one at a time.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="no"&gt;TRANSACTION&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="n"&gt;ms&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;begin&lt;/span&gt; &lt;span class="n"&gt;transaction&lt;/span&gt;

&lt;span class="no"&gt;Book&lt;/span&gt; &lt;span class="no"&gt;Load&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="n"&gt;ms&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="no"&gt;SELECT&lt;/span&gt; &lt;span class="s2"&gt;"books"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;*&lt;/span&gt; &lt;span class="no"&gt;FROM&lt;/span&gt; &lt;span class="s2"&gt;"books"&lt;/span&gt; &lt;span class="no"&gt;WHERE&lt;/span&gt; &lt;span class="s2"&gt;"books"&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="s2"&gt;"author_id"&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="s2"&gt;"author_id"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
&lt;span class="no"&gt;Cover&lt;/span&gt; &lt;span class="no"&gt;Load&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="n"&gt;ms&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="no"&gt;SELECT&lt;/span&gt; &lt;span class="s2"&gt;"covers"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;*&lt;/span&gt; &lt;span class="no"&gt;FROM&lt;/span&gt; &lt;span class="s2"&gt;"covers"&lt;/span&gt; &lt;span class="no"&gt;WHERE&lt;/span&gt; &lt;span class="s2"&gt;"covers"&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="s2"&gt;"book_id"&lt;/span&gt; &lt;span class="no"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sc"&gt;?,&lt;/span&gt; &lt;span class="p"&gt;?,&lt;/span&gt; &lt;span class="sc"&gt;?)&lt;/span&gt; &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="s2"&gt;"book_id"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"book_id"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"book_id"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;

&lt;span class="no"&gt;Author&lt;/span&gt; &lt;span class="no"&gt;Update&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.4&lt;/span&gt;&lt;span class="n"&gt;ms&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="no"&gt;UPDATE&lt;/span&gt; &lt;span class="s2"&gt;"authors"&lt;/span&gt; &lt;span class="no"&gt;SET&lt;/span&gt; &lt;span class="s2"&gt;"name"&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="no"&gt;WHERE&lt;/span&gt; &lt;span class="s2"&gt;"authors"&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="s2"&gt;"id"&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="s2"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"New Name"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;

&lt;span class="no"&gt;TRANSACTION&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.9&lt;/span&gt;&lt;span class="n"&gt;ms&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;commit&lt;/span&gt; &lt;span class="n"&gt;transaction&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The downside to this approach is that the Author model is aware of the relationship between books and covers, coupling the three models together. That may be a trade-off you are willing to make to avoid calling the database more than necessary.&lt;/p&gt;

&lt;h3&gt;
  
  
  In conclusion
&lt;/h3&gt;

&lt;p&gt;In conclusion, Rails has a lot of "magic" methods that can make it easy to add new functionality. However, it can sometimes come with unintended consequences in practice such as N+1 queries.&lt;/p&gt;

&lt;p&gt;Sometimes extra validation makes you feel safer but it could be slowing down your code if you are not careful. Thus, it is better to add as much validation as you need and nothing more.&lt;/p&gt;

&lt;h4&gt;
  
  
  Code
&lt;/h4&gt;

&lt;p&gt;If you want to view and run the code mentioned in the blog post, you can see the source code in this &lt;a href="https://gist.github.com/VeerpalBrar/2fc3ec1913cabadcaeaec44c96223a40"&gt;github gist&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>ruby</category>
      <category>activerecord</category>
    </item>
    <item>
      <title>Include, Extend, and Prepend in Ruby</title>
      <dc:creator>Veerpal</dc:creator>
      <pubDate>Sat, 27 Nov 2021 00:14:07 +0000</pubDate>
      <link>https://dev.to/veerpalb/include-extend-and-prepend-in-ruby-3hbm</link>
      <guid>https://dev.to/veerpalb/include-extend-and-prepend-in-ruby-3hbm</guid>
      <description>&lt;p&gt;This month, I took the time to go back to basics and try to understand how &lt;code&gt;include&lt;/code&gt;, &lt;code&gt;extend&lt;/code&gt; and &lt;code&gt;prepend&lt;/code&gt; work in ruby.&lt;/p&gt;

&lt;h3&gt;
  
  
  Modules
&lt;/h3&gt;

&lt;p&gt;Ruby uses modules to share behaviour across classes. A module will contain all the logic for the desired behaviour. Any class which would like to use the same behaviour, can either &lt;code&gt;include&lt;/code&gt; or &lt;code&gt;extend&lt;/code&gt; the module.&lt;/p&gt;

&lt;p&gt;What is the difference between &lt;code&gt;include&lt;/code&gt; and &lt;code&gt;extend&lt;/code&gt;? When a class &lt;code&gt;include&lt;/code&gt;'s a module, it adds the module methods as &lt;em&gt;instance methods&lt;/em&gt; on the class.&lt;/p&gt;

&lt;p&gt;When a class &lt;code&gt;extend&lt;/code&gt;'s a module, it adds the module methods as &lt;em&gt;class methods&lt;/em&gt; on the class.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="k"&gt;module&lt;/span&gt; &lt;span class="nn"&gt;A&lt;/span&gt;
 &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;hello&lt;/span&gt;
 &lt;span class="s2"&gt;"world"&lt;/span&gt;
 &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Foo&lt;/span&gt;
 &lt;span class="kp"&gt;include&lt;/span&gt; &lt;span class="no"&gt;A&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Bar&lt;/span&gt;
 &lt;span class="kp"&gt;extend&lt;/span&gt; &lt;span class="no"&gt;A&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="no"&gt;Foo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;hello&lt;/span&gt; &lt;span class="c1"&gt;#works&lt;/span&gt;
&lt;span class="no"&gt;Foo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;hello&lt;/span&gt; &lt;span class="c1"&gt;#error&lt;/span&gt;

&lt;span class="no"&gt;Bar&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;hello&lt;/span&gt; &lt;span class="c1"&gt;#error&lt;/span&gt;
&lt;span class="no"&gt;Bar&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;hello&lt;/span&gt; &lt;span class="c1"&gt;#works&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If it makes sense for an instance of a class to implement the behaviour, then you would include the module. Then each instance has access to the module methods.&lt;/p&gt;

&lt;p&gt;If the behaviour is not tied to a particular instance, then you can extend the module. Then the methods will be available as class methods.&lt;/p&gt;

&lt;h3&gt;
  
  
  self.included
&lt;/h3&gt;

&lt;p&gt;What if you want some methods to be instance methods and others to be class methods? A common way to implement this is to use the &lt;code&gt;self.included&lt;/code&gt; callback. Whenever a class includes a module, it runs the &lt;code&gt;self.included&lt;/code&gt; callback on the module. We can add the logic for extending another module on the class inside of the &lt;code&gt;self.included&lt;/code&gt; method.&lt;/p&gt;

&lt;p&gt;To do this, we create a nested module that contains the class methods. The self.included callback will extend the nested module on every class that includes the main module. Then the class will have access to the nested module's methods as class methods.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="k"&gt;module&lt;/span&gt; &lt;span class="nn"&gt;A&lt;/span&gt;
 &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nc"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;included&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;base&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="n"&gt;base&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;extend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;ClassMethods&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="k"&gt;end&lt;/span&gt;

 &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;hello&lt;/span&gt;
 &lt;span class="s2"&gt;"world"&lt;/span&gt;
 &lt;span class="k"&gt;end&lt;/span&gt;

 &lt;span class="k"&gt;module&lt;/span&gt; &lt;span class="nn"&gt;ClassMethods&lt;/span&gt;
 &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;hi&lt;/span&gt;
 &lt;span class="s2"&gt;"bye"&lt;/span&gt;
 &lt;span class="k"&gt;end&lt;/span&gt;
 &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Foo&lt;/span&gt;
 &lt;span class="kp"&gt;include&lt;/span&gt; &lt;span class="no"&gt;A&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="no"&gt;Foo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;hello&lt;/span&gt; &lt;span class="c1"&gt;#works&lt;/span&gt;
&lt;span class="no"&gt;Foo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;hello&lt;/span&gt; &lt;span class="c1"&gt;#error&lt;/span&gt;

&lt;span class="no"&gt;Foo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;hi&lt;/span&gt; &lt;span class="c1"&gt;#error&lt;/span&gt;
&lt;span class="no"&gt;Foo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;hi&lt;/span&gt; &lt;span class="c1"&gt;#works&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Using &lt;code&gt;self.included&lt;/code&gt;, lets us provide both instance and class methods when the module is included.&lt;/p&gt;

&lt;p&gt;Note that this approach only works with the module that is included in a class. If we were to &lt;code&gt;extend&lt;/code&gt; the module in this example, then Foo would have &lt;code&gt;hello&lt;/code&gt; as a class method but not &lt;code&gt;hi&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="k"&gt;module&lt;/span&gt; &lt;span class="nn"&gt;A&lt;/span&gt;
 &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nc"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;included&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;base&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="n"&gt;base&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;extend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;ClassMethods&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="k"&gt;end&lt;/span&gt;

 &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;hello&lt;/span&gt;
 &lt;span class="s2"&gt;"world"&lt;/span&gt;
 &lt;span class="k"&gt;end&lt;/span&gt;

 &lt;span class="k"&gt;module&lt;/span&gt; &lt;span class="nn"&gt;ClassMethods&lt;/span&gt;
 &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;hi&lt;/span&gt;
 &lt;span class="s2"&gt;"bye"&lt;/span&gt;
 &lt;span class="k"&gt;end&lt;/span&gt;
 &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Foo&lt;/span&gt;
 &lt;span class="kp"&gt;extend&lt;/span&gt; &lt;span class="no"&gt;A&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="no"&gt;Foo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;hello&lt;/span&gt; &lt;span class="c1"&gt;#error&lt;/span&gt;
&lt;span class="no"&gt;Foo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;hello&lt;/span&gt; &lt;span class="c1"&gt;#works&lt;/span&gt;

&lt;span class="no"&gt;Foo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;hi&lt;/span&gt; &lt;span class="c1"&gt;#error&lt;/span&gt;
&lt;span class="no"&gt;Foo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;hi&lt;/span&gt; &lt;span class="c1"&gt;#error&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Ancestor chain
&lt;/h4&gt;

&lt;p&gt;So what's actually happening when you include or extend a module?&lt;br&gt;
When you include a module, you add it to the ancestor chain of the class.&lt;br&gt;
The ancestor chain is the order of lookup Ruby follows when determining if a method is defined on an object. When you call a method on a class, ruby will check to see if the method is defined on the first item in the ancestor chain (the class). If it is not, it will check the next item in the ancestor chain and so on.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="k"&gt;module&lt;/span&gt; &lt;span class="nn"&gt;A&lt;/span&gt;
 &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;hello&lt;/span&gt;
 &lt;span class="s2"&gt;"world"&lt;/span&gt;
 &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Foo&lt;/span&gt;
 &lt;span class="kp"&gt;include&lt;/span&gt; &lt;span class="no"&gt;A&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="no"&gt;Foo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ancestors&lt;/span&gt; &lt;span class="c1"&gt;# [Foo, A, Object, Kernel, BasicObject]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Similarly, if you extend a module, you add the module to the ancestor list of the singleton class. If you're unfamiliar with singleton classes, I mention them in my post on &lt;a href="https://dev.to/veerpalb/singleton-methods-in-ruby-29"&gt;singleton methods in ruby&lt;/a&gt;. The main idea is that every object has a hidden singleton class which stores methods &lt;strong&gt;implemented only on that object&lt;/strong&gt;. A class object also has a singleton class that stores methods implemented on that class ie class methods.&lt;/p&gt;

&lt;p&gt;When calling a class method, ruby will look at the singleton classes ancestor chain to see where the class method is defined. Since class methods get defined on the singleton class, extending a module adds it to the &lt;strong&gt;singleton class's&lt;/strong&gt; ancestor chain.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="k"&gt;module&lt;/span&gt; &lt;span class="nn"&gt;A&lt;/span&gt;
 &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;hello&lt;/span&gt;
 &lt;span class="s2"&gt;"world"&lt;/span&gt;
 &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Bar&lt;/span&gt;
 &lt;span class="kp"&gt;extend&lt;/span&gt; &lt;span class="no"&gt;A&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="no"&gt;Bar&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ancestors&lt;/span&gt; &lt;span class="c1"&gt;# [Bar, Object, Kernel, BasicObject]&lt;/span&gt;
&lt;span class="no"&gt;Bar&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;singleton_class&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ancestors&lt;/span&gt; &lt;span class="c1"&gt;# [#&amp;lt;Class:Bar&amp;gt;, A, #&amp;lt;Class:Object&amp;gt;, #&amp;lt;Class:BasicObject&amp;gt;, Class, Module, Object, Kernel, BasicObject]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Prepend
&lt;/h3&gt;

&lt;p&gt;Prepend is like &lt;code&gt;include&lt;/code&gt; in its functionality. The only difference is where in the ancestor chain the module is added. With &lt;code&gt;include&lt;/code&gt;, the module is added &lt;strong&gt;after&lt;/strong&gt; the class in the ancestor chain. With prepend, the module is added &lt;strong&gt;before&lt;/strong&gt; the class in the ancestor chain. This means ruby will look at the module to see if an instance method is defined before checking if it is defined in the class.&lt;/p&gt;

&lt;p&gt;This is useful if you want to wrap some logic around your methods.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="no"&gt;Module&lt;/span&gt; &lt;span class="no"&gt;A&lt;/span&gt;
 &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;hello&lt;/span&gt;
 &lt;span class="n"&gt;put&lt;/span&gt; &lt;span class="s2"&gt;"Log hello in module"&lt;/span&gt;
 &lt;span class="k"&gt;super&lt;/span&gt;
 &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Foo&lt;/span&gt;
 &lt;span class="kp"&gt;include&lt;/span&gt; &lt;span class="no"&gt;A&lt;/span&gt;

 &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;hello&lt;/span&gt;
 &lt;span class="s2"&gt;"World"&lt;/span&gt;
 &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="no"&gt;Foo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;hello&lt;/span&gt;
&lt;span class="c1"&gt;# log hello from module&lt;/span&gt;
&lt;span class="c1"&gt;# World&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Resources
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://medium.com/@leo_hetsch/ruby-modules-include-vs-prepend-vs-extend-f09837a5b073"&gt;Ruby modules: Include vs Prepend vs Extend&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://stackoverflow.com/questions/17552915/ruby-mixins-extend-and-include"&gt;Ruby mixins: extend and include&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.railstips.org/blog/archives/2009/05/15/include-vs-extend-in-ruby/"&gt;Include vs extend in ruby&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ruby</category>
    </item>
    <item>
      <title>Consistent Hashing (with ruby implementation)</title>
      <dc:creator>Veerpal</dc:creator>
      <pubDate>Tue, 26 Oct 2021 22:28:27 +0000</pubDate>
      <link>https://dev.to/veerpalb/consistent-hashing-with-ruby-implementation-43ii</link>
      <guid>https://dev.to/veerpalb/consistent-hashing-with-ruby-implementation-43ii</guid>
      <description>&lt;h3&gt;
  
  
  Problem
&lt;/h3&gt;

&lt;p&gt;Let's assume you have a web application that's running on multiple servers. To help speed up queries, you add a cache to store data accessed often by your application. Before calling the database for a piece of information, you first check if it exists in the cache. As you gain more users, one cache instance is too small to provide a significant performance boost. In this case, you add more cache instances to your server to cache more information.&lt;/p&gt;

&lt;p&gt;But now, you have to check every cache instance to see if it contains a key. It would be easier if you knew which cache instance has the key beforehand.&lt;/p&gt;

&lt;h3&gt;
  
  
  Hashing
&lt;/h3&gt;

&lt;p&gt;You can use hashing to determine which cache instance to save the key in. Compute &lt;code&gt;hash(key) % N&lt;/code&gt; where &lt;code&gt;hash&lt;/code&gt; is some hashing function and &lt;code&gt;N&lt;/code&gt; is the number of cache instances. This function returns a number between 0 and &lt;code&gt;N&lt;/code&gt; where each number refers to a cache instance. Thus you can map keys to cache instances. To check if a key exists in the cache, hash the key to get the cache instance and only check if that instance has the key. This strategy enables you to have multiple cache instances while keeping lookup efficient.&lt;/p&gt;

&lt;p&gt;However, what happens if a cache instance crashes? The cache instance will be unavailable, and you will lose the cached data. In future queries, you will need to recache the data in a different cache instance. The only problem is that the value of N in &lt;code&gt;(hash(key) % N)&lt;/code&gt; has changed. All your keys will map to a new cache instance. A key that maps to &lt;code&gt;server:A&lt;/code&gt; now maps to server:B even though only server:C is unavailable. This increases cache misses across all cache instances even if one cache instance is unavailable. Ideally, we would only want to remap the keys for the unavailable server.&lt;/p&gt;

&lt;h3&gt;
  
  
  Consistent Hashing
&lt;/h3&gt;

&lt;p&gt;Consistent hashing is a strategy to map keys to cache instances but allows cache instances to be added or removed from the list of available instances.&lt;/p&gt;

&lt;p&gt;Consistent hashing works by imagining a circle. Each key and cache instance is assigned a corresponding point on this circle. To determine which cache instance to add a key to, we map the key to the closest cache on the circle going in a clockwise direction.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--W4quuY4e--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://veerpalbrar.github.io/images/blog/consistent-hashing-circle.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--W4quuY4e--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://veerpalbrar.github.io/images/blog/consistent-hashing-circle.png" alt="circle diagram of consistent hashing" width="567" height="518"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Programmatically, consistent hashing is simple to implement. We map each of our cache servers to some integer using a hash function. Here, the hash represents the point on the circle for the cache.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;add_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nb"&gt;hash&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;hash_value&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;hash_to_node&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;hash&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt;

    &lt;span class="nb"&gt;puts&lt;/span&gt; &lt;span class="s2"&gt;"Nodes map to &lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="vi"&gt;@hash_to_node&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;

 &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;hash_value&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="no"&gt;Digest&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;SHA256&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;digest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;sum&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="mi"&gt;360&lt;/span&gt;
 &lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the code above, we keep track of the mapping of hashes to nodes in &lt;code&gt;hash_to_node&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;To determine which cache instance to add a key to, we hash the key ie we find the corresponding point on the circle. Then we find the cache that hashes to a number greater than the key's hash. This is effectively the cache that is closest to the key's hash.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;find_cache&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nb"&gt;hash&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;hash_value&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nb"&gt;puts&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; hashes to &lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="nb"&gt;hash&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

    &lt;span class="n"&gt;node_hash&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;closest_node_hash&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;hash&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;node&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;hash_to_node&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;node_hash&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="nb"&gt;puts&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; maps to  &lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;

 &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;closest_node_hash&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
   &lt;span class="vi"&gt;@hash_to_node&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;keys&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sort&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;bsearch&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;server&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="vi"&gt;@hash_to_node&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;keys&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sort&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;first&lt;/span&gt;
 &lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In &lt;code&gt;closest_node_hash(key)&lt;/code&gt;, we sort the cache instance hashes. Then we do a binary search (&lt;code&gt;bsearch&lt;/code&gt;) to find the integer with a value greater than our hashed key.&lt;/p&gt;

&lt;p&gt;If a value is not found, we return the first cache in the list. This emulates a circle since we "wrap" around to the beginning of the list.&lt;/p&gt;

&lt;p&gt;Once we have the hash that is greater than the key, we get the corresponding cache instance. This is the cache we should add the key to.&lt;/p&gt;

&lt;p&gt;We now have a consistent way to map our keys to cache instances.&lt;/p&gt;

&lt;h3&gt;
  
  
  Adding and Removing Nodes
&lt;/h3&gt;

&lt;p&gt;Now let's test what happens when you add or remove a cache instance. Let's run this code on a set of keys to see what the mapping looks like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Nodes map to {213=&amp;gt;"server:A", 154=&amp;gt;"server:B", 331=&amp;gt;"server:C"}

a hashes to 319
a maps to  server:C

b hashes to 65
b maps to  server:B

z hashes to 284
z maps to  server:C

hello hashes to 165
hello maps to  server:A
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As you can see, the keys are distributed among the three cache instances.&lt;/p&gt;

&lt;p&gt;Now, let's add a node to our list and run it again.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Nodes map to {213=&amp;gt;"server:A", 154=&amp;gt;"server:B", 331=&amp;gt;"server:C", 301=&amp;gt;"server:B1"}

a hashes to 319
a maps to  server:C

b hashes to 65
b maps to  server:B

z hashes to 284
z maps to  server:B1

hello hashes to 165
hello maps to  server:A
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When we add a server, only a small subset of keys get remapped to the new instance. Thus, only a small subset of keys will experience a cache miss as they get moved to a new cache. This is because the mapping depends on which node is "closest" to the key. When you add a new server, the closest server does not change for most keys. Thus the mapping for most of the keys remains consistent.&lt;/p&gt;

&lt;p&gt;Now, let's remove &lt;code&gt;server:B&lt;/code&gt; from the list and see what happens.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Nodes map to {213=&amp;gt;"server:A", 331=&amp;gt;"server:C", 301=&amp;gt;"server:B1"}

a hashes to 319
a maps to  server:C

b hashes to 65
b maps to  server:A

z hashes to 284
z maps to  server:B1

hello hashes to 165
hello maps to  server:A
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Only keys that mapped to &lt;code&gt;server:B&lt;/code&gt; need to be remapped. All the other keys remain the same as their "closest" server has not changed.&lt;/p&gt;

&lt;p&gt;As you can see, consistent hashing makes scaling our cache instances easier. Cache instances can be added and removed without having to remap all the keys.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;As nodes are added and removed, the distribution of the keys can be uneven between the servers. In this case, we can add "fake" nodes which map to an existing server. For example, we can add another node for server A in the list. This will cause some keys to get remapped to server A and even out the distribution of keys.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Conclusion
&lt;/h3&gt;

&lt;p&gt;I used caches in this blog post for a practical application of this hashing strategy. However, consistent hashing can be applied anytime you want to divide a set of keys across multiple nodes. For example, in peer-to-peer networks or a load balancer. My favorite part of learning about consistent hashing was seeing how a hash table can be modified to work in a more distributed way.&lt;/p&gt;

&lt;h3&gt;
  
  
  Code
&lt;/h3&gt;

&lt;p&gt;You can find the complete implementation of the &lt;a href="https://gist.github.com/VeerpalBrar/10293df1299d7a897f5305c3c9ecfbef"&gt;consistent hashing code on Github&lt;/a&gt;.&lt;/p&gt;

&lt;h4&gt;
  
  
  Resources
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.geeksforgeeks.org/hashing-in-distributed-systems/"&gt;Hashing in distributed systems&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.toptal.com/big-data/consistent-hashing"&gt;A Guide to Consistent Hashing&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://en.wikipedia.org/wiki/Distributed_hash_table"&gt;Wikipedia: Distributed Hash Table&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://developpaper.com/hash-algorithm-in-distributed-system/"&gt;Hash algorithm in distributed system&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
    </item>
    <item>
      <title>Scaling Applications With Message Queues</title>
      <dc:creator>Veerpal</dc:creator>
      <pubDate>Wed, 29 Sep 2021 22:20:44 +0000</pubDate>
      <link>https://dev.to/veerpalb/scaling-applications-with-message-queues-41dg</link>
      <guid>https://dev.to/veerpalb/scaling-applications-with-message-queues-41dg</guid>
      <description>&lt;p&gt;This month I started looking into system design patterns for scaling and application. I started off by learning about message queues: what are they and why are the useful?&lt;/p&gt;

&lt;h3&gt;
  
  
  The Problem
&lt;/h3&gt;

&lt;p&gt;In a typical web application, a client sends a request to a server that processes it and returns a response. For example, the client may request a list of products. The server would query the database for the list of products and return the list. As the number of requests increases, one server can not handle all the requests. Some clients will be unable to connect with the server as it is unavailable. In this case, you can horizontally scale the application. You buy more servers for the application so that you can handle the increased load.&lt;/p&gt;

&lt;p&gt;Now imagine, some requests are computationally expensive. For example, they need to generate a large report that uses a lot of CPU and takes many seconds to run. While generating reports, the server is unable to process other requests from clients.&lt;/p&gt;

&lt;p&gt;One solution could be to buy even more servers to run your application. This can be expensive and wasteful. Say the report generation requests are more likely at the end of a month. Then for most of the month, you will have extra servers you don't need. The additional servers are only required when there is an increased load from generating reports.&lt;/p&gt;

&lt;h3&gt;
  
  
  Synchronous Vs Asynchronous
&lt;/h3&gt;

&lt;p&gt;We can solve this problem by changing how we think about processing requests. Currently, the client sends a request and then waits for a response from the server. The client is stuck waiting for many seconds while the server generates the report. The client needs a response from the server, but that response does not have to be the final report. Instead, the server can send a response that acknowledges the request for the report without returning the report. Then, it can process the request for the report asynchronously in the background. Once the report generates, the server can send an email to the client and let it know the report is complete.&lt;/p&gt;

&lt;p&gt;By moving to asynchronous computation, we reduce the response time for the client. Instead of waiting for a response from the server, the client can complete other tasks. From a user's perspective, they clicked a button and got a message that the report is being generated. The user can now do other things on the site while the report is generating.&lt;/p&gt;

&lt;p&gt;By making the report generation asynchronous, the server can respond to more requests. Yet, what happens if the server gets a lot of requests to generate a report? It will try to generate all the reports in the background. The server will be doing too much background work and will slow down or run out of memory.&lt;/p&gt;

&lt;h3&gt;
  
  
  Queue
&lt;/h3&gt;

&lt;p&gt;A server should only process one or two reports in the background at a time. If more requests for a report come in, they can be added to a report queue. Once a report is generated, the server can start generating the next report in the queue. This way, all the reports will eventually be generated without overwhelming the server.&lt;/p&gt;

&lt;p&gt;This approach is better, but where is this queue stored? One solution is to store it on the server. However, this could lead to an unequal distribution of report generation requests. A server with a larger queue will take longer to generate reports compared to servers with smaller queues.&lt;/p&gt;

&lt;p&gt;A better solution is to have a shared queue for all the servers. A set of servers can respond to requests and add tasks to the queue. The tasks could be any task we want to offload from the request servers. For example, sending emails or uploading a file to the cloud).&lt;br&gt;
Another set of servers can process background jobs currently in the queue. In this case, the queue would be a persistent data store (database, Redis cache, etc) that all servers can access.&lt;/p&gt;

&lt;p&gt;This idea is known as a task queue (sometimes called a message queue).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.cloudamqp.com%2Fimg%2Fblog%2Fthumb-mq.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.cloudamqp.com%2Fimg%2Fblog%2Fthumb-mq.jpg" alt="Producer sends tasks to a queue which are consumed by a consumer"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Task Queues
&lt;/h3&gt;

&lt;p&gt;Task queues enable multiple systems to communicate with each other. One system acts as a producer and will add tasks to the queue. Another system is a consumer and processes the tasks in the queue and actions on them. In this case, the server handling requests is the producer which adds tasks to the queue. The servers which process the tasks are the consumers.&lt;/p&gt;

&lt;p&gt;A task queue has many benefits. First, a producer and consumer never have to communicate with each other directly. The producers do not make an API call to the consumer to let them know of an event. Producers only need access to the queue. A producer can add a task to the queue even if none of the consumers are online. Once the consumers are back online, they would start processing that tasks in the queue.&lt;/p&gt;

&lt;p&gt;Furthermore, the producers and consumers can scale independently. As the number of tasks increases, you can add more consumers without increasing the number of producers.&lt;/p&gt;

&lt;p&gt;However, one downside to task queues (and asynchronous processes) is that the order of execution is no longer linear. You can't guarantee the order the tasks run in. If some tasks depend on others completing first, the task queue logic becomes more complex.&lt;/p&gt;

&lt;h3&gt;
  
  
  Conclusion
&lt;/h3&gt;

&lt;p&gt;As your application grows, offloading certain tasks to a message queue is a great way to scale your application. This blog post only touches the surfaces of tasks queues. Message queue software, such as RabbitMQ, has a lot of built-in functionality for managing message queues. They also allow you to implement other patterns with your message queue such as the publisher-subscriber pattern.&lt;/p&gt;

&lt;h4&gt;
  
  
  Resources
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.youtube.com/watch?v=W4_aGb_MOls" rel="noopener noreferrer"&gt;What is a Message Queue and When should you use Messaging Queue Systems Like RabbitMQ and Kafka&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://shopify.engineering/high-availability-background-jobs" rel="noopener noreferrer"&gt;High Availability by Offloading Work Into the Background&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://highscalability.com/blog/2008/10/8/strategy-flickr-do-the-essential-work-up-front-and-queue-the.html" rel="noopener noreferrer"&gt;Strategy: Flickr - Do The Essential Work Up-Front And Queue The Rest&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.cloudamqp.com/blog/what-is-message-queuing.html" rel="noopener noreferrer"&gt;What is Message Queueing&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>systemdesign</category>
      <category>architecture</category>
      <category>messagequeues</category>
    </item>
    <item>
      <title>Understanding Rspec Best Practices</title>
      <dc:creator>Veerpal</dc:creator>
      <pubDate>Mon, 30 Aug 2021 22:26:07 +0000</pubDate>
      <link>https://dev.to/veerpalb/understanding-rspec-best-practices-2edm</link>
      <guid>https://dev.to/veerpalb/understanding-rspec-best-practices-2edm</guid>
      <description>&lt;p&gt;This past month, I looked at "best practices" for writing RSpec tests. Sites like &lt;a href="https://www.betterspecs.org/"&gt;betterspecs&lt;/a&gt; and the &lt;a href="https://rspec.rubystyle.guide"&gt;RSpec style guide&lt;/a&gt; offer simple rules to follow. Yet, they do not elaborate on why they suggest the practices they do. Therefore, I decided to spend some time better understanding their recommendations.&lt;/p&gt;

&lt;h3&gt;
  
  
  DRY vs DAMP
&lt;/h3&gt;

&lt;p&gt;Both sites mention &lt;code&gt;DRY&lt;/code&gt;(Don't Repeat Yourself) at some point. DRY (Don't Repeat Yourself) is a programming principle that aims to reduce duplication in code. Since you are testing one class in many scenarios, you can expect some duplication in the setup and execution of your tests. If you follow DRY, you would move this duplication into &lt;code&gt;before&lt;/code&gt; and &lt;code&gt;let&lt;/code&gt; blocks.&lt;/p&gt;

&lt;p&gt;However, it can be harder to figure out what is being tested because all of the logic is outside of the actual test. This makes it harder to read the code and understand how a class is expected to work. You should aim to make tests readable and easy to understand, even if you duplicate some bits of code. This is sometimes known as DAMP (Descriptive and Meaningful phrases).&lt;/p&gt;

&lt;p&gt;That said, lots of duplication in tests makes them harder to modify. The RSpec style guide suggests "doing everything directly in your &lt;code&gt;it&lt;/code&gt; blocks even if it is duplication and then refactor your tests after you have them working to be a little more DRY".&lt;/p&gt;

&lt;p&gt;The aim is to strike a balance between &lt;code&gt;DAMP&lt;/code&gt; and &lt;code&gt;DRY&lt;/code&gt; and be okay with some duplication to help increase readability.&lt;/p&gt;

&lt;h3&gt;
  
  
  Using &lt;code&gt;let&lt;/code&gt; vs &lt;code&gt;before&lt;/code&gt; blocks
&lt;/h3&gt;

&lt;p&gt;Both sites suggest instantiating variables using &lt;code&gt;let&lt;/code&gt; statements instead of inside &lt;code&gt;before&lt;/code&gt; blocks. Code within each &lt;code&gt;before(:each)&lt;/code&gt; block runs before every example block. A variable defined in a &lt;code&gt;before&lt;/code&gt; block is created for each example, even if the test does not reference the variable. Creating a lot of database objects in a &lt;code&gt;before(:each)&lt;/code&gt; block, will slow down tests. In comparison, &lt;code&gt;let&lt;/code&gt; is lazy-loaded. A &lt;code&gt;let&lt;/code&gt; object is only created after it is referenced in a test. Each test will only create the objects referenced in the test itself. Thus, you avoid creating unnecessary objects in your tests.&lt;/p&gt;

&lt;p&gt;Avoid using &lt;code&gt;before(:all)&lt;/code&gt; to instantiate data that is used across many tests. It can cause data to leak between tests, leading to flaky or false positive tests. All examples in Rspec run in a transaction. All database changes are rolled back at the end of the test. That way, you start with a clean database at the beginning of each example. Changes made in a &lt;code&gt;before(:all)&lt;/code&gt; block are not part of the transaction. Though you can clean up the database changes in an &lt;code&gt;after(:all)&lt;/code&gt; block. If you forget to clean up the data, it will persist across all tests and could cause other tests to fail. Database changes made in &lt;code&gt;let&lt;/code&gt; blocks or &lt;code&gt;before(:each)&lt;/code&gt; blocks get rolled back at the end of the example by the database transaction.&lt;/p&gt;

&lt;h3&gt;
  
  
  Factories
&lt;/h3&gt;

&lt;p&gt;Both sites advocate for factories over fixtures (&lt;a href="https://github.com/betterspecs/betterspecs/issues/11"&gt;though there is a not clear consensus&lt;/a&gt;). With fixtures, test objects are all defined in fixture files with predefined data. Fixtures can be used across tests but modifying an existing fixture can break tests that depend on that fixture. As a codebase grows managing fixtures for all the various states of your object can be difficult. In comparison, factories let you build and configure new objects per test.&lt;/p&gt;

&lt;p&gt;Working with factories can also be overwhelming, especially when you are new to them. I have found a couple of helpful tips that can make working with factories easier:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;When defining factory defaults, only provide the attributes required to pass validation. All other functionality should be added via traits. Avoid creating associations that are not required by default. That way you don't create database objects that are not required for each test.&lt;/li&gt;
&lt;li&gt;When using factories in a test, provide only the traits required for the test to pass. It clarifies the properties of the object that are required to make the test pass.&lt;/li&gt;
&lt;li&gt;If your test references a default value of a factory, set the default value during object creation. For example, even if the default name for a user is "Bob", create should your user with &lt;code&gt;build(:user, name: "Bob")&lt;/code&gt;. This indicates that the name is important for the test and makes it explicit where the value of &lt;code&gt;"Bob"&lt;/code&gt; is coming from.&lt;/li&gt;
&lt;li&gt;If you use FactoryBot, try to build your factory objects instead of creating them. When you use &lt;code&gt;create&lt;/code&gt;, it calls the database to instantiate the object and all its associations. &lt;code&gt;build&lt;/code&gt;, will set up the attributes but not save them to the database. It will still call &lt;code&gt;create&lt;/code&gt; on the associations and will run validation on those. Finally, if you use &lt;code&gt;build_stubbed&lt;/code&gt;, the object associated are stubbed out so the database is not called. So, try to build test objects to avoid hitting the DB and help speed up tests.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Mocking
&lt;/h3&gt;

&lt;p&gt;The rails style guide has some guidelines related to mocking objects.&lt;/p&gt;

&lt;p&gt;First, they suggest to not stub the object you are trying to test. For example, avoid doing &lt;code&gt;allow(object_under_test).to receive(:foo).and_return("bar")&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Tests ensure that your code does what you expect it to. When you stub out parts of the object you are testing, you risk false positive tests. The stubbed code never runs, so even if the test passes, you can't be confident that your code works.&lt;/p&gt;

&lt;p&gt;Sometimes, we want to see what a method returns based on the state of the test object. Thus, we're tempted to stub some of its methods to match the expected state. Instead of stubbing the state of the object, build the object with the desired state using a factory. Likewise, you might want to stub out a method that makes a complicated library call that's hard to test. In that case, either stub out the library call or extract the complicated logic into another class. Then stub out the class in your tests. When you extract the logic into another class, you are now stubbing the collaborator, instead of the object under test.&lt;/p&gt;

&lt;p&gt;Mocking collaborators of the object under test is acceptable. The collaborator has been tested in its own unit tests. You can test the collaborator is called with the correct arguments but stub the response for faster tests. Therefore, you rely on the collaborators interface rather than its implementation.&lt;/p&gt;

&lt;h3&gt;
  
  
  In conclusion
&lt;/h3&gt;

&lt;p&gt;When I started researching best practices, I wanted some tips on writing better tests. In reality, I've realized it's not that clear-cut, and there are many ways of testing an object. I realized that even "best practices" have exceptions. Instead of following rules blindly, it helps to understand the reasoning behind the rules. Then you can confidently know you are using these rules correctly.&lt;/p&gt;

&lt;h4&gt;
  
  
  Resources
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.betterspecs.org"&gt;betterspecs.org&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://rspec.rubystyle.guide"&gt;RSpec style guide&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://books.thoughtbot.com/assets/testing-rails.pdf"&gt;Testing Rails&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://myronmars.to/n/dev-blog/2012/06/thoughts-on-mocking"&gt;Thoughts on Mocking&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>rspec</category>
      <category>ruby</category>
    </item>
    <item>
      <title>Tips for debugging in ruby</title>
      <dc:creator>Veerpal</dc:creator>
      <pubDate>Wed, 28 Jul 2021 23:22:30 +0000</pubDate>
      <link>https://dev.to/veerpalb/tips-for-debugging-in-ruby-2hbl</link>
      <guid>https://dev.to/veerpalb/tips-for-debugging-in-ruby-2hbl</guid>
      <description>&lt;p&gt;This month I looked into debugging ruby code. While I usually can figure out the source of bugs, I've been thinking about how to debug code more efficiently. When I debug in ruby, I tend to rely on printing variables to the terminal. If the code is more complex, I step through the code with &lt;code&gt;byebug&lt;/code&gt; or &lt;code&gt;binding.pry&lt;/code&gt;. During this past month, I've been learning techniques that let me level up these skills.&lt;/p&gt;

&lt;h3&gt;
  
  
  Navigating code
&lt;/h3&gt;

&lt;p&gt;I learned a few techniques to navigate the codebase faster and level up both printing and &lt;code&gt;byebug&lt;/code&gt; based debugging.&lt;/p&gt;

&lt;h4&gt;
  
  
  Printing methods
&lt;/h4&gt;

&lt;p&gt;One technique that I already used but is still worth mentioning is to use &lt;code&gt;p&lt;/code&gt; instead of &lt;code&gt;puts&lt;/code&gt;. &lt;code&gt;puts&lt;/code&gt; calls &lt;code&gt;to_s&lt;/code&gt; on the object, which by default is the object class and &lt;code&gt;id&lt;/code&gt;. You can override the &lt;code&gt;to_s&lt;/code&gt; class to return detailed information about the object. The other option is to use &lt;code&gt;p&lt;/code&gt;, which calls &lt;code&gt;.inspect&lt;/code&gt; on the object. By default, inspect returns a string with the class, object_id, and instance variables. The output of &lt;code&gt;p&lt;/code&gt; can be difficult to parse if an object has many instance variables. In this case, you can use &lt;code&gt;pp&lt;/code&gt;, which stands for pretty print, and makes the output easier to read. &lt;code&gt;pp&lt;/code&gt; is also helps to format hashes and JSON objects.&lt;/p&gt;

&lt;h4&gt;
  
  
  Raising errors
&lt;/h4&gt;

&lt;p&gt;Sometimes it can be difficult to find the print statements in the server logs. One option is to prepend print statements with strings like "!!!" and search for them on the server output. Another technique is to raise an exception immediately after the print statements. Then you can find the code faster as you know it happens right before the exception. Raising errors is useful if that section of code runs many times. You can use conditional logic to raise an error in the cases you want to investigate.&lt;/p&gt;

&lt;h4&gt;
  
  
  Freezing
&lt;/h4&gt;

&lt;p&gt;If you want to know when an object is modified, you can &lt;code&gt;freeze&lt;/code&gt; it. Then whenever the object is modified, it will raise an exception. Freezing an object is a faster way to figure out which classes are modifying it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Leveraging Ruby
&lt;/h3&gt;

&lt;p&gt;Methods such as &lt;code&gt;inspect&lt;/code&gt; and &lt;code&gt;pp&lt;/code&gt; are useful but don't always appear in beginner ruby tutorials. I've found that learning more about ruby has given me new tools for debugging Ruby code. There are many methods in ruby that are there to make it easier for developers to work with ruby.&lt;/p&gt;

&lt;h4&gt;
  
  
  Objects
&lt;/h4&gt;

&lt;p&gt;For example, if you have a method that takes an input ( &lt;code&gt;input_obj&lt;/code&gt;) but it's not clear what type of input it is. Normally, I would search the code base for all the locations that this method is invoked. In the calling method, you can figure out what is passed in as the input. A faster way to figure this out would be to run the code and do &lt;code&gt;p input_obj.class.name&lt;/code&gt;. That way, you know the exact class of the input. Everything in ruby is an object and inherits from the &lt;a href="https://ruby-doc.org/core-3.0.2/Object.html"&gt;&lt;code&gt;Object&lt;/code&gt; class&lt;/a&gt; class. It has methods such as &lt;code&gt;methods&lt;/code&gt;, &lt;code&gt;instance_variables&lt;/code&gt;, &lt;code&gt;responds_to?&lt;/code&gt; that you can use to learn more about method inputs. Granted, you can figure out a lot of this information with &lt;code&gt;inspect&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The Object class also mixes the &lt;a href="https://ruby-doc.org/core-3.0.2/Kernel.html#method-i-caller"&gt;Kernel module, which has a &lt;code&gt;caller&lt;/code&gt;&lt;/a&gt; method. You can use &lt;code&gt;caller&lt;/code&gt; to get the calling stack for an object. &lt;code&gt;caller&lt;/code&gt; is a faster way to figure out who is calling a method instead of searching through the entire code base.&lt;/p&gt;

&lt;h4&gt;
  
  
  Method
&lt;/h4&gt;

&lt;p&gt;In ruby, even &lt;a href="https://ruby-doc.org/core-3.0.2/Method.html#method-i-source_location"&gt;methods&lt;/a&gt; are objects! You can determine where a method is implemented by calling &lt;code&gt;source_location&lt;/code&gt; on the method:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;ClassName.instance_method(:method_name).source_location&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Using &lt;code&gt;source_location&lt;/code&gt; is especially useful when the method name is common and is harder to search for in the code. If a method calls &lt;code&gt;super&lt;/code&gt;, you can use &lt;code&gt;super_method&lt;/code&gt; to get the &lt;code&gt;Method&lt;/code&gt; object for the super method:&lt;br&gt;
&lt;code&gt;ClassName.instance_method(:method_name).super_method&lt;/code&gt;&lt;/p&gt;
&lt;h4&gt;
  
  
  Inheritance Hierarchy
&lt;/h4&gt;

&lt;p&gt;Sometimes, the source of bugs is due to objects extending many modules that change their behavior in unexpected ways. You can track when a module is added to an object with &lt;a href="https://ruby-doc.org/core-2.5.0/Module.html#method-i-included"&gt;&lt;code&gt;included&lt;/code&gt;&lt;/a&gt;. You can overwrite &lt;code&gt;included&lt;/code&gt; to print information when a module is included on an object. Use &lt;a href="https://ruby-doc.org/core-2.5.0/Module.html#method-i-method_added"&gt;&lt;code&gt;method_added&lt;/code&gt;&lt;/a&gt; to track when an instance method is added to a module. These methods help track down bugs related to metaprogramming.&lt;/p&gt;
&lt;h4&gt;
  
  
  Tracepoint
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://ruby-doc.org/core-2.5.0/TracePoint.html"&gt;&lt;code&gt;Tracepoint&lt;/code&gt;&lt;/a&gt; allows you to trace the call stack for a piece of code. To see all the methods called while a code block run, you could trace the call stack with Tracepoint:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;trace = TracePoint.new(:call) do |tp|
 p[tp.path, tp.lineno, tp.defined_class, tp.method_id]
end

trace.enable
User.some_method
trace.disable
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After you create a new tracepoint, you must enable it. When enabled, a tracepoint object will log all the methods calls until the trace is disabled. When you initialize a new tracepoint, it takes a block executes for each method call. The example above prints the file the method is located in (&lt;code&gt;tp.path&lt;/code&gt;), the line number &lt;code&gt;tp.lineno&lt;/code&gt;, the class &lt;code&gt;tp.defined_class&lt;/code&gt;, and the method &lt;code&gt;tp.method_id&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The logging for Tracepoint is quite verbose as it will also output the method calls for code in gems. Thus, Tracepoint is more useful for getting the general execution path for the code.&lt;/p&gt;

&lt;p&gt;To reduce the output, you can use conditionals to only print in certain cases:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;TracePoint.trace(:call) do |tp|
 next unless tp.self.is_a?(User) # only print method calls for Users
 # tracing logic
end
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That way, you can see how the execution path for a particular object to see how it is used.&lt;/p&gt;

&lt;p&gt;Tracepoint's code is also less intuitive to write. Rather than memorizing the code, I'd save it in a snippet and copy it whenever I want to use it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Reading gem source code
&lt;/h3&gt;

&lt;p&gt;Sometimes, the code I'm interested in exists in a gem instead of the application code. Understanding gem code usually requires reading the gem documentation to figure out how the code work. If you can not find the information in the docs, you would have to read the source code. I can read the code on Github, but this can be tedious to navigate and search. Instead, you can do &lt;code&gt;bundle open &amp;lt;gem_name&amp;gt;&lt;/code&gt; to open the code for the gem in a text editor. It will open the version specified in the nearest Gemfile. That way, you can use your IDE to search and navigate the gem code. In your application code, you can use &lt;code&gt;source_location&lt;/code&gt; to find the location of a method defined in a gem! You can also use print statements and &lt;code&gt;byebug&lt;/code&gt; to debug the gem source code if needed. When you finish debugging, use &lt;code&gt;gem pristine &amp;lt;gem_name&amp;gt;&lt;/code&gt; to clean up any changes.&lt;/p&gt;

&lt;h3&gt;
  
  
  Conclusion
&lt;/h3&gt;

&lt;p&gt;Debugging ruby goes beyond the use of print statements to trace code execution. There is a lot of built-in ruby functionality which can help you more effectively debug your code. As I dig deeper into ruby, I now consider how I can leverage what I learn to debug code.&lt;/p&gt;

&lt;h3&gt;
  
  
  Sources
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://maximomussini.com/posts/debugging-ruby-libraries/"&gt;Debugging Libraries: Ruby Edition&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://tenderlovemaking.com/2016/02/05/i-am-a-puts-debuggerer.html"&gt;I am a puts debugger&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.schneems.com/2016/01/25/ruby-debugging-magic-cheat-sheet.html"&gt;Ruby debugging magic cheat sheet&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://blog.appsignal.com/2020/04/01/changing-the-approach-to-debugging-in-ruby-with-tracepoint.html"&gt;Changing the Approach to Debugging in Ruby with TracePoint&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.rubyguides.com/2017/01/spy-on-your-ruby-methods/"&gt;How To Spy on Your Ruby Methods&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ruby</category>
      <category>debugging</category>
      <category>tracepoint</category>
    </item>
  </channel>
</rss>
