<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Utkarsh Sharma</title>
    <description>The latest articles on DEV Community by Utkarsh Sharma (@utkarshvit).</description>
    <link>https://dev.to/utkarshvit</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F390504%2Ff0eee927-58af-4a59-a415-854f101521e2.jpeg</url>
      <title>DEV Community: Utkarsh Sharma</title>
      <link>https://dev.to/utkarshvit</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/utkarshvit"/>
    <language>en</language>
    <item>
      <title>Scalable Key-Value Store</title>
      <dc:creator>Utkarsh Sharma</dc:creator>
      <pubDate>Wed, 20 May 2020 10:37:45 +0000</pubDate>
      <link>https://dev.to/utkarshvit/scalable-key-value-store-4dhm</link>
      <guid>https://dev.to/utkarshvit/scalable-key-value-store-4dhm</guid>
      <description>&lt;h1&gt;
  
  
  Scalable Key-Value Store
&lt;/h1&gt;

&lt;h3&gt;
  
  
  System Overview
&lt;/h3&gt;

&lt;p&gt;This project implements a scalable multi node key value store using &lt;a href="https://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf"&gt;Dynamo Db's&lt;/a&gt; consistent hashing policy. The client adds a key value pair and retrieves a value for a key from the system using simple HTTP GET and POST as we will see in the next sections. Similarly a client can add a node to the system and remove a node form the system without having to worry about key migration and updating routing information.&lt;/p&gt;

&lt;h3&gt;
  
  
  Running in Docker Environment
&lt;/h3&gt;

&lt;p&gt;The next subsections are for running the system on a single host docker environment. For knowing how to run the system on &lt;a href="https://vcl.ncsu.edu/"&gt;VCL&lt;/a&gt; or other cloud services please refer to the &lt;a href="https://github.com/UtkarshVIT/AdvDistSystems#running-on-cloud-environment"&gt;Running on Cloud Environment&lt;/a&gt; section.&lt;/p&gt;

&lt;h6&gt;
  
  
  Architechture
&lt;/h6&gt;

&lt;p&gt;Each node in the system is a Ubuntu 18 image running a Flask application on the Python development server and storing the key value in the node itself using &lt;a href="https://werkzeug.palletsprojects.com/en/0.16.x/contrib/cache/"&gt;Simple Cache&lt;/a&gt;. The nodes communicate with each other via HTTP GET and POST. For hashing we have used the md5 hash to calculate the hash of the key and apply modulo 10000 on it so that the max value in the hash ring in 10000.&lt;/p&gt;

&lt;h6&gt;
  
  
  Docker Compose Setup
&lt;/h6&gt;

&lt;p&gt;Our present experimental system is based on the following setup. You can modify the system based on your preferences by editing the docker-compose.yml file. We have two nodes in the default state. These two nodes are Ubuntu containers running a Flask app communicating with each other via HTTP in a docker network. There is a load balancer which uses round robin algorithm to distribute load between these two nodes. We also have a standby container running the sample Flask app which we will use later for scaling up. The fourth container is a client which will interact with the system and run the test cases.&lt;/p&gt;

&lt;pre&gt;
        (3000)                               
      ____n1___   ←--┐      n3 (standby)       |    n1     = 172.23.0.3
     |        |       |                         |    n2     = 172.23.0.4
     |        |      LB  ←--- client           |    n3     =  172.23.0.5
     |___n2___|   (load balancer)               |    LB     = 172.23.0.6
       (8000)                                   |  client    = 172.23.0.7 
&lt;/pre&gt;

&lt;h6&gt;
  
  
  Deployment
&lt;/h6&gt;

&lt;ol&gt;
&lt;li&gt;Running the containers&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Run the following command to deploy the system and attach the host session to the logs of the spawned containers&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;&lt;code&gt;$docker-compose up&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Attach to the console of the client using the command below so as to access the docker network.
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;code&gt;$docker container exec -it advdistsystems_client_1 /bin/bash&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Reconfigure the system&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Run the following command to update the routing information for all the nodes in the system.&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;&lt;code&gt;$sh reconfigure.sh&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;h3&gt;
  
  
  Running on Cloud Environment
&lt;/h3&gt;

&lt;p&gt;The next subsections are for running the system on a cloud environment. This code has been deployed and tested by running on &lt;a href="https://vcl.ncsu.edu/"&gt;VCL&lt;/a&gt; and AWS.&lt;/p&gt;

&lt;h6&gt;
  
  
  Architechture
&lt;/h6&gt;

&lt;p&gt;Each node is an EC2 instance of type &lt;a href="https://aws.amazon.com/ec2/instance-types/t2/"&gt;t2.micro&lt;/a&gt; running the Ubuntu 18 image. The node is running the &lt;a href="https://httpd.apache.org/"&gt;Apache&lt;/a&gt; web server. The application is built on Flask development Framework and is running in the node on the Python’s WSGI app server and listening on port 80. The node is also running a &lt;a href="https://memcached.org/"&gt;Memcached&lt;/a&gt; server listening on port 11211. The Memcached server is responsible for storing the key value pairs and also for maintaining the state of the hash ring. We are using the Memcached server for storing the state of the hash ring as apache can spawn multiple threads of the application and the state of the hash ring should be consistent across all the threads. The apache web server is running 5 instances of the Flask application. The nodes communicate with each other via HTTP and with the Memcached server via TCP. All nodes are behind a load balancer which is an Ubuntu VM running HAProxy and uses round robin to route requests between servers. For hashing we have used the md5 hash to calculate the hash of the key and apply modulo 10000 on it so that the max value in the hash ring in 10000.&lt;/p&gt;

&lt;h6&gt;
  
  
  Deployment
&lt;/h6&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Create a VM for each node you want in the system running the Ubuntu 18 image on VCL or any other cloud provider.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Attach to the terminal of each node and download the setup.sh file and run it using the command below. Repeat this process for all the nodes deployed in the previous step. This file contains the commands to install all the components of the system.&lt;br&gt;
&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;code&gt;$sudo wget https://raw.githubusercontent.com/UtkarshVIT/AdvDistSystems/master/setup.sh &amp;amp;&amp;amp; sh setup.sh&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Edit the &lt;code&gt;reconfigure.sh&lt;/code&gt; file to confiure the hash table on each node. It is currently configured to deploy a two node system but can be configured to a custom scenario. The reconfiguration details are in the file itself. Complete system reconfiguration by running the command below.
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;code&gt;$sh reconfigure.sh&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create a Layer 7 load balancer using HAProxy as explained &lt;a href="https://upcloud.com/community/tutorials/haproxy-load-balancer-ubuntu/"&gt;here&lt;/a&gt; for VCL or custom create a load balancer using AWS load balancer &lt;a href="https://aws.amazon.com/elasticloadbalancing/"&gt;service&lt;/a&gt;. &lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Running Test Cases
&lt;/h3&gt;

&lt;h6&gt;
  
  
  IMP Note
&lt;/h6&gt;

&lt;p&gt;For docker deployment, the test cases are configured for the degault setup in docker-compose. For cloud deployment,the test cases in &lt;code&gt;./tests/pytest.py&lt;/code&gt; are configured for a 2 node system with a scale up test to expand it to three nodes.&lt;/p&gt;

&lt;h6&gt;
  
  
  Running test case steps
&lt;/h6&gt;

&lt;ol&gt;
&lt;li&gt;If running the docker setup, attach to the console of the client using the command below. If running on a cloud service, skip this step as ports are public.
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;code&gt;$docker container exec -it advdistsystems_client_1 /bin/bash&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Reconfigure the system&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Reconfigure the system to clear cache and update routing information. The &lt;a href="https://github.com/UtkarshVIT/AdvDistSystems/blob/master/tests/reconfigure.sh"&gt;reconfigure.sh&lt;/a&gt; file in the root directory is updated with the information of the experimental setup. Update the file iff you are using a custom setup. Run the following command from the client's shell&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;&lt;code&gt;$sh reconfigure.sh&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Reconfiure test Cases.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If running on cloud scenario, the IP addresses of the nodes are pre set for the docker setup. Thus, edit the ip addresses of the nodes in the file &lt;code&gt;/tests/pytest.py&lt;/code&gt; if running on a cloud deployment else if using docker you can skip this step.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Execute test cases.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The following command will simulate four scenarios and exectute the test cases in &lt;code&gt;tests/pytest.py&lt;/code&gt;. For detailed information on the test cases see &lt;a href="https://github.com/UtkarshVIT/AdvDistSystems/blob/master/tests/pytest.py"&gt;pytest.py&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;&lt;code&gt;$python pytest.py&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;h3&gt;
  
  
  Common Commands
&lt;/h3&gt;

&lt;h6&gt;
  
  
  IMP Note
&lt;/h6&gt;

&lt;p&gt;Please Ensure you have completed Step 1 and Step 2 of &lt;strong&gt;Running Test Cases&lt;/strong&gt; before running any of the below commands.&lt;/p&gt;

&lt;h6&gt;
  
  
  Commands
&lt;/h6&gt;

&lt;ul&gt;
&lt;li&gt;Set a env variable for the load balancer.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;


For example, in the docker setup use, this command to setup env variable for load balancer.



```$lb="172.23.0.6:5000"```



* Adding a key-val pair [sending POST to LB]



```$curl --data "key=&amp;lt;custom-key&amp;gt;&amp;amp;val=&amp;lt;custom-val&amp;gt;" $lb/route```



* Fetching val for a key [sending GET to LB]



```$curl $lb/route?key=&amp;lt;custom-key&amp;gt;```



* Adding a node to the system



```$curl --data $lb/add_node/&amp;lt;key-of-new-node&amp;gt;/&amp;lt;ip:port-of-new-node&amp;gt;```



For example, in the docker setup use, this command to add a node at key 5000 in the system.



```$curl --data $lb/add_node/5000/172.23.0.5:5000```



* Removing a node from the system



```$curl $lb/remove_node/&amp;lt;ip:port-of-target-node&amp;gt;```

&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

</description>
      <category>showdev</category>
      <category>distributedsystems</category>
      <category>octograd2020</category>
      <category>2020devgrad</category>
    </item>
  </channel>
</rss>
