<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Dipankar Medhi</title>
    <description>The latest articles on DEV Community by Dipankar Medhi (@dipankarmedhi).</description>
    <link>https://dev.to/dipankarmedhi</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F839089%2F4dac16a5-40ee-4b55-8edf-c54e2e592557.jpg</url>
      <title>DEV Community: Dipankar Medhi</title>
      <link>https://dev.to/dipankarmedhi</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/dipankarmedhi"/>
    <language>en</language>
    <item>
      <title>Recognize dates from documents using Sliding Window Algorithm &amp; Python OCR.</title>
      <dc:creator>Dipankar Medhi</dc:creator>
      <pubDate>Sat, 07 Jan 2023 17:23:44 +0000</pubDate>
      <link>https://dev.to/dipankarmedhi/recognize-dates-from-documents-using-sliding-window-algorithm-python-ocr-2oae</link>
      <guid>https://dev.to/dipankarmedhi/recognize-dates-from-documents-using-sliding-window-algorithm-python-ocr-2oae</guid>
      <description>&lt;p&gt;Hey there 👋,&lt;/p&gt;

&lt;p&gt;Today, lets solve a text processing problem that asks us to find any date present in a text extracted from an image.&lt;/p&gt;

&lt;p&gt;We are using &lt;strong&gt;easyocr&lt;/strong&gt; , a python OCR library to find the text from the images. Lets move on with the code.&lt;/p&gt;

&lt;h1&gt;
  
  
  Extracting text from images | Setting up easyocr
&lt;/h1&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;We start by creating a &lt;a href="http://data-extraction.py" rel="noopener noreferrer"&gt;&lt;code&gt;data-extraction.py&lt;/code&gt;&lt;/a&gt; module.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Create a &lt;strong&gt;DataExtraction&lt;/strong&gt; class and initiate the &lt;strong&gt;easyocr&lt;/strong&gt; model.&lt;br&gt;
&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
from datetime import datetime  
import easyocr  
import re  

class DataExtraction:  
    def __init__ (self) -&amp;gt; None:  
      self.months = {  
            "JAN": "01",  
            "FEB": "02",  
            "MAR": "03",  
            "APR": "04",  
            "MAY": "05",  
            "JUN": "06",  
            "JUL": "07",  
            "AUG": "08",  
            "SEP": "09",  
            "OCT": "10",  
            "NOV": "11",  
            "DEC": "12",  
        }  
        self.reader = easyocr.Reader(["en"])

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h1&gt;
  
  
  Converting date strings to DateTime objects
&lt;/h1&gt;

&lt;p&gt;There can be an unknown number of date formats and parsing each one of them will take an infinite amount of time and work. So in this example, well consider only a few well-known forms.&lt;/p&gt;

&lt;p&gt;Well try to identify &lt;strong&gt;dd mmm yyyy&lt;/strong&gt; date formats from a string.&lt;/p&gt;

&lt;p&gt;For example, if the given date is &lt;strong&gt;15 sd f may 2019&lt;/strong&gt; , then the output should be &lt;strong&gt;15052019&lt;/strong&gt;".&lt;/p&gt;

&lt;p&gt;We are going to use the &lt;strong&gt;Sliding Window&lt;/strong&gt; to detect if any month is present in between two groups of numerical characters.&lt;/p&gt;

&lt;p&gt;The string includes numbers, alphabets, including other characters. For example, consider 𝗴𝘀 𝟭𝟱 𝗺𝗮𝗶 𝗺𝗮𝘆 𝟮𝟬𝟭𝟵 𝘀𝗴𝗳 𝘀. The date should be 15th May 2019.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnyun33fllwkpeqlw3pvo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnyun33fllwkpeqlw3pvo.png" width="800" height="412"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;The first step is to implement a sliding window to convert MMM to a number. Like, &lt;em&gt;may to 05.&lt;/em&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;We create a function that takes in a string and finds if it contains any month from the above dictionary, &lt;strong&gt;months&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;def month_to_num(self, s: str) -&amp;gt; str:
        res = ""
        start = 0
        try:
            for end in range(len(s)):
                rightChar = s[end]
                res += rightChar
                if len(res) == 3:
                    if res.upper() in self.months.keys():
                        numeric_date = self.months[res.upper()]
                        return numeric_date
                    start += 1
                    res = res[1:]
        except Exception as e:
            pass

        return ""

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Next, we create a function that takes in a string and gives us the desired format.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;def find_date_string(self, s: str) -&amp;gt; list: # s = "𝗴𝘀 𝟭𝟱 𝗺𝗮𝗶 𝗺𝗮𝘆 𝟮𝟬𝟭𝟵 𝘀𝗴𝗳 "
        s1 = " ".join(re.split(r"([a-zA-Z])([0-9]+)", s))
        s2 = " ".join(re.split(r"([0-9]+)([a-zA-Z]+)", s1))
        text = "-" + "-".join(re.split(r"[-;,.\s]\s*", s2)) + "-" # "gs-15-mai-may-2019-sgf"
        dates_type_1 = re.findall(r"-[0-9][0-9]-.*?-[0-9][0-9][0-9][0-9]-", text) # "-15-mai-may-2019"
        date_objects = []
        if len(dates_type_1) &amp;gt; 0:
            date_objs = self.get_date_object(dates_type_1)
            for date_obj in date_objs:
                date_objects.append(date_obj)
        return date_objects

def get_date_object(self, date_type_1_list: list):
    dates = []
    for date_str in date_type_1_list:
        day_str = date_str[1:3]
        month_str = date_str[3:-4]
        year_str = date_str[-5:-1]

        month_number = self.month_to_num(month_str)
        if month_number == "":
            return ""

        result_date_str = f"{day_str}-{month_number}-{year_str}"
        date_object = datetime.strptime(result_date_str, "%d-%m-%Y")
        dates.append(date_object)  

     return dates

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Now we just have to pass the extracted strings into the above functions.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;def get_date_from_img(self, img_path: str):
        result = []

        # extract the texts from the img
        text_strings = self.reader.readtext(img_path, detail=0)

        # check every string for dates
        for s in text_strings:
            date_obj_list = self.find_date_string(s)
            if len(date_obj_list) &amp;gt; 0:
                result.append(date_obj_list)
       return result

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Thats it. We have all the DateTime objects present in a document image.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This method can be used on any kind of document, provided the date format matches the defined type. There are many kinds of date formats used throughout the world. Different countries have different formats. Parsing each one of them will require some more effort but it is definitely achievable.&lt;/p&gt;

&lt;p&gt;Here are some of the other formats to be used for different date types.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"""
1. 1 mai/may 2019
2. 1 mai/may 19
3. 12 09 2016
4. 2 09 2016
5. 12 09 16
6. 2 09 16  
"""  
dates_type_2 = re.findall(r"-[0-9][0-9]-[0-9][0-9]-[0-9][0-9][0-9][0-9]-", text)
dates_type_3 = re.findall(r"-[0-9][0-9]-[0-9][0-9]-[0-9][0-9]-", text)
dates_type_4 = re.findall(r"-[0-9][0-9]-.*?-[0-9][0-9]-", text)
dates_type_5 = re.findall(r"-[0-9]-.*?-[0-9][0-9]-", text)
dates_type_6 = re.findall(r"-[0-9]-.*?-[0-9][0-9][0-9][0-9]-", text)
dates_type_7 = re.findall(r"-[0-9]-[0-9][0-9]-[0-9][0-9]-", text)
dates_type_8 = re.findall(r"-[0-9]-[0-9][0-9]-[0-9][0-9][0-9][0-9]-", text)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Thats all folks! See you soon.&lt;/p&gt;

&lt;p&gt;Happy Coding 🤟&lt;/p&gt;

</description>
      <category>python</category>
      <category>computervision</category>
      <category>data</category>
      <category>technology</category>
    </item>
    <item>
      <title>How I built a real-time Machine Learning system with Kafka, Elasticsearch, Kibana, and Docker</title>
      <dc:creator>Dipankar Medhi</dc:creator>
      <pubDate>Sun, 04 Dec 2022 04:00:14 +0000</pubDate>
      <link>https://dev.to/dipankarmedhi/how-i-built-a-real-time-machine-learning-system-with-kafka-elasticsearch-kibana-and-docker-3h50</link>
      <guid>https://dev.to/dipankarmedhi/how-i-built-a-real-time-machine-learning-system-with-kafka-elasticsearch-kibana-and-docker-3h50</guid>
      <description>&lt;p&gt;We will design and build a real-time sentiment analysis and hate detection system.&lt;/p&gt;

&lt;p&gt;This is a project that I made in the &lt;strong&gt;Turn Language into Action, Natural Language Hackathon by&lt;/strong&gt; &lt;a href="http://Expert.ai" rel="noopener noreferrer"&gt;&lt;strong&gt;Expert.ai&lt;/strong&gt;&lt;/a&gt; &lt;strong&gt;.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I have always been interested in real-time systems and have always wondered how things work under the hood.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;HOW&lt;/strong&gt;? 🤔&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4cjanu1q324kr13r2uy5.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4cjanu1q324kr13r2uy5.gif" width="480" height="480"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;So, I found this hackathon to be a perfect opportunity for me to learn and build something new.&lt;/p&gt;

&lt;p&gt;Well then, Lets &lt;strong&gt;ROLL!!!&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F59slef9zuuo5u7jirvpq.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F59slef9zuuo5u7jirvpq.gif" width="480" height="200"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Project Architecture
&lt;/h1&gt;

&lt;p&gt;This is what the complete pipeline looks like. Dont worry I will cover everything in detail.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq981sfr7g9zz8s9xco61.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq981sfr7g9zz8s9xco61.png" alt="Project Architecture" width="800" height="368"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;But before we move on with the tools and architecture, let me talk about our data sources.&lt;/p&gt;

&lt;p&gt;I have used Twitter API for real-time tweets, specifically pythons &lt;a href="https://docs.tweepy.org/en/stable/" rel="noopener noreferrer"&gt;tweepy&lt;/a&gt; library for streaming tweets. In addition to that, I have used &lt;a href="https://newsapi.org/" rel="noopener noreferrer"&gt;NewsAPI&lt;/a&gt; for daily news articles.&lt;/p&gt;

&lt;p&gt;I have used &lt;strong&gt;docker&lt;/strong&gt; to set up all the necessary tools as containers for this project.&lt;/p&gt;

&lt;p&gt;Now lets talk about each component.&lt;/p&gt;

&lt;h1&gt;
  
  
  Apache Kafka
&lt;/h1&gt;

&lt;p&gt;For ingesting the real-time data, I have used Apache Kafka.&lt;/p&gt;

&lt;p&gt;Now, what is &lt;strong&gt;Apache Kafka?&lt;/strong&gt; Well&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Apache Kafka (Kafka) is an open source, distributed streaming platform that enables (among other things) the development of real-time, event-driven applications. IBM&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Since I have used Python, there is a python client &lt;a href="https://github.com/dpkp/kafka-python" rel="noopener noreferrer"&gt;&lt;strong&gt;kafka-python&lt;/strong&gt;&lt;/a&gt; available that makes working with Kafka relatively easy.&lt;/p&gt;

&lt;p&gt;Using the &lt;strong&gt;KafkaProducer&lt;/strong&gt; , Ive sent the messages (Twitter and NewsAPI) via 2 Kafka topics to the &lt;strong&gt;KafkaConsumer&lt;/strong&gt;. One for the tweets and the other one for the news articles respectively.&lt;/p&gt;

&lt;p&gt;KafkaConsumer then calls the Machine Learning service to classify the sentiments of the news media articles and detect hate in the tweets.&lt;/p&gt;

&lt;h1&gt;
  
  
  Machine Learning service
&lt;/h1&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;a href="http://Expert.ai" rel="noopener noreferrer"&gt;Expert.ai&lt;/a&gt; turns language into data so teams can make better decisions.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Since I built this project as a part of the &lt;a href="http://Expert.ai" rel="noopener noreferrer"&gt;&lt;strong&gt;Expert.ai&lt;/strong&gt;&lt;/a&gt; hackathon, I have used their API for sentiment analysis/classification and hate detection.&lt;/p&gt;

&lt;p&gt;However, you can always use your own &lt;strong&gt;Tensorflow&lt;/strong&gt; or &lt;strong&gt;PyTorch&lt;/strong&gt; model. Also, &lt;strong&gt;Huggingface&lt;/strong&gt; has some very relevant models for sentiment classifications and they are straightforward to set up. You should check them out!&lt;/p&gt;

&lt;p&gt;I am using the &lt;a href="https://docs.expert.ai/nlapi/latest/guide/sentiment-analysis/" rel="noopener noreferrer"&gt;&lt;strong&gt;Sentiment Analysis&lt;/strong&gt;&lt;/a&gt; and &lt;a href="https://docs.expert.ai/nlapi/latest/guide/detection/hate-speech/" rel="noopener noreferrer"&gt;&lt;strong&gt;Hate speech detection&lt;/strong&gt;&lt;/a&gt; APIs from &lt;a href="http://Expert.ai" rel="noopener noreferrer"&gt;Expert.ai&lt;/a&gt; NL API.&lt;/p&gt;

&lt;h1&gt;
  
  
  Elasticsearch
&lt;/h1&gt;

&lt;p&gt;Okay, we have the classified data. Now What?&lt;/p&gt;

&lt;p&gt;We have to store that data somewhere to use it for further analytics. I have used Elasticsearch and Kibana to visualize the stored data.&lt;/p&gt;

&lt;p&gt;You might ask, why Kibana?&lt;/p&gt;

&lt;p&gt;Let me introduce you to the &lt;strong&gt;ELK stack&lt;/strong&gt;.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;ELK is the acronym for three open source projects: Elasticsearch, Logstash, and Kibana. Elasticsearch is a search and analytics engine. Logstash is a serverside data processing pipeline that ingests data from multiple sources simultaneously, transforms it, and then sends it to a stash like Elasticsearch. Kibana lets users visualize data with charts and graphs in Elasticsearch. &lt;a href="http://Elastic.co" rel="noopener noreferrer"&gt;Elastic.co&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Elasticsearch, Logstash and Kibana go hand in hand in most data engineering or data ingestion use cases. But I have omitted Logstash to keep the pipeline simple and clear to its goal.&lt;/p&gt;

&lt;p&gt;But, you can always add Logstash and scale the pipeline further as needed.&lt;/p&gt;

&lt;p&gt;That is enough about the ELK stack. Lets jump into the Elasticsearch design.&lt;/p&gt;

&lt;h3&gt;
  
  
  Elasticsearch: The Official Distributed Search &amp;amp; Analytics Engine
&lt;/h3&gt;

&lt;p&gt;Like databases, Elasticsearch has " &lt;strong&gt;Indexes"&lt;/strong&gt;. These indexes store data defined with certain mappings type. Mapping is more like a schema in other databases.&lt;/p&gt;

&lt;p&gt;The mapping describes the fields in the JSON documents along with their data type, as well as how they should be indexed in the indexes.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyqr9utiq22e1uis9fb88.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyqr9utiq22e1uis9fb88.png" width="800" height="108"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Databases ~ Indexes&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;The above image will give you a better idea about Elasticsearch indexes compared to MySQL or PostgreSQL.&lt;/p&gt;

&lt;h1&gt;
  
  
  Kibana
&lt;/h1&gt;

&lt;p&gt;Done with storing the messages/data in the Elasticsearch indexes? Okay, Great! We can finally use that resultant data to visualize and get more insights about the data.&lt;/p&gt;

&lt;p&gt;We use Kibana for that.&lt;/p&gt;

&lt;h2&gt;
  
  
  Kibana: Explore, Visualize, Discover Data | Elastic
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Your window into the Elastic Stack Kibana is a free and open user interface that lets you visualize your Elasticsearch
&lt;/h3&gt;

&lt;p&gt;&lt;a href="http://www.elastic.co" rel="noopener noreferrer"&gt;www.elastic.co&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Kibana is a free and open user interface that lets you visualize your Elasticsearch data and navigate the Elastic Stack.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Kibana Dashboard
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4b7vmi7fkldggedoepj5.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4b7vmi7fkldggedoepj5.gif" width="800" height="384"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is what my final Kibana dashboard looks like. You can check out the code at my GitHub &lt;a href="https://github.com/Dipankar-Medhi/sense_media" rel="noopener noreferrer"&gt;repo&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Feel free to leave a star if you like the project.&lt;/p&gt;

&lt;p&gt;This part covers only the idea or the overview of the project along with the project architecture. Ill soon add the coding section in a separate part so stay tuned for that&lt;/p&gt;




&lt;p&gt;Thats all folks. See you soon 👋&lt;/p&gt;

&lt;p&gt;Happy coding.&lt;/p&gt;

</description>
      <category>streaming</category>
      <category>machinelearning</category>
      <category>dataengineering</category>
      <category>datascience</category>
    </item>
    <item>
      <title>Streaming tweets using Twitter V2 API | Tweepy</title>
      <dc:creator>Dipankar Medhi</dc:creator>
      <pubDate>Wed, 17 Aug 2022 04:53:43 +0000</pubDate>
      <link>https://dev.to/dipankarmedhi/streaming-tweets-using-twitter-v2-api-tweepy-58pf</link>
      <guid>https://dev.to/dipankarmedhi/streaming-tweets-using-twitter-v2-api-tweepy-58pf</guid>
      <description>&lt;p&gt;With v2 Twitter API, things have changed when it comes to streaming tweets. Today we're going to see how to use StreamingClient to stream tweets and store them into an SQLite3 database.&lt;/p&gt;

&lt;h2&gt;
  
  
  About Twitter V2 API
&lt;/h2&gt;

&lt;p&gt;For streaming tweets, you are most likely to apply for an "Elevated" account.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb2ylfydb38psamezaskf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb2ylfydb38psamezaskf.png" alt="image.png" width="" height=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The application process is fairly simple and easy. Once the application has been submitted, you will receive an "approval" email from the Twitter Dev team.&lt;/p&gt;

&lt;h2&gt;
  
  
  Things to be done on your Twitter developer portal
&lt;/h2&gt;

&lt;p&gt;After you've got your &lt;strong&gt;Elevate&lt;/strong&gt; access, visit &lt;a href="https://developer.twitter.com/en/portal/dashboard" rel="noopener noreferrer"&gt;Developer portal&lt;/a&gt; to get your projects and apps ready.&lt;/p&gt;

&lt;p&gt;Move to the projects and apps menu, present on the left side of the developer portal, and add an application as required.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Click on "Add app&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Select App environment&lt;/li&gt;
&lt;li&gt;App name&lt;/li&gt;
&lt;li&gt;Keys &amp;amp; Token&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Next, you will get your API keys and tokens along with a bearer token.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;💡 Save them, cause we'll need them to make requests to the Twitter API.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Now, let's move on to the next section.&lt;/p&gt;

&lt;h2&gt;
  
  
  Installing tweepy
&lt;/h2&gt;

&lt;p&gt;Installing Tweepy is pretty straightforward 📏.&lt;/p&gt;

&lt;p&gt;The official &lt;a href="https://docs.tweepy.org/en/stable/install.html" rel="noopener noreferrer"&gt;tweepy documentation&lt;/a&gt; has everything we need. Make sure to have a look at it.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Make a python virtual environment&lt;code&gt;python -m venv venv&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Install tweepy &lt;code&gt;pip install tweepy&lt;/code&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;See, that's not hard😸.&lt;/p&gt;

&lt;p&gt;Now that we are done with the requirements, we can move to the coding section.&lt;/p&gt;

&lt;h2&gt;
  
  
  Let's write some code
&lt;/h2&gt;

&lt;p&gt;Before that, let's structure our code.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Make a &lt;strong&gt;database&lt;/strong&gt; directory where we'll store the SQLite DB files.&lt;/li&gt;
&lt;li&gt;A &lt;code&gt;main.py&lt;/code&gt; file where all our code goes in, and&lt;/li&gt;
&lt;li&gt;A &lt;code&gt;.env&lt;/code&gt; file that will store all our API keys and tokens.&lt;/li&gt;
&lt;/ol&gt;

&lt;blockquote&gt;
&lt;p&gt;💡 For this project, I have put everything into one file but you can always refactor them into separate modules as per requirements.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Now, We are ready! 🚗&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Store the API keys and tokens in a .env file
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;API_KEY="apikeygoeshere"
API_KEY_SECRET="apikeysecretgoeshere"
ACCESS_TOKEN="accesstokengoeshere"
ACCESS_TOKEN_SECRET="accesstokensecretgoeshere"
BEARER_TOKEN="bearertokengoeshere"

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Importing all necessary packages
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from dotenv import load_dotenv
import os
import sqlite3
import tweepy
import time
import argparse

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Loading the API credentials
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;load_dotenv()
api_key = os.getenv("API_KEY")
api_key_secret = os.getenv("API_KEY_SECRET")
access_key = os.getenv("ACCESS_KEY")
access_key_secret = os.getenv("ACCESS_KEY_SCERET")
bearer_token = os.getenv("BEARER_TOKEN")

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4. Creating the database
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;conn = sqlite3.connect("./database/tweets.db")
print("DB created!")
cursor = conn.cursor()
cursor.execute("CREATE TABLE IF NOT EXISTS tweets (username TEXT,tweet TEXT)")
print("Table created")

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  5. Creating the Streaming class
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;class TweetStreamV2(tweepy.StreamingClient):
    new_tweet = {}

    def on_connect(self):
        print("Connected!")

    def on_includes(self, includes):
        self.new_tweet["username"] = includes["users"][0].username
        print(self.new_tweet)
        # insert tweets in db
        cursor.execute(
            "INSERT INTO tweets VALUES (?,?)",
            (
                self.new_tweet["username"],
                self.new_tweet["tweet"],
            ),
        )
        conn.commit()
        # print(self.new_tweet)
        print("tweet added to db!")
        print("-" * 30)

    def on_tweet(self, tweet):
        if tweet.referenced_tweets == None:
            # self.new_tweet["tweet"] = tweet.text
            print(tweet.text)
            time.sleep(0.3)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;What does the code say?&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Before moving into details, I request you to please have a look at the &lt;a href="https://docs.tweepy.org/en/stable/streamingclient.html" rel="noopener noreferrer"&gt;StreamingClient&lt;/a&gt; documentation. This will make things more clear.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;on_connect&lt;/code&gt; method prints a "Connected" message, letting us know that we have successfully connected to the Twitter API.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;on_tweet&lt;/code&gt; method receives a tweet and processes it according to the conditions, if there are any, and adds the tweet to the hashmap. &lt;/li&gt;
&lt;li&gt;
&lt;code&gt;on_includes&lt;/code&gt; is responsible for the user details and adds the user data to the hashmap.&lt;/li&gt;
&lt;li&gt;Finally, the data in the hashmap is inserted into the tweets table.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  5. Main function
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;def main():
    # get args
    parser = argparse.ArgumentParser()
    parser.add_argument("search_query", help="Twitter search query")
    args = parser.parse_args()
    query = args.search_query

    stream = TweetStreamV2(bearer_token)

    # delete previous query
    prev_id = stream.get_rules().data[0].id
    stream.delete_rules(prev_id)
    # add new query
    stream.add_rules(tweepy.StreamRule(query))

    print(stream.get_rules())

    stream.filter(
        tweet_fields=["created_at", "lang"],
        expansions=["author_id"],
        user_fields=["username", "name"],
    )

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;What does the code say?&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The python script takes an argument, &lt;code&gt;search_query&lt;/code&gt;. &lt;/li&gt;
&lt;li&gt;This argument is added to the stream rules after deleting the previously added rules.&lt;/li&gt;
&lt;li&gt;Rules are basically searched queries that go in as input into the stream object. There can be more than one rule. And each rule has a &lt;code&gt;value&lt;/code&gt;, &lt;code&gt;tag&lt;/code&gt; and an &lt;code&gt;id&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;id&lt;/code&gt; is passed on to the &lt;code&gt;delete_rules&lt;/code&gt; method to delete a rule.&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;💡 I suggest you refer to the official &lt;a href="https://docs.tweepy.org/en/stable/streamingclient.html#tweepy.StreamingClient.add_rules" rel="noopener noreferrer"&gt;documentation&lt;/a&gt; for more details on adding and deleting rules.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;ul&gt;
&lt;li&gt;Next, we have the filter method. It is responsible for filtering the tweets based on the &lt;code&gt;query&lt;/code&gt; passed and the fields chosen. &lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;All the different fields are:&lt;/p&gt;

&lt;p&gt;expansions (list[str] | str) expansions media_fields (list[str] | str) media_fields place_fields (list[str] | str) place_fields poll_fields (list[str] | str) poll_fields tweet_fields (list[str] | str) tweet_fields user_fields (list[str] | str) user_fields threaded (bool) Whether or not to use a thread to run the stream&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;💡 Refer to the official &lt;a href="https://docs.tweepy.org/en/stable/streamingclient.html#tweepy.StreamingClient.filter" rel="noopener noreferrer"&gt;documentation&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Let's try out our app
&lt;/h3&gt;

&lt;p&gt;To test if everything is working, we pass on &lt;code&gt;Spiderman&lt;/code&gt; argument while running the main.py file.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ python main.py Spiderman

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will create a &lt;code&gt;tweets.db&lt;/code&gt; file inside the database directory.&lt;/p&gt;

&lt;p&gt;And if you view the &lt;code&gt;tweets.db&lt;/code&gt; file, you will find a table with &lt;code&gt;username&lt;/code&gt; and &lt;code&gt;tweet&lt;/code&gt; as its columns respectively.&lt;/p&gt;

&lt;p&gt;| username | tweet |&lt;br&gt;
| username | some_tweets_about_spiderman |&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;This is an example showing how to use the Twitter V2 API with python using the Tweepy library to get live tweets and store them in a database. You can also use &lt;code&gt;csv&lt;/code&gt;, &lt;code&gt;json&lt;/code&gt; files to store tweets.&lt;/p&gt;

&lt;p&gt;I will keep adding more blogs to this series.&lt;/p&gt;

&lt;p&gt;🤝Follow for quick updates.&lt;/p&gt;




&lt;p&gt;🌎Explore, 🎓Learn, 👷Build.&lt;/p&gt;

&lt;p&gt;Happy Coding💛&lt;/p&gt;

</description>
      <category>streaming</category>
      <category>machinelearning</category>
      <category>datascience</category>
      <category>dataengineering</category>
    </item>
    <item>
      <title>How to create an end-to-end Machine Learning pipeline with AMLS (Azure Machine Learning Studio)</title>
      <dc:creator>Dipankar Medhi</dc:creator>
      <pubDate>Mon, 02 May 2022 06:55:36 +0000</pubDate>
      <link>https://dev.to/dipankarmedhi/how-to-create-an-end-to-end-machine-learning-pipeline-with-amls-azure-machine-learning-studio-3mad</link>
      <guid>https://dev.to/dipankarmedhi/how-to-create-an-end-to-end-machine-learning-pipeline-with-amls-azure-machine-learning-studio-3mad</guid>
      <description>&lt;p&gt;Welcome👋!&lt;/p&gt;

&lt;p&gt;Today let us build an end-to-end Machine learning pipeline with Microsoft Azure Machine Learning Studio.&lt;/p&gt;

&lt;p&gt;We are using the adult income dataset.&lt;/p&gt;

&lt;p&gt;For a more detailed tutorial, visit the official Microsoft Azure documentation.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://docs.microsoft.com/en-us/azure/machine-learning/how-to-create-machine-learning-pipelines" rel="noopener noreferrer"&gt;https://docs.microsoft.com/en-us/azure/machine-learning/how-to-create-machine-learning-pipelines&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Step1: Creating the Workspace
&lt;/h2&gt;

&lt;p&gt;The first step is to create the &lt;a href="https://docs.microsoft.com/en-us/azure/machine-learning/how-to-manage-workspace" rel="noopener noreferrer"&gt;Azure Machine Learning workspace&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step2: Connect to workspace
&lt;/h2&gt;

&lt;p&gt;Import all the dependencies&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from azureml.core import Workspace, Datastore
from azureml.core import Experiment
from azureml.core import Model
import azureml.core
import pandas as pd
import numpy as np
import joblib
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import roc_auc_score
from sklearn.metrics import roc_curve
from sklearn import metrics

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Connecting to the workspace&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ws = Workspace.from_config()
print(ws)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step3: Create an Experiment
&lt;/h2&gt;

&lt;p&gt;We are naming our experiment "new-adult-exp"&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Create an Azure ML experiment in your workspace
experiment = Experiment(workspace = ws, name = "new-adult-exp")
run = experiment.start_logging()

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step4: Setting up a datastore
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What's a datastore?
&lt;/h3&gt;

&lt;p&gt;A datastore stores the data for the pipeline to access. A default datastore is registered to connect to the Azure Blob storage.&lt;/p&gt;

&lt;h3&gt;
  
  
  Azure Storage data services
&lt;/h3&gt;

&lt;p&gt;The Azure Storage platform includes the following data services:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Azure Blobs&lt;/strong&gt; : A massively scalable object store for text and binary data. Also includes support for big data analytics through Data Lake Storage Gen2.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Azure Files&lt;/strong&gt; : Managed file shares for cloud or on-premises deployments.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Azure Queues&lt;/strong&gt; : A messaging store for reliable messaging between application components.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Azure Tables&lt;/strong&gt; : A NoSQL store for schemaless storage of structured data.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Azure Disks&lt;/strong&gt; : Block-level storage volumes for Azure VMs.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For a brief understanding of all the data storage types, I recommend following the official documentation. 👇&lt;/p&gt;

&lt;p&gt;&lt;a href="https://docs.microsoft.com/en-us/azure/storage/common/storage-introduction?toc=/azure/storage/blobs/toc.json" rel="noopener noreferrer"&gt;https://docs.microsoft.com/en-us/azure/storage/common/storage-introduction?toc=/azure/storage/blobs/toc.json&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Setting up the datastore
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#upload data by using get_default_datastore()
ds = ws.get_default_datastore()
ds.upload(src_dir='./data', target_path='data', overwrite=True, show_progress=True)

print('Done')

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmxrx98e4ysvq1wjmgoix.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmxrx98e4ysvq1wjmgoix.png" alt="image.png" width="" height=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Creating the Tabular Dataset
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from azureml.core import Dataset

csv_paths = [(ds, 'data/adult.csv')]
tab_ds = Dataset.Tabular.from_delimited_files(path=csv_paths)
tab_ds = tab_ds.register(workspace=ws, name='adult_ds_table',create_new_version=True)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fobi07j4xy4zx3bhmd504.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fobi07j4xy4zx3bhmd504.png" alt="image.png" width="" height=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Step5: Creating a pipeline folder
&lt;/h2&gt;

&lt;p&gt;Inside the &lt;code&gt;User&lt;/code&gt; folder we have the username folder, and inside that we create a new folder &lt;code&gt;pipeline&lt;/code&gt; that will contain all the code files.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2eg21kpsfjzly263b15d.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2eg21kpsfjzly263b15d.png" alt="image.png" width="" height=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Step6: Create Compute Target
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from azureml.core.compute import ComputeTarget, AmlCompute

compute_name = "aml-compute"
vm_size = "STANDARD_NC6"
if compute_name in ws.compute_targets:
    compute_target = ws.compute_targets[compute_name]
    if compute_target and type(compute_target) is AmlCompute:
        print('Found compute target: ' + compute_name)
else:
    print('Creating a new compute target...')
    provisioning_config = AmlCompute.provisioning_configuration(vm_size=vm_size, # STANDARD_NC6 is GPU-enabled
                                                                min_nodes=0,
                                                                max_nodes=4)
    # create the compute target
    compute_target = ComputeTarget.create(
        ws, compute_name, provisioning_config)

    # Can poll for a minimum number of nodes and for a specific timeout.
    # If no min node count is provided it will use the scale settings for the cluster
    compute_target.wait_for_completion(
        show_output=True, min_node_count=None, timeout_in_minutes=20)

    # For a more detailed view of current cluster status, use the 'status' property
    print(compute_target.status.serialize())

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step7: Loading the dataset and training
&lt;/h2&gt;

&lt;p&gt;I am loading the tabular data from the &lt;code&gt;Datasets&lt;/code&gt; under the Assets tab.&lt;/p&gt;

&lt;p&gt;Here, I am using &lt;strong&gt;Random Forest classifier&lt;/strong&gt; for classifying if the income is below 50k or more than 50k.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Loading the dataset
from azureml.core import Run
from azureml.core import Dataset
from sklearn.ensemble import RandomForestClassifier

dataset = Dataset.get_by_name(ws, 'adult_ds_table', version='latest')

# converting our dataset to pandas dataframe
adult_data = dataset.to_pandas_dataframe()
# dropping the null values
adult_data = adult_data.dropna()

## Performing data preprocessing
df = adult_data.rename(columns={'fnlwgt': 'final-wt'})

# outlier treatment
def remove_outlier_IQR(df, field_name):
    iqr = 1.5 * (np.percentile(df[field_name], 75) -
                 np.percentile(df[field_name], 25))
    df.drop(df[df[field_name] &amp;gt; (
        iqr + np.percentile(df[field_name], 75))].index, inplace=True)
    df.drop(df[df[field_name] &amp;lt; (np.percentile(
        df[field_name], 25) - iqr)].index, inplace=True)
    return df

df2 = remove_outlier_IQR(df,'final-wt')
df_final = remove_outlier_IQR(df2, 'hours-per-week')
df_final.shape

df_final = df_final.replace({'?': 'unknown'})
cat_df = df_final.select_dtypes(exclude=[np.number, np.datetime64])
num_df = df_final.select_dtypes(exclude=[np.object, np.datetime64])
cat_df = pd.get_dummies(cat_df)
data = pd.concat([cat_df,num_df],axis=1)

from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split

X1 = data.drop(columns=['income_&amp;lt;=50K', 'income_&amp;gt;50K'])
y1 = data['income_&amp;lt;=50K']

# Scaling the data
scaler = StandardScaler()
scaled_df = scaler.fit_transform(X1)

X1_train, X1_test, y1_train, y1_test = train_test_split(
    scaled_df, y1, test_size=0.3)

# model training
rfm = RandomForestClassifier(random_state=10)
rfm.fit(X1_train, y1_train)
y1_pred = rfm.predict(X1_test)

print(metrics.accuracy_score(y1_test, y1_pred))
run.log('accuracy', np.float(metrics.accuracy_score(y1_test, y1_pred)))
run.log('AUC', np.float(roc_auc_score(y1_test, y1_pred)))

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step8: Register the model
&lt;/h2&gt;

&lt;p&gt;The next step that is important is to register the trained model in the workspace for future model inference.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Save the trained model
model_file = 'new-adult-income-model.pkl'
joblib.dump(value=rfm, filename=model_file)
run.upload_file(name = 'outputs/' + model_file, path_or_stream = './' + model_file)
# Complete the run
run.complete()
# Register the model
model = run.register_model(model_path='outputs/new-adult-income-model.pkl', model_name='new-adult-income-model',
                   tags={'Training context':'Inline Training'},
                   properties={'AUC': run.get_metrics()['AUC'], 'Accuracy': run.get_metrics()['accuracy']})

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It is visible inside the &lt;code&gt;Models&lt;/code&gt; section under Assets tab.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiedo2tv7qoz2n7h4zv5q.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiedo2tv7qoz2n7h4zv5q.png" alt="image.png" width="" height=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Step9: Deploying the model
&lt;/h2&gt;

&lt;p&gt;The next step is to deploy the model.&lt;/p&gt;

&lt;p&gt;Create the &lt;code&gt;InferenceConfig&lt;/code&gt; and &lt;code&gt;AciWebservice&lt;/code&gt; for deploying the model as a webservice and access it via the endpoints using any REST API or gRPC.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from azureml.core.webservice import AciWebservice
from azureml.core.model import InferenceConfig

import os
path = os.getcwd()
# Configure the scoring environment
script_file = os.path.join(path, "prepare.py")
env_file = os.path.join(path, "adult-income.yml")

inference_config = InferenceConfig(runtime= "python",
                                   entry_script="./prepare.py",
                                   conda_file="./adult-income.yml")
deployment_config = AciWebservice.deploy_configuration(cpu_cores = 1, memory_gb = 1)
service_name = "adult-income-service"
service = Model.deploy(ws, service_name, [model], inference_config, deployment_config, overwrite=True)
service.wait_for_deployment(True)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here's the endpoint details under &lt;code&gt;Endpoints&lt;/code&gt; section. &lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9uxponyayfmoqrecn2bx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9uxponyayfmoqrecn2bx.png" alt="image.png" width="" height=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Step10: Check by sending a request
&lt;/h2&gt;

&lt;p&gt;We check if our endpoint is working fine by sending a request using &lt;code&gt;requests&lt;/code&gt; package.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import requests
import json

endpoint = service.scoring_uri
x_new = X1_test[0:1].tolist()
# Convert the array to a serializable list in a JSON document
input_json = json.dumps({"data": X1_test[0:1].tolist()})
# Set the content type
headers = { 'Content-Type':'application/json' }
response = requests.post(endpoint, data = input_json, headers = headers)
pred = json.loads(response.json())
print(pred)


output:
['above_50k']

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;This is an example showcasing the workflow of Azure Machine Learning Studio, focusing on the steps necessary to create a machine learning pipeline that utilizes the Datastore for storing the data for training and inferencing.&lt;/p&gt;

&lt;p&gt;I will be updating this article in future by adding CI/CD functions and implementing container orchestration (like AKS).&lt;/p&gt;




&lt;p&gt;🌎Explore, 🎓Learn, 👷Build. Happy Coding💛&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Build an awesome CLI using GO</title>
      <dc:creator>Dipankar Medhi</dc:creator>
      <pubDate>Thu, 14 Apr 2022 10:24:28 +0000</pubDate>
      <link>https://dev.to/dipankarmedhi/build-an-awesome-cli-using-go-6o6</link>
      <guid>https://dev.to/dipankarmedhi/build-an-awesome-cli-using-go-6o6</guid>
      <description>&lt;h2&gt;
  
  
  CLI in go
&lt;/h2&gt;

&lt;p&gt;Go is great for building CLI applications. It provides two very powerful tools cobra-cli and viper. But in this example, we are going to use the flag package and other built-in tools.&lt;/p&gt;

&lt;p&gt;For more information on CLI using go, visit &lt;a href="https://go.dev/solutions/clis" rel="noopener noreferrer"&gt;go.dev&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Creating project structure and go module
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;First we create a directory, I have named it &lt;strong&gt;go-todo-cli&lt;/strong&gt;. You can give your own name.&lt;/li&gt;
&lt;li&gt;Inside that create two more directories, &lt;strong&gt;cmd/todo&lt;/strong&gt; , where we will have the command-line interface code.&lt;/li&gt;
&lt;li&gt;Add &lt;strong&gt;main.go&lt;/strong&gt; and &lt;strong&gt;main_test.go&lt;/strong&gt; files inside &lt;strong&gt;cmd/todo&lt;/strong&gt; directory.&lt;/li&gt;
&lt;li&gt;Add &lt;strong&gt;todo.go&lt;/strong&gt; and &lt;strong&gt;todo_test.go&lt;/strong&gt; files inside the &lt;strong&gt;parent&lt;/strong&gt; directory.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A graphical representation for a better understanding of the project folder structure.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fglkm47oig1ahqddz79px.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fglkm47oig1ahqddz79px.png" alt="image.png" width="" height=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Then initialize the Go module for the project by using &lt;code&gt;go mod init &amp;lt;your module name&amp;gt;&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;go mod init github.com/dipankar-medhi/TodoCli

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;💡 Keeping the module name the same as the folder name can make things easy.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Coding the todo functions
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Start by declaring the package name inside the todo.go file.&lt;/li&gt;
&lt;li&gt;Import the packages.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;package todo

import (
    "encoding/json"
    "errors"
    "fmt"
    "io/ioutil"
    "os"
    "time"
)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Then we create two data structures to be used in our package. The first one is a struct &lt;code&gt;item&lt;/code&gt; and the second one is a list type &lt;code&gt;[]item&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;The item struct will have some fields, like the &lt;em&gt;Task&lt;/em&gt; as string, &lt;em&gt;Done&lt;/em&gt; as bool to mark if the task is complete or not, &lt;em&gt;CreatedAt&lt;/em&gt; as time.Time that shows the time when this task is created. And lastly, we have &lt;em&gt;CompletedAt&lt;/em&gt; of time.Time that shows when this task is completed.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;type item struct {
    Task string
    Done bool
    CreatedAt time.Time
    CompletedAt time.Time
}

type List []item

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;💡 The struct name is lowercase cause we do not plan to export it.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Functions of our todo CLI application:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Add new tasks&lt;/li&gt;
&lt;li&gt;Mark tasks as complete&lt;/li&gt;
&lt;li&gt;Delete tasks from the list of tasks&lt;/li&gt;
&lt;li&gt;Save the list of tasks as JSON&lt;/li&gt;
&lt;li&gt;Get the tasks from the JSON file&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So, let's start by defining the add function&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Add function&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This function will add new tasks to the list []item.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;func (l *List) Add(task string) {
    t := item{
        Task: task,
        Done: false,
        CreatedAt: time.Now(),
        CompletedAt: time.Time{},
    }

    *l = append(*l, t)
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Complete function&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This function marks an item/task as complete by setting the done field inside the item struct as true and completed at the current time.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;func (l *List) Complete(i int) error {
    ls := *l
    if i &amp;lt;= 0 || i &amp;gt; len(ls) {
        return fmt.Errorf("item %d does not exist", i)
    }
    ls[i-1].Done = true
    ls[i-1].CompletedAt = time.Now()

    return nil
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Save function&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This function saves the list of tasks in JSON format.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;func (l *List) Save(fileName string) error {
    json, err := json.Marshal(l)
    if err != nil {
        return err
    }
    return ioutil.WriteFile(fileName, json, 0644)
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Get function&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This function will get the saved tasks list from the directory with help of the filename and decode and parse that JSON data into a list.&lt;/p&gt;

&lt;p&gt;It will also handle cases when the filename doesn't exist or is an empty file.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;func (l *List) Get(fileName string) error {
    file, err := ioutil.ReadFile(fileName)
    if err != nil {
        // if the given file does not exist
        if errors.Is(err, os.ErrNotExist) {
            return nil
        }
        return err
    }

    if len(file) == 0 {
        return nil
    }

    return json.Unmarshal(file, l)
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We are done with the to-do functions.&lt;/p&gt;

&lt;p&gt;Now let's write the tests to ensure everything is working correctly as intended.&lt;/p&gt;

&lt;h2&gt;
  
  
  Writing tests for todo functions
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Start by creating a todo_test.go file inside the same directory as todo.go is present.&lt;/li&gt;
&lt;li&gt;Write the package name as todo_test and import the necessary packages.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;package todo_test

import (
    "io/ioutil"
    "os"
    "testing"

    todo "github.com/dipankar-medhi/TodoCli"
)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Test for add function&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;func TestAdd(t *testing.T) {
    l := todo.List{}

    taskName := "New Task"
    l.Add(taskName)

    if l[0].Task != taskName {
        t.Errorf("Expected %q, got %q instead", taskName, l[0].Task)
    }
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Test for complete function&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;func TestComplete(t *testing.T) {
    l := todo.List{}

    taskName := "New Task"
    l.Add(taskName)

    if l[0].Task != taskName {
        t.Errorf("Expected %q, got %q instead", taskName, l[0].Task)
    }

    if l[0].Done {
        t.Errorf("New task should not be completed.")
    }
    l.Complete(1)

    if !l[0].Done {
        t.Errorf("New task should be completed.")
    }
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Test for saving and get function&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;func TestSaveGet(t *testing.T) {
    // two list
    l1 := todo.List{}
    l2 := todo.List{}
    taskName := "New Task"
    // saving task into l1 and loading it into l2 -- error if fails
    l1.Add(taskName)
    if l1[0].Task != taskName {
        t.Errorf("Expected %q, got %q instead.", taskName, l1[0].Task)
    }
    tf, err := ioutil.TempFile("", "")
    if err != nil {
        t.Fatalf("Error creating temp file: %s", err)
    }
    defer os.Remove(tf.Name())
    if err := l1.Save(tf.Name()); err != nil {
        t.Fatalf("Error saving list to file: %s", err)
    }
    if err := l2.Get(tf.Name()); err != nil {
        t.Fatalf("Error getting list from file: %s", err)
    }
    if l1[0].Task != l2[0].Task {
        t.Errorf("Task %q should match %q task.", l1[0].Task, l2[0].Task)
    }
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now let's test the application.&lt;/p&gt;

&lt;p&gt;Save the file and use the go test tool to execute the tests.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ go test -v
=== RUN TestAdd
--- PASS: TestAdd (0.00s)
=== RUN TestComplete
--- PASS: TestComplete (0.00s)
=== RUN TestDelete
--- PASS: TestDelete (0.00s)
=== RUN TestSaveGet
--- PASS: TestSaveGet (0.00s)
PASS
ok github.com/dipankar-medhi/TodoCli

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It is working fine. Let's proceed to the next step.&lt;/p&gt;

&lt;h2&gt;
  
  
  Building the main CLI functionality
&lt;/h2&gt;

&lt;p&gt;We create the &lt;code&gt;main.go&lt;/code&gt; and &lt;code&gt;main_test.go&lt;/code&gt; file inside cmd/todo.&lt;/p&gt;

&lt;p&gt;Let's begin writing the code inside the main.go file.&lt;/p&gt;

&lt;p&gt;We start by importing the packages.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;package main

import (
    "flag"
    "fmt"
    "os"

    todo "github.com/dipankar-medhi/TodoCli"
)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Create a main() function.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;func main() {

}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Inside the main function, write all our command-line functions and flags to be executed.&lt;/p&gt;

&lt;p&gt;Parse the command-line flags.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    task := flag.String("task", "", "Task to be included in the todolist")
    list := flag.Bool("list", false, "List all tasks")
    complete := flag.Int("complete", 0, "Item to be completed")

    flag.Parse()

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;💡these are pointers, so we have to use * to use them.&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    l := &amp;amp;todo.List{}

    //calling Get method from todo.go file
    if err := l.Get(todoFileName); err != nil {
        // in cli, stderr output is best practice
        fmt.Fprintln(os.Stderr, err)
        // another good practice is to exit the program with
        // a return code different than 0.
        os.Exit(1)
    }

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Decide what to do based on the arguments provided. So we use switch for this purpose.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    switch {
    case *list:
        // list current to do items
        for _, item := range *l {
            if !item.Done {
                fmt.Println(item.Task)
            }
        }
    // to verify if complete flag is set with value more than 0 (default)
    case *complete &amp;gt; 0:
        if err := l.Complete(*complete); err != nil {
            fmt.Fprintln(os.Stderr, err)
            os.Exit(1)
        }
        // save the new list
        if err := l.Save(todoFileName); err != nil {
            fmt.Fprintln(os.Stderr, err)
            os.Exit(1)
        }
    // verify if task flag is set with different than empty string
    case *task != "":
        l.Add(*task)
        if err := l.Save(todoFileName); err != nil {
            fmt.Fprintln(os.Stderr, err)
            os.Exit(1)
        }
    default:
        // print an error msg
        fmt.Fprintln(os.Stderr, "Invalid option")
        os.Exit(1)
    }

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Writing tests for the main function
&lt;/h2&gt;

&lt;p&gt;Start by importing packages and defining some variables.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;package main

import (
    "fmt"
    "os"
    "os/exec"
    "path/filepath"
    "runtime"
    "testing"
)


var (
    binName = "todo"
    fileName = ".todo.json"
)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Test for Main function&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;func TestMain(m *testing.M) {
    fmt.Println("Building tool...")
    if runtime.GOOS == "windows" {
        binName += ".exe"
    }
    build := exec.Command("go", "build", "-o", binName)
    if err := build.Run(); err != nil {
        fmt.Fprintf(os.Stderr, "Cannot build tool %s: %s", binName, err)
        os.Exit(1)
    }

    fmt.Println("Running tests....")
    result := m.Run()
    fmt.Println("Cleaning up...")
    os.Remove(binName)
    os.Remove(fileName)
    os.Exit(result)
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;**Tests for Todo functions&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;func TestTodoCLI(t *testing.T) {
    task := "test task number 1"
    dir, err := os.Getwd()
    if err != nil {
        t.Fatal(err)
    }
    cmdPath := filepath.Join(dir, binName)
    t.Run("AddNewTask", func(t *testing.T) {
        cmd := exec.Command(cmdPath, "-task", task)
        if err := cmd.Run(); err != nil {
            t.Fatal(err)
        }
    })

    t.Run("ListTasks", func(t *testing.T) {
        cmd := exec.Command(cmdPath, "-list")
        out, err := cmd.CombinedOutput()
        if err != nil {
            t.Fatal(err)
        }
        expected := task + "\n"

        if expected != string(out) {
            t.Errorf("Expected %q, got %q instead\n", expected, string(out))
        }

    })
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We have written all our tests.&lt;/p&gt;

&lt;p&gt;Now, let's test out the application.&lt;/p&gt;

&lt;p&gt;Run go test -v inside cmd/todo directory.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ go test -v
Building tool...
Running tests....
=== RUN TestTodoCLI
=== RUN TestTodoCLI/AddNewTask
=== RUN TestTodoCLI/ListTasks
--- PASS: TestTodoCLI (0.51s)
    --- PASS: TestTodoCLI/AddNewTask (0.47s)
    --- PASS: TestTodoCLI/ListTasks (0.05s) 
PASS
Cleaning up...
ok github.com/dipankar-medhi/TodoCli/cmd/todo 1.337s

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We see that everything is working fine.&lt;/p&gt;

&lt;p&gt;Now it's time to use our application.&lt;/p&gt;

&lt;p&gt;Before getting the list of items, we should add some tasks. So we add a few items using -task flag.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ go run main.go -task "Get Vegetables from the market"
$ go run main.go -task "Drop the package"


$ go run main.go -list
"Get Vegetables from the market"
"Drop the package"

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let's try marking our tasks complete.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ go run main.go -complete 1


$ go run main.go -list
"Drop the package"

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;This is a simple to-do CLI that has limited functions. And by using external packages like cobra-cli, the functionality of the application can be improved to a great extent.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Reference&lt;/strong&gt; : "Powerful Command-Line Applications in Go Build Fast and Maintainable Tools by Ricardo Gerardi"&lt;/p&gt;




&lt;p&gt;🌎Explore, 🎓Learn, 👷‍♂️Build. Happy Coding💛&lt;/p&gt;

</description>
      <category>go</category>
      <category>cli</category>
    </item>
    <item>
      <title>Part1 - Introduction (Clean Architecture by Robert C.Martin)</title>
      <dc:creator>Dipankar Medhi</dc:creator>
      <pubDate>Mon, 04 Apr 2022 11:06:08 +0000</pubDate>
      <link>https://dev.to/dipankarmedhi/part1-introduction-clean-architecture-by-robert-cmartin-7b5</link>
      <guid>https://dev.to/dipankarmedhi/part1-introduction-clean-architecture-by-robert-cmartin-7b5</guid>
      <description>&lt;h2&gt;
  
  
  Why is there a decline in programmers' productivity over time?
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Late-night race
&lt;/h3&gt;

&lt;p&gt;Modern developers are sleep deprived. They work day and night, and they don't sleep and write code all day long to complete their task before their deadline.&lt;/p&gt;

&lt;p&gt;But sleep is essential. Sleep deprivation can affect our working potential and lower our performance while writing code. The part of the brain that knows to write good, clean code is sleeping.&lt;/p&gt;

&lt;h3&gt;
  
  
  Overconfidence
&lt;/h3&gt;

&lt;p&gt;Modern developers are overconfident, just like the Rabbit in the "The Rabbit and The Tortoise" story.&lt;/p&gt;

&lt;p&gt;Programmers think they can come back later to clean their mess, but they won't cause they have to deal with the new tasks.&lt;/p&gt;

&lt;p&gt;So to maintain the company's productivity, developers must stop thinking like the Rabbit and be reliable. Developers must take responsibility for their code and try to produce well defined and clean code in the first iteration itself rather than thinking about making it better later or entirely starting over the coding process. Cause the reality is&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Their overconfidence will drive the redesign into the same mess as the original project.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Which is more important? Functions or Architecture
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Functions
&lt;/h3&gt;

&lt;p&gt;Companies hire programmers to write code, and Managers believe writing less code to run their machines and saving money is what matters most.&lt;/p&gt;

&lt;p&gt;And so, most programmers believe that fixing bugs and running machines with few lines of code make them better programmers.&lt;/p&gt;

&lt;h3&gt;
  
  
  Architecture
&lt;/h3&gt;

&lt;p&gt;Another significant value of software is Architecture. Software must be easy to change and manipulate. Technology keeps changing, and so do the requirements. And to deal with new requirements, developers must be able to change old software architecture into new ones.&lt;/p&gt;

&lt;p&gt;But the process is not as smooth as it sounds. If an architecture prefers one strategy over another, it is tough to make changes and upgrade the system. This is why the 1st year of development is often much cheaper than the later integrations.&lt;/p&gt;

&lt;p&gt;So architectures should be flexible and adjustable.&lt;/p&gt;

&lt;h3&gt;
  
  
  Which one gets more importance?
&lt;/h3&gt;

&lt;p&gt;A program that works fine and provides excellent performance but cannot be changed later on won't be enough in the future when the requirements are different.&lt;/p&gt;

&lt;p&gt;And, if a program does not work as well as the first one, but is flexible and easy to change, then further debugging can make it work and keep it working in the future as the requirements vary.&lt;/p&gt;

&lt;p&gt;Both are important, but architectures ensure longer lasting software and maintain the production costs in the long run by delivering what is important rather than urgent.&lt;/p&gt;




&lt;p&gt;This blog is a part of my knowledge database that I am creating for everything I read/study. It is part1, and there are more parts to be read. And once I finish reading those remaining portions, I will add them to this series.&lt;/p&gt;




&lt;p&gt;🌎Explore, 🎓Learn, 👷‍♂️Build. Happy Coding💛&lt;/p&gt;

</description>
      <category>books</category>
    </item>
    <item>
      <title>Build a Shared Wallet in Solidity</title>
      <dc:creator>Dipankar Medhi</dc:creator>
      <pubDate>Wed, 30 Mar 2022 07:42:35 +0000</pubDate>
      <link>https://dev.to/dipankarmedhi/build-a-shared-wallet-in-solidity-4cmj</link>
      <guid>https://dev.to/dipankarmedhi/build-a-shared-wallet-in-solidity-4cmj</guid>
      <description>&lt;p&gt;Today, we will build a shared wallet in Solidity, which will have functions like withdrawing, adding funds to different users on the wallet. &lt;br&gt;
We will use &lt;a href="https://openzeppelin.com/" rel="noopener noreferrer"&gt;Openzeppelin&lt;/a&gt; for the ownership and other security processes. &lt;/p&gt;
&lt;h2&gt;
  
  
  🚀What is the project all about?
&lt;/h2&gt;

&lt;p&gt;This project aims to create a shared wallet on the blockchain, and there will be an Owner and other users. &lt;/p&gt;

&lt;p&gt;The Owner will have access to all the functions of the wallet. The Owner can add funds and withdraw ethers, while only the added users on the wallet will have access to withdraw funds. No other user can draw or add funds to their account.&lt;/p&gt;
&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;You can choose whatever you are familiar with as the design principle will remain the same. But in our case, we are going to use &lt;a href="https://remix-project.org/" rel="noopener noreferrer"&gt;Remix IDE&lt;/a&gt; for running and deploying the smart contracts for Ethereum.&lt;/p&gt;
&lt;h3&gt;
  
  
  What is a Smart Contract?
&lt;/h3&gt;

&lt;p&gt;A Smart Contract is like a digital agreement or deal between two parties. While a normal agreement takes place on paper or official documents, a Smart Contract is executed as code running in a blockchain.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Want to more about &lt;strong&gt;Smart Contract&lt;/strong&gt;? Visit &lt;a href="https://ethereum.org/en/developers/docs/smart-contracts/" rel="noopener noreferrer"&gt;Ethereum.org&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;
  
  
  Solidity
&lt;/h3&gt;

&lt;p&gt;We will write our whole code on &lt;strong&gt;Solidity&lt;/strong&gt;, so having a basic understanding of its syntax will make things easier to understand. If you are new to Solidity, here are some excellent resources that you might want to check out.&lt;/p&gt;
&lt;h4&gt;
  
  
  📌List of good free solidity resources
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.youtube.com/watch?v=YJ-D1RMI0T0" rel="noopener noreferrer"&gt;MASTER Solidity for Blockchain Step-By-Step (Full Course)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.defi-academy.com/courses/introduction-to-smart-contracts" rel="noopener noreferrer"&gt;Solidity 101: Introduction to smart contracts&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://blog.finxter.com/solidity/" rel="noopener noreferrer"&gt;Solidity Crash Course&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  Let's Start Coding
&lt;/h2&gt;
&lt;h3&gt;
  
  
  Step 1: Set up Remix IDE
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Visit &lt;a href="https://remix-project.org/" rel="noopener noreferrer"&gt;remix&lt;/a&gt; and start a new project by clicking on the &lt;code&gt;REMIX IDE&lt;/code&gt; on the top right corner.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj6y8mgqiip32p2b57vft.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj6y8mgqiip32p2b57vft.png" alt="image.png" width="" height=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;A new screen with a default workspace containing some folders and files on the left and a code editor on the right will appear.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Create a new file &lt;strong&gt;simpleSharedWallet.sol&lt;/strong&gt;. &lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Falaub6eq8xqedfh68vl2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Falaub6eq8xqedfh68vl2.png" alt="image.png" width="" height=""&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Step 2: Solidity version and import Openzepplin
&lt;/h3&gt;

&lt;p&gt;Specify the solidity version and import Openzepplin into our code.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;//SPDX-License-Identifier: MIT
pragma solidity ^0.8.12;

import "https://github.com/OpenZeppelin/openzeppelin-contracts/blob/master/contracts/access/Ownable.sol";
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Version 0.8.12 or more is used. &lt;/li&gt;
&lt;li&gt;Openzeppelin deals with the ownership of the contract and provides an access control mechanism.&lt;/li&gt;
&lt;li&gt;To know more about Openzeppelin visit &lt;a href="https://docs.openzeppelin.com/contracts/2.x/api/ownership" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Step 3: Funds smart contract
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Create the &lt;code&gt;Funds&lt;/code&gt; smart contract.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;contract Funds is Ownable {

}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Define a map inside the &lt;code&gt;Funds&lt;/code&gt; contract that will hold the **addresses **and **funds **of the users.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;contract Funds is Ownable {
    mapping(address =&amp;gt; uint) public funds;
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Create a public function &lt;code&gt;setFunds()&lt;/code&gt; to set the funds for the different users.&lt;/li&gt;
&lt;li&gt;The function accepts the parameters address &lt;code&gt;_who&lt;/code&gt; and uint &lt;code&gt;_amount&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;This function is only made accessible to the owner by using onlyOwner modifier provided by Openzepplin.&lt;/li&gt;
&lt;li&gt;If the requirements are met, the fund is incremented on the map.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;function setFund(address _who, uint _amount) public onlyOwner {
        require(funds[_who] &amp;lt;= address(this).balance , "Amount is more than available in the contract");
        require(_amount &amp;lt;= address(this).balance, "Amount is too high");
        funds[_who] += _amount;
    }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Next, create an &lt;code&gt;allowed()&lt;/code&gt; modifier. So, what is a modifier? Modifiers are used to modify the behaviour of a function. The body of the function is inserted in the place of &lt;code&gt;_;&lt;/code&gt; if all the above-written requirements are met while calling this function. To know more about modifiers, visit &lt;a href="https://www.tutorialspoint.com/solidity/solidity_function_modifiers.htm" rel="noopener noreferrer"&gt;here&lt;/a&gt;.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;modifier allowed(uint _amount) {
        require(msg.sender == owner() || funds[msg.sender] &amp;gt;= _amount, "You are not allowed");
        _;
    }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Finally, create an internal function &lt;code&gt;reduceFunds()&lt;/code&gt; that accepts address &lt;code&gt;_who&lt;/code&gt; and uint &lt;code&gt;_amount&lt;/code&gt;. This function will decrement the funds from the users in the &lt;code&gt;funds&lt;/code&gt; map, every time the funds are withdrawn from the wallet.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;function reduceFund(address _who, uint _amount) internal{
        funds[_who] -= _amount;
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 4: SharedWallet smart contract
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Create the &lt;code&gt;SharedWallet&lt;/code&gt; contract.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;contract SharedWallet is Ownable, Funds {

}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Create a public function &lt;code&gt;getBalance()&lt;/code&gt; that returns &lt;code&gt;uint256&lt;/code&gt;. This function will return the balance of the owner(contract).
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;function getBalance() public view returns (uint256) {
        return address(this).balance;
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Next, create a payable function &lt;code&gt;withdrawMoney()&lt;/code&gt; that will accept an &lt;code&gt;address&lt;/code&gt; and a &lt;code&gt;_amount&lt;/code&gt; parameter. And for this project, we make it accessible only by the owner and the users added to the &lt;code&gt;funds&lt;/code&gt; map.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;function withdrawMoney(address payable _to, uint _amount) payable public allowed(_amount) {
        // entered amount must &amp;lt; balance in the contract
        require(_amount &amp;lt;= address(this).balance, "Contract doesn't own enough money");
        // transfer funds to address entered
        _to.transfer(_amount);
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;We define the &lt;code&gt;pay()&lt;/code&gt; function to initiate the transaction by the contract.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;function pay() public payable {

}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 4: Compiling
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Select the Solidity compiler option and make sure that the compiler version matches the defined solidity version.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fymy6kb2j43lgmwt4ynau.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fymy6kb2j43lgmwt4ynau.png" alt="image.png" width="" height=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Compile the solidity file.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Step 5: Deploy and run the transaction
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzfs7qbqnj5p7oya4cxh3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzfs7qbqnj5p7oya4cxh3.png" alt="image.png" width="" height=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Select the &lt;code&gt;Environment&lt;/code&gt;. For this project, choose Javascript VM.&lt;/li&gt;
&lt;li&gt;Select an account.&lt;/li&gt;
&lt;li&gt;Make sure on the contract option, &lt;strong&gt;SimpleSharedWallet.sol&lt;/strong&gt; file is selected.&lt;/li&gt;
&lt;li&gt;Then deploy.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Step 6: Testing
&lt;/h3&gt;

&lt;p&gt;For testing the application, I suggest using the other available accounts in the JavaScript VM account list. You can try adding them to the funds map by &lt;code&gt;setFunds&lt;/code&gt;. Try withdrawing with the owner account and other available accounts. &lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Blockchain technology is very new, and it is still improving. Many people are still not aware of blockchain technology as its use continues to spread. Blockchain technology can potentially bring positive changes to our lives and society, and we, the developers, should continue exploring and promoting its use.&lt;/p&gt;

&lt;p&gt;This is a simple project done as an example to show one of the use cases of blockchain technology. It is no way near to the original application of blockchain technology but just a tiny glimpse into the world of blockchain. &lt;/p&gt;




&lt;p&gt;🌎Explore, 🎓Learn, 👷‍♂️Build. Happy Coding💛&lt;/p&gt;

</description>
      <category>solidity</category>
      <category>blockchain</category>
      <category>web3</category>
    </item>
    <item>
      <title>Best books on Go Programming Language</title>
      <dc:creator>Dipankar Medhi</dc:creator>
      <pubDate>Wed, 30 Mar 2022 04:13:39 +0000</pubDate>
      <link>https://dev.to/dipankarmedhi/best-books-on-go-programming-language-6e9</link>
      <guid>https://dev.to/dipankarmedhi/best-books-on-go-programming-language-6e9</guid>
      <description>&lt;p&gt;Learning a new topic can be overwhelming, especially if it's a new programming language. Although the concepts remain the same, being able to write and solve problems using a new syntax can be a bit confusing.&lt;/p&gt;

&lt;p&gt;So here I am, trying to help you guys get a good understanding of the Go Programming language by sharing a list of goog Go programming language books.&lt;/p&gt;

&lt;p&gt;These are personally read by me, and they are completely based on my opinion. If you have more good books and what me to add them to the list, feel free to comment down below🤗.&lt;/p&gt;

&lt;p&gt;Note: This list is not ordered. The one above the other doesn't mean it is better.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Learning Go: An Idiomatic Approach to Real-World Go Programming
&lt;/h2&gt;

&lt;p&gt;by O'reilly&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0mjh5hx1hv3f66wqcwzs.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0mjh5hx1hv3f66wqcwzs.png" alt="book"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;There is no other publisher who is as consistent as O'reilly when it comes to delivering good resourceful books.&lt;/p&gt;

&lt;h3&gt;
  
  
  Insights from the book
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;It is easy to follow along and well structured.&lt;/li&gt;
&lt;li&gt;It has code examples that make it easy for the readers to understand what they read.&lt;/li&gt;
&lt;li&gt;The added notes and tips make this book more interesting and engaging.&lt;/li&gt;
&lt;li&gt;It covers everything from Setting up the environment to writing tests in go.&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;💡 &lt;strong&gt;If you are enrolled in any Go bootCamp, following this book will help you understand the language to a deeper level.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  2. The Go Programming Language
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;by Alan A. A. Donovan and Brian W. Kernighan&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8xxjho3a9f8cf4wybz9g.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8xxjho3a9f8cf4wybz9g.png" alt="image.png"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The Go Programming Language is a very well known and reputed book among Go programmers. &lt;/p&gt;

&lt;h3&gt;
  
  
  Insights from the book
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;If you need something that has &lt;strong&gt;more theory&lt;/strong&gt;, then definitely this book is for you.&lt;/li&gt;
&lt;li&gt;This book considered many &lt;strong&gt;use cases and try to replicate&lt;/strong&gt; scenarios where a certain go function can be used in a particular way.&lt;/li&gt;
&lt;li&gt;It has &lt;strong&gt;code examples&lt;/strong&gt; to help you understand what's going on.&lt;/li&gt;
&lt;li&gt;And readers are tested with added &lt;strong&gt;exercises&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;💡 &lt;strong&gt;The Go Programming Language with Learning Go can be a great combination for coders and help you learn every bit of Go.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  3. Concurrency in Go: Tools and Techniques for Developers
&lt;/h2&gt;

&lt;p&gt;*by Katherine Cox-Buday *&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl5chnij0yahtl0c88gl1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl5chnij0yahtl0c88gl1.png" alt="image.png"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you are someone who has a tough time understanding concurrency in golang then this book is for you.&lt;/p&gt;

&lt;h3&gt;
  
  
  Insights from this book
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;A great introduction to &lt;strong&gt;concurrency&lt;/strong&gt; and deep dive to understanding what concurrency is.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Code examples&lt;/strong&gt; make it very intuitive and help a lot to follow along.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Step by step explanation&lt;/strong&gt; of code examples. &lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  4. Head First Go
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;by Jay McGavren&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0bu37iea2i47s3kgfxf4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0bu37iea2i47s3kgfxf4.png" alt="image.png"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is an interesting book from O'reilly. It follows the idea similar to a children's book. Beginners should follow this book.&lt;/p&gt;

&lt;h3&gt;
  
  
  Insights from this book
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Illustrations&lt;/strong&gt; with easy to understand language.&lt;/li&gt;
&lt;li&gt;There are &lt;strong&gt;well-explained comments&lt;/strong&gt; that portray the purpose of a line to its reader.&lt;/li&gt;
&lt;li&gt;This book also has some additional chapters on building &lt;strong&gt;web applications&lt;/strong&gt; using go.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There are many more important and interesting books on Go which I have not covered in this blog. But I will try to read those and add them to this list. &lt;/p&gt;

&lt;p&gt;Every author spends their valuable hours in writing these awesome books for us programmers to easily understand and adapt to new technologies. So we must appreciate their hard work and effort by sharing what we learn and helping their work reach a greater audience and beyond.  &lt;/p&gt;




&lt;p&gt;🌎Explore, 🎓Learn, 👷‍♂️Build. Happy Coding💛&lt;/p&gt;

</description>
      <category>go</category>
      <category>books</category>
      <category>programming</category>
    </item>
    <item>
      <title>KNN from scratch VS sklearn</title>
      <dc:creator>Dipankar Medhi</dc:creator>
      <pubDate>Sat, 26 Mar 2022 04:14:43 +0000</pubDate>
      <link>https://dev.to/dipankarmedhi/knn-from-scratch-vs-sklearn-13p9</link>
      <guid>https://dev.to/dipankarmedhi/knn-from-scratch-vs-sklearn-13p9</guid>
      <description>&lt;p&gt;Welcome👋,&lt;/p&gt;

&lt;p&gt;In this article, we are going to build our own &lt;strong&gt;KNN algorithm&lt;/strong&gt; from scratch and apply it to 23 different feature data set using &lt;strong&gt;Numpy&lt;/strong&gt; and &lt;strong&gt;Pandas&lt;/strong&gt; libraries.&lt;/p&gt;

&lt;p&gt;First, let us get some idea about the KNN or K Nearest Neighbour algorithm.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is the K Nearest Neighbors algorithm?
&lt;/h2&gt;

&lt;p&gt;K Nearest Neighbors is one of the simplest predictive algorithms out there in the supervised machine learning category.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The algorithm works based on two criteria: —&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The number of neighbours to include in the cluster.&lt;/li&gt;
&lt;li&gt;The distance of the neighbours from the test data point.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5vk03xm0sumb2r122hl5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5vk03xm0sumb2r122hl5.png" alt="image.png" width="" height=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Fig: Prediction made by one nearest neighbour (book: Intro to Machine Learning with Python)&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The above image showcases the number of neighbours ( &lt;strong&gt;k = the number of neighbours&lt;/strong&gt; ) that are being considered in predicting the value for the test data point.&lt;/p&gt;

&lt;p&gt;Now, let us start coding in our jupyter notebook.&lt;/p&gt;

&lt;h2&gt;
  
  
  Let's Code
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Data Preprocessing
&lt;/h3&gt;

&lt;p&gt;In our case, we are using the &lt;a href="https://github.com/Dipankar-Medhi/k-nearest-neighbors-KNN/blob/main/diamonds.csv" rel="noopener noreferrer"&gt;diamonds dataset&lt;/a&gt; having &lt;strong&gt;10 features&lt;/strong&gt; out of which &lt;strong&gt;3 are categorical&lt;/strong&gt; and the rest &lt;strong&gt;7 are numerical&lt;/strong&gt; features.&lt;/p&gt;

&lt;h3&gt;
  
  
  Removing Outliers
&lt;/h3&gt;

&lt;p&gt;We can use the &lt;strong&gt;boxplot()&lt;/strong&gt; function to produce boxplots and check if there are any outliers present in the dataset.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcoq56nmpnbn4atf2altu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcoq56nmpnbn4atf2altu.png" alt="image.png" width="" height=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We can see that there are some outliers in the dataset.&lt;/p&gt;

&lt;p&gt;So we remove these outliers using the &lt;strong&gt;IQR method&lt;/strong&gt; (or choose any method of your choice).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# IQR
def remove_outlier_IQR(df, field_name):
    iqr = 1.5 * (np.percentile(df[field_name], 75) -
                 np.percentile(df[field_name], 25))
    df.drop(df[df[field_name] &amp;gt; (
        iqr + np.percentile(df[field_name], 75))].index, inplace=True)
    df.drop(df[df[field_name] &amp;lt; (np.percentile(
        df[field_name], 25) - iqr)].index, inplace=True)
return df

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Printing the shape of the data frame before and after outlier removal using IQR.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;print('Shape of df before IQR:',df.shape)

df2 = remove_outlier_IQR(df, 'carat')
df2 = remove_outlier_IQR(df2, 'depth')
df2 = remove_outlier_IQR(df2, 'price')
df2 = remove_outlier_IQR(df2, 'table')
df2 = remove_outlier_IQR(df2, 'height_mm')
df2 = remove_outlier_IQR(df2, 'length_mm')
df_final = remove_outlier_IQR(df2, 'width_mm')
print('The Shape of df after IQR:',df_final.shape)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;The shape of df before IQR: (53940, 10)&lt;/p&gt;

&lt;p&gt;The shape of df after IQR: (46518, 10)&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Again, after removing the outliers, we check the dataset using a boxplot for better visual confirmation.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fslyx5581rshq0i6s4r1o.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fslyx5581rshq0i6s4r1o.png" alt="image.png" width="" height=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Boxplots after IQR method&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Encoding the Categorical variables
&lt;/h3&gt;

&lt;p&gt;There are &lt;strong&gt;3 categorical features&lt;/strong&gt; in the dataset. Let us print and see the unique values of each feature.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;print('Unique values of cat features:\n')
print('color:', cat_df.color.unique())
print('cut_quality:', cat_df.cut_quality.unique())
print('clarity:', cat_df.clarity.unique())

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flm2oeamypvokl9hodc5w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flm2oeamypvokl9hodc5w.png" alt="image.png" width="" height=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;These are the unique values of the categorical features.&lt;/p&gt;

&lt;p&gt;So, for encoding these features, we use &lt;strong&gt;LabelEncoder&lt;/strong&gt; and &lt;strong&gt;Dummy variables&lt;/strong&gt; (or you can also use &lt;strong&gt;OneHotEncoder&lt;/strong&gt; )&lt;/p&gt;

&lt;p&gt;We can use &lt;code&gt;LabelEncoder()&lt;/code&gt; for converting the cut_quality to numerical values like 0, 1, 2, ….. because cut_quality has ordinal data.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Label encoding using the LabelEncoder function from sklearn
from sklearn.preprocessing import LabelEncoder
label_encoder = LabelEncoder()

df_final['cut_quality'] = label_encoder.fit_transform(df_final['cut_quality'])
df_final.head(2)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyhihllrol9gfq4z34nyx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyhihllrol9gfq4z34nyx.png" alt="image.png" width="" height=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Then we use the &lt;code&gt;get_dummies()&lt;/code&gt; function of the pandas library to get the dummy variables for the categories colour and clarity.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# using dummy variables for the remaing categories
df_final = pd.get_dummies(df_final,columns=['color','clarity'])
df_final.head()

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxsqunams3syz5k1ex42e.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxsqunams3syz5k1ex42e.png" alt="image.png" width="" height=""&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;df_final.shape
--&amp;gt; (46518, 23)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Splitting data for training and testing
&lt;/h3&gt;

&lt;p&gt;We split the data for training and testing using the &lt;code&gt;train_test_split()&lt;/code&gt; method from the sklearn library. The test_size is kept to be equal to 25% of the original dataset.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;data = df_final.copy()
# Using sklearn for scaling and splitting
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split

X = data.drop(columns=['price'])
y = data['price']

# Scaling the data
scaler = StandardScaler()
scaled_df = scaler.fit_transform(X)

X_train, X_test, y_train, y_test = train_test_split(
    scaled_df, y, test_size=0.25)
print("X train shape: {} and y train shape: {}".format(
    X_train.shape, y_train.shape))
print("X test shape: {} and y test shape: {}".format(X_test.shape, y_test.shape))

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbmjphcpyiy0ixcb2dkvk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbmjphcpyiy0ixcb2dkvk.png" alt="image.png" width="" height=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  KNN from sklearn library Vs KNN built from scratch
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Sklearn KNN model
&lt;/h3&gt;

&lt;p&gt;First we use &lt;strong&gt;KNN regressor model&lt;/strong&gt; from sklearn.&lt;/p&gt;

&lt;p&gt;For choosing the optimal k value, we iterate using for loop putting the k value from 1 to 10.&lt;/p&gt;

&lt;p&gt;In our case, the optimal k value obtained is &lt;strong&gt;5&lt;/strong&gt;. So using this &lt;strong&gt;k = 5&lt;/strong&gt; we train the model and make predictions and print those predicted values.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Finding the optimal k value
from sklearn import neighbors
from sklearn.metrics import mean_squared_error
from math import sqrt
import matplotlib.pyplot as plt
rmse_val = []  
for K in range(10):
    K = K+1
    model = neighbors.KNeighborsRegressor(n_neighbors=K)

    model.fit(X_train, y_train)  
    pred = model.predict(X_test)  
    error = sqrt(mean_squared_error(y_test, pred))  
    rmse_val.append(error)  
    print('RMSE value for k = ', K, 'is:', error)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm0mco5cg5xzqe19ozl4g.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm0mco5cg5xzqe19ozl4g.png" alt="image.png" width="" height=""&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Using the optimal k value.
from sklearn import neighbors

model = neighbors.KNeighborsRegressor(n_neighbors=5)

model.fit(X_train, y_train) # fit the model
pred = model.predict(X_test)
pred

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhh923zhl8pyysopnojg7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhh923zhl8pyysopnojg7.png" alt="image.png" width="" height=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now, let's move on to our own KNN model from sklearn using NumPy and pandas.&lt;/p&gt;

&lt;h3&gt;
  
  
  KNN model from scratch
&lt;/h3&gt;

&lt;p&gt;We convert the train and test data into NumPy arrays.&lt;/p&gt;

&lt;p&gt;Then we combine the &lt;strong&gt;X_train&lt;/strong&gt; and &lt;strong&gt;y_train&lt;/strong&gt; into a matrix.&lt;/p&gt;

&lt;p&gt;The matrix will contain the &lt;strong&gt;22 columns&lt;/strong&gt; of the &lt;strong&gt;X_train&lt;/strong&gt; data and &lt;strong&gt;1 column&lt;/strong&gt; of the &lt;strong&gt;y_train&lt;/strong&gt; at the end (i.e the last column).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;train = np.array(X_train)
test = np.array(X_test)
y_train = np.array(y_train)
# reshaping the array from columns to rows
y_train = y_train.reshape(-1, 1)
# combining the training dataset and the y_train into a matrix
train_df = np.hstack([train, y_train])
train_df[0:2]

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo3vud59hidrm54twlefh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo3vud59hidrm54twlefh.png" alt="image.png" width="" height=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now, for each row (data point) of the test dataset, we find the &lt;strong&gt;euclidian distance&lt;/strong&gt; between every point of the train data and the test data point.&lt;/p&gt;

&lt;p&gt;We use for loop for iterating through every point on the test dataset to find the distances and stacking them into the training dataset train_df respectively.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Steps:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;We find the distances between one test point and every point of the train data set.&lt;/li&gt;
&lt;li&gt;We reshape the distances using &lt;code&gt;reshape(-1,1)&lt;/code&gt; to convert this into an array of 1 column and the 11630 rows.&lt;/li&gt;
&lt;li&gt;Then using &lt;code&gt;np.hstack()&lt;/code&gt; we stack this distance array into the train_df dataset.&lt;/li&gt;
&lt;li&gt;Now we sort this matrix from smallest to largest based on the distance column.&lt;/li&gt;
&lt;li&gt;We then take the y_train values from the first 5 rows and take their mean to obtain the prediction value.&lt;/li&gt;
&lt;li&gt;Repeat the above steps for every test point and predict the values respectively and store these values in an array.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;preds = []
for i in range(len(test)):
    distances = np.sqrt(np.sum((train - test[i])**2, axis = 1))
    distances = distances.reshape(-1,1)
    matrix = np.hstack([train_df, distances])
    sorted_matrix = matrix[matrix[:,-1].argsort()]
    neighbours = [sorted_matrix[i][-2] for i in range(5)]
    pred_value = np.mean(neighbours)
    preds.append(pred_value)
knn_scratch_pred = np.array(preds)
knn_scratch_pred

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1gm0lupgnmadivf1i6z7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1gm0lupgnmadivf1i6z7.png" alt="image.png" width="" height=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Comparing Sklearn and Our KNN model
&lt;/h3&gt;

&lt;p&gt;For comparing the prediction values obtained from sklearn and our knn_method, we produce a pandas data frame pred_df as shown in the code below.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sklearn_pred = pred.reshape(-1,1)
my_knn_pred = knn_scratch_pred.reshape(-1,1)
predicted_values = np.hstack([sklearn_pred,my_knn_pred])
pred_df = pd.DataFrame(predicted_values,columns=['sklearn_preds','my_knn_preds'])
pred_df

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd1fpfi6a3qkcby09bwsx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd1fpfi6a3qkcby09bwsx.png" alt="image.png" width="" height=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We can see that the predicted values of our knn_algorithm are exactly similar to those obtained using the sklearn library. This shows that our intuition and method is correct and very accurate.&lt;/p&gt;

&lt;p&gt;For the full code file and the dataset, visit &lt;a href="https://github.com/Dipankar-Medhi/k-nearest-neighbors-KNN" rel="noopener noreferrer"&gt;Github&lt;/a&gt;.&lt;/p&gt;




&lt;p&gt;🌎Explore, 🎓Learn, 👷‍♂️Build. Happy Coding💛&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>knn</category>
      <category>ai</category>
    </item>
    <item>
      <title>How Blockchain is changing the Financial industry?</title>
      <dc:creator>Dipankar Medhi</dc:creator>
      <pubDate>Fri, 18 Mar 2022 17:16:46 +0000</pubDate>
      <link>https://dev.to/dipankarmedhi/how-blockchain-is-changing-the-financial-industry-1gj</link>
      <guid>https://dev.to/dipankarmedhi/how-blockchain-is-changing-the-financial-industry-1gj</guid>
      <description>&lt;p&gt;Hi👋, Today, let us go through the impact of Blockchain technology on the current banking system and how it will change (already changing) the digital transaction and user interaction of sales and exchange. We'll consider scenarios before and after the introduction of blockchain technology and understand the impact of blockchain technology in the world of finance.&lt;/p&gt;

&lt;h2&gt;
  
  
  Before Blockchain Technology
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Barter System - before currency
&lt;/h3&gt;

&lt;p&gt;Earlier, people used to exchange goods or services for other goods or services without using any other form of currency. For example, if one needed a bag of sugar, they had to exchange it for something with equal value in the current market.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy1wkzn9yiqsti8iq8gcg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy1wkzn9yiqsti8iq8gcg.png" alt="image.png" width="" height=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;credit: historyplex.com&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Age of Currency
&lt;/h3&gt;

&lt;p&gt;Then Government introduced the concept of currency (coins) that the general public can use to buy goods or services or earn in exchange for goods or services. Later, after some years, the concept of the bank came in, where they promised currency (coins) in exchange for gold or objects of similar or equal value. Considering the bank as the Trusted third-party body, it stores all the public's transaction history in a &lt;strong&gt;Ledger&lt;/strong&gt;.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;What is a Ledger&lt;/strong&gt;? A &lt;em&gt;ledger&lt;/em&gt; is a book containing all the users' transaction history consisting of all their debits and credits along with the specified time of transaction.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;For more on the ledger, visit &lt;a href="https://www.freshbooks.com/hub/accounting/what-is-a-ledger" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffyxxljw97cx2vpnyl7gx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffyxxljw97cx2vpnyl7gx.png" alt="image.png" width="" height=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;However, Government has set many regulations on using their own money. Banks restrict users from freely spending and investing their hard-earned money in their needs and interests. Banks also set different charges for transactions and the transfer of funds. Moreover, there is a risk of fraud by these third-party bodies where there is a high chance that users might never get back their money. And solving these problems with a traditional system is nearly impossible. So came the blockchain technology that has solved most of these problems.&lt;/p&gt;

&lt;h2&gt;
  
  
  After Blockchain Technology - 🦸‍♂️Blockchain to the rescue
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6kg4ksx3qetyzptqe25v.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6kg4ksx3qetyzptqe25v.png" alt="image.png" width="" height=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;credits: comoganhardinheiro.pt&lt;/p&gt;
&lt;/blockquote&gt;

&lt;ul&gt;
&lt;li&gt;Like Government, which keeps every unique transaction record using Blockchain technology, individual digital currency transaction is preserved in a &lt;strong&gt;Decentralized Ledger&lt;/strong&gt; , utilizing blockchain for financial transactions.&lt;/li&gt;
&lt;li&gt;&lt;p&gt;In 1991, &lt;strong&gt;Herbert&lt;/strong&gt; and &lt;strong&gt;Sonatta&lt;/strong&gt; , in their paper, mentioned: &lt;strong&gt;"How to timestamp a digital document."&lt;/strong&gt; They said if we use a timestamp in a digital document and store it in a repository, we can maintain it as a record and secure it for the future. It is also a proper way to tackle the double-spending problem.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;So taking the point as mentioned above as a reference, &lt;strong&gt;Satoshi Nakamoto&lt;/strong&gt; launched a white paper, where he gave the world's first &lt;strong&gt;Peer-to-Peer Electronic Cash System&lt;/strong&gt;. He said that this would bypass the need for the traditional centralized system by introducing a few new terms like Peer-To-Peer Network, Hashed-based Blocks, etc.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  A little information on Peer-To-Peer Network
&lt;/h3&gt;

&lt;p&gt;The central concept is to place the ledger to every computer around the world in the form of &lt;strong&gt;hashed blocks&lt;/strong&gt;. These hashed blocks are coded cryptographically in different systems across the globe. And changing these blocks or tempering is impossible because this chain is distributed in other systems. This form of network builds the systems' &lt;strong&gt;trust&lt;/strong&gt; and &lt;strong&gt;integrity&lt;/strong&gt; , making it a proper and most effective way of financing in the 21st century. These are the points put forward by Satoshi Nakamoto, which he named Bitcoin.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;To read the full original white paper of Satoshi Nakamoto on Bitcoin, visit &lt;a href="https://bitcoin.org/bitcoin.pdf" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft5s5zp0lu9pnlk3b2ws7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft5s5zp0lu9pnlk3b2ws7.png" alt="image.png" width="" height=""&gt;&lt;/a&gt;credits: &lt;a href="https://remitano.com/forum/in/post/295-peer-to-peer-networking-how-its-changing-our-lives" rel="noopener noreferrer"&gt;https://remitano.com/forum/in/post/295-peer-to-peer-networking-how-its-changing-our-lives&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  A small brief on Hashed based system
&lt;/h3&gt;

&lt;p&gt;Hashing serves to make an object, document, legal paper unique by providing a &lt;strong&gt;digital fingerprint&lt;/strong&gt; in the form of hashes. These hashes give proof of ownership to the respective owners without exposing the details of the product/object to others. &lt;strong&gt;SHA256&lt;/strong&gt; is an example of the hashing function used in Bitcoin, and SHA 256 always gives out a fixed length with a 256-bits size (32 bytes). These hashes are stored in the form of blocks and are added to the blockchain, distributed over thousands of systems (nodes) worldwide.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmk79kclh3uxg48amh59m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmk79kclh3uxg48amh59m.png" alt="image.png" width="" height=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;And tempering the information is impossible because as soon as someone changes (or tries to change) the data, the hash value changes, which doesn't match the hashes present in the blocks on the blockchain on the other systems. This mismatch of hashes in the blocks makes the block invalid, keeping the information safe and secure in the blockchain.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbzhnv7jq0685o1x5bcqk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbzhnv7jq0685o1x5bcqk.png" alt="image.png" width="" height=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;This blog focuses on the impact of blockchain technology on the finance industry, showcasing the changes and improvements it can bring to the industry. It handles the different aspects of how the traditional banking system function and what are the problems it creates for users around the world. It reflects that the banking system is currently vulnerable and how blockchain technology can solve most of its issues. How peer-to-peer network works and how hashing fits in a blockchain network.&lt;/p&gt;

&lt;p&gt;There are many more important factors to be discussed and many essential aspects to consider when talking about blockchain technology. Although I haven't covered them all in this blog, I will surely make more blogs on the critical factors that make blockchain technology a fantastic innovation in the technology industry.&lt;/p&gt;




&lt;p&gt;🌎Explore, 🎓Learn, 👷‍♂️Build. Happy Learning💛&lt;/p&gt;

</description>
      <category>blockchain</category>
      <category>web3</category>
    </item>
    <item>
      <title>Self Driving Car using Tensorflow</title>
      <dc:creator>Dipankar Medhi</dc:creator>
      <pubDate>Tue, 15 Mar 2022 06:16:59 +0000</pubDate>
      <link>https://dev.to/dipankarmedhi/self-driving-car-using-tensorflow-1poc</link>
      <guid>https://dev.to/dipankarmedhi/self-driving-car-using-tensorflow-1poc</guid>
      <description>&lt;p&gt;Welcome👋, Today I will walk you through a Tensorflow project where we'll build a self-driving car based on Nvidia's Self Driving Car model.&lt;/p&gt;

&lt;h3&gt;
  
  
  Prerequisites
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Unity - Go to &lt;a href="https://unity.com/" rel="noopener noreferrer"&gt;Unity&lt;/a&gt; and download the Unity installer. Choose the right version as per your system requirement. Start the installer and follow the necessary to successfully install the program.&lt;/li&gt;
&lt;li&gt;Simulator - Visit &lt;a href="https://github.com/udacity/self-driving-car-sim" rel="noopener noreferrer"&gt;github/udacity&lt;/a&gt; and follow the instructions mentioned in the &lt;a href="https://github.com/udacity/self-driving-car-sim/blob/master/README.md" rel="noopener noreferrer"&gt;Readme.md&lt;/a&gt; to download and run the simulator as per the system requirements.&lt;/li&gt;
&lt;li&gt;Anaconda/python env - Create a python environment for the model using &lt;a href="https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html" rel="noopener noreferrer"&gt;conda&lt;/a&gt; or &lt;a href="https://docs.python.org/3/library/venv.html" rel="noopener noreferrer"&gt;python&lt;/a&gt;. &lt;/li&gt;
&lt;li&gt;Tensorflow - Install TensorFlow after creating an anaconda env. Visit &lt;a href="https://anaconda.org/conda-forge/tensorflow" rel="noopener noreferrer"&gt;here&lt;/a&gt;to know more. &lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Running the Simulator
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;When we first run the simulator we will see a screen similar to the one shown below.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8iuq9ramk5s0l22ta8i6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8iuq9ramk5s0l22ta8i6.png" alt="image.png" width="" height=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Choose the resolution (I suggest &lt;strong&gt;640x480&lt;/strong&gt; ) and graphic quality. &lt;/li&gt;
&lt;li&gt;Then start the simulator by pressing the &lt;strong&gt;Play&lt;/strong&gt; button.&lt;/li&gt;
&lt;li&gt;Next, we'll see a screen with two options, &lt;strong&gt;Training Mode&lt;/strong&gt; and &lt;strong&gt;Autonomous Mode&lt;/strong&gt;. &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frd0gduwabh3okbj9ih0n.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frd0gduwabh3okbj9ih0n.png" alt="image.png" width="" height=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Select a track and choose &lt;strong&gt;Training Mode&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Training Mode
&lt;/h4&gt;

&lt;p&gt;This mode records the images produced by the 3 cameras (left, centre, right) present on the front of the car. All the captured image are saved in the local disk along with the &lt;strong&gt;steering&lt;/strong&gt; , &lt;strong&gt;throttle&lt;/strong&gt; , &lt;strong&gt;brake&lt;/strong&gt; and &lt;strong&gt;speed&lt;/strong&gt; values in a CSV file named &lt;strong&gt;driving_log.csv&lt;/strong&gt;.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;For more accurate results, run the car for 8-10 laps.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  The Goal
&lt;/h3&gt;

&lt;p&gt;The goal of the project is to run the car automatically using deep neural networks in the Autonomous Mode using all the data obtained after running the Training Mode.&lt;/p&gt;

&lt;h3&gt;
  
  
  Let's Start Coding!
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Exploratory Data Analysis
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;We import the data and necessary libraries.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import pandas as pd
import numpy as np
import os
import cv2
import tensorflow as tf
import matplotlib.pyplot as plt
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Convolution2D, Flatten, Dense, Dropout
from tensorflow.keras.optimizers import Adam

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Load the data and view the head using &lt;code&gt;df.head()&lt;/code&gt;.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;columns = ['center', 'left', 'right', 'steering', 'throttle','brake', 'speed']
df = pd.read_csv(os.path.join('E:\dev\SelfDrivingCar','driving_log.csv'), names = columns)
df.head()

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0ik8b7oozjtj5wlqbtt8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0ik8b7oozjtj5wlqbtt8.png" alt="image.png" width="" height=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Plotting the steering values for visual insights.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;plt.hist(df.steering)
plt.show()

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faecx5wjsd4ndk6h7crmr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faecx5wjsd4ndk6h7crmr.png" alt="image.png" width="" height=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;We can also check its skewness by running the code.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;print("Skewness of the steering feature:\n", df['steering'].skew())

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;We are going to use &lt;strong&gt;steering&lt;/strong&gt; column as the dependent variable. Our goal will be to predict the steering values from the images produced by the simulation.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;ul&gt;
&lt;li&gt;Checking the image using &lt;strong&gt;OpenCV&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;img = cv2.imread(df['center'][0])
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
plt.imshow(img)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foaffrfcmnoes37tvffoo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foaffrfcmnoes37tvffoo.png" alt="image.png" width="" height=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The image is okay, but there are many unnecessary objects like mountains, trees, sky, etc. that we can remove from the image and only keep the road track for training.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Image Preprocessing and Data Augmentation
&lt;/h4&gt;

&lt;p&gt;Before moving to the training process, it is important to remove unwanted data and keep the images simple for training the model. Image preprocessing may also decrease model training time and increase model inference speed.&lt;/p&gt;

&lt;p&gt;Image Augmentation is the process of creating more data for training from the already available ones to obtain good results and prevent overfitting.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Defining a function &lt;code&gt;image_preprocessing()&lt;/code&gt; that accepts the path of the image as input, to crop the image and convert the images to YUV.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;def image_preprocessing(path):
    # cropping image
    img = cv2.imread(path)
    cropped_img = img[60:160,:]
    # color conversion from BGR to YUV
    final_img = cv2.cvtColor(cropped_img, cv2.COLOR_BGR2YUV)
    # application of gaussian blur
    final_img = cv2.GaussianBlur(final_img,(3,5),0)
    # resize image
    output = cv2.resize(final_img, (300,80))
    # normalizing image
    output = output/255
    return output

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Create a function &lt;code&gt;data_augmentation()&lt;/code&gt; that accepts the image processing function and outputs augmented images and augmented steering features.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;def data_augmentation(img_process):
    images = []
    steerings = []
    # for each row in the dataset
    for row in range(df.shape[0]):
        # for ith column
        for i in range(3):
            # splitting image path and filename
            fileName = mod_name(df.iloc[row, i])
            filePath = './IMG/'+ fileName
            # processing the images
            img = img_process(filePath)
            images.append(img)
            steerings.append(df['steering'][row])

    # image and measurement augmentation
    augmented_images, augmented_steerings = [], []
    for image, steering in zip(images, steerings):
        augmented_images.append(image)
        augmented_steerings.append(steering)

        # horizontally flippping the images
        flipped_img = cv2.flip(image, 1)
        augmented_images.append(flipped_img)
        # changing the sign to match the flipped images
        augmented_steerings.append(-1*steering)

    return augmented_images, augmented_steerings

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;We store the augmented images and augmented steering values in two different variables. And print the values along with the processed image to check if everything works fine.&lt;/li&gt;
&lt;li&gt;We use matplotlib to view the images.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;augmented_images, augmented_steerings = data_augmentation(image_preprocessing)
print(augmented_steerings[100])
plt.imshow(augmented_images[100])
plt.show()

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flz9glq8ynv4x27k2e337.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flz9glq8ynv4x27k2e337.png" alt="image.png" width="" height=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Training and Validation
&lt;/h3&gt;

&lt;p&gt;The next step is to prepare the training and validation dataset.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;First, we store the augmented images and augmented steering values separately in X and y variables.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;X = np.array(augmented_images)
y = np.array(augmented_steerings)

X.shape

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;(7698, 80, 300, 3)&lt;/p&gt;
&lt;/blockquote&gt;

&lt;ul&gt;
&lt;li&gt;We then split the dataset for training using the &lt;code&gt;train_test_split&lt;/code&gt; method from the &lt;strong&gt;sklearn&lt;/strong&gt; library.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from sklearn.model_selection import train_test_split
xtrain, xval, ytrain, yval = train_test_split(X, y, test_size = 0.2, random_state = 1)
print('Train images:',len(xtrain))
print('Validation images:',len(xval))

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;Train images: 6158 Validation images: 1540&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Model Building and Training
&lt;/h3&gt;

&lt;p&gt;The &lt;a href="https://www.i-programmer.info/news/105-artificial-intelligence/9678-nvidias-neural-network-drives-a-car.html" rel="noopener noreferrer"&gt;model&lt;/a&gt; architecture is based on the Nvidia's Neural Network for Self Driving Car.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;model = Sequential()
model.add(Convolution2D(24,(5,5),(2,2),input_shape=xtrain[0].shape))

model.add(Convolution2D(36,(5,5),(2,2),activation='elu'))
model.add(Convolution2D(48,(5,5),(2,2),activation='elu'))
# since the images are very small, we are keeping the stride small and not 2x2.
model.add(Convolution2D(64,(3,3),activation='elu'))
model.add(Convolution2D(64,(3,3),activation='elu'))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(100,activation='elu'))
model.add(Dense(50,activation='elu'))
model.add(Dense(10,activation='elu'))
model.add(Dense(1))

model.compile(Adam(learning_rate=0.0001), loss='mse', metrics=['accuracy'])
print(model.summary())

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqp98shgqzh3mq01oo878.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqp98shgqzh3mq01oo878.png" alt="image.png" width="" height=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Early stopping the model to prevent overfitting.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Saving the model.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Evaluating Training and Validation loss
&lt;/h3&gt;

&lt;p&gt;Plot the training loss and validation loss using matplotlib.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.legend(['Training', 'Validation'])
plt.title('loss')
plt.xlabel('epoch')
plt.show()

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdxd3jngkbm49eq34o61x.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdxd3jngkbm49eq34o61x.png" alt="image.png" width="" height=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Conclusion
&lt;/h3&gt;

&lt;p&gt;The overall model is not an exact duplicate of the original Nvidia model, it is just an implementation of the idea behind the original, so there is still room for improvement. The accuracy and losses of the model can be further improved by proper hyperparameter tuning and data preprocessing. In my case, I have considered only 2 to 3 laps of the track to collect the images for training data. So, increasing the number of laps will surely affect the model accuracy and losses. And I have converted the images to YUV, but if we consider removing the colours and keeping only the edges or converting the images into greyscale, we might get improved results with our model.&lt;/p&gt;




&lt;p&gt;🌎Explore, 🎓Learn, 👷‍♂️Build. Happy Coding💛&lt;/p&gt;

</description>
    </item>
  </channel>
</rss>
