<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Vignesh C</title>
    <description>The latest articles on DEV Community by Vignesh C (@iamvigneshc).</description>
    <link>https://dev.to/iamvigneshc</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F547668%2F87a28216-ff7f-4600-bbb8-e8d397670414.jpg</url>
      <title>DEV Community: Vignesh C</title>
      <link>https://dev.to/iamvigneshc</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/iamvigneshc"/>
    <language>en</language>
    <item>
      <title>Azure ML - Automobile Price Predictor</title>
      <dc:creator>Vignesh C</dc:creator>
      <pubDate>Tue, 23 Jan 2024 20:27:47 +0000</pubDate>
      <link>https://dev.to/iamvigneshc/azure-ml-automobile-price-predictor-52f2</link>
      <guid>https://dev.to/iamvigneshc/azure-ml-automobile-price-predictor-52f2</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;The objective is to create and train a Machine Learning model to predict automobile price based on different parameters using Azure ML and public data set. We will also go a step further to set up and deploy web service.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Steps:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;1) Navigate to &lt;a href="https://studio.azureml.net/"&gt;https://studio.azureml.net/&lt;/a&gt; and sign up / sign in.&lt;/p&gt;

&lt;p&gt;2) Add new experiment using the (+) symbol at the bottom and select blank experiment.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl7hnb8zccehy0wx16i66.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl7hnb8zccehy0wx16i66.png" alt="Image description" width="800" height="488"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;3) Leverage the public data set - Automobile price data set and drag it to the mapping area.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg9m7xi8m0ff5fqm1g9hc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg9m7xi8m0ff5fqm1g9hc.png" alt="Image description" width="430" height="570"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Click and select Visualization &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyje7wtc849833caiy54x.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyje7wtc849833caiy54x.png" alt="Image description" width="800" height="432"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;4) Search experiment items for "Select Columns in Dataset". Drag it to the mapping area to include All columns&lt;br&gt;
Excluding the column 'normalized-losses'.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyvzujf3jgrmav2h36lmc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyvzujf3jgrmav2h36lmc.png" alt="Image description" width="800" height="269"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqilsrhqfnimmzzy6598p.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqilsrhqfnimmzzy6598p.png" alt="Image description" width="277" height="307"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;5) Link them as below.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuz2hq0lu3yx3cxc6e8ac.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuz2hq0lu3yx3cxc6e8ac.png" alt="Image description" width="343" height="226"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;6) Search experiment items for "Clean Missing Data". Drag it to the mapping area to the next step. Add the cleaning mode to remove entire row.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F460z2kg70t43pujok53w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F460z2kg70t43pujok53w.png" alt="Image description" width="283" height="81"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;7) Save and Run experiment.&lt;/p&gt;

&lt;p&gt;8) After careful feature engineering, determine the key parameters and select only those columns from the data set. In order to train the data set we also feed the price field.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwuttcy61vmh93yt3mm66.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwuttcy61vmh93yt3mm66.png" alt="Image description" width="591" height="370"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw035usivl6pl26gyfy23.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw035usivl6pl26gyfy23.png" alt="Image description" width="800" height="280"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;9) Now we introduce a step to Split data. 75% of data will be used for training the model. Introduce the below step&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx323y9qg4skv89ogox1x.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx323y9qg4skv89ogox1x.png" alt="Image description" width="253" height="196"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F842nts0zmpnwz7ee9ado.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F842nts0zmpnwz7ee9ado.png" alt="Image description" width="312" height="442"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;10) For price prediction, we will use linear regression to train the model. Introduce the below steps.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0ksv8w8qfenstq6fcup7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0ksv8w8qfenstq6fcup7.png" alt="Image description" width="507" height="190"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feti3l3kxebph4nezdn1m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feti3l3kxebph4nezdn1m.png" alt="Image description" width="800" height="350"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm2vtwk0sp9o49835ugfn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm2vtwk0sp9o49835ugfn.png" alt="Image description" width="310" height="240"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;11) Score the model&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs073i1sfs25po5hasu88.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs073i1sfs25po5hasu88.png" alt="Image description" width="559" height="316"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk3wlzp3phkx5ie82hksj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk3wlzp3phkx5ie82hksj.png" alt="Image description" width="800" height="489"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;12) Evaluate the model and visualize&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu0z6pgvarrdo9dd9mck1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu0z6pgvarrdo9dd9mck1.png" alt="Image description" width="800" height="519"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdmym22upwxg14qa1180d.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdmym22upwxg14qa1180d.png" alt="Image description" width="700" height="346"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;13) The model is now ready to be deployed. Below changes are done to exclude exact price field in real time. We will use the Scored Labels.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5o6sxqcf5sin2kzg626i.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5o6sxqcf5sin2kzg626i.png" alt="Image description" width="800" height="268"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4k10jns03njwi8pjtjib.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4k10jns03njwi8pjtjib.png" alt="Image description" width="800" height="325"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn4ujcn3o9maimocay6hr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn4ujcn3o9maimocay6hr.png" alt="Image description" width="800" height="330"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;14) Now we are ready to set the web service&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F58vt5rjwmfob4uvrs3b9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F58vt5rjwmfob4uvrs3b9.png" alt="Image description" width="596" height="290"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdq7dfn3w7z594oh28pqw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdq7dfn3w7z594oh28pqw.png" alt="Image description" width="403" height="169"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;15) We will feed the input of web service to the score model. Run the experiment and deploy the web service&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhxe7h8hdzlp1lk7cus48.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhxe7h8hdzlp1lk7cus48.png" alt="Image description" width="800" height="353"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;16) API key has been generated for further usage of web services.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffoxxywfhrxh0jsdfoi0n.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffoxxywfhrxh0jsdfoi0n.png" alt="Image description" width="800" height="372"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;17) Test the model with real time data&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjpvtcgzllqaatrqx24wz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjpvtcgzllqaatrqx24wz.png" alt="Image description" width="800" height="479"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>azureml</category>
      <category>automobile</category>
      <category>pricing</category>
      <category>aiml</category>
    </item>
    <item>
      <title>Understanding OAuth 2.0</title>
      <dc:creator>Vignesh C</dc:creator>
      <pubDate>Tue, 15 Aug 2023 20:47:07 +0000</pubDate>
      <link>https://dev.to/iamvigneshc/understanding-oauth-20-467b</link>
      <guid>https://dev.to/iamvigneshc/understanding-oauth-20-467b</guid>
      <description>&lt;h2&gt;
  
  
  What is OAuth?
&lt;/h2&gt;

&lt;p&gt;OAuth (Open Authorization) is an open standard for authorization that allow web apps to request access to third party systems on behalf of its users without sharing any account credentials. &lt;/p&gt;

&lt;p&gt;OAuth 2.0 is an authorization framework which delegates access and permissions between APIs and applications in a safe and reliable exchange and made more compatible for use by both websites and apps&lt;/p&gt;

&lt;p&gt;It also allows for a greater variety of access tokens, like having short-lived tokens and long-lived refresh tokens.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Key Components of OAuth 2.0&lt;/strong&gt;
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Resource Owner - The user who owns the resources&lt;/li&gt;
&lt;li&gt;Client - The application requesting access&lt;/li&gt;
&lt;li&gt;Authorization Server - Issues access tokens&lt;/li&gt;
&lt;li&gt;Resource Server - Hosts protected resources&lt;/li&gt;
&lt;li&gt;Access Token - Grants access with specific scopes&lt;/li&gt;
&lt;li&gt;Redirect URI - After permission is granted&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8b4gpnitu0rxv4p1xd0u.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8b4gpnitu0rxv4p1xd0u.png" alt="Image description" width="800" height="431"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu7fmx2aihlo9x7oi7psm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu7fmx2aihlo9x7oi7psm.png" alt="Image description" width="800" height="439"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  BEST PRACTICES
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Choose the Right Grant Type&lt;/li&gt;
&lt;li&gt;Implement Secure Token Management&lt;/li&gt;
&lt;li&gt;Consistent User Consent Mechanisms&lt;/li&gt;
&lt;li&gt;Regular Security Reviews and Updates&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>api</category>
      <category>oauth</category>
      <category>sso</category>
      <category>security</category>
    </item>
    <item>
      <title>Git Cheat Sheet</title>
      <dc:creator>Vignesh C</dc:creator>
      <pubDate>Wed, 17 Feb 2021 17:32:40 +0000</pubDate>
      <link>https://dev.to/iamvigneshc/git-cheat-sheet-25od</link>
      <guid>https://dev.to/iamvigneshc/git-cheat-sheet-25od</guid>
      <description>&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faufe0s0q3b3d7u1v1o26.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faufe0s0q3b3d7u1v1o26.JPG" alt="Git" width="720" height="1017"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;br&gt;&lt;br&gt;
Credits: Nina Jaeschke&lt;/p&gt;

</description>
      <category>github</category>
      <category>devops</category>
    </item>
    <item>
      <title>Reporting in Power BI - Quick Steps</title>
      <dc:creator>Vignesh C</dc:creator>
      <pubDate>Mon, 18 Jan 2021 04:12:24 +0000</pubDate>
      <link>https://dev.to/iamvigneshc/reporting-in-power-bi-steps-2ace</link>
      <guid>https://dev.to/iamvigneshc/reporting-in-power-bi-steps-2ace</guid>
      <description>&lt;p&gt;Power BI is a business analytics service by Microsoft. It aims to provide interactive visualizations and business intelligence capabilities with an interface simple enough for end users to create their own reports and dashboards.&lt;/p&gt;

&lt;h1&gt;
  
  
  Power BI Reporting
&lt;/h1&gt;

&lt;h2&gt;
  
  
  Steps:
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Prepare Data
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Importing Data (from data sources)&lt;/li&gt;
&lt;li&gt;Transforming Data (Edit query in Query editor)

&lt;ul&gt;
&lt;li&gt;Rename table&lt;/li&gt;
&lt;li&gt;Append Query (two or more tables to append columns)&lt;/li&gt;
&lt;li&gt;Fixing Metadata&lt;/li&gt;
&lt;li&gt;Filter Rows (condition)&lt;/li&gt;
&lt;li&gt;Remove Columns&lt;/li&gt;
&lt;li&gt;Adding Columns&lt;/li&gt;
&lt;li&gt;Merge queries (inner, outer, left, right)&lt;/li&gt;
&lt;li&gt;Split by delimiter&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fnopyer2pdhx1sjko5hi5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fnopyer2pdhx1sjko5hi5.png" alt="Alt Text" width="800" height="411"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fbgduhd75exheuokm0w9i.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fbgduhd75exheuokm0w9i.png" alt="Alt Text" width="800" height="411"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Cleansing Data (Edit query in Query editor)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Frd2is22ri8vg3unpp115.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Frd2is22ri8vg3unpp115.png" alt="Alt Text" width="800" height="411"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Create Data Model, Enhance
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Define table relationships

&lt;ul&gt;
&lt;li&gt;Fix data model for summarization&lt;/li&gt;
&lt;li&gt;Fix query and create relationships&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;li&gt;Configure Model properties&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fth1bsw8pbn7s076mrw06.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fth1bsw8pbn7s076mrw06.png" alt="Alt Text" width="800" height="411"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Adding Calculated columns and Measures (DAX - Data Analysis Expression)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Ffiowfggyngrcbr5m9nr9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Ffiowfggyngrcbr5m9nr9.png" alt="Alt Text" width="800" height="411"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Create hierarchies&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fiors7u9l6ztrqwjb26ii.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fiors7u9l6ztrqwjb26ii.png" alt="Alt Text" width="800" height="411"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Create Report
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Report elements
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Visualization elements&lt;/li&gt;
&lt;li&gt;Static elements&lt;/li&gt;
&lt;li&gt;Page&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Format
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Chart (Conditional Formatting)&lt;/li&gt;
&lt;li&gt;Page (background)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  4. Publish Report
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;PBIX file&lt;/li&gt;
&lt;li&gt;Pin to Dashboard&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  5. Visualize Data
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Filters, Slicers&lt;/li&gt;
&lt;li&gt;Highlighting, Drilling&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fv21qfv1i65k6c9aut5ep.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fv21qfv1i65k6c9aut5ep.png" alt="Alt Text" width="800" height="411"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  6. Refresh Data
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Publish dataset&lt;/li&gt;
&lt;li&gt;Refreshing data&lt;/li&gt;
&lt;li&gt;Power BI Gateway set up - Personal&lt;/li&gt;
&lt;li&gt;Power BI Gateway - Enterprise&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>database</category>
      <category>sql</category>
      <category>datascience</category>
      <category>beginners</category>
    </item>
    <item>
      <title>Architecting with Google Kubernetes Engine</title>
      <dc:creator>Vignesh C</dc:creator>
      <pubDate>Mon, 18 Jan 2021 03:42:37 +0000</pubDate>
      <link>https://dev.to/iamvigneshc/architecting-with-google-kubernetes-engine-3kj1</link>
      <guid>https://dev.to/iamvigneshc/architecting-with-google-kubernetes-engine-3kj1</guid>
      <description>&lt;p&gt;Subscribe to my &lt;a href="https://tinyurl.com/y29gn693" rel="noopener noreferrer"&gt;Youtube channel&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Kubernetes&lt;/strong&gt; is an open-source platform that helps you orchestrate and manage your container infrastructure on-premises or in the cloud. &lt;/p&gt;

&lt;p&gt;It automates the deployment, scaling, load balancing, logging, monitoring, and other management features of containerized applications. These are the features that are characteristic of typical platform-as-a-service solutions. Kubernetes also facilitates the features of infrastructure-as-a-service, such as allowing a wide range of user preferences and configuration flexibility.&lt;br&gt;
&lt;br&gt;
&lt;strong&gt;Google Kubernetes Engine&lt;/strong&gt; is a managed Kubernetes service on Google infrastructure. GKE helps you to deploy, manage, and scale Kubernetes environments for your containerized applications on Google Cloud.More specifically, GKE is a component of the Google Cloud compute offerings. It makes it easy to bring your Kubernetes workloads into the cloud.&lt;/p&gt;

&lt;p&gt;&lt;br&gt;
&lt;strong&gt;Cloud Run&lt;/strong&gt; is a managed compute platform that enables you to run stateless containers via web requests or Pub/Sub events. Cloud Run is serverless: it abstracts away all infrastructure management so you can focus on developing applications. It is built on Knative, an open-source, Kubernetes-based platform. It builds, deploys, and manages modern serverless workloads. Cloud Run gives you the choice of running your containers either fully-managed or in your own GKE cluster.&lt;br&gt;
&lt;br&gt;
&lt;strong&gt;Cloud Functions&lt;/strong&gt; is an event-driven, serverless compute service for simple, single-purpose functions that are attached to events. In Cloud Functions, you simply upload your code written in JavaScript, Python, or Go; Google Cloud will automatically deploy appropriate computing&lt;br&gt;
capacity to run that code.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fja916bhir7xt98wqrg4f.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fja916bhir7xt98wqrg4f.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We will deploy and manage containerized applications on Google Kubernetes Engine (GKE) and the other tools on Google Cloud&lt;/p&gt;

&lt;p&gt;• Deploy solution elements—including infrastructure components like pods, containers, deployments, and services—along with networks and application services&lt;/p&gt;

&lt;p&gt;• Deploy practical solutions, including security and access management, resource management, and resource monitoring&lt;/p&gt;



&lt;h2&gt;
  
  
  Cloud Build
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/IamVigneshC/GCP-Architecting-with-Google-Kubernetes-Engine/tree/main/Cloud%20Build" rel="noopener noreferrer"&gt;https://github.com/IamVigneshC/GCP-Architecting-with-Google-Kubernetes-Engine/tree/main/Cloud%20Build&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Configure Pod Autoscaling and NodePools
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/IamVigneshC/GCP-Architecting-with-Google-Kubernetes-Engine/tree/main/Configure%20Pod%20Autoscaling%20and%20NodePools" rel="noopener noreferrer"&gt;https://github.com/IamVigneshC/GCP-Architecting-with-Google-Kubernetes-Engine/tree/main/Configure%20Pod%20Autoscaling%20and%20NodePools&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Creating GKE Deployments
&lt;/h2&gt;

&lt;p&gt;• Create deployment manifests, deploy to cluster, and verify Pod rescheduling as nodes are disabled&lt;/p&gt;

&lt;p&gt;• Trigger manual scaling up and down of Pods in deployments&lt;/p&gt;

&lt;p&gt;• Trigger deployment rollout (rolling update to new version) and rollbacks&lt;/p&gt;

&lt;p&gt;• Perform a Canary deployment&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/IamVigneshC/GCP-Architecting-with-Google-Kubernetes-Engine/tree/main/Creating%20GKE%20Deployments" rel="noopener noreferrer"&gt;https://github.com/IamVigneshC/GCP-Architecting-with-Google-Kubernetes-Engine/tree/main/Creating%20GKE%20Deployments&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Deploy GKE
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/IamVigneshC/GCP-Architecting-with-Google-Kubernetes-Engine/tree/main/Deploy%20GKE" rel="noopener noreferrer"&gt;https://github.com/IamVigneshC/GCP-Architecting-with-Google-Kubernetes-Engine/tree/main/Deploy%20GKE&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Ff2a4itfb7d3spklvcxr8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Ff2a4itfb7d3spklvcxr8.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Deploy Jobs
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/IamVigneshC/GCP-Architecting-with-Google-Kubernetes-Engine/tree/main/Deploy%20Jobs" rel="noopener noreferrer"&gt;https://github.com/IamVigneshC/GCP-Architecting-with-Google-Kubernetes-Engine/tree/main/Deploy%20Jobs&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Deploy with Helm
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/IamVigneshC/GCP-Architecting-with-Google-Kubernetes-Engine/tree/main/Deploy%20with%20Helm" rel="noopener noreferrer"&gt;https://github.com/IamVigneshC/GCP-Architecting-with-Google-Kubernetes-Engine/tree/main/Deploy%20with%20Helm&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Networking
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/IamVigneshC/GCP-Architecting-with-Google-Kubernetes-Engine/tree/main/Networking" rel="noopener noreferrer"&gt;https://github.com/IamVigneshC/GCP-Architecting-with-Google-Kubernetes-Engine/tree/main/Networking&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Services and Ingress Resources
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/IamVigneshC/GCP-Architecting-with-Google-Kubernetes-Engine/tree/main/Services%20and%20Ingress%20Resources" rel="noopener noreferrer"&gt;https://github.com/IamVigneshC/GCP-Architecting-with-Google-Kubernetes-Engine/tree/main/Services%20and%20Ingress%20Resources&lt;/a&gt;&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>googlecloud</category>
      <category>architecture</category>
      <category>systems</category>
    </item>
    <item>
      <title>Machine Learning journey : Day 4 (Python | Descriptive Statistics | Case Study)</title>
      <dc:creator>Vignesh C</dc:creator>
      <pubDate>Wed, 13 Jan 2021 03:15:19 +0000</pubDate>
      <link>https://dev.to/iamvigneshc/machine-learning-journey-day-4-python-descriptive-statistics-case-study-1dk5</link>
      <guid>https://dev.to/iamvigneshc/machine-learning-journey-day-4-python-descriptive-statistics-case-study-1dk5</guid>
      <description>&lt;p&gt;Subscribe to my &lt;a href="https://tinyurl.com/y29gn693"&gt;Youtube channel&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Cardio Good Fitness Case Study
&lt;/h1&gt;

&lt;p&gt;The market research team at AdRight is assigned the task to identify the profile of the typical customer for each treadmill product offered by CardioGood Fitness. The market research team decides to investigate whether there are differences across the product lines with respect to customer characteristics. The team decides to collect data on individuals who purchased a treadmill at a CardioGoodFitness retail store during the prior three months. The data are stored in the &lt;a href="https://github.com/IamVigneshC/Machine-Learning-Data-Science/blob/master/Python/CardioGoodFitness-1.csv"&gt;CardioGoodFitness.csv&lt;/a&gt; file.&lt;/p&gt;

&lt;p&gt;The team identifies the following customer variables to study:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;product purchased, TM195, TM498, or TM798;&lt;/li&gt;
&lt;li&gt;gender;&lt;/li&gt;
&lt;li&gt;age, in years;&lt;/li&gt;
&lt;li&gt;education, in years;&lt;/li&gt;
&lt;li&gt;relationship status, single or partnered;&lt;/li&gt;
&lt;li&gt;annual household income ;&lt;/li&gt;
&lt;li&gt;average number of times the customer plans to use the treadmill each week;&lt;/li&gt;
&lt;li&gt;average number of miles the customer expects to walk/run each week;&lt;/li&gt;
&lt;li&gt;self-rated fitness on an 1-to-5 scale, where 1 is poor shape and 5 is excellent shape.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Perform descriptive analytics to create a customer profile for each CardioGood Fitness treadmill product line.
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Load the necessary packages
&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;numpy&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pandas&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;

&lt;span class="c1"&gt;# Load the Cardio Dataset
&lt;/span&gt;
&lt;span class="n"&gt;mydata&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read_csv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;CardioGoodFitness-1.csv&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;mydata&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;head&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Head:
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fburjsqgs0fz7h1pzgcmb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fburjsqgs0fz7h1pzgcmb.png" alt="Alt Text" width="800" height="228"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Tail:
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F09o7dk0apj2d6y65v076.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F09o7dk0apj2d6y65v076.png" alt="Alt Text" width="800" height="227"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Describe:
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;mydata&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;describe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;include&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;all&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fzgryc2a05kqndn5vdu1m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fzgryc2a05kqndn5vdu1m.png" alt="Alt Text" width="800" height="326"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Info:
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fsna121haifeso60pddcd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fsna121haifeso60pddcd.png" alt="Alt Text" width="800" height="542"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Histogram:
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;matplotlib.pyplot&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;plt&lt;/span&gt;
&lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="n"&gt;matplotlib&lt;/span&gt; &lt;span class="n"&gt;inline&lt;/span&gt;

&lt;span class="n"&gt;mydata&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;hist&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;figsize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fysyfkigj7s44vwwerjka.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fysyfkigj7s44vwwerjka.png" alt="Alt Text" width="800" height="338"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fypt3u3y2tpz0w4p13iny.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fypt3u3y2tpz0w4p13iny.png" alt="Alt Text" width="800" height="362"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F4y2xieosjss4p3ccx8h3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F4y2xieosjss4p3ccx8h3.png" alt="Alt Text" width="800" height="352"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Boxplot:
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;seaborn&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;sns&lt;/span&gt;

&lt;span class="n"&gt;sns&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;boxplot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Gender&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Age&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;mydata&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fqgjcg49qlskqto38kvqo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fqgjcg49qlskqto38kvqo.png" alt="Alt Text" width="800" height="458"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Pair Plot
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Flpjl1breld4jdrj6dlsc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Flpjl1breld4jdrj6dlsc.png" alt="Alt Text" width="800" height="420"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fpeanu3vxyw7lzwwacykh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fpeanu3vxyw7lzwwacykh.png" alt="Alt Text" width="800" height="288"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Crosstab:
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fsmxib8r6nmz3aohg76fa.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fsmxib8r6nmz3aohg76fa.png" alt="Alt Text" width="800" height="571"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Countplot:
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F19gsli8lzzh1exa9c0n9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F19gsli8lzzh1exa9c0n9.png" alt="Alt Text" width="800" height="559"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Pivot table:
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Frek6doqrnwoobj48yeij.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Frek6doqrnwoobj48yeij.png" alt="Alt Text" width="800" height="492"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Correlation with heat map:
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F48gnutsrbz6zz4gzncax.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F48gnutsrbz6zz4gzncax.png" alt="Alt Text" width="800" height="830"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;br&gt;
Other useful links:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://numpy.org/"&gt;https://numpy.org/&lt;/a&gt;&lt;br&gt;
&lt;a href="https://pandas.pydata.org/"&gt;https://pandas.pydata.org/&lt;/a&gt;&lt;br&gt;
&lt;a href="https://seaborn.pydata.org/"&gt;https://seaborn.pydata.org/&lt;/a&gt;&lt;br&gt;
&lt;a href="https://matplotlib.org/"&gt;https://matplotlib.org/&lt;/a&gt;&lt;/p&gt;

</description>
      <category>python</category>
      <category>machinelearning</category>
      <category>datascience</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Self Study: Data Science - Machine Learning journey : Day 3 (R Programming | ROC and AUC Curves)</title>
      <dc:creator>Vignesh C</dc:creator>
      <pubDate>Sun, 10 Jan 2021 23:08:50 +0000</pubDate>
      <link>https://dev.to/iamvigneshc/self-study-data-science-machine-learning-journey-day-3-8op</link>
      <guid>https://dev.to/iamvigneshc/self-study-data-science-machine-learning-journey-day-3-8op</guid>
      <description>&lt;p&gt;Before getting into this topic, you shall optionally refer Day 1 &lt;a href="https://dev.to/iamvigneshc/starting-my-data-science-machine-learning-journey-1j93"&gt;post&lt;/a&gt; for the basics of ROC and AUC, Day 2 &lt;a href="https://dev.to/iamvigneshc/self-study-data-science-machine-learning-journey-day-2-2cjl"&gt;post&lt;/a&gt; to install R package and R Studio Desktop.&lt;/p&gt;

&lt;p&gt;Subscribe to my Youtube channel: &lt;a href="https://youtu.be/DPjFVNuMHaE"&gt;MyDigitalWorld - Cloud, DevOps, Data Science&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Day 1 video : &lt;a href="https://youtu.be/DPjFVNuMHaE"&gt;https://youtu.be/DPjFVNuMHaE&lt;/a&gt;&lt;br&gt;
Day 2 video : &lt;a href="https://youtu.be/WyFoSSvbNRA"&gt;https://youtu.be/WyFoSSvbNRA&lt;/a&gt;&lt;/p&gt;



&lt;p&gt;&lt;iframe width="710" height="399" src="https://www.youtube.com/embed/qesVU5hhIhw"&gt;
&lt;/iframe&gt;
&lt;/p&gt;

&lt;p&gt;Assume you have a set of 100 candidates preparing for a competitive examination. We are going to draw a ROC and AUC chart in R plotting the number of hours spent learning or preparing for the exam vs the result obtained, either pass or fail. We will ROC to compare the different thresholds and find the best threshold and AUC to determine which ML algorithm performed best.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Import libraries for ROC and Random forest algorithm
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight r"&gt;&lt;code&gt;&lt;span class="n"&gt;library&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pROC&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c1"&gt;# install with install.packages("pROC")&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;library&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;randomForest&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c1"&gt;# install with install.packages("randomForest")&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Generate random numbers for 100 samples for number of hours spent in learning
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight r"&gt;&lt;code&gt;&lt;span class="n"&gt;set.seed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;420&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c1"&gt;# random number generation ensures that you get the same result if you start with that same seed each time you run the same process&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="n"&gt;num.samples&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;100&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="c1"&gt;## generate 100 values from a normal distribution with&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="c1"&gt;## mean 50 and standard deviation 12, then sort them&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;elearn&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;sort&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rnorm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;num.samples&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;sd&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;12&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Now we will decide if a sample is pass or not
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight r"&gt;&lt;code&gt;&lt;span class="c1"&gt;## rank(elearn) returns 1 for the least time spent, 2 for the second least time, ...&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="c1"&gt;##              ... and it returns 100 for the highest time spent for exam preparation&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="c1"&gt;## So what we do is generate a random number between 0 and 1. Then we see if&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="c1"&gt;## that number is less than rank/100. So, for the least sample, rank = 1.&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="c1"&gt;## This sample will be classified "pass" if we get a random number less than&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="c1"&gt;## 1/100. For the second least sample, rank = 2, we get another random&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="c1"&gt;## number between 0 and 1 and classify this sample "pass" if that random&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="c1"&gt;## number is &amp;lt; 2/100. We repeat that process for all 100 samples&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;pass&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;ifelse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;test&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;runif&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;num.samples&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rank&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;elearn&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;num.samples&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;&lt;span class="w"&gt; 
  &lt;/span&gt;&lt;span class="n"&gt;yes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;no&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;pass&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c1"&gt;## print out the contents of "pass" to show us which samples were&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="c1"&gt;## classified "pass" with 1, and which samples were classified&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="c1"&gt;## "not pass" with 0.&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4. Plot the data
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight r"&gt;&lt;code&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;elearn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;pass&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  5. Fit a logistic regression to the data
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight r"&gt;&lt;code&gt;&lt;span class="n"&gt;glm.fit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;glm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pass&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;~&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;elearn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;family&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;binomial&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;elearn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;glm.fit&lt;/span&gt;&lt;span class="o"&gt;$&lt;/span&gt;&lt;span class="n"&gt;fitted.values&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  6. Draw ROC and AUC using pROC
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight r"&gt;&lt;code&gt;&lt;span class="c1"&gt;## ROC graphs should be square, since the x and y axes&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="c1"&gt;## both go from 0 to 1. However, the window in which I draw them isn't square&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="c1"&gt;## so extra whitespace is added to pad the sides.&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;roc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pass&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;glm.fit&lt;/span&gt;&lt;span class="o"&gt;$&lt;/span&gt;&lt;span class="n"&gt;fitted.values&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;TRUE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="c1"&gt;## Now let's configure R so that it prints the graph as a square.&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="c1"&gt;##&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;par&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pty&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"s"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c1"&gt;## pty sets the aspect ratio of the plot region. Two options:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="c1"&gt;##                "s" - creates a square plotting region&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="c1"&gt;##                "m" - (the default) creates a maximal plotting region&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;roc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pass&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;glm.fit&lt;/span&gt;&lt;span class="o"&gt;$&lt;/span&gt;&lt;span class="n"&gt;fitted.values&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;TRUE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="c1"&gt;## NOTE: By default, roc() uses specificity on the x-axis and the values range&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="c1"&gt;## from 1 to 0.  To use 1-specificity (i.e. the &lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="c1"&gt;## False Positive Rate) on the x-axis, set "legacy.axes" to TRUE.&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;roc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pass&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;glm.fit&lt;/span&gt;&lt;span class="o"&gt;$&lt;/span&gt;&lt;span class="n"&gt;fitted.values&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;TRUE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;legacy.axes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;TRUE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="c1"&gt;## If you want to rename the x and y axes...&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;roc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pass&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;glm.fit&lt;/span&gt;&lt;span class="o"&gt;$&lt;/span&gt;&lt;span class="n"&gt;fitted.values&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;TRUE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;legacy.axes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;TRUE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;percent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;TRUE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;xlab&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"False Positive Percentage"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;ylab&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"True Postive Percentage"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="c1"&gt;## We can also change the color of the ROC line, and make it wider...&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;roc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pass&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;glm.fit&lt;/span&gt;&lt;span class="o"&gt;$&lt;/span&gt;&lt;span class="n"&gt;fitted.values&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;TRUE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;legacy.axes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;TRUE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;percent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;TRUE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;xlab&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"False Positive Percentage"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;ylab&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"True Postive Percentage"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"#377eb8"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;lwd&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="c1"&gt;## If we want to find out the optimal threshold we can store the &lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="c1"&gt;## data used to make the ROC graph in a variable...&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;roc.info&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;roc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pass&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;glm.fit&lt;/span&gt;&lt;span class="o"&gt;$&lt;/span&gt;&lt;span class="n"&gt;fitted.values&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;legacy.axes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;TRUE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;roc.info&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="c1"&gt;## and then extract just the information that we want from that variable.&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;roc.df&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;data.frame&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="n"&gt;tpp&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;roc.info&lt;/span&gt;&lt;span class="o"&gt;$&lt;/span&gt;&lt;span class="n"&gt;sensitivities&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="m"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c1"&gt;## tpp = true positive percentage&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="n"&gt;fpp&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;roc.info&lt;/span&gt;&lt;span class="o"&gt;$&lt;/span&gt;&lt;span class="n"&gt;specificities&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="m"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c1"&gt;## fpp = false positive precentage&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="n"&gt;thresholds&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;roc.info&lt;/span&gt;&lt;span class="o"&gt;$&lt;/span&gt;&lt;span class="n"&gt;thresholds&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="n"&gt;head&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;roc.df&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c1"&gt;## head() will show us the values for the upper right-hand corner&lt;/span&gt;&lt;span class="w"&gt;
             &lt;/span&gt;&lt;span class="c1"&gt;## of the ROC graph, when the threshold is so low &lt;/span&gt;&lt;span class="w"&gt;
             &lt;/span&gt;&lt;span class="c1"&gt;## (negative infinity) that every single sample is called "pass".&lt;/span&gt;&lt;span class="w"&gt;
             &lt;/span&gt;&lt;span class="c1"&gt;## Thus TPP = 100% and FPP = 100%&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="n"&gt;tail&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;roc.df&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c1"&gt;## tail() will show us the values for the lower left-hand corner&lt;/span&gt;&lt;span class="w"&gt;
             &lt;/span&gt;&lt;span class="c1"&gt;## of the ROC graph, when the threshold is so high (infinity) &lt;/span&gt;&lt;span class="w"&gt;
             &lt;/span&gt;&lt;span class="c1"&gt;## that every single sample is called "not pass". &lt;/span&gt;&lt;span class="w"&gt;
             &lt;/span&gt;&lt;span class="c1"&gt;## Thus, TPP = 0% and FPP = 0%&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="c1"&gt;## now let's look at the thresholds between TPP 60% and 80%...&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;roc.df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;roc.df&lt;/span&gt;&lt;span class="o"&gt;$&lt;/span&gt;&lt;span class="n"&gt;tpp&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;60&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;roc.df&lt;/span&gt;&lt;span class="o"&gt;$&lt;/span&gt;&lt;span class="n"&gt;tpp&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;80&lt;/span&gt;&lt;span class="p"&gt;,]&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="c1"&gt;## We can calculate the area under the curve...&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;roc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pass&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;glm.fit&lt;/span&gt;&lt;span class="o"&gt;$&lt;/span&gt;&lt;span class="n"&gt;fitted.values&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;TRUE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;legacy.axes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;TRUE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;percent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;TRUE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;xlab&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"False Positive Percentage"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;ylab&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"True Postive Percentage"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"#377eb8"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;lwd&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;print.auc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;TRUE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="c1"&gt;## ...and the partial area under the curve.&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;roc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pass&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;glm.fit&lt;/span&gt;&lt;span class="o"&gt;$&lt;/span&gt;&lt;span class="n"&gt;fitted.values&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;TRUE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;legacy.axes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;TRUE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;percent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;TRUE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;xlab&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"False Positive Percentage"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;ylab&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"True Postive Percentage"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"#377eb8"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;lwd&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;print.auc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;TRUE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;print.auc.x&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;45&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;partial.auc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;c&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;90&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;auc.polygon&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;TRUE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;auc.polygon.col&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"#377eb822"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  7. Fit the data with a random forest
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight r"&gt;&lt;code&gt;&lt;span class="n"&gt;rf.model&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;randomForest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;factor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pass&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;~&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;elearn&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="c1"&gt;## ROC for random forest&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;roc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pass&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;rf.model&lt;/span&gt;&lt;span class="o"&gt;$&lt;/span&gt;&lt;span class="n"&gt;votes&lt;/span&gt;&lt;span class="p"&gt;[,&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;TRUE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;legacy.axes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;TRUE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;percent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;TRUE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;xlab&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"False Positive Percentage"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;ylab&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"True Postive Percentage"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"#4daf4a"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;lwd&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;print.auc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;TRUE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  8. Layer logistic regression and random forest ROC graphs
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight r"&gt;&lt;code&gt;&lt;span class="n"&gt;roc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pass&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;glm.fit&lt;/span&gt;&lt;span class="o"&gt;$&lt;/span&gt;&lt;span class="n"&gt;fitted.values&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;TRUE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;legacy.axes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;TRUE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;percent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;TRUE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;xlab&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"False Positive Percentage"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;ylab&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"True Postive Percentage"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"#377eb8"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;lwd&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;print.auc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;TRUE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="n"&gt;plot.roc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pass&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;rf.model&lt;/span&gt;&lt;span class="o"&gt;$&lt;/span&gt;&lt;span class="n"&gt;votes&lt;/span&gt;&lt;span class="p"&gt;[,&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;percent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;TRUE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"#4daf4a"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;lwd&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;print.auc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;TRUE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;TRUE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;print.auc.y&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;40&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;legend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"bottomright"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;legend&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;c&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"Logisitic Regression"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Random Forest"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;c&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"#377eb8"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"#4daf4a"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;lwd&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The final chart looks like this:
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fgc6rfvn3gv0afuwzuvo9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fgc6rfvn3gv0afuwzuvo9.png" alt="Alt Text" width="800" height="605"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;br&gt;
&lt;br&gt;
Refer my &lt;a href="https://github.com/IamVigneshC/Machine-Learning-Data-Science/blob/master/R/ROC_AUC.R"&gt;GitHub Repository&lt;/a&gt; for the code.&lt;/p&gt;

&lt;p&gt;Credits: StatQuest&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>datascience</category>
      <category>beginners</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Self Study: Data Science - Machine Learning journey : Day 2 (Statistics | R | Python | Anaconda | Jupyter)</title>
      <dc:creator>Vignesh C</dc:creator>
      <pubDate>Sun, 10 Jan 2021 04:40:47 +0000</pubDate>
      <link>https://dev.to/iamvigneshc/self-study-data-science-machine-learning-journey-day-2-2cjl</link>
      <guid>https://dev.to/iamvigneshc/self-study-data-science-machine-learning-journey-day-2-2cjl</guid>
      <description>&lt;p&gt;&lt;iframe width="710" height="399" src="https://www.youtube.com/embed/WyFoSSvbNRA"&gt;
&lt;/iframe&gt;
&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites:
&lt;/h2&gt;

&lt;p&gt;Statistics is generally considered as one of the prerequisites to study machine learning. We need statistics to help transform observations into information and to answer questions about samples of observations.&lt;/p&gt;

&lt;h3&gt;
  
  
  Statistics is needed in Machine Learning for..
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fd4g97x4eygy3mss4sz8x.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fd4g97x4eygy3mss4sz8x.png" alt="Alt Text" width="800" height="218"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Another prerequisite to data science - machine learning is a programming language - R or Python. R is used for statistical analysis to build models while Python is used beyond statistics with wide range of libraries and having better integration with other programming languages. &lt;/p&gt;

&lt;h2&gt;
  
  
  Applied Statistics:
&lt;/h2&gt;

&lt;p&gt;Two broad categories in the field of statistics:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Descriptive statistics&lt;/li&gt;
&lt;li&gt;Inferential statistics&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Descriptive statistics is the process of categorizing and describing the information. &lt;/p&gt;

&lt;p&gt;Inferential statistics includes the process of analyzing a sample of data and using it to draw inferences about the population from which it was drawn.&lt;/p&gt;

&lt;p&gt;We need to get familiarized with all these concepts to continue our machine learning journey effectively. Most of these concepts would have been covered as part of our graduate degree. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fubaqnuwuemf3iux33rth.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fubaqnuwuemf3iux33rth.png" alt="Alt Text" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Install R Studio
&lt;/h2&gt;

&lt;p&gt;Install R and R Studio Desktop for your version of OS from &lt;a href="https://rstudio.com/products/rstudio/download/#download"&gt;here&lt;/a&gt;..&lt;/p&gt;

&lt;p&gt;Sample R code to illustrate AUC and ROC from Day 1:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/IamVigneshC/Machine-Learning-Data-Science/blob/master/R/ROC_AUC.R"&gt;https://github.com/IamVigneshC/Machine-Learning-Data-Science/blob/master/R/ROC_AUC.R&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Install Python
&lt;/h2&gt;

&lt;p&gt;You can install and use python through command line or through &lt;a href="https://www.anaconda.com/products/individual"&gt;Anaconda&lt;/a&gt; which come along with a &lt;a href="https://docs.python.org/3/tutorial/index.html"&gt;tutorial&lt;/a&gt;, reference for various libraries.&lt;/p&gt;

&lt;p&gt;Once installed, you shall open JupyterLab or Jupyter notebook and work on Python.&lt;/p&gt;

&lt;p&gt;Some of my samples to get started:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://anaconda.org/iamvigneshc"&gt;https://anaconda.org/iamvigneshc&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/IamVigneshC/Machine-Learning-Data-Science/tree/master/Python"&gt;https://github.com/IamVigneshC/Machine-Learning-Data-Science/tree/master/Python&lt;/a&gt;&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>datascience</category>
      <category>beginners</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Starting my Data Science - Machine Learning journey : Day 1 (Confusion Matrix | Sensitivity | Specificity | Bias | Variance)</title>
      <dc:creator>Vignesh C</dc:creator>
      <pubDate>Fri, 08 Jan 2021 03:19:06 +0000</pubDate>
      <link>https://dev.to/iamvigneshc/starting-my-data-science-machine-learning-journey-1j93</link>
      <guid>https://dev.to/iamvigneshc/starting-my-data-science-machine-learning-journey-1j93</guid>
      <description>&lt;p&gt;Subscribe to my Youtube channel: &lt;a href="https://youtu.be/DPjFVNuMHaE"&gt;https://youtu.be/DPjFVNuMHaE&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;iframe width="710" height="399" src="https://www.youtube.com/embed/DPjFVNuMHaE"&gt;
&lt;/iframe&gt;
&lt;/p&gt;

&lt;h1&gt;
  
  
  Day 1
&lt;/h1&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fcg8bigxqqiytu24l74wv.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fcg8bigxqqiytu24l74wv.jpg" alt="Alt Text" width="800" height="440"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Types of Machine Learning
&lt;/h2&gt;

&lt;p&gt;Machine Learning Algorithms can be classified into 3 types as follows –&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Supervised Learning&lt;/li&gt;
&lt;li&gt;Unsupervised Learning&lt;/li&gt;
&lt;li&gt;Reinforcement Learning&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Supervised learning&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In Supervised Learning, the dataset on which we train our model is labeled. There is a clear and distinct mapping of input and output. Based on the example inputs, the model is able to get trained in the instances. An example of supervised learning is spam filtering. Based on the labeled data, the model is able to determine if the data is spam or ham. This is an easier form of training. Spam filtering is an example of this type of machine learning algorithm.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Unsupervised Learning&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In Unsupervised Learning, there is no labeled data. The algorithm identifies the patterns within the dataset and learns them. The algorithm groups the data into various clusters based on their density. Using it, one can perform visualization on high dimensional data. One example of this type of Machine learning algorithm is the Principle Component Analysis. Furthermore, K-Means Clustering is another type of Unsupervised Learning where the data is clustered in groups of a similar order.&lt;/p&gt;

&lt;p&gt;The learning process in Unsupervised Learning is solely on the basis of finding patterns in the data. After learning the patterns, the model then makes conclusions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Reinforcement Learning&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Reinforcement Learning is an emerging and most popular type of Machine Learning Algorithm. It is used in various autonomous systems like cars and industrial robotics. The aim of this algorithm is to reach a goal in a dynamic environment. It can reach this goal based on several rewards that are provided to it by the system.&lt;/p&gt;

&lt;p&gt;It is most heavily used in programming robots to perform autonomous actions. It is also used in making intelligent self-driving cars. Let us consider the case of robotic navigation. Furthermore, the efficiency can be improved with further experimentation with the agent in its environment. This the main principle behind reinforcement learning. There are similar sequences of action in a reinforcement learning model.&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting started with the Basics
&lt;/h2&gt;

&lt;p&gt;Machine learning algorithms build a model based on sample data, known as "training data", in order to make predictions or decisions without being explicitly programmed&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Train the ML methods&lt;/li&gt;
&lt;li&gt;Test the ML methods&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We predict using different ML methods and document the results.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Confusion matrix&lt;/strong&gt; helps to compare different ML methods and decide which performs best. We represent the training and testing data and document the actuals vs predicted in a matrix form depending on the number of parameters involved&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fm7118wr6kx9rtmupb0g7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fm7118wr6kx9rtmupb0g7.png" alt="Alt Text" width="800" height="567"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cross Validation&lt;/strong&gt; is used to decide which machine learning method would be best for our dataset.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sensitivity and Specificity&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Sensitivity measures the proportion of positives that are correctly identified (i.e. the proportion of those who have some condition (affected) who are correctly identified as having the condition)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F9kc3nykcnv5p0ksi2rcu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F9kc3nykcnv5p0ksi2rcu.png" alt="Alt Text" width="800" height="190"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Specificity measures the proportion of negatives that are correctly identified (i.e. the proportion of those who do not have the condition (unaffected) who are correctly identified as not having the condition)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fewryyg4frcelflijtk5f.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fewryyg4frcelflijtk5f.png" alt="Alt Text" width="800" height="185"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Bias and Variance&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The inability of a ML method to capture the true relationship is called Bias&lt;/li&gt;
&lt;li&gt;The difference in fits between data sets is called Variance (training vs testing data)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F62kqwvbn2awtnwbdhdbj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F62kqwvbn2awtnwbdhdbj.png" alt="Alt Text" width="660" height="494"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;ROC and AUC&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;ROC (Receiver Operator Characteristic) graphs and AUC (the area under the curve), are useful for consolidating the information from a ton of confusion matrices into a single, easy to interpret graph.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;ROC curve makes it easy to identify the best threshold for making a decision&lt;/li&gt;
&lt;li&gt;AUC helps in deciding which categorization method is better&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fleaxea7yd7gfxhafi226.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fleaxea7yd7gfxhafi226.png" alt="Alt Text" width="800" height="550"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>datascience</category>
      <category>beginners</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>GCP - Streaming IoT Kafka to PubSub</title>
      <dc:creator>Vignesh C</dc:creator>
      <pubDate>Wed, 06 Jan 2021 04:05:33 +0000</pubDate>
      <link>https://dev.to/iamvigneshc/gcp-streaming-iot-kafka-to-pubsub-3o9i</link>
      <guid>https://dev.to/iamvigneshc/gcp-streaming-iot-kafka-to-pubsub-3o9i</guid>
      <description>&lt;p&gt;Subscribe to my YouTube channel : &lt;a href="https://www.youtube.com/channel/UCvJ2JvEAzFi_cU8kPFdppjA"&gt;MyDigitalWorld&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Objectives:
&lt;/h2&gt;

&lt;p&gt;• Launch a Kafka instance and use it to communicate with Pub/Sub&lt;/p&gt;

&lt;p&gt;• Configure a Kafka connector to integrate with Pub/Sub&lt;/p&gt;

&lt;p&gt;• Setup topics and subscriptions for message communication&lt;/p&gt;

&lt;p&gt;• Perform basic testing of both Kafka and Pub/Sub services&lt;/p&gt;

&lt;p&gt;• Connect IoT Core to Pub/Sub&lt;/p&gt;

&lt;h2&gt;
  
  
  Architecture:
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F2szhikz58n2yt3a8pv5i.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F2szhikz58n2yt3a8pv5i.png" alt="Arch" width="800" height="676"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Introduction:
&lt;/h2&gt;

&lt;p&gt;With the announcement of the Google Cloud Confluent managed Kafka offering, it has never been easier to use Google Cloud's great data tools with Kafka. You can use the Apache Beam Kafka.io connector to go straight into Dataflow, but this may not always be the right solution.&lt;/p&gt;

&lt;p&gt;Whether Kafka is provisioned in the Cloud or on premise, you might want to push to a subset of Pub/Sub topics. Why? For the flexibility of having Pub/Sub as your Google Cloud event notifier. Then you could not only choreograph Dataflow jobs, but also use topics to trigger Cloud Functions.&lt;/p&gt;

&lt;p&gt;So how do you exchange messages between Kafka and Pub/Sub? This is where the Pub/Sub Kafka Connector comes in handy.&lt;/p&gt;

&lt;p&gt;Tip: Here we use a virtual machine with a single instance of Kafka. This Kafka instance connects to Pub/Sub and exchanges event messages between the two services.&lt;/p&gt;

&lt;p&gt;In the real world, Kafka would likely be run in a cluster.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F7mzymeirem0m3e45fxqr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F7mzymeirem0m3e45fxqr.png" alt="Arch" width="800" height="496"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Details:
&lt;/h2&gt;

&lt;p&gt;Youtube link: &lt;a href="https://tinyurl.com/y6dd28vr"&gt;https://tinyurl.com/y6dd28vr&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;GitHub: &lt;a href="https://github.com/IamVigneshC/GCP-Streaming-IoT-Kafka-to-PubSub"&gt;https://github.com/IamVigneshC/GCP-Streaming-IoT-Kafka-to-PubSub&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Blog: &lt;a href="https://tinyurl.com/y4q9k4cp"&gt;https://tinyurl.com/y4q9k4cp&lt;/a&gt;&lt;/p&gt;

</description>
      <category>googlecloud</category>
      <category>serverless</category>
      <category>pubsub</category>
      <category>beginners</category>
    </item>
    <item>
      <title>Confluent: Developing a Streaming Microservices Application - Kafka</title>
      <dc:creator>Vignesh C</dc:creator>
      <pubDate>Wed, 30 Dec 2020 03:46:06 +0000</pubDate>
      <link>https://dev.to/iamvigneshc/confluent-developing-a-streaming-microservices-application-kafka-3d1e</link>
      <guid>https://dev.to/iamvigneshc/confluent-developing-a-streaming-microservices-application-kafka-3d1e</guid>
      <description>&lt;h2&gt;
  
  
  Background
&lt;/h2&gt;

&lt;p&gt;We build services using a Streaming Platform, some will be stateless: simple functions that take an input, perform a business operation and produce an output. Some will be stateful, but read-only, as in when views need to be created so we can serve remote queries. Others will need to both read and write state, either entirely inside the Kafka ecosystem, or by calling out to other services or databases. Having all approaches available makes the Kafka’s Streams API a powerful tool for building event-driven services. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.confluent.io/blog/building-a-microservices-ecosystem-with-kafka-streams-and-ksql/"&gt;https://www.confluent.io/blog/building-a-microservices-ecosystem-with-kafka-streams-and-ksql/&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Objectives
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Persist events into Kafka by producing records that represent customer orders&lt;/li&gt;
&lt;li&gt;Write a service that validates customer orders&lt;/li&gt;
&lt;li&gt;Write a service that joins streaming order information with streaming payment information and data from a customer database&lt;/li&gt;
&lt;li&gt;Define one set of criteria to filter records in a stream based on some criteria&lt;/li&gt;
&lt;li&gt;Create a session window to define five-minute windows for processing&lt;/li&gt;
&lt;li&gt;Create a state store for the Inventory Service&lt;/li&gt;
&lt;li&gt;Create one persistent query that enriches the orders stream with customer information using a stream-table join&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Check out the steps &lt;a href="https://iamvigneshc.medium.com/confluent-developing-a-streaming-microservices-application-494680ce5e6"&gt;here&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Additional References:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://docs.confluent.io/platform/current/streams/developer-guide/dsl-api.html"&gt;https://docs.confluent.io/platform/current/streams/developer-guide/dsl-api.html&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.confluent.io/blog/distributed-real-time-joins-and-aggregations-on-user-activity-events-using-kafka-streams/"&gt;https://www.confluent.io/blog/distributed-real-time-joins-and-aggregations-on-user-activity-events-using-kafka-streams/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.confluent.io/product/ksql/"&gt;https://www.confluent.io/product/ksql/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/confluentinc"&gt;https://github.com/confluentinc&lt;/a&gt;&lt;/p&gt;

</description>
      <category>kafka</category>
      <category>microservices</category>
      <category>realtime</category>
      <category>streaming</category>
    </item>
    <item>
      <title>GCP - Automating DevOps Workflows with GitLab and Terraform</title>
      <dc:creator>Vignesh C</dc:creator>
      <pubDate>Tue, 29 Dec 2020 00:59:48 +0000</pubDate>
      <link>https://dev.to/iamvigneshc/gcp-automating-devops-workflows-with-gitlab-and-terraform-egi</link>
      <guid>https://dev.to/iamvigneshc/gcp-automating-devops-workflows-with-gitlab-and-terraform-egi</guid>
      <description>&lt;h2&gt;
  
  
  Overview
&lt;/h2&gt;

&lt;p&gt;GitLab is a complete DevOps platform delivered as a single application. GitLab provides a single UI for development and operational teams within organizations to work concurrently throughout the application development to delivery life cycle. As a result of applications becoming more and more complex, organizations turn to automation, like CI/CD, in hopes to streamline processes. However, to increase reliability and consistency in automation beyond software development, more and more teams are resorting to declarative automation and their respective workflows in code and storing them in code repositories. Often and dynamically triggering automation as code to quickly iterate and deliver value into production is becoming more and more popular. This phenomenon is commonly known as GitOps, where tools such as Terraform and GitLab shine. &lt;/p&gt;

&lt;p&gt;Google Cloud provides scalable cloud infrastructure with managed services for various forms of compute, storage, networking, etc. With extensive open APIs, almost any and every aspect of it can be automated and managed via code the proper tools. Organizations looking to adopt cloud native solutions in an agile way need a better way to manage the inherent complexities when elastically scaling their services, breaking up monolithic applications, and securely operating at speed through cross functional teams. All of the prior has been the foundation of the DevOps transformation as we recognize today.&lt;/p&gt;

&lt;p&gt;The fundamental challenge DevOps attempts to address is the unification of Developer and Operator workflow. When automating the application development process, Continuous Integration / Continuous Delivery (CI/CD) is crucial in iteratively deploying code reliably, in a secure fashion from source code into production. When automating Infrastructure administration and provisioning, Infrastructure as Code (IaC) is an emerging method to quickly and reliably build deployment targets for applications. IaC enables practitioners to deploy infrastructure through declarative means and maintain the desired state and lifecycle of cloud resources.&lt;/p&gt;

&lt;p&gt;GitLab is the single application that provides all of the necessary functionality for DevOps teams out of the box for a seamless, low maintenance, just-commit-code software development, and delivery experience. GitLab enables teams to plan sprints and projects, organize code in git repositories, operationalize application code with CI/CD pipelines, secure source code through various vulnerability scanning, and much more. GitLab provides means to construct flexible automation workflows that are complementary to other tools like Terraform. Terraform is a tool for building, changing, and versioning infrastructure safely and efficiently. The infrastructure Terraform can manage includes low-level components such as compute instances, storage, and networking, as well as high-level components such as DNS entries, SaaS features, etc. Together, GitLab and Terraform can be configured together to provide DevOps teams the capability to manage their cloud through IaC, continuously and reliably.&lt;/p&gt;

&lt;h2&gt;
  
  
  Sample Application Overview
&lt;/h2&gt;

&lt;p&gt;We will be using a sample application that we have called Vote-App which is representative of a N-Tier microservice architecture, containerized application that will be deployed to a Kubernetes cluster provided by GKE. Below is a high level overview of the sample application architecture:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F0ze1qbebcj0gl2j3bwhj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F0ze1qbebcj0gl2j3bwhj.png" alt="Alt Text" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Vote UI:&lt;/strong&gt; Is a python containerized microservice using flask to generate a front end UI for end users to place a ‘vote'. This service will be running on GKE.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Results UI:&lt;/strong&gt; Is a NodeJs containerized microservice used to generate a front end UI for end users to see voting results. This service will be running on GKE.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Redis:&lt;/strong&gt; This will be a managed Redis service powered by Cloud Memorystore which will serve as a cached queuing system for votes. This managed service will store aggregated votes inserted by the *Vote UI *service.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Postgres:&lt;/strong&gt; This will be a managed Postgres service powered by Cloud SQL which will serve as a central database for all votes casted and accounted for. This managed service will be queried by the Results UI to show results.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Worker:&lt;/strong&gt; This Is a .Net Core containerized microservice that queries votes in the que stored in Redis cache and enters them into the Postgres database. This service will be running on GKE.&lt;/p&gt;

&lt;h2&gt;
  
  
  DevOps Workflow Overview
&lt;/h2&gt;

&lt;p&gt;You will have access to two GitLab users, Developer and Operator. After initial GitLab configuration and post importing of all relevant projects, you will role play deploying microservices and managing Infrastructure via IaC automated through CI/CD from code commit to running in production. &lt;/p&gt;

&lt;p&gt;You will start with only a single, pre-provisioned GKE cluster as a deployment target for your microservices. In this example, we will simply be committing to the Master branch to deploy into a single staging environment. The high level workflow is as follows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;The user will commit code to the application code repositories to scan, package and deploy the service.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;This will kick off a series of CI/CD pipelines to&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Execute code security scanning for vulnerabilities&lt;/li&gt;
&lt;li&gt;Package the code into a docker container&lt;/li&gt;
&lt;li&gt;Trigger a child pipeline to provision dependent infrastructure.&lt;/li&gt;
&lt;li&gt;Deploy all necessary Kubernetes resources (pods, services, secrets, etc.) to GKE.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;li&gt;

&lt;p&gt;The Terraform child CI/CD pipeline stored will:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use Terraform to init, plan, and apply to provision a Memstore instance and a Cloud SQL Postgres Instance on GCP.&lt;/li&gt;
&lt;li&gt;The CI/CD pipeline will also update and deploy a ConfigMap and Secrets resource to Kubernetes that the microservices will reference for appropriate DB endpoints and credentials.&lt;/li&gt;
&lt;li&gt;Lastly, the services will then be deployed to Kubernetes and be accessible through a web endpoints by the end user.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The workflow will be concerned with a single GitLab subgroup spanning multiple GitLab Projects outlined below. Each project listed below is a git repository with a .gitlab-ci.yml file already defined for an automated pipeline to execute upon code commit.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fd01nmq2h5iddfxii2h15.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fd01nmq2h5iddfxii2h15.png" alt="Alt Text" width="800" height="561"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Continue reading &lt;a href="https://iamvigneshc.medium.com/automating-devops-workflows-with-gitlab-and-terraform-6113400fa5c6"&gt;here&lt;/a&gt;...&lt;/p&gt;

&lt;h3&gt;
  
  
  Additional References:
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://www.terraform.io/docs/index.html"&gt;https://www.terraform.io/docs/index.html&lt;/a&gt;&lt;br&gt;
&lt;a href="https://docs.gitlab.com/ee/ci/"&gt;https://docs.gitlab.com/ee/ci/&lt;/a&gt;&lt;/p&gt;

</description>
      <category>devops</category>
      <category>git</category>
      <category>terraform</category>
      <category>kubernetes</category>
    </item>
  </channel>
</rss>
