<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Avinash Gupta</title>
    <description>The latest articles on DEV Community by Avinash Gupta (@tier3guy).</description>
    <link>https://dev.to/tier3guy</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F707820%2F16b0e443-c535-40f2-b3b4-d73fea167d71.jpeg</url>
      <title>DEV Community: Avinash Gupta</title>
      <link>https://dev.to/tier3guy</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/tier3guy"/>
    <language>en</language>
    <item>
      <title>Build your own Cryptocurrency !</title>
      <dc:creator>Avinash Gupta</dc:creator>
      <pubDate>Sun, 26 Mar 2023 06:13:18 +0000</pubDate>
      <link>https://dev.to/tier3guy/build-your-own-cryptocurrency--fa2</link>
      <guid>https://dev.to/tier3guy/build-your-own-cryptocurrency--fa2</guid>
      <description>&lt;p&gt;Hi all, this is Avinash once again with a small blog post. Let's start this blog with a quick question. Have you ever thought of building your own crypto. Look dont lie and don't forget to drop your answer in the comment box. I am asking this because, this idea came to my mind not only once but a lot of times. So, in this post lets do the same. &lt;strong&gt;Lets create our own crypto.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;So, for building our crypto we will be using ERC20 smart contract. For those who dont know the difference between a Coin and a ERC20 token, read carefully the upcoming content. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Difference between ERC20 Token and a Coin&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The main difference between a coin and an ERC20 token is that coins are standalone currencies, whereas ERC20 tokens are created and operate on the Ethereum blockchain.&lt;/p&gt;

&lt;p&gt;Coins have their own independent blockchain network and are often designed to serve as a currency or a means of value transfer. Examples of coins include Bitcoin, Litecoin, and Ripple.&lt;/p&gt;

&lt;p&gt;On the other hand, ERC20 tokens are created using the Ethereum blockchain and are subject to the rules and protocols of the Ethereum network. &lt;/p&gt;

&lt;p&gt;I know you did'nt understand. So, in simple terms coins are those which is built from very scratch. They can have their own blockchain network but on the other hand ERC20 tokens are like set of instructions or we can say a protocol which runs only on the ethereum blockchain in form of a smart contract. There are many ERC20 smart contracts out there in the market. The programer can import it, override it and build his or her own's digital asset. &lt;/p&gt;

&lt;p&gt;And this is what we are exactly going to do in this blog post to build our crypto. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Building our Token&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;As I said there are many ERC20 smart contracts out there in the market. But for this post we gonna use Open Zeppelin. You read it more about it &lt;a href="https://www.openzeppelin.com/"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;I think I am boring you know. So, lets start the coding aspect of this blog. But if have understood the above concepts you will have some upperhand than others but if you did'nt you can still continue. &lt;em&gt;Code speaks more than the theory&lt;/em&gt; - this is what I believe. With this said lets start. &lt;/p&gt;

&lt;p&gt;I am calling my token as AviToken. Yeah you got it right it is coming from my name Avinash. And for the smybol I am using AVT, just ethereum use ETH, the same way. You are free to use your name and symbol.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;// SPDX-License-Identifier: MIT

pragma solidity ^0.8.0;

// This the open zeppelin's ERC20 Tokens smart contract. 
// And this how we import any contract in solidity using import keyword.
import "https://github.com/OpenZeppelin/openzeppelin-contracts/blob/master/contracts/token/ERC20/ERC20.sol";


// Its time to override some of the properties of this contract.
// How to do that - simple using 'is' keyword
contract AviToken is ERC20 {

    // creating the constructor so that we can pass our initial supply 
    // of the tokens at the time of deployment.

    // again calling ERC20 contracts constructor to give the name and the symbol of the token
    constructor(uint256 _initialSupply) ERC20("AviToken", "AVT") {

        // upto this everything is correct and this will work absolutely fine. 
        // But next we have to tranfers all of the tokens to the author of this contract 
        // lets do this.

        _mint(msg.sender, _initialSupply);

        // _mint function will help us to do the same
        // basically we are telling to the code that hey just put all of those supply of tokens
        // to the address of the author (msg.sender ---&amp;gt; gives address of the current user)
        // at the time of the constructor calling user will be the author

        // and yes you are done.
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Try running the above code on the &lt;a href="https://remix.ethereum.org/"&gt;remix ide&lt;/a&gt;. It will look something like this. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--B4pcqy1Z--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/mm40ia2607b0l1sj41eg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--B4pcqy1Z--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/mm40ia2607b0l1sj41eg.png" alt="Image description" width="880" height="462"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Deploy it using the ethereum symbol showing in the side bar. &lt;/p&gt;

&lt;p&gt;&lt;em&gt;&lt;strong&gt;Point to be noted : We will be giving values in WEI i.e 1 = 10^18 Wei&lt;/strong&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;So, lets say if we want to have one million tokens as the intial supply then we have to write it like this : 1000000,000000000000000000&lt;/p&gt;

&lt;p&gt;Like this, &lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--ZCdV_qZa--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/90b42auikekyzrsz95qo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--ZCdV_qZa--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/90b42auikekyzrsz95qo.png" alt="Image description" width="683" height="222"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now we have the contract deployed and congratulations to you. You have successfully created your first ever crypto currency. &lt;/p&gt;

&lt;p&gt;Now you can interact with this contract. That is you can transfer tokens from one account to another. Using the UI buttons given by the remix ide. &lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--Oq-COkvl--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ffr3bhgeyfrr1c2jneo6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--Oq-COkvl--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ffr3bhgeyfrr1c2jneo6.png" alt="Image description" width="880" height="443"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;That is for today. Hope you guys enjoyed learning with me. And please drop your comments, questions or anything you want to share below.&lt;/p&gt;

</description>
      <category>solidity</category>
      <category>blockchain</category>
      <category>erc</category>
      <category>web3</category>
    </item>
    <item>
      <title>Lasso Regression Analysis</title>
      <dc:creator>Avinash Gupta</dc:creator>
      <pubDate>Sun, 30 Oct 2022 18:26:29 +0000</pubDate>
      <link>https://dev.to/tier3guy/lasso-regression-analysis-54g</link>
      <guid>https://dev.to/tier3guy/lasso-regression-analysis-54g</guid>
      <description>&lt;h2&gt;
  
  
  What is Lasso regression:
&lt;/h2&gt;

&lt;p&gt;Lasso Regression is a type of linear regression that uses shrinkage. Shrinkage is where data values are shrunk towards a central point, like the mean. The lasso procedure encourages simple, sparse models (i.e. models with fewer parameters). This particular type of regression is well-suited for models showing high levels of multicollinearity or when you want to automate certain parts of model selection, like variable selection/parameter elimination.&lt;/p&gt;

&lt;p&gt;The acronym “LASSO” stands for Least Absolute Shrinkage and Selection operator.&lt;/p&gt;

&lt;h2&gt;
  
  
  L1 Regularization:
&lt;/h2&gt;

&lt;p&gt;Lasso Regression performs L1 regularization, which adds a penalty equal to the absolute value of the magnitude of coefficients. This type of regularization can result in sparse models with few coefficients; Some coefficients can become zero and eliminate from the model. Larger penalties result in coefficient values closer to zero, which is ideal for producing simpler models. On the other hand, L2 regularization (e.g. Ridge regression) doesn’t result in the elimination of coefficients or sparse models. This makes the Lasso far easier to interpret than Ridge.&lt;/p&gt;

&lt;h2&gt;
  
  
  Performing the Regression:
&lt;/h2&gt;

&lt;p&gt;Lasso solutions are quadratic programming problems, which are best solved with software (like Matlab). The goal of the algorithm is to minimize:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--zUL8GBGB--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/kgtud58qv60j6yuncck3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--zUL8GBGB--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/kgtud58qv60j6yuncck3.png" alt="Lasso Solution" width="364" height="97"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Which is the same as minimizing the sum of squares with constraint Σ |Bj| ≤ s (Σ = summation notation). Some of the βs are shrunk to exactly zero, resulting in a regression model that’s easier to interpret.&lt;/p&gt;

&lt;p&gt;A tuning parameter, λ controls the strength of the L1 penalty. λ is basically the amount of shrinkage:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;When λ = 0, no parameters are eliminated. The estimate is equal to the one found with linear regression.&lt;/li&gt;
&lt;li&gt;As λ increases, more and more coefficients are set to zero and eliminated (theoretically, when λ = ∞, all coefficients are eliminated).&lt;/li&gt;
&lt;li&gt;As λ increases, bias increases.&lt;/li&gt;
&lt;li&gt;As λ decreases, variance increases.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Code to Explain Lasso Regression Analysis:
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import pandas as pd
import numpy as np
import matplotlib.pylab as plt
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LassoLarsCV
from sklearn import preprocessing
from sklearn.metrics import mean_squared_error
pd.options.mode.chained_assignment = None
%matplotlib inline 
RND_STATE = 45123
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Loading Data:
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;data = pd.read_csv("data/tree_addhealth.csv")
data.columns = map(str.upper, data.columns)
data.describe()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Output of Loading Data:
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--0d4jxrIE--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/igxu9z4a4xywaknprj61.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--0d4jxrIE--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/igxu9z4a4xywaknprj61.png" alt="Data set" width="880" height="217"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Data Processing:
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;data_clean = data.dropna()

predvar = data_clean[['BIO_SEX','HISPANIC','WHITE','BLACK','NAMERICAN','ASIAN',
'AGE','ALCEVR1','ALCPROBS1','MAREVER1','COCEVER1','INHEVER1','CIGAVAIL','DEP1',
'ESTEEM1','VIOL1','PASSIST','DEVIANT1','SCHCONN1','EXPEL1','FAMCONCT','PARACTV',
'PARPRES']]

target = data_clean.GPA1

recode = {1:1, 2:0}
data_clean['BIO_SEX'] = data_clean['BIO_SEX'].map(recode)

predictors=predvar.copy()

predictors['BIO_SEX']=preprocessing.scale(predictors['BIO_SEX'].astype('float64'))
predictors['HISPANIC']=preprocessing.scale(predictors['HISPANIC'].astype('float64'))
predictors['WHITE']=preprocessing.scale(predictors['WHITE'].astype('float64'))
predictors['NAMERICAN']=preprocessing.scale(predictors['NAMERICAN'].astype('float64'))
predictors['ASIAN']=preprocessing.scale(predictors['ASIAN'].astype('float64'))
predictors['AGE']=preprocessing.scale(predictors['AGE'].astype('float64'))
predictors['ALCEVR1']=preprocessing.scale(predictors['ALCEVR1'].astype('float64'))
predictors['ALCPROBS1']=preprocessing.scale(predictors['ALCPROBS1'].astype('float64'))
predictors['MAREVER1']=preprocessing.scale(predictors['MAREVER1'].astype('float64'))
predictors['COCEVER1']=preprocessing.scale(predictors['COCEVER1'].astype('float64'))
predictors['INHEVER1']=preprocessing.scale(predictors['INHEVER1'].astype('float64'))
predictors['CIGAVAIL']=preprocessing.scale(predictors['CIGAVAIL'].astype('float64'))
predictors['DEP1']=preprocessing.scale(predictors['DEP1'].astype('float64'))
predictors['ESTEEM1']=preprocessing.scale(predictors['ESTEEM1'].astype('float64'))
predictors['VIOL1']=preprocessing.scale(predictors['VIOL1'].astype('float64'))
predictors['PASSIST']=preprocessing.scale(predictors['PASSIST'].astype('float64'))
predictors['DEVIANT1']=preprocessing.scale(predictors['DEVIANT1'].astype('float64'))
predictors['SCHCONN1']=preprocessing.scale(predictors['SCHCONN1'].astype('float64'))
predictors['EXPEL1']=preprocessing.scale(predictors['EXPEL1'].astype('float64'))
predictors['FAMCONCT']=preprocessing.scale(predictors['FAMCONCT'].astype('float64'))
predictors['PARACTV']=preprocessing.scale(predictors['PARACTV'].astype('float64'))
predictors['PARPRES']=preprocessing.scale(predictors['PARPRES'].astype('float64'))
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Making train test split:
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;pred_train, pred_test, tar_train, tar_test = train_test_split(predictors, target, test_size=.3, random_state=RND_STATE)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Fitting LassoLarsCV
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;model = LassoLarsCV(cv=10, precompute=False).fit(pred_train,tar_train)

dict(zip(predictors.columns, model.coef_))
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--TnqPCq2r--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/50iom2hmv7j7ncutcl95.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--TnqPCq2r--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/50iom2hmv7j7ncutcl95.png" alt="model" width="562" height="594"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;m_log_alphas = -np.log10(model.alphas_)
ax = plt.gca()
plt.plot(m_log_alphas, model.coef_path_.T)
plt.axvline(-np.log10(model.alpha_), linestyle='--', color='k',
            label='alpha CV')
plt.ylabel('Regression Coefficients')
plt.xlabel('-log(alpha)')
plt.title('Regression Coefficients Progression for Lasso Paths')
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--Qp_yeEFM--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/u0y0vcuur5cagu5ax7jy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--Qp_yeEFM--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/u0y0vcuur5cagu5ax7jy.png" alt="graph" width="388" height="278"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;m_log_alphascv = -np.log10(model.cv_alphas_)
plt.figure()
plt.plot(m_log_alphascv, model.mse_path_, ':')
plt.plot(m_log_alphascv, model.mse_path_.mean(axis=-1), 'k',
         label='Average across the folds', linewidth=2)
plt.axvline(-np.log10(model.alpha_), linestyle='--', color='k',
            label='alpha CV')
plt.legend()
plt.xlabel('-log(alpha)')
plt.ylabel('Mean squared error')
plt.title('Mean squared error on each fold')
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--yXXY0VtR--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ynmtidazc3olwhqgumi9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--yXXY0VtR--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ynmtidazc3olwhqgumi9.png" alt="mean squared error" width="395" height="278"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;train_error = mean_squared_error(tar_train, model.predict(pred_train))
test_error = mean_squared_error(tar_test, model.predict(pred_test))
print('training data MSE', train_error)
print('test data MSE', test_error)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;training data MSE 0.473352755862&lt;br&gt;
test data MSE 0.462008061824&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;rsquared_train=model.score(pred_train,tar_train)
rsquared_test=model.score(pred_test,tar_test)
print('training data R-square', rsquared_train)
print('test data R-square', rsquared_test)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;training data R-square 0.208850928224&lt;br&gt;
test data R-square 0.204190123696&lt;/code&gt;&lt;/p&gt;

</description>
      <category>python</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>Cluster Analysis using K-means</title>
      <dc:creator>Avinash Gupta</dc:creator>
      <pubDate>Sun, 30 Oct 2022 16:50:07 +0000</pubDate>
      <link>https://dev.to/tier3guy/cluster-analysis-using-k-means-3d1l</link>
      <guid>https://dev.to/tier3guy/cluster-analysis-using-k-means-3d1l</guid>
      <description>&lt;h2&gt;
  
  
  Introduction:
&lt;/h2&gt;

&lt;p&gt;The k-means algorithm explores a preplanned number of clusters in an unlabeled multidimensional dataset, it concludes this via an easy interpretation of how an optimized cluster can be expressed.&lt;/p&gt;

&lt;p&gt;Primarily the concept would be in two steps:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;First, the cluster center is the arithmetic mean (AM) of all the data points associated with the cluster.&lt;/li&gt;
&lt;li&gt;Second, each point is adjoint to its cluster center in comparison to other cluster centers. These two interpretations are the foundation of the k-means clustering model.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You can take the center as a data point that outlines the means of the cluster, also it might not possibly be a member of the dataset.&lt;/p&gt;

&lt;p&gt;In simple terms, k-means clustering enables us to cluster the data into several groups by detecting the distinct categories of groups in the unlabeled datasets by itself, even without training data.&lt;/p&gt;

&lt;p&gt;This is the centroid-based algorithm such that each cluster is connected to a centroid while following the objective to minimize the sum of distances between the data points and their corresponding clusters.&lt;/p&gt;

&lt;p&gt;Specifically performing two tasks, the k-means algorithm:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Calculates the correct value of K-center points or centroids by an iterative method.&lt;/li&gt;
&lt;li&gt;Assigns every data point to its nearest k-center, and the data points, closer to a particular k-center, make a cluster. Therefore, data points, in each cluster, have some similarities and are far apart from other clusters.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Explanation:
&lt;/h2&gt;

&lt;p&gt;K-Means is just the Expectation-Maximization (EM) algorithm, It is a persuasive algorithm that exhibits a variety of contexts in data science, the E-M approach incorporates two parts in its procedure:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;To assume some cluster centers.&lt;/li&gt;
&lt;li&gt;Re-run as far as transformed.&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;E-Steps:&lt;/strong&gt; To appoint data points to the closest cluster center.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;M-Steps:&lt;/strong&gt; To introduce the cluster centers to the mean.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Where the E-step is the Expectation step, it comprises upgrading forecasts of associating the data point with the respective cluster.&lt;br&gt;
And, M-step is the Maximization step, which includes maximizing some features that specify the region of the cluster centers, for this maximization, is expressed by considering the mean of the data points of each cluster.&lt;/p&gt;

&lt;p&gt;In account of some critical possibilities, each reiteration of E-step and M-step algorithms will always yield in terms of improved estimation of clusters’ characteristics.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;K-means utilize an iterative procedure to yield its final clustering based on the number of predefined clusters, as per need according to the dataset and represented by the variable K.&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;For instance, if K is set to 3 (k3), then the dataset would be categorized into 3 clusters if k is equal to 4, then the number of clusters will be 4, and so on.&lt;/p&gt;

&lt;p&gt;The fundamental aim is to define k centers, one for each cluster, these centers must be located in a sharp manner because of the various allocation causes different outcomes. So, it would be best to put them as far away as possible from each other.&lt;/p&gt;

&lt;p&gt;Also, &lt;strong&gt;&lt;em&gt;The maximum number of plausible clusters will be the same as the total number of observations/features present in the dataset.&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Code:
&lt;/h2&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from pandas import Series, DataFrame
import pandas as pd
import numpy as np
import matplotlib.pylab as plt
from sklearn.model_selection import train_test_split
from sklearn import preprocessing
from sklearn.cluster import KMeans
from scipy.spatial.distance import cdist
from sklearn.decomposition import PCA
import statsmodels.formula.api as smf
import statsmodels.stats.multicomp as multi 
%matplotlib inline
RND_STATE = 55121
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;code&gt;Loading Data:&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;data = pd.read_csv("data/tree_addhealth.csv")
data.columns = map(str.upper, data.columns)

data_clean = data.dropna()

cluster=data_clean[['ALCEVR1','MAREVER1','ALCPROBS1','DEVIANT1','VIOL1',
'DEP1','ESTEEM1','SCHCONN1','PARACTV', 'PARPRES','FAMCONCT']]

cluster.describe()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;Preprocessing Data:&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;clustervar=cluster.copy()
clustervar['ALCEVR1']=preprocessing.scale(clustervar['ALCEVR1'].astype('float64'))
clustervar['ALCPROBS1']=preprocessing.scale(clustervar['ALCPROBS1'].astype('float64'))
clustervar['MAREVER1']=preprocessing.scale(clustervar['MAREVER1'].astype('float64'))
clustervar['DEP1']=preprocessing.scale(clustervar['DEP1'].astype('float64'))
clustervar['ESTEEM1']=preprocessing.scale(clustervar['ESTEEM1'].astype('float64'))
clustervar['VIOL1']=preprocessing.scale(clustervar['VIOL1'].astype('float64'))
clustervar['DEVIANT1']=preprocessing.scale(clustervar['DEVIANT1'].astype('float64'))
clustervar['FAMCONCT']=preprocessing.scale(clustervar['FAMCONCT'].astype('float64'))
clustervar['SCHCONN1']=preprocessing.scale(clustervar['SCHCONN1'].astype('float64'))
clustervar['PARACTV']=preprocessing.scale(clustervar['PARACTV'].astype('float64'))
clustervar['PARPRES']=preprocessing.scale(clustervar['PARPRES'].astype('float64'))

clus_train, clus_test = train_test_split(clustervar, test_size=0.3, random_state=RND_STATE)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;K-means Analysis for 9 clusters:&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;clusters=range(1,10)
meandist=[]

for k in clusters:
    model=KMeans(n_clusters=k)
    model.fit(clus_train)
    clusassign=model.predict(clus_train)
    meandist.append(sum(np.min(cdist(clus_train, model.cluster_centers_, 'euclidean'), axis=1)) 
    / clus_train.shape[0])
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;Relation between cluster and average distance:&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;plt.plot(clusters, meandist)
plt.xlabel('Number of clusters')
plt.ylabel('Average distance')
plt.title('Selecting k with the Elbow Method')
plt.show()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Plotting Output of Relation:
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--SrioWScx--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/fzbg08jgqqmbl0svuslo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--SrioWScx--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/fzbg08jgqqmbl0svuslo.png" alt="cluster vs avg distance" width="389" height="278"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Solution for 3 cluster model:&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;model3=KMeans(n_clusters=3)
model3.fit(clus_train)
clusassign=model3.predict(clus_train)

pca_2 = PCA(2)
plot_columns = pca_2.fit_transform(clus_train)
plt.scatter(x=plot_columns[:,0], y=plot_columns[:,1], c=model3.labels_,)
plt.xlabel('Canonical variable 1')
plt.ylabel('Canonical variable 2')
plt.title('Scatterplot of Canonical Variables for 3 Clusters')
plt.show()

clus_train.reset_index(level=0, inplace=True)
cluslist=list(clus_train['index'])
labels=list(model3.labels_)
newlist=dict(zip(cluslist, labels))

newclus=DataFrame.from_dict(newlist, orient='index')
newclus.columns = ['cluster']
newclus.describe()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Plotting Clusters:
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--ML9niScF--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/stk73k059kwcnn7q267u.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--ML9niScF--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/stk73k059kwcnn7q267u.png" alt="clusters graph" width="388" height="278"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;newclus.reset_index(level=0, inplace=True)
merged_train=pd.merge(clus_train, newclus, on='index')
merged_train.head(n=100)
merged_train.cluster.value_counts()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Output reset_index:&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--aMb8-Yrq--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/31eswqt5ozamhgebli5b.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--aMb8-Yrq--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/31eswqt5ozamhgebli5b.png" alt="reset index" width="337" height="130"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;clustergrp = merged_train.groupby('cluster').mean()
print ("Clustering variable means by cluster")
print(clustergrp)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Output of Cluster Variable means:&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--IqbbZ5rc--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/6t8eviwyg8stmerghxfb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--IqbbZ5rc--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/6t8eviwyg8stmerghxfb.png" alt="cluster variable means" width="854" height="310"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;gpa_data=data_clean['GPA1']
gpa_train, gpa_test = train_test_split(gpa_data, test_size=.3, random_state=RND_STATE)
gpa_train1=pd.DataFrame(gpa_train)
gpa_train1.reset_index(level=0, inplace=True)
merged_train_all=pd.merge(gpa_train1, merged_train, on='index')
sub1 = merged_train_all[['GPA1', 'cluster']].dropna()

gpamod = smf.ols(formula='GPA1 ~ C(cluster)', data=sub1).fit()
print (gpamod.summary())
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;OLS Regression Results:&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--g0KWxIe_--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/gpxvdscqibetyo3ju0f3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--g0KWxIe_--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/gpxvdscqibetyo3ju0f3.png" alt="OLS Regression" width="880" height="521"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;print ('means for GPA by cluster')
m1= sub1.groupby('cluster').mean()
print (m1)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--89Iseg5b--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/vjgzhit5jxmq8okc9qgy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--89Iseg5b--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/vjgzhit5jxmq8okc9qgy.png" alt="means_gpa" width="395" height="155"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;print ('standard deviations for GPA by cluster')
m2= sub1.groupby('cluster').std()
print (m2)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--FETVBFy4--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/46qt9imgqtmp6lnvlj7f.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--FETVBFy4--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/46qt9imgqtmp6lnvlj7f.png" alt="sd_gpa" width="460" height="160"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;mc1 = multi.MultiComparison(sub1['GPA1'], sub1['cluster'])
res1 = mc1.tukeyhsd()
print(res1.summary())
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Output for Comparison of means:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--jOw4C-R_--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/rvf8gvdow568394ai81r.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--jOw4C-R_--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/rvf8gvdow568394ai81r.png" alt="comparison_means" width="667" height="223"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>python</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>A Lasso Regression</title>
      <dc:creator>Avinash Gupta</dc:creator>
      <pubDate>Sun, 23 Oct 2022 15:31:47 +0000</pubDate>
      <link>https://dev.to/tier3guy/a-lasso-regression-25p</link>
      <guid>https://dev.to/tier3guy/a-lasso-regression-25p</guid>
      <description>&lt;p&gt;&lt;strong&gt;Lasso Regression&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Lasso regression is a regularization technique. It is used over regression methods for a more accurate prediction. This model uses shrinkage. Shrinkage is where data values are shrunk towards a central point as the mean. The lasso procedure encourages simple, sparse models (i.e. models with fewer parameters). This particular type of regression is well-suited for models showing high levels of multi-collinearity or when you want to automate certain parts of model selection like variable selection/parameter elimination.&lt;/p&gt;

&lt;p&gt;The word “LASSO” stands for Least Absolute Shrinkage and Selection Operator. It is a statistical formula for the regularization of data models and feature selection.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Regularization&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Regularization is an important concept used to avoid overfitting the data, especially when the trained and test data are much varying. Regularization is implemented by adding a “penalty” term to the best fit derived from the trained data, to achieve a lesser variance with the tested data and also restricts the influence of predictor variables over the output variable by compressing their coefficient. In regularization, what we do is normally keep the same number of features but reduce the magnitude of the coefficients. We can reduce the magnitude of the coefficients by using different types of regression techniques that use regularization to overcome this problem.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Lasso Regularization Technique&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;There are two main regularization techniques, namely Ridge Regression and Lasso Regression. They both differ in the way they assign a penalty to the coefficients. In this blog, we will try to understand more about Lasso Regularization technique.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;L1 Regularization&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If a regression model uses the L1 Regularization technique, then it is called Lasso Regression. If it used the L2 regularization technique, it’s called Ridge Regression. We will study more about these in the later sections. L1 regularization adds a penalty that is equal to the absolute value of the magnitude of the coefficient. This regularization type can result in sparse models with few coefficients. Some coefficients might become zero and get eliminated from the model. Larger penalties result in coefficient values that are closer to zero (ideal for producing simpler models). On the other hand, L2 regularization does not result in any elimination of sparse models or coefficients. Thus, Lasso Regression is easier to interpret as compared to the Ridge.&lt;br&gt;
Mathematical equation of Lasso Regression&lt;/p&gt;

&lt;p&gt;Residual Sum of Squares + λ * (Sum of the absolute value of the magnitude of coefficients)&lt;/p&gt;

&lt;p&gt;Where,&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;λ denotes the amount of shrinkage.
λ = 0 implies all features are considered and it is equivalent to the linear regression where only the residual sum of squares is considered to build a predictive model
λ = ∞ implies no feature is considered i.e, as λ closes to infinity it eliminates more and more features
The bias increases with an increase in λ
variance increases with a decrease in λ
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;Example&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import numpy as np

Creating a New Train and Validation Datasets

from sklearn.model_selection import train_test_split
data_train, data_val = train_test_split(new_data_train, test_size = 0.2, random_state = 2)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Classifying Predictors and Target&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#Classifying Independent and Dependent Features
#_______________________________________________
#Dependent Variable
Y_train = data_train.iloc[:, -1].values
#Independent Variables
X_train = data_train.iloc[:,0 : -1].values
#Independent Variables for Test Set
X_test = data_val.iloc[:,0 : -1].values
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Evaluating The Model With RMLSE&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;def score(y_pred, y_true):
error = np.square(np.log10(y_pred +1) - np.log10(y_true +1)).mean() ** 0.5
score = 1 - error
return score
actual_cost = list(data_val['COST'])
actual_cost = np.asarray(actual_cost)

Building the Lasso Regressor
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h1&gt;
  
  
  Lasso Regression
&lt;/h1&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from sklearn.linear_model import Lasso
#Initializing the Lasso Regressor with Normalization Factor as True
lasso_reg = Lasso(normalize=True)
#Fitting the Training data to the Lasso regressor
lasso_reg.fit(X_train,Y_train)
#Predicting for X_test
y_pred_lass =lasso_reg.predict(X_test)
#Printing the Score with RMLSE
print("\n\nLasso SCORE : ", score(y_pred_lass, actual_cost))
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    0.7335508027883148

    The Lasso Regression attained an accuracy of 73% with the given Dataset.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



</description>
    </item>
    <item>
      <title>Random Forest</title>
      <dc:creator>Avinash Gupta</dc:creator>
      <pubDate>Sun, 16 Oct 2022 14:04:51 +0000</pubDate>
      <link>https://dev.to/tier3guy/random-forest-e08</link>
      <guid>https://dev.to/tier3guy/random-forest-e08</guid>
      <description>&lt;h1&gt;
  
  
  About Random Forest Classifier:
&lt;/h1&gt;

&lt;p&gt;Random Forest is a classifier that contains several Decision Trees on various subsets of a given DataSet and takes the average to improve the predictive accuracy of that dataset. During the implementation of homework #2, I fitted several classifiers, including RandomForestClassifier and ExtraTreesClassifier to predict the binary response variable – TREG1 (whether a person is a smoker or not). All variables in the dataset, like age, gender, race, alcohol use, and others (see dataset) were used to build the final model. After fitting the model, these factors influenced the final variable with different levels of importance. &lt;br&gt;
&lt;em&gt;Calculated and sorted descending these factors into feature importance lists:&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;marever1    0.096374&lt;br&gt;
age         0.083599&lt;br&gt;
DEVIANT1    0.080081&lt;br&gt;
SCHCONN1    0.075221&lt;br&gt;
GPA1        0.074775&lt;br&gt;
DEP1        0.071728&lt;br&gt;
FAMCONCT    0.067389&lt;br&gt;
PARACTV     0.063784&lt;br&gt;
ESTEEM1     0.057945&lt;br&gt;
ALCPROBS1   0.057670&lt;br&gt;
VIOL1       0.048614&lt;br&gt;
ALCEVR1     0.043539&lt;br&gt;
PARPRES     0.039425&lt;br&gt;
WHITE       0.022146&lt;br&gt;
cigavail    0.021671&lt;br&gt;
BLACK       0.018512&lt;br&gt;
BIO_SEX     0.014942&lt;br&gt;
inhever1    0.012832&lt;br&gt;
cocever1    0.012590&lt;br&gt;
PASSIST     0.010221&lt;br&gt;
EXPEL1      0.009777&lt;br&gt;
HISPANIC    0.007991&lt;br&gt;
AMERICAN    0.005332&lt;br&gt;
ASIAN       0.003844&lt;br&gt;
&lt;/code&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Code:
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import pandas as pd
import numpy as np
import matplotlib.pylab as plt
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.ensemble import ExtraTreesClassifier
from sklearn.metrics import confusion_matrix
from sklearn.metrics import accuracy_score
%matplotlib inline
RND_STATE = 55324

AH_data = pd.read_csv(“data/tree_addhealth.csv”)
data_clean = AH_data.dropna()
data_clean.dtypes

data_clean.describe()

predictors = data_clean[[‘BIO_SEX’, ‘HISPANIC’, ‘WHITE’, ‘BLACK’, ‘NAMERICAN’, ‘ASIAN’, ‘age’,
‘ALCEVR1’, ‘ALCPROBS1’, ‘marever1’, ‘cocever1’, ‘inhever1’, ‘cigavail’, ‘DEP1’, ‘ESTEEM1’,
‘VIOL1’,
‘PASSIST’, ‘DEVIANT1’, ‘SCHCONN1’, ‘GPA1’, ‘EXPEL1’, ‘FAMCONCT’, ‘PARACTV’, ‘PARPRES’]]

targets = data_clean.TREG1

pred_train, pred_test, tar_train, tar_test = train_test_split(predictors, targets, test_size=.4, random_state=RND_STATE)

print(“Predict train shape: “, pred_train.shape)
print(“Predict test shape: “, pred_test.shape)
print(“Target train shape: “, tar_train.shape)
print(“Target test shape: “, tar_test.shape)

classifier = RandomForestClassifier(n_estimators=25, random_state=RND_STATE)
classifier = classifier.fit(pred_train, tar_train)

predictions = classifier.predict(pred_test)

print(“Confusion matrix:”)
print(confusion_matrix(tar_test, predictions))
print()
print(“Accuracy: “, accuracy_score(tar_test, predictions))

important_features = pd.Series(data=classifier.feature_importances_,index=predictors.columns)
important_features.sort_values(ascending=False,inplace=True)

print(important_features)

model = ExtraTreesClassifier(random_state=RND_STATE)
model.fit(pred_train, tar_train)

print(model.feature_importances_)

trees = range(25)
accuracy = np.zeros(25)
for idx in range(len(trees)):
classifier = RandomForestClassifier(n_estimators=idx + 1, random_state=RND_STATE)
classifier = classifier.fit(pred_train, tar_train)
predictions = classifier.predict(pred_test)
accuracy[idx] = accuracy_score(tar_test, predictions)

plt.cla()
plt.plot(trees, accuracy)
plt.show()

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h1&gt;
  
  
  Output:
&lt;/h1&gt;

&lt;p&gt;&lt;em&gt;Final model looked well on test data and showed an accuracy level of 83,4%! So results can be presented in this plot:&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--BRlbXQfZ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1665917825252/Ew5UHScyi.png%3Fauto%3Dcompress%2Cformat%26format%3Dwebp" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--BRlbXQfZ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1665917825252/Ew5UHScyi.png%3Fauto%3Dcompress%2Cformat%26format%3Dwebp" alt="Image description" width="382" height="252"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As we can see from the plot that even one tree can show the accuracy at a good level. The above-given data can be described even with one tree. But, on the other hand, it is clear, that after adding some more trees final accuracy increases a bit, can make the model able to predict the data more precisely.&lt;/p&gt;

</description>
      <category>assignment</category>
      <category>course</category>
    </item>
    <item>
      <title>CSS Pseudo Elements</title>
      <dc:creator>Avinash Gupta</dc:creator>
      <pubDate>Tue, 02 Nov 2021 16:00:58 +0000</pubDate>
      <link>https://dev.to/tier3guy/css-pseudo-elements-3npb</link>
      <guid>https://dev.to/tier3guy/css-pseudo-elements-3npb</guid>
      <description>&lt;p&gt;Hey friend, are you finding CSS pseudo-elements difficult?&lt;br&gt;
No worries, I was also finding it difficult when I was new to this CSS. In this doc I will be explaining you &lt;code&gt;CSS pseudo-elements&lt;/code&gt; in a very easy manner. So lets start,&lt;br&gt;
Suppose I have given you a task to design only the first letter of a word in your &lt;code&gt;HTML&lt;/code&gt; page. How will you do this? One will say that okay I will apply a span tag to that letter, and then I will style it. But my friend its 2021 are you thinking that is it a good practice. At the end it's a single word then why will you apply &lt;code&gt;&amp;lt;span&amp;gt;&lt;/code&gt; tag.&lt;/p&gt;

&lt;p&gt;Right?&lt;/p&gt;

&lt;p&gt;Here comes the role of this &lt;code&gt;pseudo-elements&lt;/code&gt;. Pseudo-elements is used to customize any specific part of any element.&lt;br&gt;
Till now you have understood that okay &lt;strong&gt;a CSS pseudo-element is used to style specified parts of an element&lt;/strong&gt;. But I you don't know how to do that. &lt;br&gt;
Before jumping into that let me explain you the different types of &lt;em&gt;CSS pseudo-elements&lt;/em&gt; present.&lt;/p&gt;

&lt;p&gt;There are majorly 6 pseudo-elements available and they are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;::before&lt;/li&gt;
&lt;li&gt;::after&lt;/li&gt;
&lt;li&gt;::first-letter&lt;/li&gt;
&lt;li&gt;::first-line&lt;/li&gt;
&lt;li&gt;::marker&lt;/li&gt;
&lt;li&gt;::selection&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Now, let just discuss them one by one.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;::before&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;p::before{
   content: 'Hello world';
   color: blue;
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So, before is used to add a content before any class or element. In the above written example &lt;code&gt;Hello world&lt;/code&gt; will be embedded before the paragraph with blue color. Now you can customize it according to your choice.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;::after&lt;/strong&gt;&lt;br&gt;
There is no difference in &lt;code&gt;::before&lt;/code&gt; and &lt;code&gt;::after&lt;/code&gt; but as their name suggests before adds content before any element while after will embed it after the element.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;::first-letter&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;p::first-letter{
   color: blue;
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It is used customize the first letter of any element as we have discussed in the example.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;::first-line&lt;/strong&gt;&lt;br&gt;
Again there is no difference between ::first-line and ::first-letter but as their name is suggesting it is used to customize the first line of any element.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;::marker&lt;/strong&gt;&lt;br&gt;
The marker is used to style the markers of the list item.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;::selection&lt;/strong&gt;&lt;br&gt;
Honestly talking this is my favourite pseudo-element, as it helps to customize the area that has been selected/high-lighted by the user on the dom.&lt;br&gt;
Example speaks: means by default if you high-light any text on the browser the background color changes to blue, but now you customize it by your own. Isn't it amazing. I think you will agree with me.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So, this is was all about the pseudo-elements, hope you liked it and get some new knowledge. &lt;br&gt;
Thank you!&lt;/p&gt;

</description>
      <category>css</category>
      <category>webdev</category>
      <category>beginners</category>
      <category>100daysofcode</category>
    </item>
    <item>
      <title>CSS Positions</title>
      <dc:creator>Avinash Gupta</dc:creator>
      <pubDate>Sat, 30 Oct 2021 14:22:53 +0000</pubDate>
      <link>https://dev.to/tier3guy/css-positions-2p0l</link>
      <guid>https://dev.to/tier3guy/css-positions-2p0l</guid>
      <description>&lt;p&gt;I am as a beginner faced a lot of difficulty in understanding the &lt;code&gt;Position&lt;/code&gt; property in CSS. Today in this doc I will be explaining Positions in a very friendly and easy manner. So, lets begin.&lt;/p&gt;

&lt;p&gt;What is &lt;code&gt;position&lt;/code&gt; property?&lt;br&gt;
As the name suggest Position is a property in CSS to set the position of an element.&lt;br&gt;
It has 5 values and they are :&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;static&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;relative&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;fixed&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;absolute&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;sticky&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;now again a question arises that is why should we use this is position property?&lt;/p&gt;

&lt;p&gt;And the answer is that, after specifying the position property you can use four more properties i.e &lt;code&gt;top&lt;/code&gt;, &lt;code&gt;bottom&lt;/code&gt;, &lt;code&gt;left&lt;/code&gt; and &lt;code&gt;right&lt;/code&gt;, which will again help you better to shape your elements to enhance your frontend.&lt;/p&gt;

&lt;p&gt;Note: you cannot use these four properties (top, bottom, left, right) directly before specifying the position property. Using them directly without specifying position will be of no use.&lt;/p&gt;

&lt;p&gt;Now, we know that what is position property and why should we use it. Now lets quickly jump to those five values and try to better understand them.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;static&lt;/code&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;position: static;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;now, static is the default position property, it positions your element according to regular flow of the dom.&lt;br&gt;
(Note: top, bottom, left, and right won't work with static)&lt;/p&gt;

&lt;p&gt;Now you know why without defining position property you cannot use top, bottom, left and right. (As if you have not defined the position, the browser will set it to static (default behavior) and top, bottom, left, right won't work with static)&lt;/p&gt;

&lt;p&gt;Gist: static is the default behavior of the browser.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;relative&lt;/code&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;position: relative;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;relative works relative to the normal position of the element.&lt;br&gt;
I know you didn't understand. Lets take up an example.&lt;br&gt;
Suppose you created a div and set its width and height. Now,&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;position: relative;
left: 50px;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;after applying these two lines in the css of your div, you will find that your div is shifted by 50px from its left. Now this is what means relative to the normal position. Once again, if you apply&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;top: 100px;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;you will see that your div shifted by 100px from its top.&lt;br&gt;
I hope now you get the clear idea.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;fixed&lt;/code&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;position: fixed;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Fixed and relative and totally similar, but the only difference is that the relative position the elements from its normal position while fixed sets the position in relative to the browser window.&lt;br&gt;
Eg.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;position: fixed;
left: 5px;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;after applying this css your div will move to a distance of 5px from the left of your browser window.&lt;br&gt;
and the important thing is that it won't scroll with your webpage. It will get fixed to that position. &lt;/p&gt;

&lt;p&gt;But suppose you want to fix the position with respective to browser window but at same time you want your div to get scroll with your page. How do you do this, as if you will apply position fixed property it won't allow your div to get scroll.&lt;/p&gt;

&lt;p&gt;Now, here comes the concept of position: absolute.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;absolute&lt;/code&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;position: absolute;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There are two difference between absolute and fixed.&lt;br&gt;
a. The first one is fixed doesn't allow scroll but absolute does.&lt;br&gt;
b. And the second one is absolute works with the nearest &lt;strong&gt;positioned&lt;/strong&gt; ancestor. Means that fixed sets your element with the respect to the dom window, while absolute sets your div to the nearest ancestor(parent) element who has its position set. But what if there is not parent element with position property, then it will set your according to dom as like fixed but it will still allow scroll.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;sticky&lt;/code&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;position: sticky;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;sticky is same is fixed the only difference is it sets the property when ever the element gets out of the window on scrolling.&lt;/p&gt;

&lt;p&gt;So, this was all about the position property. Hope you get some knowledge after reading this doc. Thank you.&lt;/p&gt;

</description>
      <category>css</category>
      <category>positions</category>
      <category>html</category>
      <category>webdev</category>
    </item>
  </channel>
</rss>
