<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Omale Happiness Ojone</title>
    <description>The latest articles on DEV Community by Omale Happiness Ojone (@codinghappinessweb).</description>
    <link>https://dev.to/codinghappinessweb</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F561973%2F3bb9568b-fc8f-4fc1-8af3-a9b808cf6a71.jpeg</url>
      <title>DEV Community: Omale Happiness Ojone</title>
      <link>https://dev.to/codinghappinessweb</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/codinghappinessweb"/>
    <language>en</language>
    <item>
      <title>How to predict a model using linear regression</title>
      <dc:creator>Omale Happiness Ojone</dc:creator>
      <pubDate>Tue, 14 Jun 2022 20:11:26 +0000</pubDate>
      <link>https://dev.to/codinghappinessweb/how-to-predict-a-model-using-linear-regression-2hpm</link>
      <guid>https://dev.to/codinghappinessweb/how-to-predict-a-model-using-linear-regression-2hpm</guid>
      <description>&lt;h2&gt;
  
  
  What is Regression?
&lt;/h2&gt;

&lt;p&gt;Regression is a supervised learning method used to determine the relationship between variables. When the output variable is a real or continuous value, you have a regression problem.&lt;/p&gt;

&lt;p&gt;In this lesson, I'll show you how to predict a model using linear regression, and I'll use the fish market &lt;a href="https://www.kaggle.com/aungpyaeap/fish-market"&gt;dataset&lt;/a&gt; as an example. Let's get started. &lt;/p&gt;

&lt;h2&gt;
  
  
  What is Linear Regression?
&lt;/h2&gt;

&lt;p&gt;Linear regression is a straightforward statistical regression method for predicting relationships between continuous variables. Linear regression, as the name implies, depicts a linear relationship between the independent variable (X-axis) and the dependent variable (Y-axis). Simple linear regression is defined as a linear regression with only one input variable (x). When there are several input variables, the linear regression is referred to as multiple linear regression.The linear regression model gives a sloped straight line describing the relationship within the variables.&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--8nLls5e5--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/i0ovqwzavb0weu6s7yyp.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--8nLls5e5--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/i0ovqwzavb0weu6s7yyp.JPG" alt="Image description" width="392" height="290"&gt;&lt;/a&gt;&lt;br&gt;
The dependent variable and independent variables have a linear relationship, as shown in the graph above. When the value of x (the independent variable) rises, so does the value of y (the dependent variable). The best fit straight line is designated by the red line. We aim to plot a line that best predicts the data points based on the given data points.&lt;/p&gt;

&lt;p&gt;Firstly, let's import basic utilities:&lt;/p&gt;

&lt;p&gt;In [1]:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;%matplotlib inline
import numpy as np
import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.preprocessing import LabelEncoder, StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let's now read the csv file (keep in mind it should be in the same folder as the notebook!):&lt;/p&gt;

&lt;p&gt;In [2]:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;df = pd.read_csv('Fish.csv')
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let's take a peek at the dataset's first few rows. This is necessary in order to gain a fundamental comprehension.&lt;/p&gt;

&lt;p&gt;In [3]:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;df.head()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--gMKzZEDY--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/c8i0dp6rys5im8zq2xi0.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--gMKzZEDY--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/c8i0dp6rys5im8zq2xi0.JPG" alt="Image description" width="522" height="189"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now let's look at the dataset more closely to obtain essential statistical indicators such as the mean and standard deviation.&lt;/p&gt;

&lt;p&gt;In [4]:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;df.describe()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--v016xRqx--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/kvzfhgbjh4bx37hpgv7j.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--v016xRqx--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/kvzfhgbjh4bx37hpgv7j.JPG" alt="Image description" width="602" height="260"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It is important to reshape the two dimensions (X and y), as failure to do so would result in an error.&lt;/p&gt;

&lt;p&gt;In [5]:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;X = np.array(df['Length1']).reshape(-1, 1) 
y = np.array(df['Length2']).reshape(-1, 1)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Split the dataset in two parts: train and test. This is needed to calculate the accuracy (and many other metrics) of the model. We will use the train part during the training, and the test part during the evaluation.&lt;/p&gt;

&lt;p&gt;In [6]:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.33, random_state=101)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Import the model and instantiate it:&lt;/p&gt;

&lt;p&gt;In [7]:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from sklearn.linear_model import LinearRegression
linearmodel = LinearRegression()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now let's train the model:&lt;/p&gt;

&lt;p&gt;In [8]:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;linearmodel.fit(X_train, y_train)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let's have a look at how the model is performing with R2. R2 is a statistical metric used to determine whether or not a model is "a good fit" and how well it works. The Pearson Correlation Coefficient is equivalent to the R2 in this situation (one independent variable). R2 has a range of values between 0.0 and 1.0, with 0 indicating the worst fit and 1 indicating the best fit.&lt;/p&gt;

&lt;p&gt;In [9]:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;linearmodel.score(X_test, y_test)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--Tzf0PVxJ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/y91ai6brp4j0au1jvsas.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--Tzf0PVxJ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/y91ai6brp4j0au1jvsas.JPG" alt="Image description" width="243" height="38"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It's quite high! This is because the two variables (Length1 and Length2), as seen during the EDA, take the shape of a straight line. Let's compare the predicted values to the test values in the dataset.&lt;/p&gt;

&lt;p&gt;In [10]:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;plt.scatter(x_test, y_test)
plt.plot(x_test, linearmodel.predict(x_test), color = 'red')
plt.show()

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--Zo-N4cuj--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/gtukup7h74khexvhxpbx.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--Zo-N4cuj--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/gtukup7h74khexvhxpbx.JPG" alt="Image description" width="445" height="265"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Linear Regression (multiple independent variables): Let's predict weight
&lt;/h2&gt;

&lt;p&gt;Predicting the weight of the fish using Linear Regression is similar to the previous one. The only significant difference is the presence of numerous independent variables. The variable "Species" will be removed entirely.&lt;/p&gt;

&lt;p&gt;In [11]:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;x = fish.drop(['Weight', 'Species'], axis = 1)
y = fish['Weight']
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In [12]:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.3, random_state = 42)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In [13]:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;linearmodel = LinearRegression()
linearmodel.fit(x_train, y_train)
linearmodel.score(x_test, y_test)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--__tTpaq---/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ral1ojvjywb78vrelb8b.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--__tTpaq---/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ral1ojvjywb78vrelb8b.JPG" alt="Image description" width="251" height="41"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;The model is a good fit but it's not performing well (or rather, not as well as hoped) for this problem and data. The EDA didn't cover some basics such as feature selection and removing outliers. Also the model was deprived of a feature 'Species' which as you might imagine, may influence the weight of a fish. Another important factor is the size of the dataset, usually larger datasets lead to more accurate results. Anyways the goal of a simple linear regression is to predict the value of a dependent variable based on an independent variable. The greater the linear relationship between the independent variable and the dependent variable, the more accurate is the prediction.&lt;br&gt;
Thanks for reading&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;P.S:&lt;/strong&gt; I'm looking forward to being your friend, let's connect on &lt;a href="https://twitter.com/Coding_Happinex"&gt;twitter&lt;/a&gt;.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Data Preparation</title>
      <dc:creator>Omale Happiness Ojone</dc:creator>
      <pubDate>Sat, 16 Apr 2022 21:57:00 +0000</pubDate>
      <link>https://dev.to/codinghappinessweb/data-preparation-4khj</link>
      <guid>https://dev.to/codinghappinessweb/data-preparation-4khj</guid>
      <description>&lt;h1&gt;
  
  
  Data Preparation
&lt;/h1&gt;

&lt;p&gt;Data preparation is the transformation of raw data into a form that is more suitable for modeling so that data scientists and analysts can run it through machine learning algorithms to uncover insights or make predictions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why is Data Preparation Important?
&lt;/h2&gt;

&lt;p&gt;Most machine learning algorithms require data to be formatted in a very specific way, so datasets generally require some amount of preparation before they can yield useful insights. Some datasets have values that are missing, invalid, have inaccuracies or other errors, which are difficult for the algorithm to process. &lt;/p&gt;

&lt;p&gt;The algorithm cannot function if data is missing. If the data is incorrect, the algorithm produces less accurate, if not misleading, results. Some datasets simply lack useful business context (for example, poorly defined ID values), necessitating feature enrichment. Good data preparation results in clean, well-curated data, which leads to more practical, accurate model results.&lt;/p&gt;

&lt;h2&gt;
  
  
  Steps in data preparation process
&lt;/h2&gt;

&lt;p&gt;The process of preparing data includes the following:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Data collection:
&lt;/h3&gt;

&lt;p&gt;Relevant data is gathered from operational systems, data warehouses and other data sources. During this step, data professionals and end users gathering data themselves should confirm that the data is a good fit for the objectives of the planned applications.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Data discovery and profiling.
&lt;/h3&gt;

&lt;p&gt;The next step is to explore the collected data to understand what it contains and what needs to be done to prepare it for the intended use. Data profiling helps identify patterns, anomalies, inconsistencies, missing data, and other attributes and issues in data sets, so problems can be addressed.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Data cleaning.
&lt;/h3&gt;

&lt;p&gt;In this step, the identified data errors are corrected to create complete and accurate data sets that are ready to be processed and analyzed. For example, faulty data is removed or fixed, missing values are filled in, and inconsistent entries are harmonized. Nevertheless, there are general data cleaning operations that can be performed, such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Using statistics to define normal data and identify outliers.
&lt;/li&gt;
&lt;li&gt;Identifying columns that have the same value or no variance and removing them.
&lt;/li&gt;
&lt;li&gt;Identifying duplicate rows of data and removing them.
&lt;/li&gt;
&lt;li&gt;Marking empty values as missing.
&lt;/li&gt;
&lt;li&gt;Imputing missing values using statistics or a learned model&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  4. Data structuring.
&lt;/h3&gt;

&lt;p&gt;At this point, the data needs to be structured, modelled and organized into a unified format that will meet the requirements of the planned use.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Data transformation and enrichment.
&lt;/h3&gt;

&lt;p&gt;In connection with structuring data, it often must be transformed to make it consistent and turn it into usable information. Data enrichment and optimization further enhance data sets as needed to produce the desired business insights.&lt;/p&gt;

&lt;h3&gt;
  
  
  6. Data validation and publishing.
&lt;/h3&gt;

&lt;p&gt;To complete the preparation process, automated routines are run against the data to validate its consistency, completeness and accuracy. The prepared data is then stored in a data warehouse or other repository and made available for use.&lt;/p&gt;

&lt;p&gt;A big benefit of instituting an effective data preparation process is that data scientists and other end users can spend less time finding and structuring data and instead focus more on data mining and data analysis. For example, data preparation can be done more quickly, and prepared data can automatically be fed to users for analyses.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;In this article, we have seen what data preparation is and the process of preparing data. We also saw reasons why data preparation is important. &lt;strong&gt;Thanks for reading&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;P.S:&lt;/strong&gt;  I'm looking forward to being your friend, let's connect on  &lt;a href="https://twitter.com/Coding_Happinex"&gt;twitter&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>beginners</category>
      <category>programming</category>
      <category>datascience</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>Data Cleaning</title>
      <dc:creator>Omale Happiness Ojone</dc:creator>
      <pubDate>Fri, 15 Apr 2022 16:14:13 +0000</pubDate>
      <link>https://dev.to/codinghappinessweb/data-cleaning-1k9b</link>
      <guid>https://dev.to/codinghappinessweb/data-cleaning-1k9b</guid>
      <description>&lt;p&gt;Data cleaning refers to the process of “cleaning” data, by identifying errors in the data and then rectifying them.&lt;br&gt;
The main aim of Data Cleaning is to identify and remove errors &amp;amp; duplicate data, in order to create a reliable dataset.&lt;br&gt;
We will use the fish dataset as the basis for this tutorial.&lt;/p&gt;
&lt;h2&gt;
  
  
  Fish Dataset
&lt;/h2&gt;

&lt;p&gt;The “Fish Dataset” is a machine learning dataset.&lt;br&gt;
The task involves predicting the weight of a fish.&lt;br&gt;
You can access the dataset here:&lt;br&gt;
[(&lt;a href="https://www.kaggle.com/aungpyaeap/fish-market)" rel="noopener noreferrer"&gt;https://www.kaggle.com/aungpyaeap/fish-market)&lt;/a&gt;]&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;         from pandas import read_csv
         from numpy import unique
         import pandas as pd
         import seaborn as sns
         import matplotlib.pyplot as plt
         import numpy as np
         fish = pd.read_csv("Fish.csv")
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;.&lt;/strong&gt; How does the data look like?&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6rfziynaczcjjq6bpi58.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6rfziynaczcjjq6bpi58.JPG" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Fill-Out Missing Values
&lt;/h2&gt;

&lt;p&gt;One of the first steps of fixing errors in your dataset is to find incomplete values and fill them out. Most of the data that you may have can be categorized.&lt;br&gt;
In most cases, it is best to fill out your missing values based on different categories or create entirely new categories to include the missing values.&lt;br&gt;
If your data are numerical, you can use mean and median to rectify the errors.&lt;br&gt;
let's check our dataset:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvzhg5of4xodaop3el1xv.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvzhg5of4xodaop3el1xv.JPG" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As you can see, in this case, we do not have missing values.&lt;/p&gt;

&lt;h2&gt;
  
  
  Removing rows with missing values
&lt;/h2&gt;

&lt;p&gt;One of the simplest things to do in data cleansing is to remove or delete rows with missing values. This may not be the ideal step in case of a huge amount of errors in your training data.&lt;br&gt;
If the missing values are considerably less, then removing or deleting missing values can be the right approach. You will have to be very sure that the data you are deleting does not include information that is present in the other rows of the training data.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; As you can see, in this case, we do not have missing values. However, this is not always the case.&lt;/p&gt;

&lt;h2&gt;
  
  
  Fixing errors in the Dataset
&lt;/h2&gt;

&lt;p&gt;Ensure there are no typographical errors and inconsistencies in the upper or lower case.&lt;br&gt;
Go through your data set, identify such errors, and solve them to make sure that your training set is completely error-free. This will help you to yield better result from your machine learning functions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Identify Columns That Contain a Single Value
&lt;/h2&gt;

&lt;p&gt;Columns that have a single observation or value are probably useless for modeling.&lt;br&gt;
These columns or predictors are referred to zero-variance predictors as if we measured the variance (average value from the mean), it would be zero.&lt;br&gt;
When a predictor contains a single value, we call this a zero-variance predictor because there truly is no variation displayed by the predictor.&lt;br&gt;
You can detect rows that have this property using the &lt;strong&gt;nunique() Pandas function&lt;/strong&gt; that will report the number of unique values in each column.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5vkih3yhwbb25lvtaie1.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5vkih3yhwbb25lvtaie1.JPG" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Delete Columns That Contain a Single Value
&lt;/h2&gt;

&lt;p&gt;Variables or columns that have a single value should probably be removed from your dataset.&lt;br&gt;
From the above picture we could see that the column &lt;strong&gt;Species&lt;/strong&gt; has a single value.&lt;br&gt;
Columns are relatively easy to remove from a NumPy array or Pandas DataFrame.&lt;br&gt;
One approach is to record all columns that have a single unique value, then delete them from the Pandas DataFrame by calling the drop() function.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu4m3kn4tazdfgbptq8e9.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu4m3kn4tazdfgbptq8e9.JPG" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Identify Rows That Contain Duplicate Data
&lt;/h2&gt;

&lt;p&gt;Rows that have identical data are probably useless, if not dangerously misleading during model evaluation.&lt;br&gt;
A duplicate row is a row where each value in each column for that row appears in identically the same order (same column values) in another row.&lt;br&gt;
The pandas function &lt;strong&gt;duplicated()&lt;/strong&gt; will report whether a given row is duplicated or not. All rows are marked as either False to indicate that it is not a duplicate or True to indicate that it is a duplicate. If there are duplicates, the first occurrence of the row is marked False (by default), as we might expect.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpw7vxosrsw7f562i5a97.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpw7vxosrsw7f562i5a97.JPG" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;First, the presence of any duplicate rows is reported, and in this case, we can see that there are no duplicates (False).&lt;br&gt;
But in a case where there are duplicates, we could also use the Pandas function &lt;strong&gt;drop_duplicates()&lt;/strong&gt; to drop the duplicates row.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Data Cleaning is a critical process for the success of any machine learning function. For most machine learning projects, about 80 percent of the effort is spent on data cleaning. We have discussed some of the points.&lt;/p&gt;

</description>
      <category>beginners</category>
      <category>programming</category>
      <category>python</category>
      <category>datascience</category>
    </item>
    <item>
      <title>How I Deployed my First Machine Learning Model Using Streamlit (Part 2)</title>
      <dc:creator>Omale Happiness Ojone</dc:creator>
      <pubDate>Sat, 17 Jul 2021 20:34:48 +0000</pubDate>
      <link>https://dev.to/codinghappinessweb/how-i-deployed-my-first-machine-learning-model-using-streamlit-part-2-103a</link>
      <guid>https://dev.to/codinghappinessweb/how-i-deployed-my-first-machine-learning-model-using-streamlit-part-2-103a</guid>
      <description>&lt;p&gt;In this article I will be explaining how to deploy your machine learning model online.&lt;/p&gt;

&lt;p&gt;But before then here is the link to the part 1 of this article:&lt;br&gt;
&lt;a href="https://dev.to/codinghappinessweb/how-i-deployed-my-first-machine-learning-model-using-streamlit-part-1-31h9"&gt;https://dev.to/codinghappinessweb/how-i-deployed-my-first-machine-learning-model-using-streamlit-part-1-31h9&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;After signing up on streamlit, you will get an invite from the app which would look like this.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--U5aoUujw--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/sits7xaw5og0s9zrwywh.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--U5aoUujw--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/sits7xaw5og0s9zrwywh.JPG" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It takes 1-2days before it could get accepted, the acceptance invite looks like this.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--GwqEwTvG--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/odt4yzhb2i3znce00ekk.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--GwqEwTvG--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/odt4yzhb2i3znce00ekk.JPG" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;After signing in, you create a github repository and name it i.e Loan Prediction. Add the project files to the repository created.&lt;/p&gt;

&lt;p&gt;Some of the files you should add to the repository includes:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The code for the project&lt;/li&gt;
&lt;li&gt;The python script for the project&lt;/li&gt;
&lt;li&gt;The pkl file which is classifier.pkl for this project&lt;/li&gt;
&lt;li&gt;Requirement.txt file&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;To get the requirement.txt file just type "pip install pipreqs" on your command prompt, Locate the folder where your python file for the streamlit app is, open a terminal inside that folder and run this command "-pipreqs". It will scan through all the python file there and create a requirement.txt file for you.&lt;/p&gt;

&lt;p&gt;Next, Sign into streamlit share: you will create a new app, link your github repository to it and specify the main python file for the app, then you deploy.&lt;/p&gt;

&lt;p&gt;Finally your app deployment is ready!!!&lt;br&gt;
You can now share the links to your friends.&lt;/p&gt;

&lt;h1&gt;
  
  
  End Notes
&lt;/h1&gt;

&lt;p&gt;Congratulations! We have now successfully completed loan prediction model deployment using Streamlit. The deployment is simple, fast, and most importantly in Python. I encourage you to first try this particular project, play around with the values as input, and check the results. And then, you can try out other machine learning projects as well and perform model deployment using streamlit. &lt;/p&gt;

&lt;p&gt;Lastly, I would love to hear your feedback and suggestions for this article. If you have any questions related to the article, post them in the comments section below. I will actively look forward to answering them.&lt;/p&gt;

&lt;p&gt;Link to part 1 of the article:&lt;a href="https://dev.to/codinghappinessweb/how-i-deployed-my-first-machine-learning-model-using-streamlit-part-1-31h9"&gt;https://dev.to/codinghappinessweb/how-i-deployed-my-first-machine-learning-model-using-streamlit-part-1-31h9&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can view the app via&lt;a href="https://share.streamlit.io/codinghappiness-web/loanprediction/main/loan_prediction.py/"&gt;Streamlit&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can access the dataset&lt;a href="https://github.com/codinghappiness-web/streamlit_app.py/blob/main/train_u6lujuX_CVtuZ9i.csv"&gt;Github&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;And my jupyter notebook&lt;a href="https://github.com/codinghappiness-web/streamlit_app.py/blob/main/Loan%20Prediction.ipynb"&gt;Github&lt;/a&gt;&lt;/p&gt;

</description>
    </item>
    <item>
      <title>How I Deployed my First Machine Learning Model Using Streamlit (Part 1)</title>
      <dc:creator>Omale Happiness Ojone</dc:creator>
      <pubDate>Sat, 17 Jul 2021 20:31:39 +0000</pubDate>
      <link>https://dev.to/codinghappinessweb/how-i-deployed-my-first-machine-learning-model-using-streamlit-part-1-31h9</link>
      <guid>https://dev.to/codinghappinessweb/how-i-deployed-my-first-machine-learning-model-using-streamlit-part-1-31h9</guid>
      <description>&lt;p&gt;I believe most of you must have done some form of data science project at some point in your lives, be it a machine learning project, a deep learning project, or even visualizations of your data. And the best part of these projects is to showcase them to others.&lt;/p&gt;

&lt;p&gt;But the question is how will you showcase your work to others? Well, this is where Model Deployment will help you.&lt;/p&gt;

&lt;p&gt;In this article I will be showing you how I was able to deploy my first machine learning model using Streamlit.&lt;/p&gt;

&lt;p&gt;Streamlit is a popular open-source framework used for model deployment by machine learning and data science teams. And the best part is it’s free of cost and purely in python.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fygnksdh285gffuu1j3v9.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fygnksdh285gffuu1j3v9.JPG" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Preparing Data and Training Model
&lt;/h1&gt;

&lt;p&gt;We will first build a loan prediction model and then deploy it using Streamlit.&lt;/p&gt;

&lt;p&gt;The project that I have picked for this particular article is automating the loan eligibility process.&lt;/p&gt;

&lt;p&gt;The task is to predict whether the loan will be approved or not based on the details provided by customers.&lt;/p&gt;

&lt;p&gt;Based on the details provided by customers, we have to create a model that can decide whether or not their loan should be approved and point out the factors that will help us to predict whether the loan for a customer should be approved or not.&lt;/p&gt;

&lt;p&gt;As a starting point, here are a couple of factors that I think will be helpful for us with respect to this project:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Amount of loan: The total amount of loan applied by the customer. My hypothesis here is that the higher the amount of loan, the lesser the chances of loan approval and vice versa.&lt;/li&gt;
&lt;li&gt;Income of applicant: The income of the applicant (customer) can also be a deciding factor. A higher income will lead to higher probability of loan approval.&lt;/li&gt;
&lt;li&gt;Education of applicant: Educational qualification of the applicant can also be a vital factor to predict the loan status of a customer. My hypothesis is if the educational qualification of the applicant is higher, the chances of their loan approval will be higher.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Next, we need to collect the data. And the dataset related to the customers and loan will be provided at the end of this article.&lt;/p&gt;

&lt;p&gt;We will first import the required libraries and then read the CSV file:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  import pandas as pd
  train = pd.read_csv('train_ctrUa4K.csv') 
  train.head()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr2y3t00dsdaffv3vhdfm.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr2y3t00dsdaffv3vhdfm.JPG" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Above are the first five rows from the dataset.&lt;/p&gt;

&lt;p&gt;We know that machine learning models take only numbers as inputs and can not process strings. So, we have to deal with the categorical features present in the dataset and convert them into numbers.&lt;/p&gt;

&lt;p&gt;:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; train['Gender']= train['Gender'].map({'Male':0, 'Female':1})
 train['Married']= train['Married'].map({'No':0, 'Yes':1})
 train['Loan_Status']= train['Loan_Status'].map({'N':0, 
 'Y':1})
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Above, we have converted the categories present in the Gender, Married and the Loan Status variable into numbers, simply using the map function of pandas DataFrame object. Next, let’s check if there are any missing values in the dataset:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;     train.isnull().sum()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwh3kerrp0kz1s8pkrhqi.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwh3kerrp0kz1s8pkrhqi.JPG" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;So, there are missing values inside many features including the Gender, Married, LoanAmount variable. Next, we will remove all the rows which contain any missing values in them:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;train = train.dropna()
train.isnull().sum()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8nrmx8274629cn13i8hr.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8nrmx8274629cn13i8hr.JPG" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now there are no missing values in the dataset. Next, we will separate the dependent (Loan_Status) and the independent variables:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  X = train[['Gender', 'Married', 'ApplicantIncome', 
      'LoanAmount', 'Credit_History']]
  y = train.Loan_Status
  X.shape, y.shape
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzzg5o5xj8wmp3sb4v44s.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzzg5o5xj8wmp3sb4v44s.JPG" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We will first split our dataset into a training and validation set, so that we can train the model on the training set and evaluate its performance on the validation set.&lt;/p&gt;

&lt;p&gt;:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; from sklearn.model_selection import train_test_split
 x_train, x_cv, y_train, y_cv = train_test_split(X,y, 
 test_size = 0.2, random_state = 10)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;We have split the data using the train_test_split function from the sklearn library keeping the test_size as 0.2 which means 20 percent of the total dataset will be kept aside for the validation set. Next, we will train using the random forest classifier:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;      from sklearn.ensemble import RandomForestClassifier 
      model = RandomForestClassifier(max_depth=4, random_state 
      = 10) 
      model.fit(x_train, y_train)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Now, our model is trained, let’s check its performance on both the training and validation set:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;      from sklearn.metrics import accuracy_score
      pred_cv = model.predict(x_cv)
      accuracy_score(y_cv,pred_cv)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc6yxtxq93uyywiey0y4y.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc6yxtxq93uyywiey0y4y.JPG" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The model is 80% accurate on the validation set. Let’s check the performance on the training set too:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    pred_train = model.predict(x_train)
    accuracy_score(y_train,pred_train)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkt833gsk6zapbftdk9qm.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkt833gsk6zapbftdk9qm.JPG" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Performance on the training set is almost similar to that on the validation set. So, the model has generalized well. Finally, we will save this trained model so that it can be used in the future to make predictions on new observations:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;         # saving the model 
         import pickle 
         pickle_out = open("classifier.pkl", mode = "wb") 
         pickle.dump(model, pickle_out) 
         pickle_out.close()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;We are saving the model in pickle format and storing it as classifier.pkl. This will store the trained model and we will use this while deploying the model.&lt;/p&gt;

&lt;p&gt;We will be deploying this loan prediction model using Streamlit which is a recent and the simplest way of building web apps and deploying machine learning and deep learning models.&lt;/p&gt;

&lt;h1&gt;
  
  
  Model Deployment of the Loan Prediction Model using Streamlit
&lt;/h1&gt;

&lt;p&gt;Creating the app, we will start with the basic installations:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; !pip install -q streamlit
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Streamlit will be used to make our web app.&lt;/p&gt;

&lt;p&gt;We have to create the python script for our app. Let me show the code first and then I will explain it to you in detail:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;         import pickle
         import streamlit as st

         # loading the trained model
         pickle_in = open('classifier.pkl', 'rb') 
         classifier = pickle.load(pickle_in)

         @st.cache()

         # defining the function which will make the 
         prediction using the data which the user inputs 
         def prediction(Gender, Married, ApplicantIncome, 
             LoanAmount, Credit_History):   

             # Pre-processing user input    
             if Gender == "Male":
                Gender = 0
            else:
                Gender = 1

           if Married == "Unmarried":
              Married = 0
          else:
              Married = 1

          if Credit_History == "Unclear Debts":
             Credit_History = 0
         else:
             Credit_History = 1  

         LoanAmount = LoanAmount / 1000

         # Making predictions 
         prediction = classifier.predict( 
           [[Gender, Married, ApplicantIncome, LoanAmount, 
           Credit_History]])

         if prediction == 0:
            pred = 'Rejected'
         else:
             pred = 'Approved'
         return pred


        #this is the main function in which we define our 
        webpage  
       def main():       
       #front end elements of the web page 
       html_temp = """ 
       &amp;lt;div style ="background-color:yellow;padding:13px"&amp;gt; 
       &amp;lt;h1 style ="color:black;text-align:center;"&amp;gt;Streamlit 
       Loan 
       Prediction ML App&amp;lt;/h1&amp;gt; 
       &amp;lt;/div&amp;gt; 
       """

      #display the front end aspect
      st.markdown(html_temp, unsafe_allow_html = True) 

     #following lines create boxes in which user can enter 
     data 
     required to make prediction 
     Gender = st.selectbox('Gender',("Male","Female"))
     Married = st.selectbox('Marital Status', 
     ("Unmarried","Married")) 
     ApplicantIncome = st.number_input("Applicants monthly 
     income") 
     LoanAmount = st.number_input("Total loan amount")
     Credit_History = st.selectbox('Credit_History',("Unclear 
     Debts","No Unclear Debts"))
     result =""

    #when 'Predict' is clicked, make the prediction and store 
    it 
    if st.button("Predict"): 
       result = prediction(Gender, Married, ApplicantIncome, 
       LoanAmount, Credit_History) 
       st.success('Your loan is {}'.format(result))
       print(LoanAmount)

   if __name__=='__main__': 
        main()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;This is the entire python script which will create the app for us. Let me break it down and explain in detail:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz6xo6pl502qs15c7l7pn.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz6xo6pl502qs15c7l7pn.JPG" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In this part, we are saving the script as app.py, and then we are loading the required libraries which are pickle to load the trained model and streamlit to build the app. Then we are loading the trained model and saving it in a variable named classifier.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F45qh2ozkjr469v5ajzhx.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F45qh2ozkjr469v5ajzhx.JPG" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Next, we have defined the prediction function. This function will take the data provided by users as input and make the prediction using the model that we have loaded earlier. It will take the customer details like the gender, marital status, income, loan amount, and credit history as input, and then pre-process that input so that it can be feed to the model and finally, make the prediction using the model loaded as a classifier. In the end, it will return whether the loan is approved or not based on the output of the model.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd8kwz1ddhwhf4qrg8qnb.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd8kwz1ddhwhf4qrg8qnb.JPG" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;And here is the main app. First of all, we are defining the header of the app. It will display “Streamlit Loan Prediction ML App”. To do that, we are using the markdown function from streamlit. Next, we are creating five boxes in the app to take input from the users. These 5 boxes will represent the five features on which our model is trained. &lt;/p&gt;

&lt;p&gt;The first box is for the gender of the user. The user will have two options, Male and Female, and they will have to pick one from them. We are creating a dropdown using the selectbox function of streamlit. Similarly, for Married, we are providing two options, Married and Unmarried and again, the user will pick one from it. Next, we are defining the boxes for Applicant Income and Loan Amount.&lt;/p&gt;

&lt;p&gt;Since both of these variables will be numeric in nature, we are using the number_input function from streamlit. And finally, for the credit history, we are creating a dropdown which will have two categories, Unclear Debts, and No Unclear Debts. &lt;/p&gt;

&lt;p&gt;At the end of the app, there will be a predict button and after filling in the details, users have to click that button. Once that button is clicked, the prediction function will be called and the result of the Loan Status will be displayed in the app. This completes the web app creating part. And you must have noticed that everything we did is in python. Isn’t it awesome?&lt;/p&gt;

&lt;p&gt;This part is for running the app on your local machine, not the acual deployment.&lt;br&gt;
I will be explaining the actual deployment in my next article.&lt;/p&gt;

&lt;p&gt;First run the .py file in the same directory on your cmd:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    streamlit run loan_prediction.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;This will generate a link, something like this:&lt;br&gt;
 Local URL: &lt;a href="http://localhost:8501" rel="noopener noreferrer"&gt;http://localhost:8501&lt;/a&gt;&lt;br&gt;
 Network URL: &lt;a href="http://192.168.43.47:8501" rel="noopener noreferrer"&gt;http://192.168.43.47:8501&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Note that the link will vary at your end. You can click on the link which will take you to the web app:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyvda6hcaryb4ssg5ixoa.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyvda6hcaryb4ssg5ixoa.JPG" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can see, we first have the name displayed at the top. Then we have 5 different boxes that will take input from the user and finally, we have the predict button. Once the user fills in the details and clicks on the Predict button, they will get the status of their loan whether it is approved or rejected.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8p2qichvwvcxgxa1t0a0.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8p2qichvwvcxgxa1t0a0.JPG" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;And it is as simple as this to build and deploy your machine learning models using Streamlit.&lt;/p&gt;

&lt;p&gt;Note, this part is for running the app on your local machine, not the acual deployment.&lt;/p&gt;

&lt;p&gt;I will be explaining the actual deployment in my next article.&lt;/p&gt;

&lt;p&gt;Link to part 2 of the article:&lt;a href="https://dev.to/codinghappinessweb/how-i-deployed-my-first-machine-learning-model-using-streamlit-part-2-103a"&gt;https://dev.to/codinghappinessweb/how-i-deployed-my-first-machine-learning-model-using-streamlit-part-2-103a&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can view the app via &lt;a href="https://share.streamlit.io/codinghappiness-web/loanprediction/main/loan_prediction.py/" rel="noopener noreferrer"&gt;Streamlit&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can access the dataset&lt;a href="https://github.com/codinghappiness-web/streamlit_app.py/blob/main/train_u6lujuX_CVtuZ9i.csv" rel="noopener noreferrer"&gt;Github&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;And my jupyter notebook&lt;a href="https://github.com/codinghappiness-web/streamlit_app.py/blob/main/Loan%20Prediction.ipynb" rel="noopener noreferrer"&gt;Github&lt;/a&gt;&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>streamlit</category>
      <category>modeldeployment</category>
    </item>
    <item>
      <title>Analysing Dataset Using Naive Bayes Classifier</title>
      <dc:creator>Omale Happiness Ojone</dc:creator>
      <pubDate>Mon, 26 Apr 2021 19:44:09 +0000</pubDate>
      <link>https://dev.to/codinghappinessweb/analysing-dataset-using-naive-bayes-classifier-3d7o</link>
      <guid>https://dev.to/codinghappinessweb/analysing-dataset-using-naive-bayes-classifier-3d7o</guid>
      <description>&lt;p&gt;In this article, we will discuss several things related to Naive Bayes Classifier including:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Introduction to Naive Bayes.&lt;/li&gt;
&lt;li&gt;Naive Bayes with Scikit-Learn.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  1.Introduction to Naive Bayes.
&lt;/h2&gt;

&lt;p&gt;Naive Bayes classifier is a classification algorithm in machine learning and is included in supervised learning. This algorithm is based on the Bayes Theorem created by Thomas Bayes. Therefore, we must first understand the Bayes Theorem before using the Naive Bayes Classifier.&lt;/p&gt;

&lt;p&gt;The essence of the Bayes theorem is conditional probability where conditional probability is the probability that something will happen, given that something else has already occurred. By using conditional probability, we can find out the probability of an event will occur given the knowledge of the previous event.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--FD1ch5Fk--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ficbozr51lzhxdg1x12a.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--FD1ch5Fk--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ficbozr51lzhxdg1x12a.JPG" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;P(A|B) = Posterior Probability, Probability of A given Value of B.&lt;/li&gt;
&lt;li&gt;P (B|A) = Likelihood of B given A is True.&lt;/li&gt;
&lt;li&gt;P (A) = Prior Probability, Probability of event A.&lt;/li&gt;
&lt;li&gt;P (B) = Marginal Probability, Probability of event B.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;By using the basis of the Bayes theorem, the Naive Bayes Classifier formula can be written as follows :&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--pHRy5gdg--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/bzd65norqb2xhsy711a7.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--pHRy5gdg--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/bzd65norqb2xhsy711a7.JPG" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;P (y | x1, … , xj) = Posterior Probability, Probability of data included in class y given their features x1 until xj.&lt;/li&gt;
&lt;li&gt;P (x1, … , xj | y) = Likelihood of features value given that their class is y.&lt;/li&gt;
&lt;li&gt;P (y) = Prior Probability.&lt;/li&gt;
&lt;li&gt;P (x1, … , xj) = Marginal Probability.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Because marginal probability always remains the same in the calculation of naive bayes classifier, then we can ignore the calculation of marginal probability. In the Naive Bayes classifier, we determine the class of data points into based on the value of the greatest posterior probability.&lt;/p&gt;

&lt;h2&gt;
  
  
  2.Naive Bayes with Scikit-Learn.
&lt;/h2&gt;

&lt;p&gt;Now that we know how to calculate the Naive Bayes Classifier algorithm manually, we can easily use Scikit-learn. Scikit-learn is one of the libraries in Python that is used for the implementation of Machine Learning. I am here using Gaussian Naive Bayes Classifier and the datasets that I use is glass classification which you should be able to download at the end of this article.&lt;/p&gt;

&lt;p&gt;The steps in solving the Classification Problem using Naive Bayes Classifier are as follows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Load the library&lt;/li&gt;
&lt;li&gt;Load the dataset&lt;/li&gt;
&lt;li&gt;Visualize the  data&lt;/li&gt;
&lt;li&gt;Handling missing values&lt;/li&gt;
&lt;li&gt;Exploratory Data Analysis (EDA)&lt;/li&gt;
&lt;li&gt;Modelling&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;1.Load several libraries of python that will be used to work on this case:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; import pandas as pd
 import numpy as np
 import matplotlib.pyplot as plt
 import seaborn as sns
 from sklearn.naive_bayes import GaussianNB
 from sklearn.metrics import accuracy_score
 from sklearn.model_selection import train_test_split
 import warnings
 warnings.filterwarnings('ignore')
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;2.Load the dataset that will be used in working on this case. The dataset used is a glass dataset: &lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  glass=pd.read_csv("glass.csv")
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;3.Look at some general information from the data to find out the characteristics of the data in general:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  #Top five of our data
  glass.head()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--KVSeLe6N--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/k9yyvimk3nwrsyyi77ia.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--KVSeLe6N--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/k9yyvimk3nwrsyyi77ia.JPG" alt="Alt Text"&gt;&lt;/a&gt;&lt;br&gt;
:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  #Last five of our data
  glass.tail() 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--9FqiHuRv--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/wz5s2q2ykh0l0ca2ooj3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--9FqiHuRv--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/wz5s2q2ykh0l0ca2ooj3.png" alt="image"&gt;&lt;/a&gt;&lt;br&gt;
:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  #Viewing the number of rows (214) and number of columns / 
  features (10)
  glass.shape
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--lWiY2r38--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/zfdgkrrcrjcadb1odiao.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--lWiY2r38--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/zfdgkrrcrjcadb1odiao.JPG" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;4.Handling missing values from the data if there is any, if not then it can proceed to the next stage:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  #Data is clean and can continue to the Explorary Data 
  Analysis stage
  glass.isnull().sum()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--9e0v91_9--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/b6flmxrxyqkho56ukp88.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--9e0v91_9--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/b6flmxrxyqkho56ukp88.JPG" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;5.Exploratory Data Analysis to find out more about the characteristics of the data:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; #Univariate analysis Type (Target features).
 sns.countplot(df['Type'], color='red')
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--_U4xx4XO--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/mjhrxx2vl59yf424o6iy.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--_U4xx4XO--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/mjhrxx2vl59yf424o6iy.JPG" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;6.Modeling our data with Gaussian Naive Bayes from Scikit-Learn:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#Create a Naive Bayes object
nb = GaussianNB()
#Create variable x and y.
x = glass.drop(columns=['Type'])
y = glass['Type']
#Split data into training and testing data 
x_train, x_test, y_train, y_test = train_test_split(x, y, 
test_size=0.2, random_state=4)
#Training the model
nb.fit(x_train, y_train)
#Predict testing set
y_pred = nb.predict(x_test)
#Check performance of model
print(accuracy_score(y_test, y_pred))
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--FHSdeZmg--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/6dreuqqfibdst7ebugd3.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--FHSdeZmg--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/6dreuqqfibdst7ebugd3.JPG" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;From the accuracy score, it can be seen that the value is 48% which in my opinion still needs to be improved again.&lt;br&gt;
From my analysis, why the accuracy value of the Naive Bayes model is so low is due to imbalanced data. So one of the ways that I will use to improve the accuracy of my model is by data balancing.&lt;/p&gt;

&lt;p&gt;Link to the dataset:&lt;a href="https://www.kaggle.com/uciml/glass/download"&gt;https://www.kaggle.com/uciml/glass/download&lt;/a&gt;&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>supervisedlearning</category>
      <category>algorithms</category>
    </item>
    <item>
      <title>Analysing Dataset Using KNN</title>
      <dc:creator>Omale Happiness Ojone</dc:creator>
      <pubDate>Sat, 27 Mar 2021 08:20:00 +0000</pubDate>
      <link>https://dev.to/codinghappinessweb/analysing-dataset-using-knn-276n</link>
      <guid>https://dev.to/codinghappinessweb/analysing-dataset-using-knn-276n</guid>
      <description>&lt;p&gt;In this article, I will explain a classification model in detail which is a major type of supervised machine learning. The model we will work on is called a KNN classifier as the title says.&lt;br&gt;
The KNN classifier is a very popular and well known supervised machine learning technique. This article will explain the KNN classifier with a simple but complete project.&lt;/p&gt;

&lt;h1&gt;
  
  
  What is a supervised learning model?
&lt;/h1&gt;

&lt;p&gt;I will explain it in detail. &lt;br&gt;
But here is what Wikipedia has to say:&lt;br&gt;
Supervised learning is the machine learning task of learning a function that maps an input to an output based on example input-output pairs. It infers a function from labelled training data consisting of a set of training examples.&lt;br&gt;
Supervised learning models take input features (X) and output (y) to train a model. The goal of the model is to define a function that can use the input features and calculate the output.&lt;br&gt;
I will show a practical example with a real dataset&lt;/p&gt;

&lt;h1&gt;
  
  
  KNN Classifier
&lt;/h1&gt;

&lt;p&gt;The KNN classifier is an example of a memory-based machine learning model.&lt;br&gt;
That means this model memorizes the labelled training examples and they use that to classify the objects it hasn’t seen before.&lt;br&gt;
The k in KNN classifier is the number of training examples it will retrieve in order to predict a new test example.&lt;/p&gt;

&lt;p&gt;KNN classifier works in three steps:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;When it is given a new instance or example to classify, it will retrieve training examples that it memorized before and find the k number of closest examples from it.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;2.Then the classifier looks up the labels (the name of the fruit in the example above) of those k numbers of closest examples.&lt;/p&gt;

&lt;p&gt;3.Finally, the model combines those labels to make a prediction. Usually, it will predict the majority labels.&lt;/p&gt;

&lt;h1&gt;
  
  
  Data Preparation
&lt;/h1&gt;

&lt;p&gt;Before we start, I encourage you to check if you have the following resources available in your computer:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Numpy Library&lt;/li&gt;
&lt;li&gt;Pandas Library&lt;/li&gt;
&lt;li&gt;Matplotlib Library&lt;/li&gt;
&lt;li&gt;Scikit-Learn Library&lt;/li&gt;
&lt;li&gt;Jupyter Notebook environment.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Download the dataset. I provided the link at the bottom of the page. Run every line of code yourself if you are reading to learn this.&lt;br&gt;
First, import the necessary libraries:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  %matplotlib notebook
  import numpy as np
  import matplotlib.pyplot as plt
  import pandas as pd
  from sklearn.model_selection import train_test_split
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;For this tutorial, I will use the Titanic dataset from Kaggle.&lt;/p&gt;

&lt;p&gt;Here is how I can import the dataset in the notebook using pandas.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;      titanic = pd.read_csv('titanic_data.csv')
      titanic.head()
      #titaninc.head() gives the first five rows of the 
      dataset.  
      #we will print first five rows only to examine the 
      dataset.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo77k495cz36ksvtdslkb.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo77k495cz36ksvtdslkb.JPG" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Look at the second column. It contains the information, if the person survived or not.&lt;br&gt;
0 means the person survived and 1 means the person did not survive.&lt;br&gt;
For this tutorial, our goal will be to predict the ‘Survived’ feature.&lt;br&gt;
This dataset is very simple. Just from intuition, we can see that there are columns that cannot be important to predict the ‘Survived’ feature.&lt;br&gt;
For example, ‘PassengerId’, ‘Name’, ‘Ticket’ and, ‘Cabin’ does not seem to be useful to predict that if a passenger survived or not.&lt;br&gt;
I will make a new DataFrame with a few key features and name the new DataFrame titanic1.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;         titanic1 = titanic[['Pclass', 'Sex', 'Fare', 
         'Survived']]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;The ‘Sex’ column has the string value and that needs to be changed. Because computers do not understand words. It only understands numbers. I will change the ‘male’ for 0 and ‘female’ for 1.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;         titanic1['Sex'] = titanic1.Sex.replace({'male':0, 
         'female':1})
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;This is how the DataFrame titanic1 looks like:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5krtax2cxxst5g1c6uiz.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5krtax2cxxst5g1c6uiz.JPG" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Our goal is to predict the ‘Survived’ parameter, based on the other information in the titanic1 DataFrame. So, the output variable or label(y) is ‘Survived’. The input features(X) are ‘P-class’, ‘Sex’, and, ‘Fare’.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    X = titanic1[['Pclass', 'Sex', 'Fare']]
    y = titanic1['Survived']
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h1&gt;
  
  
  KNN Classifier Model
&lt;/h1&gt;

&lt;p&gt;To start with, we need to split the dataset into two sets: &lt;br&gt;
a training set and a test set.&lt;br&gt;
We will use the training set to train the model where the model will memorize both the input features and the output variable.&lt;br&gt;
Then we will use the test set to see that if the model can predict if the passenger survived using the ‘P-class’, ‘Sex’, and, ‘Fare’.&lt;br&gt;
The method ‘train_test_split’ is going to help to split the data. By default, this function uses 75% data for the training set and 25% data for the test set. If you want you can change that and you can specify the ‘train_size’ and ‘test_size’.&lt;br&gt;
If you put train_size 0.8, the split will be 80% training data and 20% test data. But for me the default value 75% is good. So, I am not using train_size or test_size parameters.&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;           X_train, X_test, y_train, y_test = 
           train_test_split(X, y, random_state=0)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Remember to use the same value for ‘random_state’. That way, every time you will do this split, it will take the same data for the training set and test set.&lt;br&gt;
I choose random_state as 0. You can choose a number of your choice.&lt;br&gt;
Python’s scikit -learn library, already have a KNN classifier model.&lt;br&gt;
I will import that.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;         from sklearn.neighbors import KNeighborsClassifier
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Save this classifier in a variable.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;        knn = KNeighborsClassifier(n_neighbors = 5)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Here, n_neighbors is 5.&lt;/p&gt;

&lt;p&gt;That means when we will ask our trained model to predict the survival chance of a new instance, it will take 5 closest training data.&lt;br&gt;
Based on the labels of those 5 training data, the model will predict the label of the new instance.&lt;br&gt;
Now, I will fit the training data to the model so that model can memorize them.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;              knn.fit(X_train, y_train)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;You may think that as it memorized the training data it can predict the label of 100% of the training features correctly. But that’s not certain. Why?&lt;br&gt;
Look, whenever we give input and ask it to predict the label it will take a vote from the 5 closest neighbors even if it has the exact same feature memorized.&lt;br&gt;
Let’s see how much accuracy it can give us on training data&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;           knn.score(X_train, y_train)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;The training data accuracy I got is 0.83 or 83%.&lt;/p&gt;

&lt;p&gt;Remember, we have a test dataset that our model has never seen. Now check, how much accurately it can predict the label of the test dataset.&lt;br&gt;
             knn.score(X_test, y_test)&lt;/p&gt;

&lt;p&gt;The accuracy came out to be 0.78 or 78%.&lt;/p&gt;

&lt;h1&gt;
  
  
  Congrats! You developed a KNN classifier!
&lt;/h1&gt;

&lt;p&gt;Notice, the training set accuracy is a bit higher than the test set accuracy. That’s overfitting.&lt;/p&gt;

&lt;h1&gt;
  
  
  What is Overfitting?
&lt;/h1&gt;

&lt;p&gt;In a single sentence, when the training set accuracy is higher than the test set accuracy, we call it overfitting.&lt;/p&gt;

&lt;h1&gt;
  
  
  Prediction
&lt;/h1&gt;

&lt;p&gt;If you want to see the predicted output for the test dataset, here is how to do that:&lt;/p&gt;

&lt;p&gt;Input:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;        y_pred = knn.predict(X_test)
        y_pred
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Output:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi5l3jcqe2ua7ggwd2u6b.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi5l3jcqe2ua7ggwd2u6b.JPG" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Or you can just input one single example and find the label.&lt;br&gt;
I want to see when a person is traveling in ‘P-class’ 3, ‘Sex’ is female that means 1, and, paid a ‘Fare’ of 25, if she could survive as per our model.&lt;br&gt;
Input:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;      knn.predict([[3, 1, 25]])
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Remember to use two brackets, because it requires a 2D array&lt;br&gt;
Output:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;     array([0], dtype=int64)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;The output is zero. That means as per our trained model the person could not survive.&lt;/p&gt;

&lt;p&gt;Please feel free to try wth more different inputs like this one!&lt;/p&gt;

&lt;h1&gt;
  
  
  If You Want to See Some Further Analysis of KNN Classifier
&lt;/h1&gt;

&lt;p&gt;KNN classifier is highly sensitive to the choice of ‘k’ or n_neighbors. In the example above I used n_neighors = 5.&lt;br&gt;
For different n_neighbors, the classifier will perform differently.&lt;br&gt;
Let’s check how it performs on the training dataset and test dataset for different n_neighbors value. I choose 1 to 20.&lt;br&gt;
Now, we will calculate the training set accuracy and the test set accuracy for each n_neighbors value from 1 to 20.&lt;/p&gt;

&lt;p&gt;Input:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;         training_accuracy  = []  
         test_accuracy = []
         for i in range(1, 21):
             knn = KNeighborsClassifier(n_neighbors = i)
             knn.fit(X_train, y_train)
             training_accuracy.append(knn.score(X_train, 
             y_train))
             test_accuracy.append(knn.score(X_test, y_test))
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;After running this code snippet, I got the training and test accuracy for different n_neighbors.&lt;br&gt;
Now, let’s plot the training and test set accuracy against n_neighbors in the same plot.&lt;br&gt;
Input:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;          plt.figure()
          plt.plot(range(1, 21), training_accuracy, 
          label='Training Accuarcy')
          plt.plot(range(1, 21), test_accuracy, label='Testing 
          Accuarcy')
          plt.title('Training Accuracy vs Test Accuracy')
          plt.xlabel('n_neighbors')
          plt.ylabel('Accuracy')
          plt.ylim([0.7, 0.9])
          plt.legend(loc='best')
          plt.show()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Output:&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fspqhts9rq3ht263f5twj.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fspqhts9rq3ht263f5twj.JPG" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Analyze the Graph Above&lt;br&gt;
In the beginning, when the n_neighbors were 1, 2, or 3, training accuracy was a lot higher than test accuracy. So, the model was suffering from high overfitting.&lt;br&gt;
After that training and test accuracy became closer. That is the sweet spot. We want that.&lt;br&gt;
But when n_neighbors was going even higher, both training and test set accuracy was going down. We do not need that.&lt;br&gt;
From the graph above, the perfect n_neighbors for this particular dataset and model should be 6 or 7.&lt;/p&gt;

&lt;h1&gt;
  
  
  That is a good classifier!
&lt;/h1&gt;

&lt;p&gt;Look at the graph above! When n_neighbors is about 7, both training and testing accuracy was above 80%.&lt;/p&gt;

&lt;h1&gt;
  
  
  Conclusion
&lt;/h1&gt;

&lt;p&gt;This article’s purpose was to show a KNN classifier with a project. If you are a machine learning beginner this should help you learn some key concepts of machine learning and the workflow. There are so many different machine learning models out there. But this is the typical workflow of a supervised machine learning model.&lt;/p&gt;

&lt;p&gt;Here is the titanic dataset I used in the article:&lt;br&gt;
&lt;a href="https://www.kaggle.com/biswajee/titanic-dataset" rel="noopener noreferrer"&gt;https://www.kaggle.com/biswajee/titanic-dataset&lt;/a&gt;&lt;/p&gt;

</description>
    </item>
    <item>
      <title>WORKING WITH DATETIME FUNCTION IN PYTHON</title>
      <dc:creator>Omale Happiness Ojone</dc:creator>
      <pubDate>Mon, 25 Jan 2021 08:10:03 +0000</pubDate>
      <link>https://dev.to/codinghappinessweb/working-with-datetime-function-in-python-j3k</link>
      <guid>https://dev.to/codinghappinessweb/working-with-datetime-function-in-python-j3k</guid>
      <description>&lt;h1&gt;
  
  
  Dates for Python
&lt;/h1&gt;

&lt;p&gt;A date in Python is not a data form of its own, but to work with dates as date objects, we can import a module called datetime.&lt;br&gt;
First you have to install it on your computer if you dont't have just simply do "pip install datetime" then you import it.&lt;br&gt;
So, in this example I will be showing you how to Import the datetime module and display the current date&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;       :
       import datetime
       x = datetime.datetime.now()
       print(x)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h1&gt;
  
  
  Date Output
&lt;/h1&gt;

&lt;p&gt;When we execute the code from the example above the result will be &lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;   :
   2021-01-20 14:01:32.454684
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Year, month, day, hour, minute, second, and microseconds are included in the date.The datetime module has several methods for returning the date object information.&lt;/p&gt;

&lt;p&gt;Here are a few examples, you will learn more about them later in this chapter:&lt;br&gt;
Example;&lt;br&gt;
Return the year and name of weekday&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;          :
          import datetime
          x = datetime.datetime.now()
          print(x.year)
          print(x.strftime("%A"))
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h1&gt;
  
  
  Date Output
&lt;/h1&gt;

&lt;p&gt;When we execute the code from the example above the result will be&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; :
 2021
 Wednesday
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h1&gt;
  
  
  Creating Date Objects
&lt;/h1&gt;

&lt;p&gt;You can use the datetime () class constructor of the datetime module to create a date. For creating a date, the datetime() class &lt;br&gt;
includes three parameters: year, month, day.&lt;br&gt;
Example;&lt;br&gt;
Create a date object&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;           :
           import datetime
           x = datetime.datetime(2021, 1, 20)
           print(x)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h1&gt;
  
  
  Date Output
&lt;/h1&gt;

&lt;p&gt;When we execute the code from the example above the result will be&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;          :
          2021-01-20 00:00:00
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;The datetime() class also takes time and time zone parameters (hour, minute, second, microsecond, tzone), but they are optional and have a value of 0, by default (None for timezone).&lt;/p&gt;

&lt;h1&gt;
  
  
  The strftime() Method
&lt;/h1&gt;

&lt;p&gt;The datetime object has a method by which date objects can be formatted into readable strings. The method is called strftime() and takes one format parameter to define the format of the string &lt;br&gt;
returned.&lt;br&gt;
Example;&lt;br&gt;
Display the name of the month&lt;/p&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  :&lt;br&gt;
  import datetime&lt;br&gt;
  x = datetime.datetime(2021, 1, 1)&lt;br&gt;
  print(x.strftime("%B"))&lt;br&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h1&gt;
&lt;br&gt;
  &lt;br&gt;
  &lt;br&gt;
  Date Output&lt;br&gt;
&lt;/h1&gt;

&lt;p&gt;When we execute the code from the example above the result will be&lt;/p&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  :&lt;br&gt;
 January&lt;br&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h1&gt;
&lt;br&gt;
  &lt;br&gt;
  &lt;br&gt;
  A reference of all the legal format codes:&lt;br&gt;
&lt;/h1&gt;

&lt;p&gt;Directive   Description Example &lt;br&gt;
%a  Weekday, short version  Wed &lt;br&gt;
%A  Weekday, full version   Wednesday&lt;br&gt;&lt;br&gt;
%w  Weekday as a number 0-6, 0 is Sunday    3&lt;br&gt;&lt;br&gt;
%d  Day of month 01-31  31&lt;br&gt;&lt;br&gt;
%b  Month name, short version   Dec &lt;br&gt;
%B  Month name, full version    December&lt;br&gt;
%m  Month as a number 01-12 12&lt;br&gt;&lt;br&gt;
%y  Year, short version, without century    18&lt;br&gt;&lt;br&gt;
%Y  Year, full version  2018&lt;br&gt;&lt;br&gt;
%H  Hour 00-23  17&lt;br&gt;&lt;br&gt;
%I  Hour 00-12  05&lt;br&gt;&lt;br&gt;
%p  AM/PM   PM&lt;br&gt;&lt;br&gt;
%M  Minute 00-59    41&lt;br&gt;&lt;br&gt;
%S  Second 00-59    08&lt;br&gt;&lt;br&gt;
%f  Microsecond 000000-999999   548513&lt;br&gt;&lt;br&gt;
%z  UTC offset  +0100&lt;br&gt;&lt;br&gt;
%Z  Timezone    CST &lt;br&gt;
%j  Day number of year 001-366  365 &lt;br&gt;
%U  Week number of year, Sunday as the first day of week, 00-53 52&lt;br&gt;&lt;br&gt;
%W  Week number of year, Monday as the first day of week, 00-53 52&lt;br&gt;&lt;br&gt;
%c  Local version of date and time  Mon Dec 31 17:41:00 2018&lt;br&gt;&lt;br&gt;
%x  Local version of date   12/31/18&lt;br&gt;&lt;br&gt;
%X  Local version of time   17:41:00&lt;br&gt;&lt;br&gt;
%%  A % character   %&lt;br&gt;&lt;br&gt;
%G  ISO 8601 year   2018&lt;br&gt;&lt;br&gt;
%u  ISO 8601 weekday (1-7)  1&lt;br&gt;&lt;br&gt;
%V  ISO 8601 weeknumber (01-53) 01&lt;/p&gt;

</description>
    </item>
    <item>
      <title>TIPS FOR BEGINNERS IN PYTHON</title>
      <dc:creator>Omale Happiness Ojone</dc:creator>
      <pubDate>Mon, 18 Jan 2021 09:06:28 +0000</pubDate>
      <link>https://dev.to/codinghappinessweb/tips-for-beginners-in-python-ipf</link>
      <guid>https://dev.to/codinghappinessweb/tips-for-beginners-in-python-ipf</guid>
      <description>&lt;h1&gt;
  
  
  WHAT IS PYTHON?
&lt;/h1&gt;

&lt;p&gt;Python is an open source programming language, an interpreted, object-oriented, high- level programming language with dynamic semantics, with applications in numerous areas, including web programming, scientific computing, and artificial intelligence. &lt;/p&gt;

&lt;h1&gt;
  
  
  CHARACTERISTICS OF PYTHON
&lt;/h1&gt;

&lt;ul&gt;
&lt;li&gt;  It has a large standard library&lt;/li&gt;
&lt;li&gt;  It is used in Databases&lt;/li&gt;
&lt;li&gt;  It is used for web scraping&lt;/li&gt;
&lt;li&gt;  Python can be used to develop games&lt;/li&gt;
&lt;li&gt;  It is used for machine learning&lt;/li&gt;
&lt;li&gt;  It is used for Data analytics&lt;/li&gt;
&lt;li&gt;  It is used for web framework &lt;/li&gt;
&lt;li&gt;  It is used for Graphical User Interface&lt;/li&gt;
&lt;li&gt;  It is used for networking and documentation e. t. c.&lt;/li&gt;
&lt;li&gt;  It is simple and powerful&lt;/li&gt;
&lt;li&gt;  Above all it is easy and fun to learn&lt;/li&gt;
&lt;/ul&gt;

&lt;h1&gt;
  
  
  BRIEF HISTORY OF PYTHON
&lt;/h1&gt;

&lt;p&gt;Python was created by Guido Van Rossum, funny enough python was not named after the snake rather it was from a British comedy group&lt;br&gt;
The first release of python  was in 1991, version 0.9.0&lt;br&gt;
In 2000, python 2.0 was released&lt;br&gt;
Python3 was released on December 2008, although python2 and 3 are similar but they have subtle differences, the most noticeable one are the print statement.&lt;br&gt;
For example print “Hello World” (This is for python2, if you do this in python3 it outputs an error because "Hello World" is suppose to be in parenthesis).&lt;/p&gt;

&lt;h1&gt;
  
  
  USEFUL RESOURCES TO LEARN PYTHON
&lt;/h1&gt;

&lt;p&gt;If you decide to learn Python in 2021, then here are some of the useful Python books, courses, and tutorials to start your journey in the beautiful world of Python.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The Complete Python MasterClass&lt;/li&gt;
&lt;li&gt;The Python Bible — Everything You Need to Program in Python&lt;/li&gt;
&lt;li&gt;Python Fundamentals by Pluralsight&lt;/li&gt;
&lt;li&gt;10 Free Python Programming EBooks and PDF&lt;/li&gt;
&lt;li&gt;Brad Travesry Media @ Youtube: beginner’s python course:-Brad explains in a way you understand them clearly and it is practical so you will see how the code actually works.&lt;/li&gt;
&lt;li&gt;Sololearn: sololearn is a great app for learning how to code and python course is really amazing and covers all you need to know about python as a beginner, there are questions and answers too&lt;/li&gt;
&lt;li&gt;Udemy:is a good place to learn too, they have paid and unpaid python courses in udemy platforms.&lt;/li&gt;
&lt;/ul&gt;

&lt;h1&gt;
  
  
  REASONS WHY YOU SHOULD LEARN PYTHON
&lt;/h1&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Simplicity-This is the single biggest reason for beginners to learn Python. When you first start with programming and coding, you don’t want to start with a programming language which has tough syntax and weird rules. Python is both readable and simple. It also easier to setup, you don’t need to deal with any class path problems like Java or compiler issues like C++.Just install Python and you are done. While installing it will also ask you to add Python in PATH which means you can run Python from anywhere on your machine.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Multipurpose-One of the things I like about Python is its Swiss Army knife nature. It’s not tied to just one thing e.g. R which is good on Data Science and Machine learning but nowhere when it comes to web development. Learning Python means you can do many things.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Jobs and Growth-Python is growing really fast and big time and it makes a lot of sense to learn a growing programming major programming language if you are just starting your programming career.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Salary-Python developers are one of the highest paid developers, particularly in the Data Science, Machine learning and web development. On average also, they are very good paying, ranging from 70,000 USD to 150,000 USD depending upon their experience, location, and domain.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Python is Known for its Huge Community-You need a community to learn a new technology and friends are your biggest asset when it comes to learning a programming language. You often get stuck with one or other issue and that time you need helping hand. Thanks to Google, you can find the solution of your Python related problem in minutes. Communities like StackOverflow also brings many Python experts together to help newcomers.&lt;/p&gt;
&lt;h1&gt;
  
  
  TIPS TO GETTING STARTED WITH PYTHON
&lt;/h1&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Dedication&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Consistency&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Make friends with experts&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Teach&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Build project&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Contribute to open source&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Rest: very important&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Closing Notes:&lt;br&gt;
Thanks, You made it to the end of the article … Good luck with your Python journey! It’s certainly a great decision and will pay you a lot in your nearest future.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>How to send emails using python</title>
      <dc:creator>Omale Happiness Ojone</dc:creator>
      <pubDate>Mon, 18 Jan 2021 08:38:44 +0000</pubDate>
      <link>https://dev.to/codinghappinessweb/how-to-send-emails-using-python-3e63</link>
      <guid>https://dev.to/codinghappinessweb/how-to-send-emails-using-python-3e63</guid>
      <description>&lt;p&gt;What are emails?&lt;br&gt;
Emails are messages distributed by electronic means from one computer user to another. There can be many more recipient as well via network.&lt;/p&gt;

&lt;p&gt;Methods we can use to send an email:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;We can use python web automation using selenium.&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;We have a python library which is SMTP library which can be used to send an email.&lt;br&gt;
But for this article I shall be explaining how to use the STMP library.&lt;br&gt;
STMP which means simple transfer mail protocol. STMP library, this library or the modules defines an SMTP clients session object which can be used to send an email to any other internet machine with an STMP or E-STMP listener daemon.&lt;/p&gt;
&lt;h1&gt;
  
  
  Here's the full code
&lt;/h1&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;      :
      import smtplib
      server = smtplib.SMTP_SSL('smtp.gmail.com', 465)
      server.login('example@gmail.com', 'password')
      server.sendmail('example@gmail.com', 
      'contact1@gmail.com', 'Hi happiness,how are you?')
      server.quit()
&lt;/code&gt;&lt;/pre&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Quickly what you should note:&lt;br&gt;
"&lt;a href="mailto:example@gmail.com"&gt;example@gmail.com&lt;/a&gt;"-your email address should be there&lt;br&gt;
"password"-the password to your email address&lt;br&gt;
"&lt;a href="mailto:contact1@gmail.com"&gt;contact1@gmail.com&lt;/a&gt;"-the receiver's email address.&lt;br&gt;
then you go ahead with the body of the message.&lt;/p&gt;

&lt;p&gt;So here's the output&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--SgxDPJ-K--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1610347510124/y6QB8bxJZ.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--SgxDPJ-K--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1610347510124/y6QB8bxJZ.jpeg" alt="gmail ans.JPG" width="468" height="258"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Finally you have to enable your "less secure app" from your google account in order to send the message.&lt;/p&gt;

</description>
    </item>
  </channel>
</rss>
