<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Aya El Sherif</title>
    <description>The latest articles on DEV Community by Aya El Sherif (@ayaelsherif).</description>
    <link>https://dev.to/ayaelsherif</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F764563%2Fc493f395-7b91-4993-85c6-105f9fd857b1.jpg</url>
      <title>DEV Community: Aya El Sherif</title>
      <link>https://dev.to/ayaelsherif</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/ayaelsherif"/>
    <language>en</language>
    <item>
      <title>Why companies need to have a Data &amp; AI team as soon as possible?</title>
      <dc:creator>Aya El Sherif</dc:creator>
      <pubDate>Fri, 20 Feb 2026 20:05:16 +0000</pubDate>
      <link>https://dev.to/ayaelsherif/why-companies-need-to-have-data-ai-team-as-soon-as-possible-1ep5</link>
      <guid>https://dev.to/ayaelsherif/why-companies-need-to-have-data-ai-team-as-soon-as-possible-1ep5</guid>
      <description>&lt;p&gt;My name is Aya and I am here to share my insights about why companies might need to consider having their Data &amp;amp; AI department as soon as possible!&lt;/p&gt;

&lt;p&gt;In the past, business decisions were based on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Experience&lt;/li&gt;
&lt;li&gt;Intuition&lt;/li&gt;
&lt;li&gt;Historical reports&lt;/li&gt;
&lt;li&gt;People's personal visions and biases&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Today, Artificial Intelligence is completely changing the rules not only in tech but also in business big decisions and our daily life.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;From intuition → data-driven decisions&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Instead of guessing, AI systems analyze millions and billions of data points in seconds to deliver accurate insights to make a better decision.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
An e-commerce company can predict which products will be sold fastly and can be in demand before it happens.&lt;br&gt;
Here we move from static reports → &lt;em&gt;real-time&lt;/em&gt; decisions&lt;br&gt;
Companies no longer need to wait until the end of the month to review performance and then make a decision about each product separately.&lt;/p&gt;

&lt;p&gt;With AI, You can get:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Immediate problem detection&lt;/li&gt;
&lt;li&gt;Immediate risk response&lt;/li&gt;
&lt;li&gt;Continuous performance optimization&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Example:&lt;br&gt;
Fraud detection systems flag suspicious transactions the moment they occur, Have you wondered before how this can happen?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;From analyzing the past → We are predicting the future&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Instead of understanding what happened, companies can anticipate what will happen as well.&lt;br&gt;
With AI, You can:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Predict customer behavior&lt;/li&gt;
&lt;li&gt;Forecast churn&lt;/li&gt;
&lt;li&gt;Avoid operational failures&lt;/li&gt;
&lt;li&gt;Assess financial risks&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;From management reports → intelligent decision support&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Rather than reading lengthy and long reports, executives and decision makers can get:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Smart recommendations&lt;/li&gt;
&lt;li&gt;Proactive alerts&lt;/li&gt;
&lt;li&gt;“What-if” scenarios with all possibilities and probabilities.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Let's get to know what does this mean for businesses and business owners?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Organizations that adopt AI-driven decision making gain:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Faster decisions&lt;/li&gt;
&lt;li&gt;Calculated and Reduced risk&lt;/li&gt;
&lt;li&gt;Increased profitability and revenue&lt;/li&gt;
&lt;li&gt;A real competitive advantage&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;The truth&lt;/strong&gt;&lt;br&gt;
AI does not replace decision-makers .. it makes their decisions smarter and way faster.&lt;/p&gt;

&lt;p&gt;If you lead a company today,&lt;br&gt;
Are your decisions driven by intuition or by data?&lt;/p&gt;

</description>
      <category>ai</category>
      <category>discuss</category>
      <category>digitaltransformation</category>
      <category>businesstalk</category>
    </item>
    <item>
      <title>Intro to Machine Learning</title>
      <dc:creator>Aya El Sherif</dc:creator>
      <pubDate>Tue, 21 Dec 2021 01:54:18 +0000</pubDate>
      <link>https://dev.to/ayaelsherif/intro-to-machine-learning-fk9</link>
      <guid>https://dev.to/ayaelsherif/intro-to-machine-learning-fk9</guid>
      <description>&lt;p&gt;&lt;strong&gt;Welcome to my third blog!&lt;/strong&gt;&lt;br&gt;
In this blog, I am revising basic concepts in the Kaggle course (Intro to machine learning) and we'll build our very first model here and it's totally basic so, it doesn't require any experience in this topic, Let's start!&lt;/p&gt;



&lt;p&gt;&lt;strong&gt;Lesson 1 :&lt;/strong&gt;  How models work?&lt;/p&gt;

&lt;p&gt;Let's walk through the example there:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Your cousin has made millions of dollars speculating on real estate. He's offered to become business partners with you because of your interest in data science. He'll supply the &lt;u&gt;money&lt;/u&gt;, and you'll supply &lt;u&gt;models &lt;/u&gt;that predict how much various houses are worth.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When you wondered how he predicted the house prices before he said by "intuition", But more questioning reveals that he's identified price patterns from houses he has seen in the past and he uses those patterns to make predictions for new houses he is considering.&lt;/p&gt;

&lt;p&gt;Machine learning works the same way. We'll start with a model called the &lt;strong&gt;Decision Tree&lt;/strong&gt;. There are fancier models that give more accurate predictions.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fshkdge6qxtngk33rdyxu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fshkdge6qxtngk33rdyxu.png" alt="Decision tree" width="475" height="382"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;What is that?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The DT (decision tree) divides houses into only &lt;u&gt;two categories&lt;/u&gt;.&lt;/li&gt;
&lt;li&gt;We use data to decide how to break the houses into two groups&lt;/li&gt;
&lt;li&gt;Then again to determine the predicted price in each group. This step of capturing patterns from data is called &lt;u&gt;fitting &lt;/u&gt;or &lt;u&gt;training &lt;/u&gt;the model.&lt;/li&gt;
&lt;li&gt;After data has been fit, you can apply it to new data to predict prices of additional new homes.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Improving the Decision Tree&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffvhw858jq7p7oz3ofrxp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffvhw858jq7p7oz3ofrxp.png" alt="DT" width="800" height="318"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now, It pops up in your mind that definitely DT 1 makes more sense as when the more no. of bedrooms the higher price it'll be, right?&lt;br&gt;
Well, This not totally true.&lt;br&gt;
As there are extra features (e.g. lot size, crime rate and so on).&lt;br&gt;
This will lead us to the &lt;strong&gt;deeper tree&lt;/strong&gt; that covers more features that definitely affects the predicted price and those are the extra "splits". &lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flcpmvfdnjfjv5que03m0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flcpmvfdnjfjv5que03m0.png" alt="Deep DT" width="709" height="449"&gt;&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;leaf&lt;/strong&gt; is where we have our predicted price.&lt;/p&gt;

&lt;p&gt;The splits and values at the leaves will be determined by the data, so we need to check out the data we'll be working with.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Lesson 2 :&lt;/strong&gt; Basic Data Exploration (Examine your data)&lt;br&gt;
To build any ML model we need to be familiar and fully understand our data, In order to do so, One of the well known libraries is "Pandas".&lt;br&gt;
What's pandas?&lt;br&gt;
&lt;strong&gt;Pandas&lt;/strong&gt; is the primary tool used for exploring and manipulating data.&lt;br&gt;
&lt;em&gt;Pandas =&amp;gt; pd&lt;/em&gt;&lt;br&gt;
Let's import it :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import pandas as pd
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The most important part of the Pandas is the "DataFrame".&lt;br&gt;
A &lt;strong&gt;DataFrame&lt;/strong&gt; holds the type of data you might think of as a table. This is similar to a sheet in &lt;u&gt;Excel&lt;/u&gt;, or a table in a &lt;u&gt;SQL &lt;/u&gt;database.&lt;/p&gt;

&lt;p&gt;Pandas has powerful methods for most things we'll want to do with this type of data.&lt;br&gt;
Let's do some code!&lt;br&gt;
Check this dataset : &lt;a href="https://www.kaggle.com/dansbecker/melbourne-housing-snapshot" rel="noopener noreferrer"&gt;Homes in Melbourne, Australia&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As usual, Check my &lt;a href="https://colab.research.google.com/drive/1wjJylgRKrS8oz_yqrnv4TPf3x5R7hVWA?usp=sharing" rel="noopener noreferrer"&gt;code&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Interpreting Data Description&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The results show &lt;strong&gt;8&lt;/strong&gt; numbers for each column in our original dataset. The first number is &lt;u&gt;count&lt;/u&gt; that shows how many rows have non-missing values.&lt;/p&gt;

&lt;p&gt;Missing values arise for many reasons. For example, the size of the 2nd bedroom wouldn't be collected when surveying a 1 bedroom house. We'll come back to the topic of missing data.&lt;/p&gt;

&lt;p&gt;The second value is the &lt;u&gt;mean&lt;/u&gt;, which is the average.&lt;br&gt;
The third value is &lt;u&gt; std &lt;/u&gt;(standard deviation) which measures how numerically spread out the values are.&lt;/p&gt;

&lt;p&gt;To interpret the &lt;strong&gt;min, 25%, 50%, 75% and max values&lt;/strong&gt;, imagine sorting each column from lowest to highest value.&lt;br&gt;
The first (smallest) value is the min.&lt;br&gt;
If you go a quarter way through the list, you'll find a number that is bigger than 25% of the values and smaller than 75% of the values that is the &lt;strong&gt;25%&lt;/strong&gt; value (pronounced "25th percentile").&lt;br&gt;
The 50th and 75th percentiles are defined analogously and the max is the largest number.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Lesson 3 :&lt;/strong&gt;  Your First Machine Learning Model&lt;/p&gt;

&lt;p&gt;In this lesson, we'll apply what is explained above to build a model. Let's go!&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Selecting Data for Modeling&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;We have so many variables here so, we'll pick a few of them using our intuition (for now).&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;To choose variables/columns, we'll need to see a list of all columns in the dataset.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;=&amp;gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;melbourne_data.columns
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output =&amp;gt; Index(['Suburb', 'Address', 'Rooms', 'Type', 'Price', 'Method', 'SellerG','Date', 'Distance', 'Postcode', 'Bedroom2', 'Bathroom', 'Car','Landsize', 'BuildingArea', 'YearBuilt', 'CouncilArea', 'Lattitude','Longtitude', 'Regionname', 'Propertycount'],dtype='object')&lt;/p&gt;

&lt;p&gt;We have some missing values&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;We will take the simplest option for now and &lt;u&gt;drop&lt;/u&gt; houses from our data. (dropna as we can consider for now that "na" means "not available".)
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;melbourne_data = melbourne_data.dropna(axis=0)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, we'll select pieces from our data&lt;/p&gt;

&lt;p&gt;&lt;u&gt;Two approaches to be followed : &lt;/u&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;- Dot notation, which we use to select the "prediction target"&lt;/li&gt;
&lt;li&gt;- Selecting with a column list, which we use to select the "features"&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Selecting The Prediction Target&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You can pull out a variable with dot-notation "."&lt;br&gt;
This single column is stored in a Series, which is like a df with only a single column of data.&lt;/p&gt;

&lt;p&gt;We'll use the dot notation to select the column we want to predict, which is called the &lt;strong&gt;prediction target&lt;/strong&gt;.&lt;br&gt;
We'll call the prediction target "y".&lt;br&gt;
So we need to save the house prices in the Melbourne data :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;y = melbourne_data.Price
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;p&gt;&lt;strong&gt;Choosing "Features"&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The columns or "features." In our case, those would be used to determine the &lt;strong&gt;home price&lt;/strong&gt;. Sometimes, we will use all columns except the target one as features. Other times it'd be better with fewer features.&lt;/p&gt;

&lt;p&gt;For now, we'll build a model with only a few features. Later on we'll see how to iterate and compare models built with different features.&lt;/p&gt;

&lt;p&gt;We select multiple features by providing a list of column names.&lt;/p&gt;

&lt;p&gt;&lt;u&gt;Here is an example:&lt;/u&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;melbourne_features = ['Rooms', 'Bathroom', 'Landsize', 'Lattitude', 'Longtitude']
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We'll call our data "X"&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;X = melbourne_data[melbourne_features]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let's see it more deep :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;X.describe()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output =&amp;gt;   Rooms   Bathroom    Landsize    Lattitude   Longtitude&lt;br&gt;
count   6196.000000 6196.000000 6196.000000 6196.000000 6196.000000&lt;br&gt;
mean    2.931407    1.576340    471.006940  -37.807904  144.990201&lt;br&gt;
std 0.971079    0.711362    897.449881  0.075850    0.099165&lt;br&gt;
min 1.000000    1.000000    0.000000    -38.164920  144.542370&lt;br&gt;
25% 2.000000    1.000000    152.000000  -37.855438  144.926198&lt;br&gt;
50% 3.000000    1.000000    373.000000  -37.802250  144.995800&lt;br&gt;
75% 4.000000    2.000000    628.000000  -37.758200  145.052700&lt;br&gt;
max 8.000000    8.000000    37000.000000    -37.457090  145.526350&lt;/p&gt;

&lt;p&gt;And&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;X.head()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output =&amp;gt; &lt;br&gt;
     Rooms  Bathroom Landsize Lattitude Longtitude&lt;br&gt;
1   2   1.0 156.0   -37.8079    144.9934&lt;br&gt;
2   3   2.0 134.0   -37.8093    144.9944&lt;br&gt;
4   4   1.0 120.0   -37.8072    144.9941&lt;br&gt;
6   3   2.0 245.0   -37.8024    144.9993&lt;br&gt;
7   2   1.0 256.0   -37.8060    144.9954&lt;/p&gt;



&lt;p&gt;&lt;strong&gt;Building Our Model&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;We will use the &lt;strong&gt;scikit-learn&lt;/strong&gt; library to create our model.&lt;br&gt;
(sklearn) is the most popular library for modeling the types of data typically stored in DataFrames.&lt;/p&gt;

&lt;p&gt;&lt;u&gt;The steps to building and using a model are:&lt;/u&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Define:&lt;/strong&gt; What type of model will it be? A decision tree? Some other type of model? Some other parameters of the model type are specified too.&lt;br&gt;
&lt;strong&gt;Fit:&lt;/strong&gt; Capture patterns from provided data. This is the heart of modeling.&lt;br&gt;
&lt;strong&gt;Predict:&lt;/strong&gt; Just what it sounds like&lt;br&gt;
&lt;strong&gt;Evaluate:&lt;/strong&gt; Determine how accurate the model's predictions are.&lt;/p&gt;

&lt;p&gt;Here is an example of defining a decision tree model with scikit-learn and fitting it with the features and target variable.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from sklearn.tree import DecisionTreeRegressor

# Define model.Specify a number for random_state to ensure same results each run
melbourne_model = DecisionTreeRegressor(random_state=1)

# Fit model
melbourne_model.fit(X, y)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output =&amp;gt;&lt;br&gt;
DecisionTreeRegressor(random_state=1)&lt;/p&gt;

&lt;p&gt;Many machine learning models allow some randomness in model training.&lt;br&gt;
Specifying a number for random_state ensures you get the same results in each run.&lt;br&gt;
We use any number, and model quality won't depend on exactly what value we choose.&lt;/p&gt;

&lt;p&gt;We now have a &lt;strong&gt;fitted model&lt;/strong&gt; that we can use to &lt;strong&gt;make predictions&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;print("Making predictions for the following 5 houses:")
print(X.head())
print("The predictions are")
print(melbourne_model.predict(X.head()))
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output =&amp;gt; Making predictions for the following 5 houses:&lt;br&gt;
   Rooms  Bathroom  Landsize  Lattitude  Longtitude&lt;br&gt;
1      2       1.0     156.0   -37.8079    144.9934&lt;br&gt;
2      3       2.0     134.0   -37.8093    144.9944&lt;br&gt;
4      4       1.0     120.0   -37.8072    144.9941&lt;br&gt;
6      3       2.0     245.0   -37.8024    144.9993&lt;br&gt;
7      2       1.0     256.0   -37.8060    144.9954&lt;br&gt;
The predictions are&lt;br&gt;
[1035000. 1465000. 1600000. 1876000. 1636000.]&lt;/p&gt;




&lt;p&gt;Check my final code from &lt;a href="https://colab.research.google.com/drive/1cT2Sc8zw3Fiwz6oQ4Srk70kYneTe4VV6?usp=sharing" rel="noopener noreferrer"&gt;Here&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;That's all for today, We covered half of the course and we'll continue in the upcoming blog!&lt;br&gt;
Hope you learnt and know now how to build a model.&lt;/p&gt;




&lt;p&gt;&lt;u&gt;Resources and docs : &lt;/u&gt;&lt;/p&gt;

&lt;p&gt;1.&lt;a href="https://www.kaggle.com/learn/intro-to-machine-learning" rel="noopener noreferrer"&gt;Kaggle Course&lt;/a&gt;&lt;br&gt;
2.&lt;a href="https://www.w3schools.com/python/pandas/pandas_intro.asp" rel="noopener noreferrer"&gt;W3schools&lt;/a&gt;&lt;br&gt;
3.&lt;a href="https://pandas.pydata.org/" rel="noopener noreferrer"&gt;Pandas documentation&lt;/a&gt;&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>python</category>
      <category>beginners</category>
      <category>discuss</category>
    </item>
    <item>
      <title>Foundations of probability (1)</title>
      <dc:creator>Aya El Sherif</dc:creator>
      <pubDate>Thu, 16 Dec 2021 11:29:37 +0000</pubDate>
      <link>https://dev.to/ayaelsherif/foundations-of-probability-1-598a</link>
      <guid>https://dev.to/ayaelsherif/foundations-of-probability-1-598a</guid>
      <description>&lt;p&gt;As explained before in our previous blog that probabilities is essential to explore more in the data science field So, Let's start our today's journey!&lt;/p&gt;




&lt;p&gt;You may wonder what is probability and its role in data science field?&lt;br&gt;
Probability is the foundation of many models and methods in data science, We can't really build a good model without knowing the concepts of probabilities.&lt;/p&gt;


&lt;h2&gt;
  
  
  Flipping a coin
&lt;/h2&gt;

&lt;p&gt;Do you remember "head and tail" we were taught in the primary school? We'll travel back years ago to this basic example.&lt;br&gt;
As we know we explain and also show the theoretical part in a piece of code in python.&lt;/p&gt;

&lt;p&gt;&lt;u&gt;&lt;strong&gt;Bernoulli trial&lt;/strong&gt;&lt;/u&gt;&lt;br&gt;
Possible outcomes here are binary which can be modeled as (Yes/No) or (On/Off) or (Head/Tail) or (Success/Failure) and so on.&lt;br&gt;
In our case its success (Heads) or Failures (Tails).&lt;br&gt;
Each outcome is called an &lt;strong&gt;Event&lt;/strong&gt;.&lt;br&gt;
For a fair coin flip it we have 50% chance of getting heads and 50%  chance of getting tails for each event.&lt;/p&gt;

&lt;p&gt;Let's simulate the coin flips, We'll be using the "Bernoulli" object from a python library called "Scipy.stats".&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#Generate rvs for random variates using arg. p for success prob. and size for no. of coin flips.
from scipy.stats import bernoulli
bernoulli.rvs(p=0.5, size=1)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Outputs =&amp;gt;&lt;br&gt;
array([0]) is the output for the first time which means failure or T.&lt;br&gt;
array([1]) is the output if you ran it again which means success or H. &lt;/p&gt;


&lt;h2&gt;
  
  
  Flipping multiple coins
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Change the size of flips&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;bernoulli.rvs (p=0.5, size=10)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output =&amp;gt; array([0, 1, 1, 0, 1, 0, 1, 0, 1, 0])&lt;br&gt;
So, How many head there? Let's explore this together!&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sum(bernoulli.rvs(p=0.5, size=10))
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output =&amp;gt; 5, This means 5 heads and 5 tails.&lt;br&gt;
Let's rerun it again and see what happens.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sum((bernoulli.rvs(p=0.5, size=10)))
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output =&amp;gt; 2, This means we have 2 heads and 8 tails.&lt;/p&gt;

&lt;p&gt;&lt;u&gt;&lt;strong&gt;Using binomial distribution for independent Bernoulli trials&lt;/strong&gt;&lt;/u&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;  n =&amp;gt; No. of the coin flips&lt;/li&gt;
&lt;li&gt;  p =&amp;gt; Probability of success&lt;/li&gt;
&lt;li&gt;  size =&amp;gt; No. of draws of the same experiment&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Let's simulate the coin flips, We'll be using the "Binom" object from a python library called "Scipy.stats".&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#**Binomial r.v.**
from scipy.stats import binom
binom.rvs (n=10 , p = 0.5 , size = 1)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output =&amp;gt; array([7]), This means we have 7 heads out of 10 flips.&lt;br&gt;
Let's now try 10 more times of drawing.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;binom.rvs(n=10, p=0.5 , size=10) 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output =&amp;gt; array([6, 5, 6, 6, 7, 6, 4, 6, 5, 6]), This means that "6" is the result that repeats the most for a fair coin.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Biased coin draws&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;binom.rvs(n=10, p=0.3 , size=10)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output =&amp;gt; array([2, 5, 3, 4, 2, 4, 1, 4, 2, 5]), Changed the p of getting heads to "0.3" lead to different outcomes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Random number generator seed&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;To simulate the outcome of a random experiment.&lt;/li&gt;
&lt;li&gt;If you run the same command with the same random seed, you will always get the same result. In Python we need to set a seed for the generator to produce similar outcomes in each experiment. Then we can check if the results are what we expected. We have two options to configure the generator: using the (random_state) parameter of the rvs function &lt;strong&gt;or&lt;/strong&gt; using (np.random.seed).
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from scipy.stats import binom
binom.rvs(n=10, p=0.5 , size =1, random_state=42)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Or&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from scipy.stats import binom
import numpy as np
np.random.seed(42)
binom.rvs(n=10,p=0.5,size=1)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output =&amp;gt; array([4])&lt;/p&gt;




&lt;p&gt;Today's blog is done but let's do a practice now!&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;u&gt;&lt;strong&gt;Practice&lt;/strong&gt;&lt;/u&gt;
&lt;/h2&gt;

&lt;p&gt;This exercise requires the bernoulli object from the scipy.stats library to simulate the two possible outcomes from a coin flip, 1 ("heads") or 0 ("tails"), and the numpy library (loaded as np) to set the random generator seed.&lt;/p&gt;

&lt;p&gt;We'll use the bernoulli.rvs() function to simulate coin flips using the size argument.&lt;/p&gt;

&lt;p&gt;We will set the random seed so you can reproduce the results for the random experiment in each exercise.&lt;/p&gt;

&lt;p&gt;From each experiment, you will get the values of each coin flip. You can add the coin flips to get the number of heads after flipping 10 coins using the sum() function.&lt;/p&gt;

&lt;p&gt;Steps:&lt;/p&gt;

&lt;p&gt;Import bernoulli from scipy.stats, set the seed with np.random.seed(). Simulate 1 flip, with a 35% chance of heads.&lt;br&gt;
Use bernoulli.rvs() and sum() to get the number of heads after 10 coin flips with 35% chance of getting heads.&lt;br&gt;
Using bernoulli.rvs() and sum(), try to get the number of heads after 5 flips with a 50% chance of getting heads.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Import numpy
import numpy as np
# Import the bernoulli object from scipy.stats
from scipy.stats import bernoulli

# Set the random seed to reproduce the results
np.random.seed(42)

# Simulate one coin flip with 35% chance of getting heads
coin_flip = bernoulli.rvs(p=0.35, size=10)
print(coin_flip)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output =&amp;gt; [0 1 1 0 0 0 0 1 0 1]&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#Using bernoulli.rvs() and sum(), try to get the number of heads after 5 flips with a 50% chance of getting heads.
five_coin_flips = bernoulli.rvs(p=0.5, size=5)
coin_flips_sum = sum(five_coin_flips)
print(coin_flips_sum)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output =&amp;gt; 2&lt;/p&gt;

&lt;p&gt;Using binom to flip even more coins Previously, you simulated 10 coin flips with a 35% chance of getting heads using bernoulli.rvs().&lt;/p&gt;

&lt;p&gt;This exercise loads the binom object from scipy.stats so you can use binom.rvs() to simulate 20 trials of 10 coin flips with a 35% chance of getting heads on each coin flip.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#Defining binom
# Set the random seed to reproduce the results
np.random.seed(42)

# Simulate 20 trials of 10 coin flips 
draws = binom.rvs(n=10, p=0.35, size=20)
print(draws)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output =&amp;gt; [3 6 4 4 2 2 1 5 4 4 1 6 5 2 2 2 3 4 3 3]&lt;/p&gt;




&lt;p&gt;I hope you got some basic knowledge and refreshed your mind with this blog and see you in the next learning journey where we will learn the probability distribution and more!&lt;br&gt;
You can check the &lt;a href="https://colab.research.google.com/drive/1tUGizsBiF9xNgzeCR0-8pmmBhcCD1f9P?usp=sharing" rel="noopener noreferrer"&gt;code&lt;/a&gt;&lt;br&gt;
Resource : &lt;a href="https://campus.datacamp.com/courses/foundations-of-probability-in-python/" rel="noopener noreferrer"&gt;https://campus.datacamp.com/courses/foundations-of-probability-in-python/&lt;/a&gt;&lt;/p&gt;

</description>
      <category>datascience</category>
      <category>beginners</category>
      <category>programming</category>
      <category>python</category>
    </item>
    <item>
      <title>"Hello Neural Network!"</title>
      <dc:creator>Aya El Sherif</dc:creator>
      <pubDate>Fri, 10 Dec 2021 13:49:50 +0000</pubDate>
      <link>https://dev.to/ayaelsherif/hello-neural-network-264</link>
      <guid>https://dev.to/ayaelsherif/hello-neural-network-264</guid>
      <description>&lt;p&gt;Machine learning is about a computer learning the patterns that distinguish things.&lt;/p&gt;




&lt;p&gt;&lt;u&gt;Let's start with a very simple question : &lt;br&gt;
&lt;/u&gt;&lt;br&gt;
X = -1 , 0 , 1 , 2 , 3 , 4&lt;/p&gt;

&lt;p&gt;Y = -3 , -1 , 1 , 3 , 5 , 7&lt;/p&gt;

&lt;p&gt;What is the formula that maps X to Y?&lt;/p&gt;

&lt;p&gt;⇒ 2X-1&lt;br&gt;
i.e. 2(-1)-1 = -3&lt;/p&gt;
&lt;h2&gt;
  
  
  &lt;strong&gt;Neural Network is a set of functions that can learn patterns.&lt;/strong&gt;
&lt;/h2&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;keras&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Sequential&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;keras&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;layers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Dense&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;units&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;input_shape&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;ul&gt;
&lt;li&gt;The above code is written using python , TF and an API in TF called "Keras".&lt;/li&gt;
&lt;li&gt;"&lt;strong&gt;Keras&lt;/strong&gt;" makes it easy to define Neural networks.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;"Dense"&lt;/strong&gt; defines a layer of connected neurons. ⇒ 1 dense ⇒ 1 layer ⇒ 1 unit ⇒ 1 neuron&lt;/li&gt;
&lt;li&gt;Successive layers are defined in sequence as "&lt;strong&gt;Sequential"&lt;/strong&gt; ⇒ 1 neuron.&lt;/li&gt;
&lt;li&gt;Shape of what's input to NN in 1st layer ⇒ 1 value.&lt;/li&gt;
&lt;/ul&gt;


&lt;h2&gt;
  
  
  Important functions:
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Optimizers&lt;/li&gt;
&lt;li&gt;Loss&lt;/li&gt;
&lt;/ul&gt;



&lt;p&gt;&lt;u&gt;Very simple : &lt;br&gt;
&lt;/u&gt;&lt;br&gt;
The neural network has no idea what is the relation between X and Y as mentioned before So,&lt;/p&gt;

&lt;p&gt;It guesses the formula for example ⇒ Y=10X-10 and then it will use the data it knows about (Set of Xs and Ys) to measure how good or bad its guess was. The &lt;strong&gt;LOSS&lt;/strong&gt; fun. measures this and then gives it to the &lt;strong&gt;OPTIMIZER&lt;/strong&gt; which figures out the next guess So, The optimizer thinks about how good / bad the guess was done using the data from the loss function.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Each guess should be better than the prev. one.&lt;/li&gt;
&lt;li&gt;As the guesses become better and better ⇒ Accuracy approaches 100%. (&lt;strong&gt;Convergence&lt;/strong&gt;)&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  Convergence
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;A machine learning model reaches convergence when it achieves a state during training in which loss settles to within an error range around the final value ⇒ A model converges when additional training will not improve the model.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxs4vdygt6vcgxmiojsg8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxs4vdygt6vcgxmiojsg8.png" alt=" " width="486" height="406"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Loss
&lt;/h2&gt;

&lt;p&gt;A lost/cost function is about checking probabilities of a prediction based on how much the prediction varies from the true value. This helps us to know more about how well our model is performing.&lt;br&gt;
Unlike accuracy, loss is not a % — it is a summation of the errors made for each sample in training or validation sets.  Loss is often used in the training process to find the "best" parameter values for the model (e.g. weights in neural network). During the training process the goal is to minimize this value.&lt;/p&gt;

&lt;p&gt;A ML model reaches convergence when it achieves a state during training in which loss settles to within an error range around the final value.&lt;/p&gt;

&lt;p&gt;A model converges when any additional training will not improve the model.&lt;/p&gt;

&lt;p&gt;E.g. ⇒ Mean squared error.&lt;/p&gt;
&lt;h2&gt;
  
  
  mean_squared_error ( ):
&lt;/h2&gt;

&lt;p&gt;Computes the mean squared error between labels and predictions.&lt;/p&gt;
&lt;h1&gt;
  
  
  Optimizer
&lt;/h1&gt;

&lt;ul&gt;
&lt;li&gt;SGD (Stochastic Gradient Descent).
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;model.complie(optimizer = 'sgd', loss_function = 'mean_squared_error')
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Now, Let's get back to our example and our sets (X &amp;amp; Y):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Xs= np.array([-1.0, 0.0, 1.0, 2,0, 3.0, 4,0], dtype= float)
Ys= np.array([-3.0, -1.0, 1.0, 3.0, 5.0, 7,0], dtype=float)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We used numpy *&lt;em&gt;python *&lt;/em&gt; library (np) for data representation.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;model.fit(Xs,Ys, epochs=100)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As discussed above : 'Epochs' is to loop 100 times and make a guess =&amp;gt; measure how good/bad the guesses by LOSS and then use the OPTIMIZER + data to make another guess and repeat this more and more.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;print(model.predict([10]))
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When you run the whole &lt;a href="https://colab.research.google.com/drive/1wfjQBFWmhgxTlCc7h8DBiaCKrxNGxxfn?usp=sharing" rel="noopener noreferrer"&gt;code&lt;/a&gt;&lt;br&gt;
You'll notice that it the prediction is [[17.862192]] and not 19 as expected that's because in neural networks we deal in "Probability"!&lt;br&gt;
Wait for more blogs explaining the glorious role of probability in the art of data! &lt;br&gt;
&lt;u&gt;Resources and they're pretty good to explore more:&lt;br&gt;
&lt;/u&gt; &lt;br&gt;
&lt;a href="https://www.coursera.org/learn/introduction-tensorflow/home/welcome" rel="noopener noreferrer"&gt;The main refrence&lt;/a&gt;&lt;br&gt;
&lt;a href="https://www.youtube.com/watch?v=aircAruvnKk" rel="noopener noreferrer"&gt;Also, This video explains the NN very well!&lt;/a&gt;&lt;/p&gt;

</description>
    </item>
  </channel>
</rss>
