DEV Community: Artik Blue

Intro to normal distributions

Artik Blue — Tue, 28 Apr 2020 15:12:22 +0000

Today I'm going to drop you a few relevant notes on normal distributions, so grab a coffee and follow me.

What is a distribution?

A distribution, or a probability distribution is a mathematical function that provides the probabilities of the occurrence of various possible outcomes in an experiment. Probability distributions are used to define different types of random variables in order to make decisions based on these models.

In other terms it is a function that shows the possible values for a variable and how often they occur.

Distributions can be both

If we are in a university campus and study the height of the people, we will probably see that most of the students are in the same height range (mean) we'll also see few students who are below that mean and few others above it. We can plot the histogram on that and what we'll see will be the distribution of the students!

We'll see something like this (height measured in cm)

That's the distribution and what it tells us is what we see, that most of the students are located in the same height range, other heights sure exist but on lower frequency so it is less probable to see students on those other ranges.

...and by the way that is also a normal distribution!

Normal distributions

In common words, normal distributions or bell-form distributions are those that look similar to the one that just presented. On those distributions most of the values fall within the mean

aproximately 68% of the values fall within one standard deviation (we already presented this one on our first posts) and 95% fall within 2 stdevs, like you can see here:

The first key point on this one is that, if your data follows a normal distribution, we only need the mean and the stdev to compute the whole graph or even better to get the actual probability/frequency on a particular value.

We can use the following formula

Other distributions

Other distributions exist as well such as the poisson distribution that looks like a normal distribution somehow moved to the left, the skellam distribution that looks like a normal one but with a peak on the centre and of course uniform distributions that present the same value all the way up.

Most of the time the type of data distribution won't matter that much in our experiments because of the central limit theorem

Central limit theorem and normal distributions

In the study of probability theory, the central limit theorem (CLT) states that the distribution of sample means approximates a normal distribution (also known as a “bell curve”), as the sample size becomes larger, assuming that all samples are identical in size, and regardless of the population distribution shape.

Said another way, CLT is a statistical theory stating that given a sufficiently large sample size from a population with a finite level of variance, the mean of all samples from the same population will be approximately equal to the mean of the population. Furthermore, all the samples will follow an approximate normal distribution pattern, with all variances being approximately equal to the variance of the population, divided by each sample's size.

That comes in very handy when it comes to statistican analysis on very large populations, as we can grab smaller grups of random samples and infere knowledge about the whole population!

Some resources and references:

https://towardsdatascience.com/understanding-the-normal-distribution-with-python-e70bb855b027

http://sphweb.bumc.bu.edu/otlt/MPH-Modules/PH717-QuantCore/PH717-Module6-RandomError/PH717-Module6-RandomError5.html

https://www.investopedia.com/terms/c/central_limit_theorem.asp

Covid19 tests and Thomas Bayes

Artik Blue — Wed, 22 Apr 2020 16:53:41 +0000

Bayes theorem

Created by reverent Thomas Bayes, who created this famous rule to infere the existence of God... needless to say that the existence of God is still in the realm of faith but his famous rule has became the very fundamentals of inference probability

The bayes theorem can be written as the following:

Do you remember the covid detection example we had in the last tutorial?

We want to know the probabilitty for one person of having covid if he or she had tested positive

We have the on average probability for a random individual of having covid or not

p(covid) = 0.01
p(¬covid) = 0.99

Then we have this test we just bought that goes:

90% of the time outputs POSITIVE if you have covid

But also

90% of the time marks NEGATIVE ig you don't have it

So we can mind our problem the following way:

P(C | POS) = P(C) * P(POS|C)
P(¬C | POS) = P(¬C) * P(POS|¬C)

In this case P(C | POS) will be the probability of having covid given a positive test result, and P(POS|C) the probability of having a positive test result giving that the patient actually has covid, it is important to mark those concepts as sometimes we may get confused on what do they represent.

So let's compute them on our case.

P(C | POS) = P(C) * P(POS|C) = 0.01 * 0.9 = 0.009

And

P(¬C | POS) = P(¬C) * P(POS|¬C) = 0.099

But as you might have seen, those probabilities don't add up to one, they should be somehow complementary right? Given a positive test if we don't have covid, then we have it right? No other options around, so they should add to one, thats not exactly like that because we are selecting random individuals and they may or may not have the virus, we need to "focus" on our target.

If we want to relativize our context to those two options, we need to normalize them, so first we add those values together:

0.009 + 0.099 = 0.108 that is the total probability of having a positive test (wether the person has the virus or not) after selecting a random person on town.

So finally we can obtain whats called the posterior probability by normalizing our probabilities, that means, dividing them by their sum.

P(C|POS) = 0.009 / 0.108 = 0.0833

P(¬C|POS) =0.099 / 0.108 = ¨0.9166

So those are the probabilities, pretty impressive right? That is because we are assuming an initial positive result on the test. We can see it better if we go in the other direction, let's evaluate for negative results.

P(C | NEG) = P(C) * P(NEG|C) = 0.001
P(¬C | NEG) = P(¬C) * P(NEG|¬C) = 0.891

And the normalizer is = 0.001 + 0.891 = 0.892

Finally:

P(C | NEG) = 0.0011
P(¬C | NEG) = 0.9989

So if we get a negative on our test, there we can be 99% sure that we are covid free, said that you can easily detect that those tests are great for "discarting", for doing the initial triage and detecting negatives, but when it comes to positives, aditional checks should be made to get a veredict.

A small lesson on conditional probability

Artik Blue — Fri, 17 Apr 2020 19:51:10 +0000

By continuing to study the many aspects of probability we have to look at the conditional probability, that answers the fundamental question: What is the probability of having X phenomenon if Y phenomenon already happened?

When we are just flipping coins the results of the second throw won't depend on the results given at the first one, but for example if we pick a random person there is some chance that he or she will be a night person or day person depending whether the person prefeers the night or the day it will be more or less probable that he or she likes to go out running early in the morning. That is one of the fundamentals of conditional probability, events that are correlated.

The covid19 test

Imagine that we are in charge of a hospital and we have this novel test that we have to use and can detect covid19 in a matter of minutes, the test works as follow:

If the person has covid it will detect it 90% of the time
If the person does not have it it will also detect that 80% of the time

And with those you can easily calculate the complementaries. We can also write it in other terms:

P(POSITIVE | COVID) = 0.9
P(NEGATIVE | COVID) = 0.1

P(POSITIVE | ¬COVID) = 0.2
P(NEGATIVE | ¬COVID) = 0.8

And from our statistical data we assume that if we pick a random person on our town only 1 out of 10 have covid, SO

P(COVID) = 0.1
P(¬COVID) = 0.9

So using that we can generate a table and calculate some probabilities:

If we want to know the probability we have that if we pick a random person on the street, this person will have covid and will be detected as positive we can just multiply the general probability of having covid by the probability of having a positive test given that he or she has covid.

We can even build a table.

COVID	TEST	P()
Y	POS	0.09
Y	NEG	0.02
N	POS	0.18
N	NEG	0.72

Note that all of those add to one and we can even calculate probabilities such as: what is the probability that if we pick a random person on the street he or she will do positive on a test, no matter if he or she has the virus o not?

We just need to add the probability of doing POS having the virus and doing POS not having it, that is 0.009 + 0.18 = 0.27 very intuitive!

Moar coins

Yes... more of them. So if we have two identical coins and we flip one, and then the other the probabilities for the second throw to be a tail won't depend on the first throw at all, as all of them will have the same probability and we will pick one or the other. But let's say one of the coins is loaded so the probability of getting a head after throwing it is 0.9, the other is just a regular coin. We put them both on a box and we ask someone to pick one of them and throw it we assume that 50% of the time the person will pick one 50% will pick the other, the probabilities of getting a tail will change here as firstly the person will have to pick a coin and DEPENDING on the coin he or she picks the probabilities of getting head will be diferent!

Again we can get the probability of getting a tail after picking the non loaded coin by doing 0.5 (picking the coin) * 0.5 (and getting tail) and after that we can build the table like this:

PICK COIN	FLIP COIN
1	H	0.25
1	T	0.25
2	H	0.45
2	T	0.05

The methodology is the same. You can think about more complex things such as, starting by the coin picking game and then doing multiple throws or using fake coins that have a head on both sides, the logic is the same.

See you next with some more probabiliy with bayes theorem :)

More about probability: Binomial distributions

Artik Blue — Tue, 14 Apr 2020 12:42:18 +0000

Let's go back to our previous coin throwing example!

Let's suppose that we flip a regular coin two times. How many times will we get two heads or two tails?

Exactly two! Good easy

Now let's flip it five times. How many outcomes do have the same number of heads than tails? Weeell... it will be zero, as five is an even number, you see sometimes we can know the result easily without having to do a lot of math behind.

Going back to the same example, let's now say 1 heads and then 4 tails, in how many ways can we achieve that? There are five scenarios when this is posible. We can image five slots each slot being one throw and as we are looking for 1 head only, the head can be in only one position at a time, so it can move through every slot but be in one slot at a time. But if we now want to know about scenarios with 2 heads, this gets complicated.

You can think about that this way: We have 5 slots and heads can be in 2 slots at a time, so if the first head is on one spot, then the other one has 4 slots to move on so it would be 4 * 5.

But... those hads can move to diferent positions but those are not independent entities, they are events so we cannot understand them as head1 and head2 being their permutation a diferent scenario. We'll have to divide by two to get rid of those permutations.

The calculation will be

5*4/2 = 10

If instead of two heads we want three of them, the calculation is similar

5*4*3 / 3*2*1

Again we have five slots for the first head, four for the second and three for the third, but we need to avoid those permutation scenarios (for the first throw the coin can be in one out of three positions, for the second throw in two, one for the last throw).

As you already saw, there can be a generalization related to this process. We see that we are dividing by numbers that are a result of a secuence of multiplication from the number itself all the way down to one, that is called the factorial and it looks like this:

n! = n*(n-1)*(n-2)...*1

So something like

5*4*3 / 3*2*1

Can also be represented as:

5*4*3 / 3!

Understanding this we can define the formula for calculating the posible situations on a particular event can happen, the binomial distribution:

n! / k! * (n-k)!

Where n is the number of times a phenomenon repeats (ex: 10 coin flips) and k is the times a result has to repeat (ex: 5 heads) assuming that we only have two options (heads or tails) for the phenomenon.

Let's apply the rule to calculate real probabilities withouth having to draw
those tedious truth tables:

We flip a coin 5 times and we want to get EXACTLY one heads, whats the probability?

The total amount of situations we can expect to have on that phenomenon is 2^5 = 32 (32 combinations of diferent results)

The about the specific situations we are interested in:

n = 5
k = 1

5! / (5-1) * 1 = 5

So 5/32 = 0.15625

That is our probability

But what if the coin is loaded and P(heads) is 0.8?

Again, we can start by calculating the number of scenarios when this will happen:

5! / 4! * 1! = 5;

And on this situation(s) we will get heads 5 times and tails one so it will be something like

1*0.8 * 1*0.8..... *1*0.2 (probability of tails * 1 tails) In other terms:

0.8^4 * 0.2^1

So 0.4

Now you know about the binomial distribution... use this knowledge with responsability...

Introduction to probability

Artik Blue — Tue, 14 Apr 2020 11:18:30 +0000

Statistics and probability are very related! Probability makes predictions about future events based on models and statistics studies data (imagine, events that already happened and produced results) to gather information about them and perhaps maybe build models.

Imagine we have a coin and we flip it to see if we get heads or tails. By fliping the coin we are making data, data that can be studied later on to for gaining more knowledge about the phenomenon.

So we flip it once and we get heads. At that point our data will be DATA={Heads}, we flip it again and... heads another time, so then DATA={Heads,Heads} and then we go for a third time and we get tails so finally DATA={Heads, Heads, Tails}. At that point, if we stop the experiment right there we might think that one time in every three times the coin returns Tails, the rest of the times Heads or in other terms: Probability(Tails) = 1/3. And what about the probability of Heads, then as we get Heads 2 times every 3 throws it will be 2/3. If we add Probability(Tails) + Proability(Heads) we get 1, so another way of getting the probability of let's say Heads is to do 1-Probability(Tails), thats called the complementary probability!

And that may be true but, what if we repeat this experiment milions of times and we keep track of every result?

Well in that case we will probably get something as: P(tails) = 1/2. Think about it that way: If a coin has two sides, we can only get heads or tails one result per throw, and if the coin is perfectly balanced, it is common sense that we can either get one or another right? So our experimental data should demonstrate that.

Now we will assume that our coin is perfectly balanced, we have a probability of 1/2 for every result.

What if we want to know the probability of getting Heads two times, one after another by throwing the coin two consecutive times.

Now that we know a little bit about probability, let's image all posible situations and try to use the rule we just presented:

FLIP 1	FLIP 2	PROBABILITY
H	H	0.25
H	T	0.25
T	H	0.25
T	T	0.25

We can derive that the probability of getting two heads will be 0.25 knowing that the probability of each of the posible outcomes or results of this phenomenon (throwing a coin 2 consecutive times) needs to add to 1. So we have four posible situations each one of them equally probable thus 0.25 is the probability of each one to happen. Another way of thinking about that is just to multiply 0.5 (the probability of getting a head) by the probability of getting heads (0.5).

What about the probability of getting two heads when the coin is loaded and we know that we can get heads with a probability of 0.6 (and thus probability of tails will be... 0.4)? If we follow that rule we just learnt, that is easy, we can multiply 0.6 by itself and we get it!.

The table will look like that this time:

FLIP 1	FLIP 2	PROBABILITY
H	H	0.36
H	T	0.24
T	H	0.24
T	T	0.16

We may think that we know all about coin throws and probabilitty right now but there are several other questions to be asked, for example...what if we want to know the probability of getting exactly one head?

Regarding to our loaded coin, let's look at the table one more time

FLIP 1	FLIP 2	PROBABILITY
H	H	0.36
H	T	0.24
T	H	0.24
T	T	0.16

There are two posible situations when we are getting exactly one head. (H,T) and (T,H) on those others we get more than one head or no heads at all.

FLIP 1	FLIP 2	PROBABILITY
H	T	0.24
T	H	0.24

We can calculate the probability of getting exactly one head by adding those two probabilities together. So the total probability will be 0.48.

In situations such as the one we just reviewed the product is asociated with AND and the addition with OR. Our last example could be written like:

The probability of getting exactly one Head is:

The probability of getting Heads AND Tails OR The probability of getting Tails AND Heads.

Voilà!

For the extra mille, let's now imagine a standard dice. The probability of getting one of the possible results is 1/6. What's the probability of getting an even number? As ech result has a probabilitty of 1/6, the probability of an even number will be 3/6 or 1/2 as it can be represented as:

The probability of getting a one OR getting a three OR getting a five.

And about getting the same number twice on a two times dice throw??

Well, we just need to think about the situations when we will have the same number twice, calculate their individual probability and add it all together :)

FLIP 1	FLIP 2	PROBABILITY
1	1	1/6 * 1/6
2	2	1/6 * 1/6
3	3	1/6 * 1/6
4	4	1/6 * 1/6
5	5	1/6 * 1/6
6	6	1/6 * 1/6

And... that add ups to 1/6 again!!

Measures of spread in data

Artik Blue — Wed, 08 Apr 2020 12:03:55 +0000

If we have some data and we want to give a first sight on it, we can initially think about the properties we saw on the previous post, such as the mean, the median of the mode. But sometimes those are not enough and we want to look at the bigger picture of the data, we may want to see how that data is distributed.

For example, let us think about a university class. We are the professor and we have a bunch of students, we want to evaluate how the whole class is doing so we start by extracting the measures of center such as the mean. We discover that the mean of our class is something like 5.8 (we are using a 0-10 scale), but our class may be very large and with that, we don't know if we have a bunch of students that are not even passing and some very smart ones who are getting straight 10s or maybe we have a whole lot of average students who get marks close to 6 and 7s. Of course we can use other systems measures here to get more information such as the median or the mode, but if you think it well, as we have decimal results the mode will return something weird (or perhaps we can group the marks into categories like (zero, fail, barely pass, good...) but that might be considered cheating if you think it well.

One thing we could do in this case is to start by plotting an histogram of the data of our students (to make it clearer I adjusted all of the marks to integers instead of decimals):

That histogram can be generated with python using pyplot in the following way:

import matplotlib.pyplot as plt

x = [1]*1 + [2]*2 + [3]*3 + [4] *4+ [5]*5 + [6]*6 + [7]*5 + [8]*4 + [9]*3 + [10]*2
# take your tame and change the bins param to see what it does!
plt.hist(x, bins = len(x))
plt.show()

What we see in the graph is that most of the students get marks closer to 6 and few of them perform very good or very bad eventhough there is some tendency in the class that says that students tend to perform more good than bad.

In fact we can contrast that with our already known measures of central tendency such as the median or the mode!

import numpy as np

x = [1]*1 + [2]*2 + [3]*3 + [4] *4+ [5]*5 + [6]*6 + [7]*5 + [8]*4 + [9]*3 + [10]*2

print("mean:")
print(np.mean(x))
>>> 5.8
print("median:")
print(np.median(x))
>>> 6

If we recall the histogram we just ploted, we now see that the median here is coincides with its center and its also easy to understand that the point associated with the highest bar is the mode and due to the fact that we have some more points closer to ten rather than points closer to zero the mean is closer to six instead of for example closer to four.

Another way to analyze the spread of our data is by the boxplot.

That can be generated with:

import matplotlib.pyplot as plt

x = [1]*1 + [2]*2 + [3]*3 + [4] *4+ [5]*5 + [6]*6 + [7]*5 + [8]*4 + [9]*3 + [10]*2

#plt.hist(x, bins = len(x))
plt.boxplot(x)
plt.show()

In this case, what the boxplot shows is that most of the data is located within the box, to be more specific the data that is inside the box is the data that falls within the interquartile range, the line marks the median.

But wait wait, interquartile range? what is a quartile?

For a better explaination, let's calculate them with the scipy stats package!

import matplotlib.pyplot as plt
from scipy.stats.mstats import mquantiles
x = [1]*1 + [2]*2 + [3]*3 + [4] *4+ [5]*5 + [6]*6 + [7]*5 + [8]*4 + [9]*3 + [10]*2

print(mquantiles(x))

>>> [4.  6.  7.8]

So 4 is the quartile 1 (Q1), 6 is the Q2 and 7.8 the Q3

That means that 25% of the students scored 4 or less, 50% of them did 6 or less and 75% did 7.8 or less (of course 100% of them scored 10 or less).

The inter-quartile range or IQR can be calculated by substracting the first quartile from the third, in this case is 3.8 and tells us how far apart the first and third quartile are, so it indicates how spread out the middle 50% of our set of data is.

Another interesting feature, but a bit less relevant in this particular case we can see here is the range, which is 10 herre as our data goes from 0 to 10.

Sometimes we may work with multiple datasets and want to perform multiple automatic decisions, so plotting a histogram or a boxplot may be ver tedious. On that cause three numbers such as the quartiles may be useful but tedious to deal with as well as they won't be a single number!

To deal with that situation we have a couple of measures that come very handy, those are the variance and the standard deviation and they well how far is the are features from the mean.

To better understand it, let's do it with python!

import numpy as np

x = [1]*1 + [2]*2 + [3]*3 + [4] *4+ [5]*5 + [6]*6 + [7]*5 + [8]*4 + [9]*3 + [10]*2

print("Variance:")
print(np.var(x))
>>> 5.26530612244898
print("Standard deviation:")
print(np.std(x))
>>> 2.294625486315573

So those measures tell us how far the average student is from the mean when it comes to his or her marks. The standard deviation is just the square root of the variance, thats because the variance is measured in squared units (we have it squared to avoid negative results).

As I want to get practical on this I encourage you to read more about that on your own here: https://www150.statcan.gc.ca/n1/edu/power-pouvoir/ch12/5214891-eng.htm

So you can learn the theory.

If your std is low that means that most of the data is grouped together near the mean and so the mean will be a strong statistic to understand the data, if otherwise the std is very high that would mean that our data is wide spread across axis.

Let us now look at the histogram one more time:

If we look closer to it... dont we see a pattern? it is like wave! Well yes, on that case it is clear that our data follows a pattern, one of the most important patterns on this case, the normal distribution (well, quasi normal distribution on this case, as it is not 100% symetric). Read more about it here! https://statisticsbyjim.com/basics/normal-distribution/

The normal distribution has many interesting properies, it is symetric, the mean the median and the mode are equal (we have a quasi normal distribution here), and one of my favorite properties is that in a normal distribution, 68% of the data fall within +/- 1 standard deviation from the mean.

So if we know that we are dealing with data that is following this distribution and we know both the mean and the std we can assume a lot of things very easy.

Other kinds of shapes exist as well out there such as left and right skewed distributions. For example the athletic shape of an individual during life it starts relatively well and decreases rapidly as the person ages. Another example of a skewed distribution could be the amount of income you have, when you are young it is really low but it may increase year after year as you grow older!

What is data?

Artik Blue — Sun, 05 Apr 2020 18:52:35 +0000

I would like to start this site as a small on board journey where I'll be keeping my notes in a format that can make them useful for others.

So this first post series will be all about data analysis we will start by the very fundamentals of statistics and move on to more advanced topics such as supervised and unsupervised learning as well as deep learning.

Intro to descriptive statistics

Data is literally everything we can keep track off and measure, it could be a list of the cars you are seeing on the street, your school records, the cost you have to pay for your favorite coffee recorded for every day of the year or all the usernames and emails of the whole netflix database among others. So everything we can measure, that is data.

And data itself is useless, if we record data it is because we want to do something with it. A complex example may be a hedge fund investor, that always has an eye on the stock market, accessing huge amounts of data including: names and stock prices of many companies, currency change rates as well as a lot of previous buy/sell operations made by other investors. By using that data a hedge fund analyst may evaluate the state of the market related to its interests and perhaps try to perform some preditions. That last example is probably what comes to our minds when we try to think about a data analyst or someone who works with large amounts of data, but the fact is that we use data analysis on a daily basis, for example when we go out for buying something, we may quick scan through all of the items in the store that have the conditions we need and then we select the one with the lower price, we may do that without htinking but deep down inside, we are applying data analysis techniques!

In general when we talk about data we talk about qualitative and quantitative data.

Qualitative data? We won't do any numerical operation with this kind of data. Think about data such as a car model, a coffee brand or a company name. It represents a quality.

Quantitative data? For example, number of cars. And if we look at this kind of data, we can see that it can be continuous or discrete. Continuous data may take negative, positive or even decimal values, the key point here is that it represents information that has an order and so it is continuous in some way. Discrete data is just quantitative data that does not follow any order. Again, there can be a bit of confusion regarding to this, as any set of quantitative data can be defined as discrete.

Measures of central tendency

And we can perform all kinds of operations on data, some of them may make zero sense but, some others are pretty well kown and widely used! Everyone starts with the measures of central tendency, by using them we can get a pretty general overview about what kind of data we have.

As we are willing to do some data science at the end, the best we can do is to start getting used to the tools, so we'll offer examples of our calculations by using python and some popular data science/math libraries such as numpy and scypy.

MEAN

The mean is useful for obtaining a general overview of the data, it is obtained by adding all the values of the list (we can call that list a dataset for now) and then dividing by the total lenght of the list. It is useful when we need to obtain a result that includes the whole set, for example the mean is used when your teachers calculate your final marks, as every exam is important, perhaps some exams may be more important than others but we'll dig deeper into that further in these series.

The main problem of using the mean though is that if we have a list of numbers for example such as: 1,2,3,2,3,2,99999 we can see that the mean will look close to 14287 and that number does not represent our list, that is mostly comprised of numbers between 0 and 5. We use other measures such as the mode when we have problems such as this one. By the way, that large number we just saw is called an outiler! And we must be aware of them.

The following example will calculate the mean of a python array by using numpy.

import numpy as np

arr = [20, 2, 7, 1, 34] 
print("mean of arr : ", np.mean(arr)) 

>>> 12.8

MEDIAN

We can obtain it by ordering the values from smallest to the greatest and then the mean will be the value that is right in the centre. If we can extract a centre (if we have an even number of items) the median will be calculated as the mean of the two central values.

The median works particularly well if we have a dataset that has values that are way different than the others!

And the median can be calculated as follows using numpy

import numpy as np

arr = [20, 2, 7, 1, 999,-30] 
print("mean of arr : ", np.median(arr)) 

>>> 7.0

MODE

The mode is the most common value of the dataset, so it is the value that occurs most often. For example, we may be searching for a new appartment to rent, the mean appartment price of a zone may be a good idea but, we can also be interested in knowing the mode as a lot of the appartments have prices in a range (ex: its easy to have 1500USD, 1800USD, 5000USD... but more rare to have 1443USD, 1221 USD and such) we can also use the mode so.

On this case we see that numpy is not able to calculate the mode as easily, but scipy does so we'll use that.

from scipy import stats

arr = [20, 2, 2, 7, 1, 34] 
print("mean of arr : ", stats.mode(arr)) 

>>> 2

MIN

It is simply the item that has the minimum value on the selected dimension, ex: the item with the lowest price.

The MIN can be calculated using numpy as follows:

import numpy as np

arr = [20, 2, 7, 1, 34] 
print("mean of arr : ", np.min(arr)) 

>>> 1

The MAX can be calculated using numpy as follows:

MAX

It is the element that has the maximum value ex: the biggest house.

import numpy as np

arr = [20, 2, 7, 1, 34] 
print("mean of arr : ", np.max(arr)) 

>>> 20

On the next parts we'll look at the measures of spread, and we'll start to see some charts!