Hello Scientists
Recently I was applying for data science positions. At many interview and technical exams, you will be asked about some math basics that are important for any data Scientist. So today let's study math! What is Set , cardinality , intersection , and union, then, we will talk about confusion matrix
Let's get started:
What is a Set
Set is a collection of things, set is made up of elements.
Set can be a group of similar or different things. For example, you can have a set of books or a set of random things in a box.
let's look at some symbols that are commonly used with sets
{} -> The representation of set (In Python this is a dict but we will use it)
∈ -> Membership/ Element of a set
∉ -> not element of/ no set membership
∅ -> Empty set (Phi)
∪ -> Union
∩ -> Intersection
Now let's take examples to understand the set and the symbols:
If we have a set of students of a math course and their names are : Manar, Noor, Raneem, Noura.
Then, to represent that set we write :
math_students = {Manar, Noor, Raneem, Noura}
Set representation in Python :
math_students = {'Manar', 'Noor', 'Raneem', 'Noura'}
Note that sets in Python can be from different types, such as :
my_set = {'A',4,True,5.78}
And Set items in Python are unordered, unchangeable, and do not allow duplicate values.
How many students above applied for the math course ?
The answer is 4. To represent it mathematically we use the word
" cardinality" which just means the size of the set. So, the cardinality of math_students is 4 and that's how we represent it:
|math_students| = 4
To find the cardinality in Python we write:
cardinality = len(math_students)
Note that the name cardinality here is just an example of a variable name
Now what is the relationship between Noura and math_students?
Noura is a member of the math_students
we represent it using ∈ like the way below :
Noura ∈ math_students
To check if there is a membership relationship in Python :
membership = "Noura" in math_students
Note that the name membership here is just an example of a variable name
What is the relationship between Dario and math_students ?
Dario is not a member of the math students set. That's how we represent it:
Dario ∉ math_students
To check the if there is no membership relationship in Python :
no_membership = "Dario" not in math_students
Note that the name no_membership here is just an example of a variable name
Now we have another set failed_students for the students who failed in the exam, but fortunately, no student has fail. So our new set is empty. To represent our empty set we write:
failed_students = ∅
In Python, the empty set looks like :
failed_students = {}
And its cardinality equal to zero :
print(len(failed_students))
output : 0
Now we reached the fun Part (Intersection and Union):
And we will start with intersection:
back to our math_class example, let's assume that we have another course which is Python, and the students who applied for the class are Manar, Tala, Dario, Raneem, Aseel
So now we have :
math_students = {Manar, Tala, Raneem, Noura}
Python_students = {Manar, Noor, Dario, Raneem, Aseel}
If you check the students' name (assuming that they are the same people),we will find out that Manar and Raneem are attending both math and Python classes
So we say the intersection between math_students and Python_students are Manar and Raneem .. To represent it mathematically:
math_students ∩ Python_students = {Manar, Raneem}
Note that the result of the intersection is a set
In Python, we represent the intersection using & symbol :
math_students & Python_students
output: {'Manar','Raneem'}
Let's imagine that we have another course for Geography, and the students are : Mark, Kitty, Tala and Keven
geography_studen = {Mark, Kitty, Tala, Keven}
What is the intersection between geography_studen and python_students ?
Well, the answer is: there is no intersection
And What did we say about how to represent an empty set ? by using Phi ∅ . So:
geography_studen ∩ python_students = ∅
And in Python :
geography_studen & python_students
output: set()
Now the Union
We have :
math_students = {Manar, Tala, Raneem, Noura}
Python_students = {Manar, Noor, Dario, Raneem, Aseel}
The union is all the members of all the sets
which means here, all the members of math_students and all the members of Python_students.
To represent the Union we said that we will use ∪.
So, let's try to put them together :
math_students ∪ Python_students = {Manar, Tala, Raneem, Noura, Manar, Noor, Dario, Raneem, Aseel}
Ok that seems good but as we can notice, Manar and Raneem are duplicated, Why ? because Manar and Raneem are attending both the math and Python courses. in other words, because Manar and Raneem are the intersection between math_students and Python_students. So let's remove one Manar and one Raneem
math_students ∪ Python_students = {Manar, Tala, Raneem, Noura, Noor, Dario, Aseel}
Now Our Union is correct. What we just did is something called Inculsion Exclusions Formula.
The mathematical representation of the formula is:
|A ∪ B| = |A| + |B| - |A ∩ B|
And that's what we did. we added Python_students and math_students then we minus the interaction (remove the duplication)
To represent union in Python:
Python_students | math_students
output: {'Manar', 'Tala', 'Raneem', 'Noura', 'Noor', 'Dario', 'Aseel'}
And that's it for the symbols 😀
Bonus Info 🥳: What Is Venn Diagram?
Venn Diagram:
Is illustration the uses circles to show the relationship among infinites groups of things.
For Example:
A= {1,3,5,7} B= {2,3,4,5,6} C= {9,15}
Finally, we reached the end. Hope you Enjoyed
Top comments (4)
{}
is not set in Python, it is adict
(writetype({}) == dict
). Empty set in Python isset()
.sorry if that was confusing
Nice work!
Thank you Matt 😄