DEV Community

ML_Concepts
ML_Concepts

Posted on

Mean, Median, and Mode, Now with Python..!

Image description
Introduction

Note: Use this link to check out our original article on measures for central tendencies.

Central tendency is a statistical concept that describes the central or typical value of a dataset. In other words, it provides a single value that represents the center or middle of a dataset. There are three main measures of central tendency: mean, median, and mode. Each of these measures provides a different perspective on the center of a dataset, and they are often used in combination to gain a better understanding of the data.

Mean

Image description

The mean, also known as the average, is calculated by summing all of the values in a dataset and dividing by the number of values. In Python, you can calculate the mean of a list of numbers using the mean() function from the statistics module. For example, the following code calculates the mean of a list of numbers.

from statistics import mean
numbers = [1, 2, 3, 4, 5]
print(mean(numbers))

Median

Image description

The median is the middle value of a dataset when it is ordered from least to greatest. If the dataset has an odd number of values, the median is the middle value. If the dataset has an even number of values, the median is the average of the two middle values. In Python, you can calculate the median of a list of numbers using the median() function from the statistics module. For example, the following code calculates the median of a list of numbers.

from statistics import median
numbers = [1, 2, 3, 4, 5]
print(median(numbers))

Mode

Image description

The mode is the value that appears most frequently in a dataset. A dataset can have one mode, multiple modes, or no mode at all. In Python, you can calculate the mode of a list of numbers using the mode() function from the statistics module. For example, the following code calculates the mode of a list of numbers.

from statistics import mode
numbers = [1, 2, 3, 4, 5, 2]
print(mode(numbers))

Keep in mind that mode is not defined for continuous data and only makes sense when you have a countable set of data.

If you want to calculate these measures for data that is not numerical, you can use python's built-in collections library which has a Counter class that would allow you to count the occurrences of each element in the data and then use that to calculate the mode.

It's important to note that central tendency measures are not always appropriate for all datasets. For example, if a dataset has a large number of outliers, the mean will be heavily influenced by these outliers and may not accurately represent the center of the dataset. In such cases, the median may be a more appropriate measure of central tendency. Additionally, if a dataset has multiple modes, it may be difficult to determine which mode is the most important or relevant.

Another important concept related to central tendency is skewness. Skewness refers to the asymmetry of a dataset. A dataset is symmetric if the mean, median, and mode are all equal. A dataset is positively skewed if the mean is greater than the median, and the mode is the smallest value. A dataset is negatively skewed if the mean is less than the median, and the mode is the largest value.

In conclusion, central tendency measures provide a single value that represents the center or middle of a dataset. Mean, median, and mode are the most commonly used measures of central tendency. Mean is the average of a dataset, the median is the middle value, and mode is the value that

Summary

In this article, I tried to explain measures for central tendencies in simple terms. If you have any questions about the post, please put them in the comment section, and I will do my best to answer them.

Top comments (0)