As a data scientist, one is expected to collect, organize, summarize, analyze, and draw inferences from data. This is where statistical processes come in handy.
What is Central Tendency?
Central tendency refers to the statistical measure that represents the typical value or central point of a dataset. With central tendencies, one can provide an accurate description of the data they are interacting with. The three main measures of central tendency are: mean, mode, and median.
- Mean (μ) -This is the most commonly used measure. It acts as the arithmetic average value. Although the mean is a good representative of data, it is sensitive to outliers, especially when working with a small sample size.
- Median- The median refers to the middle value in a data set, when all the items are arranged in either ascending or descending order. While the median is easy to compute and is not distorted by skewed data, its disadvantages are that it does not use all the information available and cannot be used for further mathematical calculations.
- Mode- The mode refers to the most common value within a set of data. Even though it can be calculated easily and is the only measure that can be used with data that is in a nominal scale, the mode is not used in statistical analysis since it is not algebraically defined.
Which Measure is Best to Use?
If working with ordinal or nominal datasets, one is not able to calculate the median or mode. It is therefore best to use the calculated mode.
Assuming you have quantitative data, it is best to use either the mean or the mode. However, if the data is either skewed or has an outlier, one should opt for the median.
In every other circumstance, one can use the mean, especially since it shows the least errors.
References
Agarwal, K. (2022, September 24). Statistics for data science Part 1: Use of central tendency for data analysis. Medium. https://medium.com/analytics-vidhya/statistics-for-data-science-part-1-use-of-central-tendency-for-data-analysis-d37cff35c9ea
Bhaskar, S., Ali, Z., & Sudheesh, K. (2019). Descriptive statistics: Measures of central tendency, dispersion, correlation, and regression. Airway, 2(3), 120. https://doi.org/10.4103/arwy.arwy_37_19
S., M. (2011). Measures of central tendency: The mean. Journal of Pharmacology and Pharmacotherapeutics, 2(2), 140-142. https://doi.org/10.4103/0976-500x.81920
S., M. (2011). Measures of central tendency: Median and mode. Journal of Pharmacology and Pharmacotherapeutics, 2(3), 214-215. https://doi.org/10.4103/0976-500x.83300
Top comments (0)