DEV Community

Cover image for Describing Dataset through...
Rutik Bhoyar
Rutik Bhoyar

Posted on • Edited on

Describing Dataset through...

Descriptive Statistics are used to describe a data in a simple way.
A raw dataset is difficult to describe, descriptive statistics describe the dataset in a simple manner through:

  • Measure of Central Tendency(Summary Statistics):Mean , Median, Mode
  • Measure of Spread: Range, Quartiles, Percentile, Absolute Deviation, Variance and Standard Deviation.
  • Measure of Symmetry: Skewness
  • Measure of Peakedness: Kurtosis

Measure of Central Tendency: The goal of measure of central tendency is to come up with one single number that best describes a distribution of scores.
Choosing one of the three measure of central tendencies over one another depends on two factors:
1.Scale of Measurement: It is used so that a summary makes sense given the nature of the scores.
2.Shape of Frequency Distribution: It is used so that the measure accurately summarizes the distribution.

MEAN: The arithmetic average of some data is average score or value and computed simply by adding together all scores and dividing by the number of scores. It uses information from every single score.
Alt Text
There is another term called Trimmed Mean and Geometric Mean.
Trimmed Mean is obtained by deleting a percentage of the smallest and the largest values from a data set and then computing the mean of the remaining values.
Geometric Mean is calculated by finding the nth root of the product of n values.
Alt Text

MEDIAN: There are three methods for computing the median,depending on the distribution of scores.

1.If you have an odd number of scores then, pick the middle value as median.
For example, 9 6 3 5 8 : Here the median=3.

2.If you have an even number of scores then, take the average of two middle scores.
For example, 9 6 3 5 8 9 --> 3+5/2 --> 4 : Here the median=4.

3.If you have several scores with the same number of value in the middle of distribution use the formula for percentiles.

MODE: If data is categorical(measures on nominal scale) then only the mode can be calculated.
Alt Text

In next I'll be writing about dispersion measures.👨

Top comments (0)