DEV Community

Jeffa Jeffa
Jeffa Jeffa

Posted on

Skewness and Kurtosis

When analyzing data in statistics or data science, it is not enough to only look at measures of central tendency like the mean, median, or mode and variability (like variance or standard deviation). To fully understand the shape of a dataset’s distribution, we use skewness and kurtosis. These two measures describe how data deviates from a perfectly normal (bell-shaped) distribution.

Skewness

It measures the asymmetry of a distribution. A perfectly normal distribution has skewness equal to 0, meaning it is symmetric around the mean.

Types of Skewness:

  • Positive Skew (Right-skewed):
    The right tail is longer than the left.
    Most data values are concentrated on the left, but a few very large values pull the mean to the right.

  • Negative Skew (Left-skewed):
    The left tail is longer than the right.
    Most data values are concentrated on the right, but a few very small values pull the mean to the left.

  • Zero Skew:
    The distribution is symmetric.
    Mean = Median = Mode.

Kurtosis

Kurtosis measures the “tailedness” of a distribution, or how extreme the outliers are compared to a normal distribution.
Types of Kurtosis:

  • Leptokurtic Heavy tails and a sharp peak. More extreme outliers than a normal distribution.
  • Platykurtic Light tails and a flatter peak. Fewer outliers than a normal distribution.
  • Mesokurtic Normal bell-shaped curve. Moderate tails and peak.

Importance of skewness and kurtosis in data science.

  1. Data Analysis: They reveal whether data follows assumptions of normality.
  2. Risk Management In finance, skewness and kurtosis help in understanding market risks — highly skewed or leptokurtic data indicates greater uncertainty.
  3. Decision-Making: They help analysts avoid misleading conclusions that come from looking at mean and standard deviation alone.

Conclusion
Skewness tells us about the symmetry of data.
Kurtosis tells us about the outliers and tail heaviness.
Together, they provide a deeper picture of data distribution beyond averages and variability, helping statisticians, data scientists, and decision-makers draw more accurate insights.

Top comments (0)