Understanding Skewness and Kurtosis in Data Distribution
In statistics, Skewness and Kurtosis are two important measures that describe the shape of a probability distribution. These measures help in understanding how the data is spread out and whether it is symmetric or skewed.
1. Skewness
Skewness refers to the asymmetry of the distribution of a dataset. A perfectly symmetric distribution has a skewness of zero. If the distribution is skewed to the left (negative skew), the tail is longer on the left side. If it is skewed to the right (positive skew), the tail is longer on the right side.
Formula for Skewness:
$$
\text{Skewness} = \frac{n}{(n-1)(n-2)} \sum_{i=1}^{n} \left( \frac{x_i - \bar{x}}{s} \right)^3
$$
Where:
- $ n $ is the sample size
 - $ x_i $ is the $ i $-th data point
 - $ \bar{x} $ is the sample mean
 - $ s $ is the sample standard deviation
 
Interpretation:
- Skewness = 0: Symmetric distribution
 - Skewness > 0: Right-skewed (positive skew)
 - Skewness < 0: Left-skewed (negative skew)
 
2. Kurtosis
Kurtosis measures the "tailedness" of the distribution. It describes the shape of the tails of the distribution. A normal distribution has a kurtosis of 3. Distributions with higher kurtosis have heavier tails and more outliers, while those with lower kurtosis have lighter tails.
Formula for Kurtosis:
$$
\text{Kurtosis} = \frac{n(n+1)}{(n-1)(n-2)(n-3)} \sum_{i=1}^{n} \left( \frac{x_i - \bar{x}}{s} \right)^4 - \frac{3(n-1)}{(n-2)(n-3)}
$$
Where:
- $ n $ is the sample size
 - $ x_i $ is the $ i $-th data point
 - $ \bar{x} $ is the sample mean
 - $ s $ is the sample standard deviation
 
Interpretation:
- Kurtosis = 3: Normal distribution (mesokurtic)
 - Kurtosis > 3: Heavy-tailed distribution (leptokurtic)
 - Kurtosis < 3: Light-tailed distribution (platykurtic)
 
Summary
| Measure | Description | Formula | 
|---|---|---|
| Skewness | Asymmetry of the distribution | $ \frac{n}{(n-1)(n-2)} \sum \left( \frac{x_i - \bar{x}}{s} \right)^3 $ | 
| Kurtosis | Tailedness of the distribution | $ \frac{n(n+1)}{(n-1)(n-2)(n-3)} \sum \left( \frac{x_i - \bar{x}}{s} \right)^4 - \frac{3(n-1)}{(n-2)(n-3)} $ | 
Understanding Skewness and Kurtosis is essential for analyzing data and making informed decisions in fields such as finance, economics, and social sciences.
Let me know if you'd like a Python implementation or a visual example!
    
Top comments (0)