1. What Are Measures of Position?
Measures of position describe where a particular data value stands relative to the rest of the dataset.
They help answer questions like:
Is this value high, low, or typical?
What proportion of data lies below (or above) a given value?
How extreme is a data point?
Unlike measures of central tendency (mean, median) or dispersion (variance, standard deviation), measures of position focus on relative standing.
**
2. Why Measures of Position Matter in Data Science**
In data science, measures of position are crucial for:
Outlier detection (e.g., IQR method)
Feature scaling and normalization
Risk assessment (finance, insurance)
Model evaluation (percentile-based metrics)
Fair comparisons across populations
Example:
A test score of 85 means very different things depending on whether it is in the 60th percentile or the 95th percentile.
3. Types of Measures of Position
Main categories:
**Percentiles
Quartiles
Deciles
Z-scores (Standard Scores)
Ranks
**
Each provides a different lens on relative position.
4. Percentiles
Definition
The p-th percentile is the value below which p% of the data falls.
Example:
90th percentile = value below which 90% of observations lie.
Properties
Percentiles range from 0 to 100
Not evenly spaced in valueβdepends on data distribution
How to Compute Percentiles
Given ordered data of size n:
Position of ππ = π/100(π+1)
β
If the position is not an integer, interpolate.
Example :
Data:
10, 20, 30, 40, 50
Find the 60th percentile.
π60=60/100(5+1)=3.6
Between 3rd and 4th values:
30+0.6(40β30)=36
So, Pββ = 36
Interpretation
60% of the data is β€ 36
40% is β₯ 36
5. Quartiles
Quartiles divide data into four equal parts.
Quartile Meaning
Qβ 25th percentile
Qβ 50th percentile (Median)
Qβ 75th percentile
Interquartile Range (IQR)
IQR=π3βπ1
β
Why important?
Measures spread of the middle 50%
Robust to outliers :
Used heavily in box plots and anomaly detection
Outlier Detection (IQR Rule)
Lower bound=π1β1.5ΓπΌππ
Upper bound=π3+1.5ΓπΌππ
β
Values outside these bounds are considered outliers.
- Deciles
Deciles split data into 10 equal parts.
Decile Percentile
Dβ 10th
Dβ
50th (Median)
Dβ 90th
Usage
Income distribution analysis
Population studies
Risk stratification
Example:
Top 10% income earners = above the 9th decile
- Z-Scores (Standard Scores) Definition
A Z-score measures how many standard deviations a value is from the mean.
π=π₯βπ/π
Where:
x = observation
ΞΌ = mean
Ο = standard deviation
Interpretation
Z-score Meaning
0 Exactly at mean
+1 1 SD above mean
-2 2 SD below mean
Why Z-Scores Are Powerful
Standardize different scales
Enable comparison across datasets
Fundamental in machine learning pre-processing
Basis of normal distribution probabilities
Example
Mean = 70
SD = 10
Score = 85
π=85β70/10=1.5
β
Interpretation:
The score is 1.5 standard deviations above the mean
- Relationship Between Z-Scores and Percentiles:
In a normal distribution:
Z Percentile
0 50%
1 ~84%
2 ~97.5%
-1 ~16%
This connection is vital in:
Hypothesis testing
Probability estimation
Statistical modelling:
- Ranks Definition
Rank assigns an ordinal position to each observation.
Example:
Highest score β Rank 1
Next β Rank 2
Types of Ranking
Dense ranking (1,2,2,3)
Competition ranking (1,2,2,4)
Fractional ranking (2.5 for ties)
Limitations
Ignores magnitude differences
Not suitable for distance-based models
- Measures of Position vs Measures of Central Tendency Aspect Central Tendency Position Focus Typical value Relative standing Examples Mean, Median Percentiles, Z Outliers Sensitive (mean) Often robust Use in ML Baseline Feature scaling, anomaly detection
11. Real-World Data Science Applications
- Machine Learning
Feature normalization using Z-scores
Quantile transformation
- Finance
Value-at-Risk (VaR) β percentile-based
Risk classification using deciles
- Healthcare
Growth percentiles (BMI-for-age)
Lab result interpretation
- Education
Standardized test scores
Admission cut-offs
- Summary Table Measure Purpose Robust to Outliers Percentile Relative position Yes Quartile Spread & position Yes Decile Distribution segmentation Yes Z-score Standardized distance No Rank Order comparison Yes
- Key Takeaways
Measures of position explain where a value lies, not just what it is.
Percentiles and quartiles are distribution-free.
Z-scores assume normality but allow deep comparisons.
In data science, they are foundational for scaling, anomaly detection, and interpretation.
Top comments (0)