Rahul Kumar

Posted on Dec 15

Performance Metrics For Regression Task

1. Mean Squared Error (MSE)

Explanation

Mean Squared Error (MSE) measures the average squared difference between estimated and actual values. It gives higher weight to larger errors due to squaring.

Mathematical Formula

$MSE = (1/n) * ∑[(y_i - ŷ_i)^2]$

Where:

$n$ is the number of data points.
$y_i$ is the actual value.
$ŷ_i$ is the predicted value.

When to Use

Sensitive to outliers
Regression model performance evaluation
Requires normally distributed errors
Penalizes large errors more heavily

Python Implementation

from sklearn.metrics import mean_squared_error
import numpy as np

def calculate_mse(y_true, y_pred):
    return mean_squared_error(y_true, y_pred)

Pros

Simple to understand
Mathematically convenient
Differentiable
Symmetric error measurement

Cons

Sensitive to outliers
Uses squared values, making interpretation challenging
Units are squared original units

Fun Facts

Commonly used in machine learning optimization
Fundamental in least squares regression
Minimized by linear regression

2. Root Mean Squared Error (RMSE)

Explanation

RMSE is the square root of MSE, bringing the metric back to the original data scale.

Mathematical Formula

$RMSE = √[(1/n) * ∑[(y_i - ŷ_i)^2]]$

When to Use

Need error in original units
Want to penalize large errors
Regression model comparison

Python Implementation

from sklearn.metrics import mean_squared_error
import numpy as np

def calculate_rmse(y_true, y_pred):
    return np.sqrt(mean_squared_error(y_true, y_pred))

Pros

Interpretable in original units
Sensitive to large errors
Symmetric error measurement

Cons

Sensitive to outliers
Squares errors before taking root
Less intuitive for non-technical audiences

Fun Facts

Standard deviation of residuals
Often preferred over MSE for reporting

3. Mean Absolute Error (MAE)

Explanation

MAE calculates the average absolute difference between predicted and actual values.

Mathematical Formula

$MAE = (1/n) * ∑|y_i - ŷ_i|$

When to Use

Less sensitive to outliers
Linear error measurement
Robust regression evaluation

Python Implementation

from sklearn.metrics import mean_absolute_error

def calculate_mae(y_true, y_pred):
    return mean_absolute_error(y_true, y_pred)

Pros

Less sensitive to outliers
Interpretable
Linear error measurement
Uses absolute value

Cons

Doesnt differentiate overestimation vs underestimation
Less mathematically convenient for optimization
Less penalization of large errors

Fun Facts

Also known as L1 loss
More robust for non-Gaussian error distributions

4. Mean Absolute Percentage Error (MAPE)

Explanation

MAPE represents the average percentage difference between predicted and actual values.

Mathematical Formula

$MAPE = (1/n) * ∑|(y_i - ŷ_i) / y_i| * 100$

When to Use

Percentage-based error comparison
Similar scale data
Forecasting and time series

Python Implementation

import numpy as np

def calculate_mape(y_true, y_pred):
    return np.mean(np.abs((y_true - y_pred) / y_true)) * 100

Pros

Scale-independent
Easy percentage interpretation
Comparable across different scales

Cons

Undefined when true value is zero
Biased towards smaller values
Asymmetric error treatment

Fun Facts

Commonly used in financial forecasting
Can be problematic with small true values

5. R-squared (Coefficient of Determination)

Explanation

R-squared represents the proportion of variance in the dependent variable predictable from independent variables.

Mathematical Formula

$R² = 1 - (SS_res / SS_tot)$

Where:

$SS_res$ is the sum of squares of residuals
$SS_tot$ is the total sum of squares

When to Use

Model goodness-of-fit assessment
Linear regression evaluation
Comparing model predictive power

Python Implementation

from sklearn.metrics import r2_score

def calculate_r2(y_true, y_pred):
    return r2_score(y_true, y_pred)

Pros

Easy interpretation (0-1 range)
Indicates model explanatory power
Normalized measure

Cons

Doesn't indicate model accuracy
Can be misleading with non-linear relationships
Increases with more predictors

Fun Facts

Ranges from 0 to 1
Not always best for model selection

6. Adjusted R-squared

Explanation

Adjusted R-squared penalizes adding unnecessary predictors to the model.

Mathematical Formula

$Adjusted R² = 1 - [(1 - R²) * (n - 1) / (n - p - 1)]$

Where:

$n$ is the number of data points
$p$ is the number of predictors

When to Use

Complex models with multiple predictors
Preventing overfitting
Model complexity comparison

Python Implementation

import numpy as np
from sklearn.linear_model import LinearRegression

def calculate_adjusted_r2(X, y):
    model = LinearRegression()
    model.fit(X, y)
    r2 = model.score(X, y)
    n, p = X.shape
    adj_r2 = 1 - (1 - r2) * (n - 1) / (n - p - 1)
    return adj_r2

Pros

Prevents overfitting
Accounts for model complexity
More reliable for complex models

Cons

Still has limitations
Assumes linear relationship
Not suitable for non-linear models

Fun Facts

Used in feature selection
Developed to improve R-squared limitations

7. Root Mean Squared Logarithmic Error (RMSLE)

Explanation

RMSLE calculates root mean squared error after log transformation, reducing impact of large errors.

Mathematical Formula

$RMSLE = √[(1/n) * ∑[(log(p_i + 1) - log(y_i + 1))^2]]$

When to Use

Percentage errors matter more than absolute errors
Skewed data
Predicting exponential growth

Python Implementation

import numpy as np

def calculate_rmsle(y_true, y_pred):
    return np.sqrt(np.mean((np.log1p(y_pred) - np.log1p(y_true))**2))

Pros

Less sensitive to outliers
Handles exponential growth
Logarithmic scale benefits

Cons

Complex interpretation
Less intuitive
Requires log transformation

Fun Facts

Popular in Kaggle competitions
Useful for price and volume predictions

8. Explained Variance Score

Explanation

Measures the proportion of variance explained by the model compared to total variance.

Mathematical Formula

$Explained Variance = 1 - (Var(y - ŷ) / Var(y))$

When to Use

Model performance assessment
Variance explanation
Prediction quality evaluation

Python Implementation

from sklearn.metrics import explained_variance_score

def calculate_explained_variance(y_true, y_pred):
    return explained_variance_score(y_true, y_pred)

Pros

Provides variance explanation
Sensitive to prediction errors
Normalized score

Cons

Similar to R-squared
Assumes linear relationships
Limited interpretability

Fun Facts

Used in signal processing
Indicates model's explanatory power

9. Mean Squared Logarithmic Error (MSLE)

Explanation

Mean Squared Logarithmic Error (MSLE) is a loss function that applies logarithmic scaling to reduce the impact of large errors while preserving relative differences.

Mathematical Formula

$MSLE = (1/n) * ∑[(log(y_i + 1) - log(ŷ_i + 1))^2]$

When to Use

When dealing with data with exponential growth
Useful for metrics where relative errors are more important than absolute errors
Recommended for scenarios with wide range of target values
Particularly effective for financial, economic, or scientific data with exponential characteristics

Python Implementation

import numpy as np

def mean_squared_log_error(y_true, y_pred):
    return np.mean(np.square(np.log1p(y_true) - np.log1p(y_pred)))

Pros

Reduces impact of large outliers
Handles wide range of scales effectively
Emphasizes relative prediction accuracy

Cons

Not suitable for negative predictions
Can be sensitive to small changes in log-scaled values
Less interpretable compared to MSE

Fun Facts

Logarithmic transformation is similar to log-based normalization
Often used in competitions like Kaggle for certain prediction tasks
Closely related to log transformation in statistical modeling

10. Log-Cosh Loss

Explanation

Log-Cosh Loss is a smooth approximation of Mean Absolute Error (MAE) that provides better numerical stability.

Mathematical Formula

$Log-Cosh Loss = (1/n) * ∑[log(cosh(y_i - ŷ_i))]$

When to Use

When you want a smooth loss function
Suitable for regression problems with potential outliers
Provides a balance between MSE and MAE

Python Implementation

import numpy as np

def log_cosh_loss(y_true, y_pred):
    return np.mean(np.log(np.cosh(y_pred - y_true)))

Pros

Smooth and differentiable
Less sensitive to outliers compared to MSE
Computationally efficient

Cons

Can be less interpretable
Performance depends on specific dataset characteristics
Might not capture all error nuances

Fun Facts

Mathematically similar to Huber loss
Provides a good compromise between MAE and MSE
Commonly used in deep learning optimization

11. Quantile Loss

Explanation

Quantile Loss allows prediction of specific quantiles of the target variable, providing more flexible regression modeling.

Mathematical Formula

$Quantile Loss = (1/n) * ∑[max(q * (y_i - ŷ_i), (q - 1) * (y_i - ŷ_i))]$

Where:

$q$ is the quantile

When to Use

Predicting specific percentiles of target variable
Asymmetric prediction scenarios
Risk assessment and financial modeling
Capturing uncertainty in predictions

Python Implementation

import numpy as np

def quantile_loss(y_true, y_pred, quantile=0.5):
    errors = y_true - y_pred
    return np.mean(np.maximum(quantile * errors, (quantile - 1) * errors))

Pros

Allows flexible probabilistic predictions
Can model different parts of the distribution
Useful for risk-aware modeling

Cons

More complex to interpret
Computationally more expensive
Requires careful quantile selection

Fun Facts

Used in financial risk modeling
Enables prediction of confidence intervals
Powerful technique in machine learning uncertainty estimation

12. Relative Absolute Error (RAE)

Explanation

Relative Absolute Error (RAE) measures prediction accuracy relative to a naive baseline prediction.

Mathematical Formula

$RAE = ∑|y_i - ŷ_i| / ∑|y_i - mean(y)|$

When to Use

Comparing different model performances
Normalizing error across different scales
Assessing relative prediction quality

Python Implementation

import numpy as np

def relative_absolute_error(y_true, y_pred):
    numerator = np.sum(np.abs(y_true - y_pred))
    denominator = np.sum(np.abs(y_true - np.mean(y_true)))
    return numerator / denominator

Pros

Scale-independent metric
Easy to interpret
Provides relative performance assessment

Cons

Sensitive to extreme values
Can be misleading with small datasets
Doesn't provide absolute error magnitude

Fun Facts

Part of the family of relative error metrics
Used in academic and research model evaluations
Helps compare models across different domains

13. Relative Squared Error (RSE)

Explanation

Relative Squared Error (RSE) compares squared prediction errors to squared errors of a baseline model.

Mathematical Formula

$RSE = ∑[(y_i - ŷ_i)^2] / ∑[(y_i - mean(y))^2]$

When to Use

Comparing model performances
Normalizing error across different datasets
Providing relative error assessment

Python Implementation

import numpy as np

def relative_squared_error(y_true, y_pred):
    numerator = np.sum((y_true - y_pred)**2)
    denominator = np.sum((y_true - np.mean(y_true))**2)
    return numerator / denominator

Pros

Provides normalized error metric
Penalizes larger errors more
Useful for model comparison

Cons

Squares errors, amplifying outlier impact
Less interpretable than absolute metrics
Can be misleading with small datasets

Fun Facts

Closely related to R-squared metric
Common in statistical modeling
Helps assess model improvement over baseline

14. Symmetric Mean Absolute Percentage Error (SMAPE)

Explanation

SMAPE provides a symmetric percentage error metric that handles both overestimation and underestimation equally.

Mathematical Formula

$SMAPE = (1/n) * ∑[|y_i - ŷ_i| / ((|y_i| + |ŷ_i|) / 2)] * 100$

When to Use

Time series forecasting
Comparing models with different scales
Handling both positive and negative predictions

Python Implementation

import numpy as np

def symmetric_mean_absolute_percentage_error(y_true, y_pred):
    return np.mean(np.abs(y_true - y_pred) / ((np.abs(y_true) + np.abs(y_pred)) / 2)) * 100

Pros

Symmetric handling of errors
Percentage-based, easy to interpret
Works with both positive and negative values

Cons

Can be unstable with values close to zero
Sensitive to small absolute differences
Might not be suitable for all datasets

Fun Facts

Recommended by many forecasting competitions
More robust than traditional MAPE
Widely used in demand forecasting

15. Mean Bias Deviation (MBD)

Explanation

Mean Bias Deviation measures the average bias in predictions, indicating systematic over or under-estimation.

Mathematical Formula

$MBD = (1/n) * ∑(ŷ_i - y_i)$

When to Use

Checking systematic model biases
Quality control in predictive models
Understanding model prediction tendencies

Python Implementation

import numpy as np

def mean_bias_deviation(y_true, y_pred):
    return np.mean(y_pred - y_true)

Pros

Simple to calculate
Directly shows model bias direction
Helps identify systematic errors

Cons

Positive and negative errors can cancel out
Less informative for complex models
Doesn't capture error magnitude

Fun Facts

Important in scientific and engineering modeling
Helps improve model calibration
Commonly used in climate and environmental prediction

16. Mean Directional Accuracy (MDA)

Explanation

Mean Directional Accuracy measures the model's ability to predict the correct direction of change.

Mathematical Formula

$MDA = (1/n) * ∑[sign(y_i - y_{i-1}) == sign(ŷi - ŷ{i-1})]$

When to Use

Time series and sequential predictions
Financial market forecasting
Trend prediction models

Python Implementation

import numpy as np

def mean_directional_accuracy(y_true, y_pred):
    direction_true = np.diff(y_true)
    direction_pred = np.diff(y_pred)
    return np.mean(np.sign(direction_true) == np.sign(direction_pred))

Pros

Focuses on directional prediction
Useful for trend-based forecasting
Simple to understand and implement

Cons

Ignores magnitude of predictions
Less informative for non-sequential data
Can be misleading with noisy data

Fun Facts

Critical in financial and economic modeling
Used in technical analysis of stock markets
Complements other accuracy metrics

17. Huber Loss

Explanation

Combines MSE and MAE, being less sensitive to outliers while maintaining quadratic behavior for small errors.

Mathematical Formula

$Huber Loss = (1/n) * ∑[0.5 * (y_i - ŷ_i)^2 if |y_i - ŷ_i| <= δ else δ * (|y_i - ŷ_i| - 0.5 * δ)]$

Where:

$δ$ is a threshold parameter

When to Use

Robust regression
Outlier-prone datasets
Machine learning optimization

Python Implementation

import numpy as np

def huber_loss(y_true, y_pred, delta=1.0):
    error = y_true - y_pred
    is_small_error = np.abs(error) <= delta
    squared_loss = 0.5 * error**2
    linear_loss = delta * np.abs(error) - 0.5 * delta**2
    return np.mean(np.where(is_small_error, squared_loss, linear_loss))

Pros

Robust to outliers
Combines quadratic and linear loss
Smooth transition

Cons

Requires hyperparameter tuning
More complex implementation
Computationally expensive

Fun Facts

Named after Peter Huber
Used in robust statistics