DEV Community

Cover image for Assumption of Homoscedasticity : A Guide to verifying the Assumption of Constant Variance of Residuals
Hussain Abdulafeez
Hussain Abdulafeez

Posted on

Assumption of Homoscedasticity : A Guide to verifying the Assumption of Constant Variance of Residuals

The assumption of homoscedasticity, also known as constant variance of residuals, is fundamental in regression analysis. It requires that the variance of residuals remains consistent across all levels of independent variables. In simpler terms, homoscedasticity implies that the spread of residuals should be uniform without any systematic patterns across the range of predicted values.

Image description

Below are the steps I took to Validate Homoscedasticity:

1. Residual Plot
2. Plotting Standardized Residuals
3. Breusch-Pagan Test
4. White Test
5. Goldfeld-Quandt Test
6. Residual Plots with Additional Variables

Let me first walk you through the steps of preparing my data before arriving at my residual plot

Importing the necessary libraries

Image description

A quick view of the data here:

Image description

Adding constant term for the intercept and fitting the model

Image description

Plotting the residual

Image description

1. Residual Plot:
Begin by plotting residuals against fitted values (predicted values) from the regression model. Inspect the plot for any discernible patterns or trends, particularly focusing on the constant spread of residuals across different levels of fitted values.

Image description

2. Plotting Standardized Residuals:
Compute standardized residuals by dividing each residual by its standard deviation. Plot standardized residuals against fitted values and observe whether the spread remains consistent across different levels of fitted values.

Image description

3. Breusch-Pagan Test:
Conduct the Breusch-Pagan test to formally assess homoscedasticity in the regression model. This test examines whether the variance of residuals is constant across observations. A significant p-value (< 0.05) indicates heteroscedasticity, suggesting a violation of the homoscedasticity assumption.

Image description

4. White Test:
Alternatively, use the White test, which is more robust than the Breusch-Pagan test and can handle correlated residuals. A significant p-value indicates heteroscedasticity.

Image description

5. Goldfeld-Quandt Test:
If applicable, consider the Goldfeld-Quandt test, especially for datasets with clear structures like time series. This test compares variances of residuals between two subsets of data. A non-significant result suggests homoscedasticity.

Image description

6. Residual Plots with Additional Variables:
Plot residuals against individual predictor variables to identify specific predictors contributing to heteroscedasticity. Patterns in these plots may indicate heteroscedasticity related to certain predictors.

Interpretation of Diagnostic Tests:

Breusch-Pagan Test:

LM Statistic: This value (230.27) is the Lagrange Multiplier (LM) statistic calculated by the Breusch-Pagan test. LM-Test p-value: The p-value (1.15e-48) associated with the LM statistic tests the null hypothesis that the variance of the residuals is constant (homoscedasticity). Interpretation: With an extremely small p-value (much less than the conventional threshold of 0.05), we reject the null hypothesis of homoscedasticity. This indicates strong evidence of heteroscedasticity in the regression model.

White Test:

LM Statistic: This value (668.11) represents the LM statistic computed by the White test. LM-Test p-value: The p-value (2.25e-134) associated with the White test assesses whether the residuals exhibit constant variance. Interpretation: Similar to the Breusch-Pagan test, the very small p-value strongly rejects the null hypothesis of homoscedasticity. It provides further evidence of heteroscedasticity in the regression model.

Goldfeld-Quandt Test:

F Statistic: The F-statistic (1.002) calculated by the Goldfeld-Quandt test compares the variance of residuals between two subsets of data. F-Test p-value: The p-value (0.476) associated with the F-statistic tests the null hypothesis that the variances of residuals are equal between the subsets. Interpretation: With a non-significant p-value (greater than 0.05), we do not reject the null hypothesis. This suggests that there is no evidence of heteroscedasticity in the context of this specific test.

Summary:
Breusch-Pagan and White tests both indicate heteroscedasticity due to their small p-values. Conversely, the Goldfeld-Quandt test does not suggest heteroscedasticity based on its non-significant p-value.

Thank you for reviewing this validation process. If you have any questions or would like to discuss further, please feel free to do so in the comments.

Top comments (0)