DEV Community

Cover image for Predicting Country Population Using Linear Regression with Java and Apache Commons
Jorge L, Morla
Jorge L, Morla

Posted on

Predicting Country Population Using Linear Regression with Java and Apache Commons

Linear regression analysis is a statistical process used to predict the value of a variable based on the values of other variables. We refer to the variable we want to predict as the dependent variable and the variables used for the prediction as the independent variables.

For our demonstration we are going to use the mexico population history to train our model.

public class MexicoPopulationLineChart {

    private final double[][] mexicoPopulation = new double[][]{
            {1955, 31452141},
            {1960, 36268055},
            {1965, 42737991},
            {1970, 50289306},
            {1975, 58691882},
            {1980, 67705186},
            {1985, 74872006},
            {1990, 81720428},
            {1995, 89969572},
            {2000, 97873442},
            {2005, 105442402},
            {2010, 112532401},
            {2015, 120149897},
            {2020, 125998302},
    };
}

Enter fullscreen mode Exit fullscreen mode

We visualize the data using a line chart ending with this result.

Image description

Now, we will attempt to predict the population for the next decades to see what we can obtain. For this we are going to use SimpleRegression class from Apache Commons Math.

SimpleRegression regression = new SimpleRegression();
regression.addData(mexicoPopulation);

NumberFormat formatter = NumberFormat.getNumberInstance();
formatter.setMaximumFractionDigits(0);

for(var i = 2025; i < 2050; i = i + 5) {
    System.out.println(i + " - " + 
    formatter.format(regression.predict(i)));
}
Enter fullscreen mode Exit fullscreen mode

giving us this output

1955 - 29,184,914
1960 - 36,735,619
1965 - 44,286,325
1970 - 51,837,031
1975 - 59,387,737
1980 - 66,938,442
1985 - 74,489,148
1990 - 82,039,854
1995 - 89,590,559
2000 - 97,141,265
2005 - 104,691,971
2010 - 112,242,676
2015 - 119,793,382
2020 - 127,344,088
2025 - 134,894,794
2030 - 142,445,499
2035 - 149,996,205
2040 - 157,546,911
2045 - 165,097,616
2050 - 172,648,322
Enter fullscreen mode Exit fullscreen mode

Finally, let's incorporate the predictions into our chart so that we can visually represent the data.

Image description

Top comments (0)