DEV Community

Cover image for Analyse yourself or how could Python help to achieve your goal?
Vladimir Semenov
Vladimir Semenov

Posted on • Originally published at Medium on

Analyse yourself or how could Python help to achieve your goal?

Because I have a technical mind, I believe that “everything around us is numbers” © Numb3rs.

In this case, I decided to investigate a performance of my marathon preparation and try to estimate the full time for the full marathon run before it happens, using my Strava tracked data.

Strava Runner Profile | Vladimir S.

To achieve that, I decided to create a Python script to calculate an estimate. Python is a programming language that is good for doing some mathematical calculations and is easy to learn.

Linear prediction

First of all, I used basic human logic to find that the very brief estimate could be easily calculated using just a linear function. Said and done.

The code above is pretty simple from a logic perspective. Let’s assume that we have a constant pace for the whole race, the same as any of the previous runs. In this case, the formula is a basic multiplication pace to the desired distance.

Predicted time = Desired distance * Pace

Exploring my running statistics, I found that Strava calculates not only a pace but also a GAP, which is Grade Adjusted Pace. Taking this into account, the linear formula with Pace and GAP gives us a brief estimate of the fastest and probably slowest time, assuming that the actual race is flatter than my usual runs and has a bit faster pace as well.

Well, using a linear formula gives us boundary values for the estimation. Not bad, but still not good enough. When I tried to use it with some of my first preparation runs, it gave me very distant values between 3.5 and 4.5 hours.

I expected to have more precise values as a result, so I started to explore other possible formulas to calculate a time prediction. After some time, I found a better formula called Pete Riegel formula.

Pete Riegel formula prediction

In a 1977 article for Runner’s World Magazine, Riegel proposed a simple formula for comparing relative performances at different distances. The formula is most commonly quoted as:

Predicted time= T1 * (D2 / D1)^C

  • T1 is the time achieved for D1
  • D1 is the distance over which the initial time is achieved
  • D2 is the distance for which the time is to be predicted
  • C is the pace degradation coefficient, from 1.06 to 1.10

Using this formula gives more precise values for the estimated time, however, it is still using two boundary values with a degradation coefficient 1.06 for faster time and 1.10 for the slowest one.

Exploring my running statistics again, I found that Strava provides with elevation information. In this case, taking into account the value of elevation for the Rotorua Marathon race, I assumed that it might help me to calculate a more precise pace degradation coefficient for a race.

To achieve that, I created a code to calculate a grade based on elevation and distance and a code to calculate the coefficient by grade. I assumed that a 0% grade could represent the lowest value of the coefficient and 3% is the highest one.

As a result, I received a coefficient of around 1.077, which represents a low-medium difficulty for the Rotorua Marathon race.

In a nutshell, with a combination of the Pace, GAP and degradation coefficients, I now have estimations with a different confidence level. I created a simple web page (using Google Charts) with a graphic that shows a visualisation of the script estimation results. It looks like the image below.


Interactive web page (https://thesun2003.github.io/marathon-prediction/)

Well, if you check the graphic above, you can see that there is a trend to run faster. I used data from my first 25 preparation runs.

Let us take a closer look. In the beginning, the fastest predicted time is a Linear GAP time with 03:53:28 which is sub 4 hours, yay! However, all other predictions are more 4 hours with the slowest Riegel prediction with the highest coefficient 1.10 is 04:49:24, ooh. The main two lines I believe, show the time between 04:19:11 to 04:32:47. This is still satisfactory but far from what I expect from my actual marathon race.

The 25th run shows a faster time, from 03:33:47 to 03:49:27, which is a great prediction for me. However, this run could be less or more accurate only because that run was on a treadmill. The run was almost flat and fast-paced.

In the meantime, if you look closer to a 16th run, then you can see that the fastest time is 03:22:32 and the slowest is 04:04:40. There was a morning run on the street for 40 minutes with really fast pace 04:56.

All in all, I believe I managed to find some fun in the marathon preparation as well as create a helpful tool to research my performance data. Moreover, I showed that it might be interesting to treat yourself as a resource of data for analysis.

At the end of this retrospective session, I agreed that I have a good tendency to increase my pace, and I expect to achieve my second goal to run a sub 4-hour marathon.

You can find the source code by in my GitHub by the link below.

thesun2003/marathon-prediction

The next retrospective session is scheduled to be at the end of the project, which means after I run a marathon. I believe it will be interesting and fun. See you then!

If this article was helpful or interesting please hit the clap button and feel free to share it . I’ll be sure to deliver more articles in the weeks to come.

Top comments (0)