DEV Community

Kyle Pena
Kyle Pena

Posted on • Edited on

Derivation of Welford's Algorithm

This will be somewhat out of context if you're coming here first. It's really a footnote in a much longer series of blog posts on summarizing data distributions in computing environments where storage is at a premium.

In that blog post, I wanted to explain how to derive Welford's Algorithm for the recurrence relation for the second central moment, and found the explanations I could find a little lacking (at least for me).

I found this helpful post to be a great starting point, but the algebra part of the post skipped over so many steps that I couldn't follow it.

So I worked it out in greater detail. I'm hoping this is helpful to other mere mortals who, like myself, couldn't quite connect the dots.

Notation:

  • y represents a new observation.
  • μ' is the updated mean (the mean after incorporating y).
  • M2M_2 is the second central moment - well, not quite. Technically you'd have to divide by N to get the central moment. But we'll call it the central moment here.
  • N is the number of observations including y.

First, some work on the mean update formula:

μ=μ+yμN\mu^\prime = \mu + \frac{y - \mu}{N}
Nμ=Nμ+yμN\mu^\prime = N\mu + y - \mu
(N1)μ+μ=(N1)μ+y(N-1)\mu^\prime + \mu^\prime = (N-1)\mu + y
(N1)μ(N1)μ=yμ(N-1)\mu^\prime - (N-1)\mu = y - \mu^\prime
(N1)μ(N1)μ=μy(N-1)\mu - (N-1)\mu^\prime = \mu^\prime - y

Now, the main derivation:

M2M2=1N(yiμ)21N1(yiμ)2M_2^\prime - M_2 = \sum_1^N{(y_i - \mu^\prime)^2} - \sum_1^{N-1}{(y_i - \mu)^2}
=(yμ)2+1N1(yiμ)2(yiμ)2= (y - \mu^\prime)^2 + \sum_1^{N-1}{(y_i - \mu^\prime)^2 - (y_i - \mu)^2}
=(yμ)2+1N1(yi22yiμ+μ2)(yi22yiμ+μ2)= (y - \mu^\prime)^2 + \sum_1^{N-1}{(y_i^2 - 2y_i\mu^\prime +\mu^{\prime 2}) - (y_i^2 - 2y_i\mu + \mu^2)}
=(yμ)2+1N12yiμ+2yiμ+(μ2μ2))= (y - \mu^\prime)^2 + \sum_1^{N-1}{-2y_i\mu^\prime + 2y_i\mu + (\mu^{\prime 2} - \mu^2))}
=(yμ)2+1N12yi(μμ)+(μμ)(μ+μ)= (y - \mu^\prime)^2 + \sum_1^{N-1}{-2y_i(\mu^\prime - \mu) + (\mu^\prime - \mu)(\mu^\prime + \mu)}
=(yμ)2+1N1(μμ)(2yi+μ+μ)= (y - \mu^\prime)^2 + \sum_1^{N-1}{(\mu^\prime - \mu)(-2y_i + \mu^\prime + \mu)}
=(yμ)2+(μμ)1N1(2yiμμ)= (y - \mu^\prime)^2 + (\mu - \mu^\prime) \sum_1^{N-1}{(2y_i - \mu^\prime - \mu)}
=(yμ)2+(μμ)[2(N1)μ(N1)μ(N1)μ]= (y - \mu^\prime)^2 + (\mu - \mu^\prime) [ 2(N-1)\mu - (N-1)\mu^\prime - (N-1)\mu ]
=(yμ)2+(μμ)[(N1)μ(N1)μ]= (y - \mu^\prime)^2 + (\mu - \mu^\prime) [ (N-1)\mu - (N-1)\mu^\prime ]

Substitute the mean update term we worked out earlier.

=(yμ)2+(μμ)(μy)= (y - \mu^\prime)^2 + (\mu - \mu^\prime) (\mu^\prime - y)
=(yμ)2+(μμ)(yμ)= (y - \mu^\prime)^2 + (\mu^\prime - \mu) (y - \mu^\prime)
=(yμ)(yμ+μμ)= (y - \mu^\prime)(y - \mu^\prime + \mu^\prime - \mu)
=(yμ)(yμ)= (y - \mu^\prime)(y - \mu)

This equals the difference M2M2M_2^\prime - M_2 , and so the complete recurrence relation for the second central moments is:

M2=M2+(yμ)(yμ)M_2^\prime = M_2 + (y - \mu^\prime)(y - \mu)

And thus the recurrence relation for the corrected sample variance based on the second central moment is:

σ2=1N1[M2+(yμ)(yμ)]\sigma^{\prime 2} = \frac{1}{N-1} [ M_2 + (y - \mu^\prime)(y - \mu) ]

AWS Security LIVE!

Tune in for AWS Security LIVE!

Join AWS Security LIVE! for expert insights and actionable tips to protect your organization and keep security teams prepared.

Learn More

Top comments (0)

Image of Docusign

🛠️ Bring your solution into Docusign. Reach over 1.6M customers.

Docusign is now extensible. Overcome challenges with disconnected products and inaccessible data by bringing your solutions into Docusign and publishing to 1.6M customers in the App Center.

Learn more