DEV Community

Fabio Manganiello
Fabio Manganiello

Posted on

Correlation without tears

Assessing the degree of correlation between two numeric series is a notoriously challenging task in every scientific discipline, as well as a crucial aspect of every scientific research (ever wondered if there's a correlation between the usage of Internet Explorer and the murder rate?).

Different coefficients have been proposed over the years for different series with different properties (most notably, Pearson's, Spearman's and Kendall's coefficients), and the quest for a "universal" correlation coefficient has often been unproductive.

It took me a while to digest the math behind the new paper from Sourav Chatterjee (https://arxiv.org/abs/1909.10140), but once I modelled the proposed coefficient into Python code I was surprised by how well it performed on arbitrary numeric series (not necessarily monotonic) compared to most of the coefficients out there. And it's also very easy to calculate compared to other coefficients.

So I've put together a Gist notebook that shows how the new coefficient works on some simple data with increasing levels of noise.

Feel free to reuse the code if you need it!

Image of Datadog

The Essential Toolkit for Front-end Developers

Take a user-centric approach to front-end monitoring that evolves alongside increasingly complex frameworks and single-page applications.

Get The Kit

Top comments (0)

A Workflow Copilot. Tailored to You.

Pieces.app image

Our desktop app, with its intelligent copilot, streamlines coding by generating snippets, extracting code from screenshots, and accelerating problem-solving.

Read the docs

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay