DEV Community

Cover image for Data Science, the Good, the Bad, and the... Future
Aaron Harris for Kite

Posted on • Updated on • Originally published at kite.com

Data Science, the Good, the Bad, and the... Future

Positives: astrophysics, biology, and sports

Part of a new article by Kirit Thadaka on the Kite Blog

Data science made a huge positive impact on the way technology influences our lives. Some of these impacts have been nice and some have been otherwise. looks at Facebook But, technology can’t inherently be good or bad, technology is… technology. It’s the way we use it that has good or bad outcomes.

We recently had a breakthrough in astrophysics with the first ever picture of a black hole. This helps physicists confirm more than a century of purely theoretical work around black holes and the theory of relativity.

To capture this image, scientists used a telescope as big as the earth (Event Horizon Telescope or EHT) by combining data from an array of eight ground-based radio telescopes and making sense of it all to construct an image. Analyzing data and then visualizing that data – sounds like some data science right here.

A cool side note on this point: a standard Python library of functions for EHT Imaging was developed by Andrew Chael from Harvard to simulate and manipulate VLBI (Very-long-baseline interferometry) data helping the process of creating the black hole image.

Olivier Elemento at Cornell uses Big Data Analytics to help identity mutations in genomes that result in tumor cells spreading so that they can be killed earlier – this is a huge positive impact data science has on human life. You can read more about his incredible research here.

Python is used by researchers in his lab while testing statistical and machine learning models. Keras, NumPy, Scipy, and Scikit-learn are some top notch Python libraries for this.

If you’re a fan of the English Premier League, you’ll appreciate the example of Leicester City winning the title in the 2015-2016 season.

At the start of the season, bookmakers had the likelihood Leicester City winning the EPL at 10 times less than the odds of finding the Loch Ness monster. For a more detailed attempt at describing the significance of this story, read this.

Everyone wanted to know how Leicester was able to do this, and it turns out that data science played a big part! Thanks to their investment into analytics and technology, the club was able to measure players’ fitness levels and body condition while they were training to help prevent injuries, all while assessing best tactics to use in a game based on the players’ energy levels.

All training sessions had plans backed by real data about the players, and as a result Leicester City suffered the least amount of player injuries of all clubs that season.

Many top teams use data analytics to help with player performance, scouting talent, and understanding how to plan for certain opponents.

Here’s an example of Python being used to help with some football analysis. I certainly wish Chelsea F.C. would use some of these techniques to improve their woeful form and make my life as a fan better. You don’t need analytics to see that Kante is in the wrong position, and Jorginho shouldn’t be in that team and… Okay I’m digressing – back to the topic now!

Now that we’ve covered some of the amazing things data science has uncovered, I’m going to touch on some of the negatives as well – it’s important to critically think about technology and how it impacts us.

The amount that technology impacts our lives will undeniably increase with time, and we shouldn’t limit our understanding without being aware of the positive and negative implications it can have.

Some of the concerns I have around this ecosystem are data privacy (I’m sure we all have many examples that come to mind), biases in predictions and classifications, and the impact of personalization and advertising on society.

Negatives: gender bias and more

This paper published in NIPS talks about how to counter gender biases in word embeddings used frequently in data science.

For those who aren’t familiar with the term, word embeddings are a clever way of representing words so that neural networks and other computer algorithms can process them.

The data used to create Word2Vec (a model for word embeddings created by Google) has resulted in gender biases that show close relations between “men” and words like “computer scientist”, “architect”, “captain”, etc. while showing “women” to be closely related to “homemaker”, “nanny”, “nurse”, etc.

Here’s the Python code used by the researchers who published this paper. Python’s ease of use makes it a good choice for quickly going from idea to implementation...

Check out the code and examples, and rest of the negatives on our blog!

Top comments (0)