DEV Community

Discussion on: Exploratory Data Analysis; Much Time & Effort?

Collapse
 
helenanders26 profile image
Helen Anderson

At my workplace, we have many, many data sources. Some cleaned up nicely with metadata, others kinda cleaned up but with no way of knowing what means what, others are completely raw, granular and need a lot of work.

I take one day a week to do a deep dive into a dataset I'm not familiar with and get to know it really well. That way I know what I'm in for the next time I'm asked for something in relation to that table, bucket or file.

I've heard of teams doing EDA Saturdays once a month to dig into a data set each and then do a show and tell to share the knowledge.

It takes time, but is so worth it. If you don't know what the data is doing there is no way you can gain any insight from it.

Collapse
 
mccurcio profile image
Matt Curcio

Interesting! I can see where a separate day (hopefully not my wkend ;)) would be useful.

But what statistical tests do you find yourself drawn to? Correlations, normality, or even boxplots?
Thanks,

Collapse
 
mccurcio profile image
Matt Curcio

Hi Helen,
I am curious. What other Data Science blogs or sites do you follow?

Collapse
 
helenanders26 profile image
Helen Anderson

Hey Matt

I'm not a Data Scientist so don't follow anything specific to that area.

My favourites blogs at the moment are:

Data36 - data36.com/ for tutorials and hands-on learning.

Simple Analytical - simpleanalytical.com/ for commentary on being in the data world and the ups and downs of being an analyst.

Mode - mode.com/blog/ - content for analysts by analysts.

Soft Skills Engineering - softskills.audio/ - my favourite podcast advice show about non-technical topics.

Thread Thread
 
mccurcio profile image
Matt Curcio

Great, thank you very much.

I noticed that your profile says B.I. How different do you consider B.I. Is from D.S.?

Thread Thread
 
helenanders26 profile image
Helen Anderson • Edited

In my workplace BI works on reports and models that show what has happened in the past and a little bit of forecasting.

The Data Scientists work on AI/ML/NLP sometimes using our models, sometimes using granular raw data