loading...

Discussion

markdown guide
 

In my opinion:

  • Building strong opinions about certain models
  • Not exploring data enough
  • forcing the data to "cry" i.e forcefully obtaining results that you want from data
  • relying too much on some libraries
  • thinking that every problem is a 'data-science' problem
 

forcing the data to "cry" i.e forcefully obtaining results that you want from data

This one is so tough. I get it all the time. "Well can't you look again. Try this very forceful twist on it.". It can be a tough fight.

If there's no correlation don't make one, and don't let leadership force you into thinking there is one.

 

"If you torture the data long enough it will confess"

 

Not exploring data enough

Getting familiarity with data takes time. Leverage experts of the business as much as possible. Get datasets small enough that they are digestable by tools the business experts are familiar with.

 

Lack of using git😭

I see almost every day, my-amazing-report-final1a-b-24.xlsx.

If you are not using git at least do yourself a favor and set up some rules around versioning to use every day. Always increment the version, never use words like final, or new (they will be outdated as fast as you write them).

I highly recommend git, but understand that its not for everyone. Just make it intuitive to everyone which version is the latest.

 

Spending too much time working on a feature that was not needed.

i.e. lack of understanding the business.

 

I would say this is more of a developer then specifically data-science thing.
But other than that, I completely agree!