DEV Community

WanjohiChristopher
WanjohiChristopher

Posted on

Data Science Workflow

DATASCIENCE WORKFLOW-CRISPDM(Cross-industry standard process for data mining.

We use CRISP-DM methodology as show below:
Alt Text
-we will go through the whole workflow process in details:
Let’s Get Started…………………………………………………………………………………………
AI IS THE NEW ELECTRICITY ~ANDREW NG

1.Business Understanding

This section requires domain expertise in which you as a data scientist you are supposed to understand
the problem at hand well and in depth.
At this point a data scientist need not to work alone,he or she needs to interact with different teams in
the company in order to make decisions in the right order.
The objectives to achieve to accomplish in a project are indicated here.
The type of problem is determined here ii.e in machine learning.

2.Data Understanding

This section a data scientist need to understand the data well,checking at its variable
descriptions,quantifying what they mean regarding the data provided and problem definition.
The following functions(questions)are done in order to understand data more namely:
We will take examples using python programming languages.

.Importing required libraries to work on data---eg.import pandas as pd
.Reading the data using pandas usually might be in different formats eg.csv,xlsx,txt,.tab.—eg pd.read_csv/excel()
.Checking number of columns and rows,missing
values,duplicates,nunique values,information
about data,datatypes,preview of data.

Discussion (0)