DEV Community

Cover image for Get Organised as a Data Scientist: Tips & Strategies
Raman Bansal
Raman Bansal

Posted on

Get Organised as a Data Scientist: Tips & Strategies

INTRODUCTION

Sometimes, beginner or intermmediate data scientist lost in their notebooks finding the cells and results of the analysis they done, due this they waste most of their figure out their code and remembering the results.

Here are some steps in which you can solve these problems and save your lot of your time

The steps are as follows.

1. Create two folders for a project

For a project, you need to create two folders. One is for raw data and other is for processed data. The raw data file folder contains raw files (or orginal data) given to you. The processed data contains files that contains the data that you preprocessed using a pipeline or any other method.

It makes you easy for working with data and also you don't need to preprocess the data again and again in the notebooks.

2. Create separate notebooks for differernt tasks

For large projects, you cannot work within a single notebook. You need to create two or more notebooks. Otherwise, you face many problems like your kernel stop working pproerly and your cells will run slowly and you will lost your precious time.

So, in order to avoid this create two or more separate notebook. Save each notebook with their name with the work they supposed to do.

3. Use Markdown

For data scientists explaination of their code and results of their analysis is very important. So, use markdown for explaining your code and note down the results.

You can markdown when

  • You need to give introduction
  • You want to write additional info about notebook
  • You want to write results of analysis
  • You want to elaborate your notebook sturcture.

4. Use comments

Data scientist mostly need to explain their codes to others. So, use comments in the cell for explaining the different parts of your cell.

For an example, in a cell you create a function that predicts the output using machine learning model. In that case you can define little bit about input and output of that function. At last remember, you should not use comments unneccesoraily.
Read the full article

Top comments (0)