DEV Community

vivian mukhongo (Avivo)
vivian mukhongo (Avivo)

Posted on

INSIGHTS FROM PROJECT 1 OF THE DATA SCIENCE BOOTCAMP

I am so thrilled to dive into this five week boot camp by Lux Tech Academy

WEEK 1 PROJECT 1 PART 1

A weather dataset was provided and requested to write a python code to answer questions regarding the same.

STEPS USED

  1. In Jupyter Notebook I imported the pandas and numpy libraries
  2. - 2. - loaded the csv file and started cleaning the data

** DATA CLEANING**

checking for the structure of the dataframe
1. df.head() - to check for the first rows
2. df.tail() - to check for the last rows
3. df.shape() - checks for the number of rows and columns
4. df.info() - checking for the general information about the dataset
5. df.dtypes - checks for the data types used in the columns
6. df.isna()- checked for the missing values in the dataset.
I WENT AHEAD AND ANSWERED THE QUESTIONS ASKED USING THE FUNCTIONS BELOW
Len()- used to determine the size or length of various data structures.
Considering my code this function has been widely used especially in cases where the number of records for certain values was asked.

print() -used to output a specified message. I used this function to make increase the readability of the output.
df.rename()- used to rename labels, columns or index of a dataframe
I renamed the column 'weather' to 'Weather_Conditions' using the syntax below
_df.rename(columns={Weather:Weather_Conditions}Inplace=True) _
Mean() - used to calculate the average value of numeric columns in a dataframe.
groupby()- used when you need to perform an operation of your data defined by unique values in one or more columns.

From the above analysis, I understood that in the realm of data analysis, understanding and utilizing the right tools and techniques is crucial for extracting meaningful insights from datasets

PART TWO:SQL CODE

I used the microsoft sql management studio
where I learnt how to connect the database with the local host and the steps for importing a csv file into the database for analysis.
Here, I explored the use of Select....From and WHERE Clause in SQL

Though I am a beginner in the field I anticipate to learn more and use a wide range of python and sql applications in the industry!
I am so excited to be in the Data industry!

AWS Q Developer image

Your AI Code Assistant

Automate your code reviews. Catch bugs before your coworkers. Fix security issues in your code. Built to handle large projects, Amazon Q Developer works alongside you from idea to production code.

Get started free in your IDE

Top comments (1)

Collapse
 
vivian_mukhongoavivo_3 profile image
vivian mukhongo (Avivo)

I am in for corrections

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more