<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Dennis Kariuki</title>
    <description>The latest articles on DEV Community by Dennis Kariuki (@karis254).</description>
    <link>https://dev.to/karis254</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1881742%2F61d9aadb-660f-419c-867d-b6482380d8aa.png</url>
      <title>DEV Community: Dennis Kariuki</title>
      <link>https://dev.to/karis254</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/karis254"/>
    <language>en</language>
    <item>
      <title>Weather Dataset- Data Analysis - Personal Project</title>
      <dc:creator>Dennis Kariuki</dc:creator>
      <pubDate>Sun, 04 Aug 2024 15:43:59 +0000</pubDate>
      <link>https://dev.to/karis254/weather-dataset-data-analysis-personal-project-16a9</link>
      <guid>https://dev.to/karis254/weather-dataset-data-analysis-personal-project-16a9</guid>
      <description>&lt;h2&gt;
  
  
  Data analysis
&lt;/h2&gt;

&lt;p&gt;Data analysis in my own understanding is trying to make sense of the data that you have - the new oil is Data and data needs to be cleaned, organized and turned into meaningful data and generally make some insights from it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why am Here???
&lt;/h2&gt;

&lt;p&gt;This is my first post on Dev.to doing my first personal project.&lt;br&gt;
An trying to horn my data science/ Analysis skills and learn a new skills.&lt;/p&gt;
&lt;h2&gt;
  
  
  Lets Begin
&lt;/h2&gt;

&lt;p&gt;The dataset I shall be using is &lt;a href="https://www.kaggle.com/datasets/ayushmi77al/weather-data-set-for-beginners" rel="noopener noreferrer"&gt;Weather Data for beginners from Kaggle&lt;/a&gt; and i will try to answer the following questions:&lt;/p&gt;

&lt;p&gt;Week 1 Project:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Find all records where the weather was exactly clear.&lt;/li&gt;
&lt;li&gt;Find the number of times the wind speed was exactly 4 km/hr.&lt;/li&gt;
&lt;li&gt;Check if there are any NULL values present in the dataset.&lt;/li&gt;
&lt;li&gt;Rename the column "Weather" to "Weather_Condition."&lt;/li&gt;
&lt;li&gt;What is the mean visibility of the dataset?&lt;/li&gt;
&lt;li&gt;Find the number of records where the wind speed is greater than 24 km/hr and visibility is equal to 25 km.&lt;/li&gt;
&lt;li&gt;What is the mean value of each column for each weather condition?&lt;/li&gt;
&lt;li&gt;Find all instances where the weather is clear and the relative humidity is greater than 50, or visibility is above 40.&lt;/li&gt;
&lt;li&gt;Find the number of weather conditions that include snow.
Part 2: Move this CSV into a database of your choice and use SQL to answer 4 of the questions above.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;
  
  
  The Libraries you need for this Project
&lt;/h2&gt;

&lt;p&gt;In my case these are the Libraries that I have used:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import os
import numpy as np
import pandas as pd
import csv
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then import the downloaded and extracted dataset (Correctly use the correct path to the downloaded dataset) - In my case this is the path and how I imported the data set and read the dataset&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#Loading Data from the CSV file
read_data = pd.read_csv(r'C:\Users\DELL\Desktop\Portfolio Data Analyst\Lux Academy\Week 1 Project - Weather Dataset for Beginners\1. Weather Data.csv')

#Reading Data from the CSV File
read_data
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Question 1 - Find all records where the weather was exactly clear
&lt;/h2&gt;

&lt;p&gt;To answer this Question we have to look at the Weather Column and check to see the different types of weathers that are there and how many different weathers are there in the dataset using the Values.Count()&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;read_data['Weather'].value_counts()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Total Number of times the Weather was Exactly Clear is 1326 Times&lt;/p&gt;

&lt;h2&gt;
  
  
  Question 2 - Find the number of times the wind speed was exactly 4 km/hr.
&lt;/h2&gt;

&lt;p&gt;To answer this Question we have to look at the Weather Column and check to see the different types of weathers that are there and how many different weathers are there in the dataset using the Values.Count()&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;read_data[read_data['Wind Speed_km/h'] == 4] #Shows all the data that has a 4km/h wind speed
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To get how many days has that speed we can use count&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;read_data[read_data['Wind Speed_km/h'] == 4].count() 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We can see that using Count we have 474 Days that has a 4km/h wind speed&lt;/p&gt;

&lt;h2&gt;
  
  
  Question 3 - Check if there are any NULL values present in the dataset
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;read_data.isnull().sum()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The data set doesn't have any null values&lt;/p&gt;

&lt;h2&gt;
  
  
  Question 4 - Rename the column "Weather" to "Weather_Condition."
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;read_data.columns

read_data.rename(columns={'Weather':'Weather Condition'})
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Question 5 - What is the mean visibility of the dataset?
&lt;/h2&gt;

&lt;p&gt;Two ways to do this - The first Method&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;read_data['Visibility_km'].mean()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Second way to do this&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;read_data.describe()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Question 6 - Find the number of records where the wind speed is greater than 24 km/hr and visibility is equal to 25 km
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;read_data[(read_data['Wind Speed_km/h']&amp;gt;24) &amp;amp; (read_data['Visibility_km']==25)]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Question 7 - What is the mean value of each column for each weather condition?
&lt;/h2&gt;

&lt;p&gt;For this Question we can use describe to get the mean value for each weather condition&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;read_data.describe()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Question 8 - Find all instances where the weather is clear and the relative humidity is greater than 50, or visibility is above 40.
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;read_data[(read_data['Weather'] =='Clear') &amp;amp; (read_data['Rel Hum_%']&amp;gt;50) &amp;amp;(read_data['Visibility_km']&amp;gt;40)]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Question 9 - Find the number of weather conditions that include snow.
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;read_data[read_data['Weather'] =='Snow']
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And that's how I have managed to Handle this task.&lt;br&gt;
Cheers to all.&lt;/p&gt;

</description>
      <category>database</category>
      <category>beginners</category>
      <category>learning</category>
      <category>datascience</category>
    </item>
  </channel>
</rss>
