<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Lameck Odhiambo</title>
    <description>The latest articles on DEV Community by Lameck Odhiambo (@lameck_oluoch).</description>
    <link>https://dev.to/lameck_oluoch</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1861451%2F339c50f0-4555-4197-8486-6e78150ae015.png</url>
      <title>DEV Community: Lameck Odhiambo</title>
      <link>https://dev.to/lameck_oluoch</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/lameck_oluoch"/>
    <language>en</language>
    <item>
      <title>Handling - No module named 'googleapiclient' in JupyterNotebook using Python</title>
      <dc:creator>Lameck Odhiambo</dc:creator>
      <pubDate>Thu, 12 Sep 2024 05:03:01 +0000</pubDate>
      <link>https://dev.to/lameck_oluoch/handling-no-module-named-googleapiclient-in-jupyternotebook-using-python-4lei</link>
      <guid>https://dev.to/lameck_oluoch/handling-no-module-named-googleapiclient-in-jupyternotebook-using-python-4lei</guid>
      <description>&lt;p&gt;Handling this error is quite easy, after a long struggle I found out that actually you can install the 'googleapiclient' within a jupyternotebook which contains the project you're  working on.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Go to the uppermost cell and write the code below
&lt;code&gt;pip install --upgrade google-api-python-client&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Allow it to run and download files&lt;/li&gt;
&lt;li&gt;Then import it using the code below
&lt;code&gt;from googleapiclient.discovery import build&lt;/code&gt;
&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>youtubeapi</category>
      <category>data</category>
      <category>python</category>
    </item>
    <item>
      <title>Understanding Your Data: The Essentials of Explanatory Data Analysis</title>
      <dc:creator>Lameck Odhiambo</dc:creator>
      <pubDate>Sun, 11 Aug 2024 18:19:21 +0000</pubDate>
      <link>https://dev.to/lameck_oluoch/understanding-your-data-the-essentials-of-explanatory-data-analysis-nb</link>
      <guid>https://dev.to/lameck_oluoch/understanding-your-data-the-essentials-of-explanatory-data-analysis-nb</guid>
      <description>&lt;p&gt;Explanatory Data Analysis is a data analytics process that aims to understand the data in depth and learn different characteristics, often using visual means. This allows one to get a better feel for the data and find useful patterns.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Types of Explanatory Data Analysis&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Univariate Analysis&lt;/strong&gt;
Focuses on analyzing single variable at a time. Helps to understand the variable’s distribution, central tendency and spread.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;em&gt;Techniques&lt;/em&gt; &lt;br&gt;
• Descriptive statistics (mean, median, mode, variance, standard deviation)&lt;br&gt;
• Visualizations (histograms, box plots, bar charts, pie charts)&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Bivariate Analysis&lt;/strong&gt;
Examines relationship between two variables. Helps to understand how one variable affects or is associated with another.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;em&gt;Techniques&lt;/em&gt;&lt;br&gt;
• Scatter plots&lt;br&gt;
• Correlation coefficient&lt;br&gt;
• Visualizations (line plots, scatter plots etc)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Steps involved in Explanatory Data Analysis&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;1. Understand the Data&lt;/strong&gt;&lt;br&gt;
Familiarize yourself with the dataset, understand the domain, and identify the objectives of the analysis.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Data Collection&lt;/strong&gt;&lt;br&gt;
Collect the required data from various sources such as databases, web scraping or APIs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Data Cleaning&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;Handle missing values&lt;/strong&gt;: impute or remove missing data.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;df.isnull().sum()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;when cleaning:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;df_cleaned =df.dropna()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Remove duplicates&lt;/strong&gt;: Ensure there are no duplicate records.&lt;br&gt;
Checking duplicates&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;df.duplicated().sum()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Cleaning&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;df_cleaned=df.drop_duplicates()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;4. Data Transformations&lt;/strong&gt;&lt;br&gt;
Normalize or standardize the data if necessary&lt;br&gt;
Create new features through feature engineering.&lt;br&gt;
Aggregate or disaggregate data based on analysis needs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Data Integration&lt;/strong&gt;&lt;br&gt;
Integrate data from various sources to create a complete data set.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;6. Data Exploration&lt;/strong&gt;&lt;br&gt;
Univariate and bivariate analysis using histograms, box plots, line plots etc.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;7. Data Visualization&lt;/strong&gt;&lt;br&gt;
Visualize data distribution and relationships using visual tools such as bar charts, line charts, scatter plots, heat maps, and box plots. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;8. Descriptive Statistics&lt;/strong&gt;&lt;br&gt;
Calculate central tendency measures (mean, median, mode) and dispersion measures (range, variance, standard deviation)&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;df.describe()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;9. Identify patterns and Outliers&lt;/strong&gt;&lt;br&gt;
Detect patterns, trends and outliers in data using visualizations and statistical methods.&lt;br&gt;
 eg; using box plot&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import matplotlib.pyplot as plt

plt.boxplot(df['column_name'])
plt.show()

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;10. Documentation and Reporting&lt;/strong&gt;&lt;br&gt;
Document the EDA process, findings and insights clearly and structured.&lt;br&gt;
Create reports and presentations to convey results to stake holders.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Explanatory Data Analysis Tools&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Using the following tools for explanatory data analysis, data scientists can effectively gain deeper insights and prepare data for advanced analytics and modelling.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;em&gt;Python Libraries&lt;/em&gt;&lt;br&gt;
• Pandas: Provides data structures and functions needed to manipulate structured data seamlessly. Used for summary statistics.&lt;br&gt;
• Matplotlib:     A plotting library that produces static, animated and interactive visualizations.&lt;br&gt;
• Seaborn: Built on matplotlib, it provides a high level interface for drawing attractive statistical graphics.&lt;br&gt;
• SciPy: Builds on NumPy and provides many higher level scientific algorithms.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;em&gt;R Libraries&lt;/em&gt;&lt;br&gt;
• ggplot2: A framework for creating graphics using principles of the grammar of graphics.&lt;br&gt;
• Dplyr: A set of tools for data manipulation offering consistent verbs to address common data manipulation tasks.&lt;br&gt;
• Tidyr: Provides function to help you organize data in tidy way.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

</description>
    </item>
    <item>
      <title>The Ultimate Guide to Data Analytics: Techniques and Tools</title>
      <dc:creator>Lameck Odhiambo</dc:creator>
      <pubDate>Sun, 04 Aug 2024 17:34:23 +0000</pubDate>
      <link>https://dev.to/lameck_oluoch/the-ultimate-guide-to-data-analytics-techniques-and-tools-2oa9</link>
      <guid>https://dev.to/lameck_oluoch/the-ultimate-guide-to-data-analytics-techniques-and-tools-2oa9</guid>
      <description>&lt;p&gt;&lt;strong&gt;DATA ANALYSIS&lt;/strong&gt;&lt;br&gt;
The Ultimate Guide to Data Analytics: Techniques and Tools&lt;br&gt;
Data is powerful – and organizations around the world understand the value that data analytics holds when it comes to driving organizational growth and profitability.&lt;br&gt;
Data analytics involves using data techniques and tools that identify patterns and trends, which in turn generate actionable insights that support informed decision making. The primary objective for data analytics is to address specified questions or challenges that are relevant for organization to drive better business outcomes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Benefits of data analytics&lt;/strong&gt;&lt;br&gt;
Recognizing patterns and trends&lt;br&gt;
Understanding how data is compiled&lt;br&gt;
Future-proofing their career&lt;br&gt;
Improving productivity&lt;br&gt;
Mitigating risks&lt;br&gt;
Identifying and leveraging sources for a competitive advantage&lt;br&gt;
Help stake holders make better decisions&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Types of data analytics&lt;/strong&gt;&lt;br&gt;
Past&lt;br&gt;
Descriptive&lt;br&gt;
Diagnostic&lt;/p&gt;

&lt;p&gt;Future&lt;br&gt;
Predictive&lt;br&gt;
Prescriptive&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tools used for data analytics&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Microsoft excel&lt;/strong&gt;&lt;br&gt;
It’s the world’s best and most user-friendly spreadsheet software features calculations and graphing functions. It’s ideal for non-techies to perform basic data analysis and create charts and reports.&lt;/p&gt;

&lt;p&gt;Pros&lt;br&gt;
No coding is required&lt;br&gt;
User-friendly interface&lt;/p&gt;

&lt;p&gt;Cons&lt;br&gt;
Runs slow with complex data analysis&lt;br&gt;
Less automation compared to specialized tools&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tableau&lt;/strong&gt;&lt;br&gt;
Being one of the best commercialized data analysis tools, Tableau is famous for its interactive dashboards and data exploration capabilities. Data teams can create visually appealing and interactive data representations through its easy to use interface and powerful capabilities.&lt;/p&gt;

&lt;p&gt;Pros&lt;br&gt;
Intuitive drag-and-drop interface&lt;br&gt;
Interactive and dynamic data visualization&lt;br&gt;
Backed by salesforce&lt;/p&gt;

&lt;p&gt;Cons&lt;br&gt;
Expensive than competition&lt;br&gt;
Steeper learner curve for advanced features&lt;/p&gt;

</description>
    </item>
  </channel>
</rss>
