<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: samuel Omondi</title>
    <description>The latest articles on DEV Community by samuel Omondi (@samuel_omondi_4c8df8999c2).</description>
    <link>https://dev.to/samuel_omondi_4c8df8999c2</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3095756%2F5a0197b2-6e31-4612-bbf2-8950ee5999ee.jpg</url>
      <title>DEV Community: samuel Omondi</title>
      <link>https://dev.to/samuel_omondi_4c8df8999c2</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/samuel_omondi_4c8df8999c2"/>
    <language>en</language>
    <item>
      <title>6 Essential Steps of Data Cleaning Every Data Analyst Should Know in MS Excel</title>
      <dc:creator>samuel Omondi</dc:creator>
      <pubDate>Sun, 27 Apr 2025 19:35:28 +0000</pubDate>
      <link>https://dev.to/samuel_omondi_4c8df8999c2/6-essential-steps-of-data-cleaning-every-data-analyst-should-know-in-ms-excel-8m3</link>
      <guid>https://dev.to/samuel_omondi_4c8df8999c2/6-essential-steps-of-data-cleaning-every-data-analyst-should-know-in-ms-excel-8m3</guid>
      <description>&lt;p&gt;Data cleaning is a fundamental step in data analysis, ensuring datasets are accurate, consistent, and ready for analysis. &lt;/p&gt;

&lt;p&gt;Microsoft Excel provides powerful tools to clean data efficiently. &lt;/p&gt;

&lt;p&gt;In this guide we will explore six critical steps for effective data cleaning in Excel, complete with formulas, functions, and best practices.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1: Explore the Dataset&lt;/strong&gt;&lt;br&gt;
Before cleaning, understand your dataset’s structure, sources, and potential issues.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Here is how we will go about it;&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;1.Identify Data Sources:&lt;br&gt;
Check if data is imported from CSV, SQL, or manual entry.&lt;/p&gt;

&lt;p&gt;Use Data → Get Data → From File/DB for structured imports.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8m8yrbjh5vw0k9i4v2dy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8m8yrbjh5vw0k9i4v2dy.png" alt="importing data" width="582" height="444"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;2.Understand Data Types:&lt;br&gt;
Check your data columns and ensure the content format matches.&lt;br&gt;
Columns with numbers should be consistent, likewise to columns with texts and dates formats.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3gy8f3u25qfq7xfau440.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3gy8f3u25qfq7xfau440.png" alt="Uncleaned data" width="560" height="175"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Use these formulars to achieve consistency in data types.&lt;br&gt;
Text vs. Numbers: Use ISTEXT(), ISNUMBER().&lt;br&gt;
Dates: Verify with ISDATE() (custom VBA may be needed).&lt;/p&gt;

&lt;p&gt;3.Spot Data Quality Issues:&lt;br&gt;
Filters: Sort columns to spot inconsistencies.&lt;/p&gt;

&lt;p&gt;use Data → Sort &amp;amp; Filter&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxe53i450eatymz5o4nhg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxe53i450eatymz5o4nhg.png" alt="Sorting and Filtering" width="660" height="280"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Conditional Formatting: Highlight blanks, errors, or outliers.&lt;/p&gt;

&lt;p&gt;use → Home → styles → conditional formatting&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faafzjvxic7yeq16etmdt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faafzjvxic7yeq16etmdt.png" alt="conditional formating" width="593" height="293"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To remove blanks, shortcut ctr + G  → special → blanks&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxsctwwtf2m9el1d7q20x.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxsctwwtf2m9el1d7q20x.png" alt="Removing blanks" width="541" height="432"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Data Validation: Set rules (e.g., age must be 18+).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn5ssgskpgdiaz3smp3h8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn5ssgskpgdiaz3smp3h8.png" alt="Data Validation" width="653" height="505"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2: Handle Missing Data&lt;/strong&gt;&lt;br&gt;
Missing values can distort analysis giving inaccurate or misleading insights. This can be costly, and to avoid this, you can decide whether to delete or impute them.&lt;/p&gt;

&lt;p&gt;Below are other ways of dealing with missing data&lt;/p&gt;

&lt;p&gt;Start by finding the  Missing Data:&lt;/p&gt;

&lt;p&gt;=COUNTBLANK(A2:A100) → Counts empty cells.&lt;br&gt;
=IF(ISBLANK(A2), "Missing", "Present") → Tags missing values.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 3: Remove Duplicates&lt;/strong&gt;&lt;br&gt;
Duplicate records like blanks can bias results&lt;br&gt;
Make sure to eliminate them while keeping unique entries.&lt;/p&gt;

&lt;p&gt;Find Duplicates:&lt;/p&gt;

&lt;p&gt;Data → data tools → duplicates&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2t5cyg6ja63yu1mupmbp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2t5cyg6ja63yu1mupmbp.png" alt="Duplicates in excel" width="455" height="343"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;After running the "find duplicates" command we can use conditional formatting to highlight the duplicates or remove the duplicates&lt;/p&gt;

&lt;p&gt;Conditional Formatting → Highlight Duplicates&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 4: Handle Formatting Issues&lt;/strong&gt;&lt;br&gt;
Inconsistent formatting leads to errors&lt;/p&gt;

&lt;p&gt;Here is how you can standardize text, dates, and units.&lt;/p&gt;

&lt;p&gt;1.Fix Text Cases:&lt;br&gt;
=UPPER(A2) (ALL CAPS)&lt;br&gt;
=PROPER(A2) (Title Case)&lt;br&gt;
=TRIM(A2) (Remove spaces)&lt;/p&gt;

&lt;p&gt;2.Standardize Dates:&lt;br&gt;
=DATEVALUE(TEXT(A2, "mm/dd/yyyy")) → Converts text to date.&lt;/p&gt;

&lt;p&gt;3.Convert Units:&lt;br&gt;
Miles to KM: =A2*1.60934&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 5: Manage Outliers&lt;/strong&gt;&lt;br&gt;
Outliers can skew statistics, and it is important to detect and handle them appropriately.&lt;/p&gt;

&lt;p&gt;The best action to take is remove outliers by filtering and deleting extreme values.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 6: Validate Data&lt;/strong&gt;&lt;br&gt;
Ensure cleaned data matches source integrity and business rules.&lt;/p&gt;

&lt;p&gt;Below is how you can go about;&lt;/p&gt;

&lt;p&gt;1.Cross-Check with Original Data:&lt;br&gt;
Use VLOOKUP() to compare cleaned vs. raw data.&lt;/p&gt;

&lt;p&gt;2.Test Business Rules:&lt;br&gt;
=IF(AND(B2&amp;gt;=18, B2&amp;lt;=65), "Valid", "Invalid")&lt;/p&gt;

&lt;p&gt;3.Check Consistency:&lt;br&gt;
Age vs. Birth Year: =YEAR(TODAY())-B2&lt;/p&gt;

</description>
      <category>tutorial</category>
      <category>beginners</category>
    </item>
  </channel>
</rss>
