DEV Community

Cover image for Introduction to MS Excel for Data Analytics
Musungu (Ruth) Ambogo
Musungu (Ruth) Ambogo

Posted on

Introduction to MS Excel for Data Analytics

Introduction

Excel is a simple yet powerful tool used by data analysts to perform various tasks such as data cleaning and preparation, data analysis, and data visualization. This article will help beginners understand how Excel can be used for data analytics and build a strong foundation for further learning.

What is Data Analytics?

Data analytics is the process of collecting, cleaning, and analyzing data to extract insights to support better decision-making, leading to more effective strategies and actions.
The four types of data analytics are:

  • Descriptive Analytics: answers the question "What has happened?" by summarizing historical data using reports, charts, and dashboards
  • Prescriptive Analytics: Answers the question "What should be done?" by recommending action steps to take to achieve a desired outcome
  • Diagnostic Analytics: Answers the question "Why did it happen?" by analyzing the past data to identify patterns, trends, and relationships in data
  • Predictive Analytics: Answers the question "What is likely to happen?" by using historical data to predict future outcomes, e.g., creating models to predict an outcome

Overview of Excel Interface

Microsoft Excel is a spreadsheet application used to organize, analyze, and visualize data. Its interface consists of several key components:

  • Workbook: An Excel file containing more than one worksheet
  • Worksheet: an individual sheet where data is entered and analyzed
  • Columns: vertical sections labeled with letters (A, B,...) used to organize data
  • Rows: horizontal sections labeled with numbers (1, 2, ...) where records are stored
  • Cell: An intersection of a row and a column (A1, B1,...)
  • Range: A group of two or more selected cells, e.g., A1:A10, represents a range of cells in column A
  • Ribbon and key tabs: A toolbar at the top of Excel that contains commands grouped into tabs such as Home, Insert, Formulas, and Data.
  • Formula: An equation that performs calculations on values. Formulas use arithmetic operations and must start with =
  • Filter: A tool used to show or hide specific rows based on specified criteria
  • Sort: A tool used to arrange data in either ascending or descending order

Below is an overview of the Excel interface showing key components.

Preparing data in Excel

Before analyzing any dataset, it is important to prepare it properly
This includes importing, cleaning, and formatting the data so it’s ready for analysis.

Importing data

Excel allows you to import data from multiple sources:

  • CSV files
  • Text files
  • Databases

Data cleaning and preparation

Proper cleaning ensures accuracy in your analysis
To perform data cleaning, you should:

  • Format data types: data can be represented in different ways, as text, number, date, or currency Ensure that each column is formatted correctly so that calculations and analysis work properly. How to do it:
    • Select the column.
    • Right-click → Format Cells → Choose the appropriate type (Number, Date, Text, Currency).

  • Remove duplicates: use a unique column in the data set (like Customer ID or Email) to check for duplicates and remove them if found. How to do it:
    • Select the column or the entire dataset.
    • Go to Data → Remove Duplicates.
    • Choose the column(s) to check for duplicates and click OK.

  • Handle missing values: Blank cells or missing data can affect calculations and analysis. You can:

    • Fill missing values with averages, zeros, or default values
    • Delete rows with critical missing data
    • Leave them blank if appropriate How to do it:
    • Use formulas like IF(ISBLANK(B2), 0, B2)
    • Or manually fill/delete blank cells
  • Data validation: restricts the data type that can be entered in a cell
    How to do it:

    • Select the cells.
    • Go to Data → Data Validation → List.
    • Enter allowed values (e.g., furniture, office supplies, technology).

Example:
In a “Category” column, only the listed categories can be entered, preventing typos like “offce”

  • Text-to-Columns: a tool used to split text into multiple columns based on a delimiter How to do it:
    • Select the column with text.
    • Go to Data → Text to Columns → Delimited.
    • Choose the delimiter and finish the wizard.

Example:
A “Full Name” column contains "Mary Jane". Using Text to Columns with a space delimiter splits it into "Mary" (first name) and "Jane" (last name).

  • Find & Replace: Allows you to locate specific text or values within a worksheet and replace them with other text or values. How to do it:
    • Press Ctrl + H.
    • Enter the value to find and the value to replace it with.
    • Click Replace All. Example: Replace all "NYC" entries with "New York City" to ensure consistency in a city column.

  • Conditional Formatting: highlights values in a range based on the condition in the cell to make patterns, trends, or outliers visible. How to do it:
    • Select the range.
    • Go to Home → Conditional Formatting → Highlight Cells Rules.
    • Choose a condition (e.g., salary greater than 10,000) and select a highlight color.

Excel functions for data analysis

Excel functions are built-in formulas that help summarize, analyze data, and automate tasks.
Common categories

  1. Aggregate functions: used to summarize data
    • SUM(): Finds the sum of values in a range =SUM(A1:A6) This adds numbers from cell A1 to A6

  • AVERAGE(): Calculates mean =AVERAGE(A1:A6) This returns a mean from cell A1 to A6

  • MIN(): Finds the smallest number in a range
  • MAX(): Finds the largest number in a range
  • COUNT(): counts the number of cells that contain numeric values.
  • COUNTIF(): counts cells that meet one condition
  • COUNTA(): counts cells that are not empty (text or numbers)

2.Logical functions: used to automate decision-making

  • IF(): performs a logical test and returns one value for TRUE and another value for FALSE

  • AND(): Returns TRUE if all conditions are met.
  • OR(): Returns TRUE if at least one condition is met.

Creating Categories Using AND & OR

Logical functions are commonly used to create categories in data analytics.
Example: Employee Performance Categorization
=IF(AND(D2>=70,E2>=5),"High Performer","Needs Improvement")


Meaning:
An employee is a high performer only if both conditions are met.

Example Using OR
=IF(OR(D2>=7,E2>=8),"Eligible","Not Eligible")

Meaning:
An employee is eligible if any one condition is met.

  1. Text functions are used to clean, format, and combine text data in Excel.

    • TRIM(): Removes all leading and excess middle spaces in data.
    • CONCAT(): connects data from different cells together
    • UPPER(), LOWER(), PROPER(): Standardize text formatting.
  2. Date and time functions

    • TODAY(): Returns the current date.
    • DATEDIF(): Calculates the difference between two dates
    • YEAR(), MONTH(), DAY(): Extracts part of a date
  3. Lookup functions
    Lookup functions are used to find and retrieve related data from another table based on a matching value.

    • VLOOKUP(): searches for a value in the first column of a table and returns a corresponding value from another column in the same row. Syntax: =VLOOKUP(lookup_value, table_array, col_index_num, [range_lookup]) Where:
    • Lookup value: The value you want to search for
    • table_array: The table range containing the data
    • col_index_num: The column number (from the left) to return a value from
    • range_lookup
      • FALSE = exact match (recommended for analytics)
      • TRUE = approximate match

Example: Get customer ID using row ID


The customer_id is

  • XLOOKUP(): can look left or right. XLOOKUP replaces VLOOKUP and HLOOKUP.
  • HLOOKUP(): searches for values horizontally across the first row and returns a value from a specified row
  • INDEX + MATCH(): returns values from a specific position in a range

Data Analysis using Pivot tables

A pivot table is a data processing tool that allows you to create summaries organized in a table
With pivot tables, you can organize data by categories, perform calculations (sum, average, count), and easily spot trends.
How to create a pivot table

  • Select your dataset.
  • Go to Insert → Pivot Table.
  • Choose where to place the pivot table (new worksheet recommended).
  • Drag fields into Rows, Columns, Values, and Filters areas.

Example of a marketing dataset

Pivot table showing total amount spent by age group

Charts for visualizing pivot table summaries

Charts make pivot table summaries easy to understand at a glance.
Excel offers many chart types:
1.Bar and Column Charts: Bar charts compare values across categories horizontally, while column charts compare values across categories vertically

2.Pie chart: shows proportions (in percentage) of a whole.
In this example, the pie chart shows the distribution of the amount spent across different age groups.


3.Line chart: shows trends over time

4.Combo chart: combines two different types of charts, e.g., a line chart and a bar chart


5.Area Chart: Visualize cumulative totals


6.Treemap: used for displaying hierarchical data using nested rectangles

How to create a chart from a pivot table:

  • Select the pivot table.
  • Go to Insert → Recommended Charts (or choose a specific chart type).
  • Customize chart titles, labels, and colors as needed.

Building Dashboards

Dashboards bring multiple pivot tables and charts together to provide an at-a-glance view of key metrics.
Key elements of a dashboard:

  1. Cards: Display key metrics or KPIs (e.g., total sales, average age).
  2. Slicers: Used to filter data dynamically in pivot tables and charts. Make dashboards interactive, allowing users to select categories like Region, Product, or Department. Tip: Use a categorical column when creating a slicer (e.g., Product Category, Region) How to add a slicer:
  3. Click on the pivot table.
  4. Go to PivotTable Analyze → Insert Slicer.
  5. Select the field(s) to filter and click OK
  6. Chart visuals: visuals generated from pivot tables

Once you’ve created your pivot tables, pivot charts, and calculated metrics, you can build a dashboard by following these steps:

Step-by-step process

  1. Create a New Worksheet
    Add a new worksheet and rename it “Dashboard.

  2. Format the Background
    Select all cells and fill the background with a color (e.g., light gray) to give a professional look.

  3. Add a Header
    Insert a text box at the top of the sheet for the dashboard title.
    Format the font, size, and color to make it stand out.

  4. Add Cards or Metrics

  5. Insert additional text boxes to display key metrics or KPIs.

  6. Copy and paste metric values from pivot tables, or calculate them directly from the original dataset if needed.

  7. Format each card to match the theme of your dashboard.

  8. Add Pivot Charts

  9. Copy and paste pivot charts from the pivot tables onto the dashboard sheet.

  10. Arrange charts according to the layout you want (e.g., top KPIs, bottom charts).

  11. Format the charts (colors, fonts, legends) to match the overall theme.

  12. Add Slicers for Interactivity

  13. Insert slicers to allow users to filter data dynamically (e.g., by Product, Region, or Department).

  14. Connect slicers to the pivot tables and charts so the dashboard updates automatically when a filter is applied.

Conclusion

Excel is a powerful tool for data analytics, allowing beginners to clean data, use pivot tables, create charts, and build interactive dashboards. By mastering these skills, you can extract insights, visualize trends, and make informed decisions. Practice on sample datasets to reinforce learning and gain confidence in using Excel for real-world analytics.

You are on the right track. Keep learning, keep growing

Top comments (0)