You happen to have a spreadsheet full of sales data, just rows and columns. You do not need SQL or Python to start analyzing data. In this article we are going to use Microsoft Excel to perform data analysis step by step.
What is Data Analysis.
This is the process of inspecting, cleaning, transforming, and modeling data with the goal of discovering useful information, informing conclusions and supporting decision making.
In simple terms, data analysis helps us answer questions like:
Which product sells the most?
What is the average rating?
Are discounts increasing sales?
Why use Excel for Data Analysis?
For years Excel has remained a data powerhouse when it comes to data analysis. Even in an era where we have popular tools like Python, Power Bi, R, Excel is still one of the most essential tools for data analysts.
One of the greatest benefits of Excel is that you do not need prior coding experience and you can learn the basics quickly. It is also one of the primary tools that is used in data collection during data entry.
Whether you are cleaning data, analyzing trends, or presenting insights, Excel offers flexibility, ease of use, and a vast range of functions that other tools often can’t match in day-to-day business scenarios.
Understanding Excel Data (Rows, Columns, Cells)
Excel is made of rows and columns which make up a grid like structure in the spreadsheet. Each cell in the grid is identified by a unique combination of a row number and a column letter.
In the image below the column and row header are displayed, the intersection of each columns and rows make a unique cell.
Columns are identified using letter and they are vertically positioned. While rows are identified numerically and are positioned horizontally. In the example of the image above, the first column is A and the first row is 1. That makes the first cell to be identified as A1.
Preparing Data for Analysis.
Now that we are familiarizing ourselves with how excel looks, it will feel much better to do so while getting a test of what it does and the powerful features it provides under the hood.
Understanding basics of Data Analysis in Excel is the first step towards making informed decisions. In the next sections, we’ll start working with real data and explore how Excel helps us clean, analyze, and summarize information efficiently.
Understanding Your Dataset
In Excel data is usually represented in the form of a dataset. This is a regular Excel spreadsheet containing rows and columns, with data depending on the field of origin.
The first job of any data analyst when presented with any form of data is to study the data and understand the type of data they have. It can be from sales, health records, telecommunication, Human Resource, etc.
Below is an example of a simple HR dataset stored in Excel.
This is a simple Excel dataset with columns showing employee details and rows filled with individual employee data.
Once you are familiar with the dataset, the next step is to perform some basic data cleaning.
Basic Data Cleaning in Excel
Data cleaning is a very essential step of data analysis. Before we start deriving any insights or making sense of our data, we need to make sure we have a clean and consistent data to work with.
There are a few things we need to check before proceeding with analysis.
Identifying missing values
Having data with missing values is common in any dataset. This can be caused by many factors during data collection, such as omission, forgetfulness, or unclear data requirements.
It is our first responsibility to make sure that the missing values are worked on to prevent getting wrong analysis.
There are several ways to deal with missing values;
Remove rows with missing values
This works only if the number of rows with missing values will not have any major effect when deducted from the dataset.Fill them with N/A, NULL or UNKNOWN
These are situations where the data is non-numerical and cannot be computed i.e. personal details like Phone number, City, etc.
Here we can use the Find and Replace.
- Select the column with missing values
- Press
Ctrl + Hor on theHomeribbon at the far right click onFind & Select. Check on the image below.
The below image will appear for Find and Replace
On the Find what field, type the value you want to replace. In the Replace with field, type the new value.
- Filling the missing values with mean, median or average. For numerical columns we can calculate the mean, median or average of the whole column and fill it in the missing values in the column.
Removing Duplicates
In many instance we will get duplicates in our datasets. Maybe employees with same employee_id or products with the same product_id. This results to having wrong analysis
Removing duplicates helps in remaining with cleaner data and in return gives correct outputs like count, mean, median, average, sum.
In Excel, duplicates can be removed by selecting the dataset, going to the Data tab, and clicking Remove Duplicates.
- Formatting Columns
It is very crucial that we have columns formatted with the correct data type it represent.
We select an individual column by clicking on top of the column where it is labelled with an alphabetical letter.
On the
Homeribbon go to theNumbersegment and choose the correct data type.
Click on on the small arrow at bottom right of the segment to get the datatype selection table as shown below.
You can now select on the given datatypes as per the one in your column.
The benefit of this is that it makes calculations and analysis easier and more accurate.. We can be able to use data given in every column correctly as per it's datatype.
- Renaming Columns Columns names are like street names to guide us as we maneuver through the dataset. They help us know where to find specific data and what it represents.. This means having the correct names for our columns is very important in our data analysis process.
To rename a column we double click the cell with the column name and then we can type in an appropriate name.
At this point, our data is clean, consistent, and ready for analysis. In the next section, we will begin exploring the data using basic Excel tools such as sorting, filtering, and simple formulas.
Sorting and Filtering Data
Now that we have clean data with no missing values or duplicates, we are ready to dive a little deeper into the world of data analysis.
We begin this by sorting our data.
Sorting in Excel is the process of arranging data in a specific order; alphabetically, numerically, or by date in either ascending or descending order.
There are different ways we can sort data in Excel. The first way is by clicking at the top of a column header to select it.
- Go to the Home tab on the ribbon and, on the far right under the Editing group, click Sort & Filter.
Click on the dropdown arrow to get more options on how to sort as illustrated in the image below.
In the Sort & Filter options we have two major options to choose from;
Sort A to Z / Oldest to Newest – This is Ascending Order
Sort Z to A / Newest to Oldest – This is Descending Order
NOTE:
We can follow the above procedure when we want to sort one or a few columns in our dataset.
The second way to sort your data is by formatting the dataset to a table.
You:
- Click on any cell in the dataset
- Go to the
Homeribbon - Under
stylessegment; click on Format as Table
When you click it, you will see multiple table styles to choose from. Check the image below for reference.
By making the data to a table, we will get a dropdown button at the right corner of every column. The button will give us options to choose from when we need to sort the column as depicted in the image below.
You may have noticed that the Sort & Filter options are grouped together in Excel. Both in the first example and the second one. This makes it easier for us to also filter or use the filter function on our dataset when needed.
Filtering in Excel is the process of displaying only the rows of data that meet specific conditions, while temporarily hiding the rest of the data.
In simple terms, filtering helps us focus on the information we are interested in without deleting or permanently removing any data.
For example:
- Showing only products with a rating above 4
- Viewing employees from a specific department
- Displaying sales made in a particular month
When a filter is applied, Excel does not remove any data. Instead, it hides the rows that do not match the selected criteria. This allows us to analyze specific parts of a dataset more efficiently.
Filtering is especially useful when working with large datasets where manually searching through rows would be time-consuming.
In the image below we see how to find the filter function just as we did we the sort.
We can also filter by formatting the dataset to a table first. This way we use the arrows in the column header of every column to apply the preferred filter as shown in the image below.
We can uncheck the boxes of the items we want to filter out so that we only remain with what we want to view.
With sorting and filtering, we can now explore our dataset from different perspectives and uncover meaningful insights more efficiently.
Basic Excel Formulas
As we have seen, Excel is not only a spreadsheet used to store data in rows and columns, but it is also a very powerful tool when it comes to manipulating data.
One of the best parts of Excel is its ability to perform computations on your data. This is possible because of the wide range of built-in formulas available in Excel. We are going to look at a few basic formulas such as SUM, AVERAGE, and COUNT.
SUM
To write any formula in Excel, we must start with an equals sign =.
This tells Excel that we want to perform a calculation in the selected cell.
SUM refers to the addition of two or more numbers.
For example: 2 + 2 = 4
In Excel we can either type the values manually or use cell representation.
In a new cell, type:
=5 + 5- Press
EnterThis will return the result10.
We can also use the SUM() function.
=SUM(5+5)
=SUM(5,5)
Both formulas will return the correct result.
Using Cell References
This becomes more useful when working with data inside a spreadsheet.
=SUM(B2,F2)
This adds the values in cells B2 and F2
Summing a Range of Cells
We can also calculate the total of a range of values in either column or rows.
=SUM(B2:F4)
This calculates the total of all values within that rectangular range.
If you want to sum values vertically in a single column:
=SUM(B2:B6)
This adds all the values from B2 to B6
With that we can now confidently use the SUM() function.
AVERAGE() Function
AVERAGE is the sum of a group of numbers divided by the total count of those numbers. It represents the central or typical value in a given dataset.
In Excel, the AVERAGE() function calculates the mean of selected cells automatically.
Just like the SUM() function, in AVERAGE() function we can either type the values manually inside the formula or use cell referencing.
Example:
=AVERAGE(66,72,80,62,60)
This will returns 68 as the average
Using Cell Referencing
=AVERAGE(B2:B6)
This calculate the average of values in cells B2 to B6.
The formula works both vertically(columns) and horizontally(rows).
Example:
=AVERAGE(B3:F3)
This would calculate the average of values across a row.
COUNT() Function
COUNT is used to count the number of cells that contain numeric values in a dataset.
In Excel, the COUNT() function counts only numbers. It ignores text and empty cells.
Example:
=COUNT(66,60,72,80,78,)
This will return 5 because there are five numeric values.
Using Cell Referencing
Example:
=COUNT(B2:F2)
This counts how many cells in the range B2 to F2 contain numbers.
Important:
If a cell contains text, it will not be counted by COUNT().
(If you want to count all non-empty cells, you would use COUNTA() instead.)
MAX() Function
MAX returns the largest value in a selected range of numbers.
=MAX(66,60,72,80,78)
This will return 80 because it is the highest value.
Using Cell Referencing
=MAX(B2:F2)
This returns the largest number in cells B2 to F2.
MIN() Function
MIN returns the smallest value in a selected range of numbers.
=MIN(66,60,72,80,78)
This will return 60 because it is the smallest value.
Using Cell Referencing
=MIN(B2:F2)
This returns the smallest number in cells B2 to F2.
MEDIAN() Function
MEDIAN returns the middle value in a dataset when the numbers are arranged in order.
If there is:
- An odd number of values → it returns the middle number.
- An even number of values → it returns the average of the two middle numbers.
Example:
=MEDIAN(66,60,72,80,78)
This returns 72
Using Cell Referencing
Example:
=MEDIAN(B2:F2)
This calculates the median of values in cells B2 to F2.
Conclusion
Microsoft Excel remains one of the most accessible and powerful tools for data analysis. Without writing a single line of code, we can clean data, organize it, explore trends, and perform meaningful calculations using built-in functions.
In this article, we learned:
- What data analysis is
- Why Excel is still relevant in modern data workflows
- How to understand datasets in rows and columns
- Basic data cleaning techniques
- How to sort and filter data for better exploration
Essential formulas such as SUM, AVERAGE, COUNT, MAX, MIN, and MEDIAN.
These are foundational skills for anyone beginning their journey into data analysis.
As you grow more comfortable with Excel, you can explore more advanced tools such as Pivot Tables, conditional formatting, charts, and eventually transition into tools like SQL, Power BI, or Python.
Every data journey starts somewhere, and Excel is one of the best places to begin.
If you are learning data analysis, I will be sharing more beginner friendly guides on Excel and transitioning into tools like SQL and Python. Stay tuned.



























Top comments (0)