DEV Community

Cover image for ANALYSTS TURNING MESSY DATA INTO ACTIONABLE INSIGHT
Shaban Ibrahim
Shaban Ibrahim

Posted on

ANALYSTS TURNING MESSY DATA INTO ACTIONABLE INSIGHT

Introduction

Have you ever come across a fancy, attractive and well-packaged dashboard in your company, media, internet or even some presentation? They might have been sales dashboards, Attrition dashboards, customer churn dashboard or even financial reporting dashboards. At first, you are curious to understand what the dashboard is trying to communicate to its users, and then you become more curious about how the dashboard was created and who created it.
As a junior data analyst or an enthusiast in the world of data, you are obviously curious and anxious to understand what the life of an analyst is, and their daily routine. Well, at least 75% of the work of a data analyst is to clean messy data.

Messy Data

It is normal and very okay for a data analyst to first deal with messy data before continuing with other analysis tasks. Thismight be messy in that it is full of inconsistencies, duplicates, incomplete or even spread across different sources that are not compatible. For example, you are dealing with sales figures from different departments, regions using different naming conventions, dates stored as texts, currencies not standardised, and a missing key field, such as ID. As an analyst, you need to address the messy data first for the accuracy of your analysis and to avoid wrong and misleading insights.

What to do as an analyst

As the data analyst that you are, you need to start by understanding the context of the data and the data itself. Understand the sources, the users and the question you are to answer as an analyst. Understanding some of these concepts will guide you in the process of cleaning your data for correct insights and accurate analysis. Note that not all inconsistencies should bother you; what matters most are those that will inform your final analysis and make it accurate.

PowerQuery

Now you might be asking yourself how data cleaning is done and how cumbersome the process is. One thing as an Analyst, you must have come across is PowerQuery. As a data analyst, understanding Power Query should be mandatory because that is where most of the data analyst's time is spent.
PowerQuery is so powerful in that it is used for:

1- Cleaning and Transforming of Data
It is a very powerful tool that is used for cleaning, transforming anf combining data in a structured and repeatable way.
It is at the PowerQuery where tasks such as removal of duplicates, fixing data types, handling missing values and standardisation of text fields are done. Here, you can also merge data from different sources.
On the same PowerQuery tasks, such as adding, removing columns and adding new columns to a data set are done. And the good thing about this is that every activity carried out within the PowerQuery is recorded for reference and also to easily track where a mistake was made during the cleaning and transformation stage.

2- Data Modelling
After the data has been cleaned, the data set now proceeds to the next stage of analysis, a very crucial phase for that matter, data modelling. Data Modelling in Power BI is where relationships are defined, especially when working with more than one data table. The data tables are organised in a way that clearly shows how the business operates. Here is where the analyst chooses between a star schema and a snowflake schema, but for simplicity and efficiency, most analysts prefer the former to the latter.
As an analyst, keep in mind that a well-defined data model is going to make analysis easier, faster and more reliable. So the analyst should be keen on the relationship direction, cardinality and granularity to ensure that the model answers the business question accurately.

3-Using DAX
DAX (Data Analysis Expression). It is like a formula language used in Power BI and Power Pivot to create calculations on your data, kind of like Excel formulas, but on steroids. It is used by analysts to turn structured data into meaningful metrics.

DAX is very useful when calculating measures such as total sales, growth rates, rolling averages, periodical performance comparison e.t.c
It also comes in handy when creating tables and columns in your PowerQuery

DAX for Calculated Measures

Total Sales = SUM(Sales[Amount])

Total Sales LY =
CALCULATE(
    [Total Sales],
    SAMEPERIODLASTYEAR(DimDate[Date])
)
Enter fullscreen mode Exit fullscreen mode
DAX for Calculated Column

Profit Status =
IF(
    Sales[Profit] > 0,
    "Profit",
    "Loss"
)
Enter fullscreen mode Exit fullscreen mode
DAX for Calculated Table

DimDate =
ADDCOLUMNS(
    CALENDAR (DATE(2018,1,1), DATE(2026,12,31)),
    "Year", YEAR([Date]),
    "Month", FORMAT([Date], "MMMM"),
    "Month No", MONTH([Date]),
    "YearMonth", FORMAT([Date], "YYYY-MM")
)
Enter fullscreen mode Exit fullscreen mode

DAX can be very powerful if the analyst understands the dynamism that exists between the row context and filter context in order to achieve accuracy and efficiency. DAX is more of a translation of business needs into calculations that update dynamically and consistently without breaking across the entire report.

4- Visualisation
After cleaning the data, modelling and using DAX to make sense out of your data, as an analyst, you take the next step of your analysis journey, which is visualisation. Here is where you design your dashboard into a communication tool that answers specific questions. The analyst is guided by the business need and the question he intends to answer using the dashboard.

A well-built and effective dashboard highlights Key Performance Indicators (KPI's), trends over time, and makes it easy to identify outliers. Here, the analyst should understand the type of visualisation to be used at a particular time to answer a specific question, for instance, line charts for trends, bar charts for comparisons, and tables for details.

5- Generating Insights
Analyst goes through the entire process, goes beyond dashboards to explain what the data is saying, answers the questions, identifies risks and opportunities, and makes recommendations.

Integrating dashboards into an analyst's daily workflows, team meetings, performance reviews and operational check-ins makes them more of an accountability tool rather than just a reporting tool.

Conclusion

Through careful data cleaning and transformation, modelling, relevant and accurate DAX, and intentional dashboard designs analyst transforms chaos into clarity, turning messy data into an insightful, well-organised dashboard. This helps inform the present and shape the future.
When done purposefully, the users see more answers and direction rather than seeing messy data, complex DAX formulas and technical models.

That is how PowerQuery give the data analysts the powers he enjoys.

Top comments (0)