DEV Community: John Kyalo

Time Intelligence Functions (DAX)

John Kyalo — Wed, 09 Jul 2025 03:28:50 +0000

Long time no Data chat around here.

Let's talk about 3 crucial time intelligence functions in Power BI namely DATEADD, DATESINPERIOD and DATESBETWEEN.

Basically, they all shift or define a date range, especially for creating time-based comparisons like:
same period last year
Trailing 12 months
Custom range
They differ in how they define the range and what you pass in the parameters.

Let's dive in:

DATEADD
Shifts the dates I have forward or backward in time by a specified number of Intervals.
I think of it as shifting the dates I have by a fixed number of units.

It is best for YoY, QoQ and MoM calculations.

DATESINPERIOD
Returns a continuous date range (before or after a reference date), going back or forward by a fixed number of Intervals.
I think of it as, "Give me a window of dates around a specific date"

It is best for trailing periods e.g. (rolling 12 months)

DATESBETWEEN
Returns all dates between a start and an end date.
I think of it as manually specifying the start and end of the range.

It is best for exact range filtering:
YTD, QTD, MTD, Sales Jan - March

In summary, DATEADD is a shift while DATESINPERIOD & DATESBETWEEN are range selectors.

A key thing to note:
DATEADD needs a complete date table as it is not gap-tolerant. Such issues are the reasons we also choose to have a dedicated Dim Calendar table.

That's it for today and see you on the next one.

Anything & Everything Data

SWITCH SELECTION IN POWER BI DAX

John Kyalo — Tue, 04 Feb 2025 03:54:57 +0000

After a sort of an hiatus, okay not really a planned one, I am here to talk about the switch selection in Power BI Dax.

Here is the scenario...
You have calculated many measures, and you want to know get to understand the %Achieved.
The approach would be first get the measure targets for all the measures you have then now subsequently do other measures with %Achieved whereby it's an %achieved = DIVIDE([achieved],[target])

Hold on, but that sounds a long route, and it is a traditional approach to providing solutions. What if I told you that we have a dynamic approach that could be done with a few steps?

Here are the steps:

Get a disconnected table with all indicators that you are tracking.
Establish a relationship of the table above with your targets table probably using the Indicator Id.
Now proceed to only do a single measure of targets by establishing a variable of the selected Indicator. see example below:
VAR SelInd = SELECTEDVALUE(disc[Indicator_id])
RETURN
CALCULATE(
SUM(Targets[Target_value]),
Targets[Indicator_id] = SelInd
)
The measure above basically looks for the indicator selected, and retrieves the target for that particular Indicator only (a dynamic approach)
Now our many measures also need to be wired in a switch logic that matches the Indicator_id in the disconnected table.
This way you only have one measure that contains all the achieved values of the indicators. It is this measure that we use to get to our %Achieved.
Basically a switch based on selected value in disc table
Now that you have your two measures, achievements and target, go ahead and calculate the %Achieved
%Achieved = DIVIDE([Achieved], [Targets])

Voila, use the above in your visuals.

key:
The relationship between the disconnected table and the Targets table allows you to dynamically fetch the target values for the selected indicator.

The switch statement in the achieved measure dynamically calculates the achieved value based on the selected indicator.

Remember,
Anything and Everything Data

Calendar in DAX, Power BI

John Kyalo — Tue, 12 Nov 2024 03:28:38 +0000

As a Power BI developer, more than often you'll likely need to track trends over time. It goes without saying that at some point you will be engaging with a calendar.

In DAX, you can easily create a calendar table with all the features of a calendar you need, may it be Quarters in a year or month numbers.

We will explore creating a calendar table :

The first step is specifying the range of dates you want your calendar to contain and you can contain that in variables. You will need to have a calendar that includes the dates present in your calendar

dimCalendar =
VAR Mindate = MIN(Expenditure[Date]
VAR Maxdate = TODAY()
RETURN
ADDCOLUMNS(
CALENDAR(Mindate, Maxdate),
"YEAR" YEAR([Date]),
"MONTH" MONTH([Date]),
"DAY" DAY([Date]),
"MONTHNAME" FORMAT([Date], "MMM"),
"QUARTER", "Q" & QUARTER([Date])
)

The above is a simple calendar that:

Defines the dates you need in your calendar table as variables
You need to ADDCOLUMNS to the calendar, and begin by having it said that the calenda will contain the mindate and the maxdate. 3.Specify the columns as needed. Whatever we have in quotation, is the name of the column then returns respective value.
The format in the Monthname ensure we return our months as 'Jan, Feb' simply as 3 letter words.
For the Quarter column, the additional Q ensures the name of quarters begin with the prefix Q. This way it adds much more sense.

There you go, try out this simple calendar and use it in your reports.

Anything and Everythind Data!

ALWAYS A DATA NERD

John Kyalo — Tue, 05 Nov 2024 20:18:59 +0000

Been a long way since I connected with you all.
Actually, the last time we did it was over some SQL...Yea, SQL is and has been great, coming in handy for data nerds dealing with end-to-end projects...

Now ever since, I have dived into the exciting Power BI, the best visualization tool in my opinion so far and it has been nothing short of a learning experience...
For a month of learning, and another month of using it on a daily basis I have done quite much with this amazing tool.

Over the next few articles, I will share with you some of the concepts that I have come to appreciate in Power BI, helping to bring out the best analysis of data. Talk of stuff in DAX, Data Modelling, Visuals that are so great and so much of other stuff...
Hang in there and let's get into this ride together!

Remember,

Anything and Everything Data

Window Functions in SQL

John Kyalo — Sun, 21 Jul 2024 16:09:36 +0000

Just another time nerds, to break down Window functions to you in the simplest form ever.
Simply window functions allow you to perform calculations across a set of rows(a window) while still retaining access to individual rows.
You can use these functions to perform running calculations.
They include:
RANK(), ROW_NUMBER(), DENSE_RANK(), LEAD(), LAG() and even other more aggregate functions.

For the ranking functions:
Row_number(): numbers all rows sequentially
Rank(): uses the same numeric values for rows which are a tie
then skips the next value
Dense_ rank(): uses the same numeric values for rows which are a
but does not allow any gap to the values.

Lead () allows access to rows after the current row while lag() allows access to rows before the current row.

Syntax:
RANK() OVER(PARTITION BY... ORDER BY... )

Partition by divides the result set into partitions
The ranking gets applied to each partition separately

This will help you get started with window functions
Anything and Everything Data

Common Table Expressions (CTEs) in SQL

John Kyalo — Sun, 05 May 2024 19:35:23 +0000

Let me take you through an advanced yet easy concept to grasp in SQL.
You have probably already dealt with sub-queries in SQL. If so, then this is no difference.
A CTE is basically a named temporary result set used within a larger SQL statement.

Similar to a subquery also known as a nested query, CTEs are useful for breaking down complex queries into more manageable parts to improve code readability.
Think of it as a better way to organize longer queries.

Having known that, let's go through a CTE example:
First things first, the syntax to include a CTE statement is,

WITH cte_xxxx
AS (larger/temporary query)
then now the main query

A point to note is every other time, you should run the two together because as its name appears, a temporary query is not saved anywhere

WITH cte_employees
AS (
SELECT emp_id, first_name, last_name, dpt_id, dpt_name
FROM employees)
SELECT * FROM cte_employees
WHERE dpt_id = 2;

The main query selects data from our CTE allowing easy retrieval of information specifically related to department 2

Always treat a CTE query like any other query...Go ahead and perform joins, aggregate functions in a CTE.
In the event of multiple CTEs, always include them in the same WITH statement separated by a comma.

Happy querying SQL nerds

JOINS IN SQL

John Kyalo — Sun, 28 Apr 2024 19:08:20 +0000

Being a SQL nerd you've probably heard of joins.

SQL JOINS combine rows from different tables on related columns.

INNER JOIN:
This being the most common type of join, inner join combines rows from two tables based on a related column between them. Basically, a field that exists in both tables.
Sometimes you can just write it as JOIN.

LEFT JOIN:
Retrieves all records from the left table with the corresponding matches in the right table. If there is no match in the right table, NULL values are returned for values in the right column.

RIGHT JOIN:
Look at this as the opposite of Left join. Retrieves all records from the right table with the corresponding matches on the left table. Again, if no match found, NULL values are returned as well.

FULL OUTER JOIN:
This type of join returns all records from both tables regardless of a match or not.
For cases where there is no match, NULL values as well are returned for the columns without a corresponding row.

CROSS JOIN:
I usually think it is a special type of JOIN...No condition imposed. (ON clause)
It returns a cartesian product of two tables. combining each row from the first table with each row from the other table.

While on JOINS, there are two clauses that appear as much:

UNION VS UNION ALL
Used to combine results from two or more queries into a single result set.
UNION automatically removes the duplicates that arise between the queries.
The idea is to generate a unique result set.
Unlike UNION, UNION ALL does not remove duplicates. It simply combines all results, including duplicates if any.
This means UNION ALL is faster than UNION as it does not need to check for duplicates and eliminate.

Happy querying!

GETTING STARTED WITH THE SQL JOURNEY

John Kyalo — Sat, 20 Jan 2024 20:28:19 +0000

Structured query language is it, and basically, it is the language for communicating with Relational databases such as MySQL and SQLSERVER.
With its ease syntax it gets no minute into understanding the fundamentals. In the past 2 weeks I have had a look or rather a deep dive into the introduction of SQL and quite a lot has been helpful in the learning.
I installed MySQL and chose Popsicle as the text editor to write the queries...It is actually a cool tool to use. I went from getting to CREATE a database to actually querying every bit of its data. Of course, tables are involved, structured into columns and rows.
Constraints in SQL, got to see how you can specify rules for the data that goes in my tables. Some of the common constraints include:
PRIMARY KEY - unique value that identifies each row in a table
FOREIGN KEY - basically a primary key in another table. You refer to it by the term REFERENCE.
NOT NULL - can't leave an empty value for your columns.
UNIQUE - a different value that doesn't match any other in that particular column.
DEFAULT - set a default value for a column if no value specified.

Moving on I got to interact with a couple SQL Queries. This goes from:
INSERT INTO - feeds data into tables.
SELECT FROM - retrieves data from tables.
UPDATE - modify data in a table.
DROP
DELETE FROM
and so forth
A lot of SQL, from the word jump I have realized it's all about the keywords, understand your tables or rather data then fetch it using the appropriate keyword.
I also got to use most of the keywords which include:
WHERE: a specification
ORDER BY: does sorting whether in ascending or descending order.
GROUP BY: groups rows with the same values into summary rows.
AS: for aliases
DISTINCT: all different values, what actually exits

Functions too especially aggregate functions such as COUNT(), SUM(), AVG(), MAX(), MIN()

Moving to JOINS, UNIONS, NESTED queries this SQL gets more interesting for a fact.

Getting to all of this and even more which I left out was way appealing. Pulling all kind of information, you require from modelled data is actually satisfying.

I now keep it on get hands all dirty on and uncovering more of SQL.

Datanerd

DATA ANALYST

The Complete Guide to Time Series Models

John Kyalo — Sun, 22 Oct 2023 15:08:08 +0000

Time is the most important factor which ensures success in a business. It is difficult to keep up with the pace of time but through methods of prediction and forecasting such as time series models this is achievable.
It involves working on time-based data to derive hidden insights to make informed decision making. The goal of a time series model is usually to make a forecast for the future.
To be able to come up with a time series model, certain aspects of a time series are considered such as:
Is it stationery?
Is there a seasonality?
Is the target variable autocorrelated?

Moving on, many of the ways to build a model include:
Moving average - it's a model that states the next model is simply the mean of all the past observations.
Exponential smoothing- uses similar logic to moving average but a less importance is assigned to each observation.
Double exponential smoothing- used when there is a trend in the time series.
Triple exponential smoothing- This method extends double exponential smoothing by adding a seasonal smoothing factor. Of course, this is useful if you notice seasonality in your time series.
Seasonal autoregressive integrated moving average (SARIMA.)
Now, SARIMA is the combination of simpler models that create a complex model that can present a time series exhibiting non-stationery properties and seasonality.
It does involve:
the autoregression model AR(p), basically a regression of the time series onto itself.
the moving average model MA(q)
order of integration
then finally seasonality S where S is the season's length.
All this combines to get the SARIMA.

To put this Time Series Models into use, it is applied in:
1) Determining patterns.
2) Detecting anomalies.
3) Forecasting future trends.

Exploratory Data Analysis using Data Visualization Techniques

John Kyalo — Sun, 08 Oct 2023 08:29:28 +0000

Exploratory Data Analysis involves initial investigation and examination of datasets to summarize the main characteristics often with the help of graphical representations.
Basically, EDA helps you get a feel of the data you are working with. This is by identifying the structure and patterns involved. EDA helps generate hypotheses about relationships and trends in data which guide further analysis.
EDA with data visualization involves creating various plots and charts, such as histograms, box plots, scatter plots, bar charts and heatmaps, to visualize the distribution and relationship within the data. With this, you are able to uncover styles, pick out relationships, and gain insights. In case of anomalies, you also get to identify them.
Mostly Matplotlib and Seaborn are Python libraries used for visualization. There exist various sorts of EDA strategies hired depending on the nature of records and desires of evaluation. This includes:

Univariate analysis-makes a specialty of analyzing character variables inside the records set. It involves visualizing an unmarried variable at a time to understand its distribution. Examples include: Histogram -displays distribution of a single numerical variable. Useful for understanding data's central tendency.
Bivariate analysis- from the name bi which means two, you explore two variables by finding their correlation, association and dependencies. Examples include: Scatter plot - which explores relationship between two variables.
Multivariate analysis- extends bivariate evaluation to encompass greater variables. It ambitions to apprehend the complex interactions and dependencies among the many variables in a record set.

More different plots which are also considered as techniques include:

Box plots (Box and Whisker) which provides a summary of visual distribution of data. Good for identifying outliers.
Line plots used for time-series data. They show how a variable change over time identifying trends.
Bar charts mainly used for comparison of categorical data.
Heatmaps which visualize correlation matrix of numerical variables. They use color intensity to represent the strength of correlations.
Pie charts which shows the composition of categorical variable.
Many others include: pair plots, violin plots, density plots, word cloud ...

All these visualization techniques can be done in both Python, and other visualization tools such as Excel, PowerBi and Tableau. Choice of a tools depends on the user's and interest and what they would like to achieve.

Therefore, EDA through visualization is a key step in the data analysis process that helps leverage insight into data understanding that results to appropriate business decision making.

It's Data Allday Everyday

Data Science for Beginners:2023-2024 Complete Roadmap

John Kyalo — Tue, 26 Sep 2023 11:33:13 +0000

Data Science is a rapidly growing field that combines statistics, mathematics, computer science and domain knowledge to extract insights from data. As a data scientist, skills are purposeful to solve complex problems and make better decisions in a variety of Industries.

To learn, it is very key to start with the basics. This includes learning about the different types of data, how to collect and clean data, and how to use programming languages to analyze data.
To know how to handle data, it's advisable to even begin with what is termed to be used by data analysts a lot i.e.: Understanding Excel, Spreadsheets and SQL which is a Structured Query Language for managing and manipulating data stored in databases.

The essential skills that then you build onto include learning a programming language in this case Python or R. Python is a general-purpose language well known for its simplicity whereas R is a statistical language.

From there, a visualization tool would help. Besides working with Python or even R for data manipulation, adding a set of tools such as Tableau, Power Bi is essential. These tools allow you to create interactive and visually appealing charts, graphs and dashboards to communicate your data insights.

Deep into data Science after getting comfortable with the essentials, it's when you dive into statistics as this is crucial for data mining. Here you get to learn about machine learning algorithms and how you can build predictive models. Basically, Machine learning would be allowing computers to learn from data without being explicitly programmed.

Advanced topics of a data Science career would include:
Deep Learning, Big data and Natural Language Processing.

Gaining hands-on experience with data science projects is the way to get to learn about this exciting field.

IT'S DATA ALLDAY, EVERYDAY