Jedidah Ondiso

Posted on Mar 2

JOINS AND WINDOWS FUNCTIONS IN SQL

#sql #database #beginners #tutorial

STRUCTURED QUERY LANGUAGE

SQL (Structured Query Language) is the standard programming language designed for managing, manipulating, and retrieving data stored in relational databases. Developed in the 1970s, it is used to interact with database systems to perform tasks such as updating records, deleting data, creating new tables, and managing user permissions.

JOINS FUNCTION
JOIN is a clause used to combine rows from two or more tables in a relational database, based on a related column (or join key) between them. This is a fundamental operation in SQL that allows data to be retrieved from multiple, logically related tables as a single, unified result set.

TYPES OF JOINS
1. Inner Join
Inner join statement joins two tables based on a common column and selects rows that have matching values in these columns.

-- join Customers and Orders tables with their matching fields customer_id

SELECT Customers.customer_id, Orders.item
FROM Customers
INNER JOIN Orders
ON Customers.customer_id = Orders.customer_id;

2. Right Join
Right join statement joins two tables based on a common column. It selects records that have matching values in these columns and the remaining rows from the right table.

SELECT Customers.customer_id, Customers.first_name, Orders.item
FROM Customers
RIGHT JOIN Orders
ON Customers.customer_id = Orders.customer_id;

3. Left Join
Left join statement combines two tables based on a common column. It then selects records having matching values in these columns and the remaining rows from the left table.

SELECT Customers.customer_id, Customers.first_name, Orders.item
FROM Customers
LEFT JOIN Orders
ON Customers.customer_id = Orders.customer_id;

4. Full Outer Join
Full outer join statement joins two tables based on a common column. It selects records that have matching values in these columns and the remaining rows from both of the tables.

SELECT Customers.customer_id, Customers.first_name, Orders.item
FROM Customers
FULL OUTER JOIN Orders
ON Customers.customer_id = Orders.customer_id;

5. Cross Join
Cross join statement return the Cartesian product of rows from the tables in the join. This is the result of combining each row from one table to each row of the second table.

SELECT * 
FROM Customers
CROSS JOIN Orders;

The query combines each row of the Customers table with each row of the Orders table.

Joins summary in a diagramn

WINDOWS FUNCTION

A window function in SQL is a type of function that performs a calculation across a specific set of rows (the 'window' in question), defined by an OVER() clause.

Window functions use values from one or multiple rows to return a value for each row, which makes them different from traditional aggregate functions, which return a single value for multiple rows.

SELECT column_1, column_2, column_3, function()
OVER (PARTITION BY partition_expression ORDER BY order_expression) as output_column_name
FROM table_name;

In this syntax:

The SELECT clause defines the columns you want to select from the table_name table.
Function() is the window function you want to use.
The OVER clause defines the partitioning and ordering of rows in the window.
The PARTITION BY clause divides rows into partitions based on the specified partition_expression; if the partition_expression is not specified, the result set will be treated as a single partition.
The ORDER BY clause uses the specified order_expression to define the order in which rows will be processed within each partition; if the order_expression is not specified, rows will be processed in an undefined order.

Finally, output_column_name is the name of your output column.
These are the key SQL window function components. One more thing worth mentioning is that window functions are applied after the processing of WHERE, GROUP BY, and HAVING clauses. This means you can use the output of your window functions in subsequent clauses of your queries.

The OVER() clause
Is essentially the core of window functions. It determines the partitioning and ordering of a rowset before the associated window function is applied. The OVER() clause can be applied with functions to compute aggregated values such as moving averages, running totals, cumulative aggregates, or top N per group results.

The PARTITION BY clause
Is used to partition the rows of a table into groups. This comes in handy when dealing with large datasets that need to be split into smaller parts, which are easier to manage. PARTITION BY is always used inside the OVER() clause; if it is omitted, the entire table is treated as a single partition.

The ORDER BY clause
It determines the order of rows within a partition; if it is omitted, the order is undefined. For instance, when it comes to ranking functions, ORDER BY specifies the order in which ranks are assigned to rows.

Frame specification
In the same OVER() clause, you can specify the upper and lower bounds of a window frame using one of the two subclauses, ROWS or RANGE. The basic syntax for both of these subclauses is essentially the same:

ROWS BETWEEN lower_bound AND upper_bound
RANGE BETWEEN lower_bound AND upper_bound

And in some cases, they might even return the same result. However, there's an important difference.

In the ROWS subclause, the frame is defined by beginning and ending row positions. Offsets are differences in row numbers from the current row number.

As opposed to that, in the RANGE subclause, the frame is defined by a value range. Offsets are differences in row values from the current row value.

Types of SQL window functions
Window functions in SQL Server are divided into three main types: aggregate, ranking, and value functions. Let's have a brief overview of each.

Aggregate window functions

AVG() returns the average of the values in a group, ignoring null values.
MAX() returns the maximum value in the expression.
MIN() returns the minimum value in the expression.
SUM() returns the sum of all the values, or only the DISTINCT values, in the expression.
COUNT() returns the number of items found in a group.
STDEV() returns the statistical standard deviation of all values in the specified expression. -** STDEVP() **returns the statistical standard deviation for the population for all values in the specified expression.
VAR() returns the statistical variance of all values in the specified expression; it may be followed by the OVER clause.
VARP() returns the statistical variance for the population for all values in the specified expression.

Ranking window functions

-ROW_NUMBER() assigns a unique sequential integer to rows within a partition of a result set.

RANK() assigns a unique rank to each row within a partition with gaps in the ranking sequence when there are ties.
DENSE_RANK() assigns a unique rank to each row within a partition without gaps in the ranking sequence when there are ties.
PERCENT_RANK() calculates the relative rank of a row within a group of rows.
NTILE() distributes rows in an ordered partition into a specified number of approximately equal groups.

Value window functions

LAG() retrieves values from rows that precede the current row in the result set.
LEAD() retrieves values from rows that follow the current row in the result set.
FIRST_VALUE() returns the first value in an ordered set of values within a partition.
**LAST_VALUE() **returns the last value in an ordered set of values within a partition.
NTH_VALUE() returns the value of the nth row in the ordered set of values.
CUME_DIST() returns the cumulative distribution of a value in a group of values.

DEV Community

JOINS AND WINDOWS FUNCTIONS IN SQL

Top comments (0)