<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Matt Rueedlinger</title>
    <description>The latest articles on DEV Community by Matt Rueedlinger (@rueedlinger).</description>
    <link>https://dev.to/rueedlinger</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F119663%2F9df01180-6c7d-4255-84ea-41e1b34fc1f9.jpg</url>
      <title>DEV Community: Matt Rueedlinger</title>
      <link>https://dev.to/rueedlinger</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/rueedlinger"/>
    <language>en</language>
    <item>
      <title>Databricks SQL Essentials - Array Data Type</title>
      <dc:creator>Matt Rueedlinger</dc:creator>
      <pubDate>Fri, 20 Feb 2026 08:00:00 +0000</pubDate>
      <link>https://dev.to/rueedlinger/databricks-sql-essentials-array-data-type-4k8l</link>
      <guid>https://dev.to/rueedlinger/databricks-sql-essentials-array-data-type-4k8l</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;🔖 &lt;em&gt;This post is part of my series &lt;a href="https://rueedlinger.ch/series/databricks-sql-essentials/" rel="noopener noreferrer"&gt;Databricks SQL Essentials&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Why Working with Array Types
&lt;/h2&gt;

&lt;p&gt;In this post, I want to focus on &lt;strong&gt;array types in Databricks SQL&lt;/strong&gt;—what they are, why they matter, and how you can use them effectively. Arrays are powerful because they allow you to store &lt;strong&gt;multiple values in a single column&lt;/strong&gt;, which can be incredibly useful when working with semi-structured data like JSON, logs, or event streams.  &lt;/p&gt;

&lt;p&gt;We will look at two common scenarios:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;From Row to Array&lt;/strong&gt; – combining multiple rows into a single array for easier aggregation.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;From Array to Row&lt;/strong&gt; – exploding an array into separate rows to analyze individual elements.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These techniques help you move smoothly between structured and semi-structured data for more flexible analysis.&lt;/p&gt;

&lt;p&gt;In this blog post, we will use the &lt;strong&gt;&lt;a href="https://marketplace.databricks.com/details/ed6cf259-81e7-4758-94c5-b444f8a5275a/Databricks_Wanderbricks-Dataset-DAIS-2025" rel="noopener noreferrer"&gt;WanderBricks dataset&lt;/a&gt;&lt;/strong&gt; in Databricks and show how to handle both cases: converting from &lt;strong&gt;rows to arrays&lt;/strong&gt; and from &lt;strong&gt;arrays to rows&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Array vs Set
&lt;/h2&gt;

&lt;p&gt;First, let’s start with a bit of theory about arrays. In this context, we will use the terms &lt;strong&gt;array&lt;/strong&gt; and &lt;strong&gt;list&lt;/strong&gt; interchangeably. While some programming languages distinguish between these two concepts, in &lt;strong&gt;Databricks SQL&lt;/strong&gt; they are represented by a single data type.&lt;/p&gt;

&lt;p&gt;An array (&lt;code&gt;ARRAY &amp;lt; elementType &amp;gt;&lt;/code&gt;) in Databricks SQL is a data type that holds a collection of elements of another &lt;a href="https://docs.databricks.com/aws/en/sql/language-manual/sql-ref-datatypes" rel="noopener noreferrer"&gt;supported data type&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Arrays let you store multiple values in a single column, making them perfect for semi-structured data. An array is an ordered collection of values stored in a single column. Arrays can hold any data type, including numbers, strings, or even other arrays.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Arrays can contain duplicates. &lt;code&gt;SELECT array(10, 20, 10) --&amp;gt; [10,20,10]&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Elements are indexed starting at &lt;code&gt;0&lt;/code&gt;, so you can access individual items. &lt;code&gt;SELECT array(10, 20, 30)[0] AS first_element --&amp;gt; 10&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Useful for storing repeating or nested data in a compact way.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A &lt;strong&gt;set&lt;/strong&gt; is a collection of unique values stored in a single column. Unlike arrays, sets &lt;strong&gt;cannot contain duplicates&lt;/strong&gt;, and the elements have &lt;strong&gt;no guaranteed order&lt;/strong&gt;.&lt;br&gt;&lt;br&gt;
Sets are useful for &lt;strong&gt;deduplicating data&lt;/strong&gt; while keeping all distinct values together.  &lt;/p&gt;

&lt;p&gt;In Databricks SQL, there is no native &lt;code&gt;SET&lt;/code&gt; data type, but you can achieve similar behavior using &lt;strong&gt;arrays with deduplication operations&lt;/strong&gt; (e.g., &lt;code&gt;array_distinct, collect_set&lt;/code&gt;) or other SQL functions to work with unique elements.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;SELECT array_distinct(array(10, 20, 10)) --&amp;gt; [10,20]&lt;/code&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;strong&gt;Feature&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Array&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Set&lt;/strong&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;em&gt;Definition&lt;/em&gt;&lt;/td&gt;
&lt;td&gt;An ordered list of values.&lt;/td&gt;
&lt;td&gt;A collection of &lt;strong&gt;unique&lt;/strong&gt; values.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;em&gt;Databricks SQL&lt;/em&gt;&lt;/td&gt;
&lt;td&gt;Fully supported with &lt;code&gt;ARRAY&lt;/code&gt; type.&lt;/td&gt;
&lt;td&gt;No native &lt;code&gt;SET&lt;/code&gt; type, but can use &lt;code&gt;array_distinct&lt;/code&gt; or &lt;code&gt;collect_set&lt;/code&gt; to emulate uniqueness.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;em&gt;Use Case&lt;/em&gt;&lt;/td&gt;
&lt;td&gt;When you need multiple values in a column, including duplicates.&lt;/td&gt;
&lt;td&gt;When you need &lt;strong&gt;unique values&lt;/strong&gt; only.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;You might wonder what the difference is between &lt;code&gt;array_distinct()&lt;/code&gt; and &lt;code&gt;collect_set()&lt;/code&gt;. The &lt;code&gt;collect_set()&lt;/code&gt; function is an aggregate function that collects unique values from multiple rows into an array, while &lt;code&gt;array_distinct()&lt;/code&gt; is a non-aggregate function that removes duplicates from an existing array. In short:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Need to collect unique values from multiple rows → use &lt;code&gt;collect_set()&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Already have an array and want to remove duplicates → use &lt;code&gt;array_distinct()&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  From Row to Array
&lt;/h2&gt;

&lt;p&gt;Sometimes, you might want to combine multiple rows into a single array. This is especially useful when you want to aggregate related data points for analysis. In Databricks SQL, you can use the following functions to achieve this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;collect_list()&lt;/code&gt;&lt;/strong&gt; – collects multiple rows into a &lt;strong&gt;list (array) of values&lt;/strong&gt;, including duplicates.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;collect_set()&lt;/code&gt;&lt;/strong&gt; – collects multiple rows into a &lt;strong&gt;list (array) of unique values&lt;/strong&gt;, removing duplicates.
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;As you can see, both are aggregate functions, they differ in how they handle duplicates. &lt;code&gt;collect_list()&lt;/code&gt; will include all values, while &lt;code&gt;collect_set()&lt;/code&gt; will only include unique values.&lt;/p&gt;
&lt;h3&gt;
  
  
  Example: Simple Array
&lt;/h3&gt;

&lt;p&gt;In this example, we create a simple array to summarize user bookings. Here we want to combine their data and summarize it per user:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;We join the tables on a common key (user_id) so that each booking is matched with its corresponding user.&lt;/li&gt;
&lt;li&gt;After joining, there are usually multiple rows per user. So we need to aggregate by user_id to get one row per user. Then we can use &lt;code&gt;collect_list()&lt;/code&gt; or &lt;code&gt;collect_set()&lt;/code&gt; to gather all the booking statuses into an array for each user.&lt;/li&gt;
&lt;li&gt;With &lt;code&gt;sort_array()&lt;/code&gt; we can sort the array of booking statuses for better readability, and &lt;code&gt;size()&lt;/code&gt; to count the number of distinct and total statuses.&lt;/li&gt;
&lt;li&gt;This results in a single row per user, with all their booking details stored in one column as an array.
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;select&lt;/span&gt;
  &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;sort_array&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;collect_set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;size&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;collect_set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;status_count_distinct&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;size&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;collect_list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;status_count&lt;/span&gt;
&lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;samples&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;wanderbricks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;
&lt;span class="k"&gt;join&lt;/span&gt; &lt;span class="n"&gt;samples&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;wanderbricks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;bookings&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;
  &lt;span class="k"&gt;on&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;
&lt;span class="k"&gt;group&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt; 
&lt;span class="k"&gt;order&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="n"&gt;status_count&lt;/span&gt; &lt;span class="k"&gt;desc&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;The output looks like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl257h4h76ijtbtffn394.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl257h4h76ijtbtffn394.png" alt="Example output with collect_list() and group by" width="800" height="164"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Example: Array of Structs
&lt;/h3&gt;

&lt;p&gt;Sometimes, you want one row per user that contains all their bookings as structured data. Here we us ethe &lt;code&gt;STRUCT&lt;/code&gt; data type &lt;br&gt;
which is a data type that groups multiple related fields into a single column. &lt;br&gt;
By using &lt;code&gt;collect_list()&lt;/code&gt; together with &lt;code&gt;struct()&lt;/code&gt;, you can create an array where each element holds detailed booking information:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Each element in the bookings array is a struct containing user_id, booking_id, total_amount, status, and created_at.&lt;/li&gt;
&lt;li&gt;Instead of having one row per booking, you now have one row per user, with all their bookings neatly packed into a single column.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This approach makes it easy to analyze or export nested booking data while keeping the data organized and compact.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;select&lt;/span&gt;
  &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;collect_list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;struct&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;booking_id&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;booking_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;total_amount&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;total_amount&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt;
  &lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;bookings&lt;/span&gt;
&lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;samples&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;wanderbricks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;
&lt;span class="k"&gt;join&lt;/span&gt; &lt;span class="n"&gt;samples&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;wanderbricks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;bookings&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;
  &lt;span class="k"&gt;on&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;
&lt;span class="k"&gt;group&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt; 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This produces the following output:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7yhjofv9l2vjidmki3ii.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7yhjofv9l2vjidmki3ii.png" alt="Example output with collect_list()" width="800" height="165"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  From Array to Row
&lt;/h2&gt;

&lt;p&gt;There might be situations where you may need to expand an array into multiple rows to analyze each element individually. Databricks SQL provides the &lt;code&gt;explode()&lt;/code&gt; function for this purpose.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;explode()&lt;/code&gt; function splits an array into separate rows. In the example below, we use the table &lt;code&gt;samples.wanderbricks.customer_support_logs&lt;/code&gt;. The &lt;code&gt;message&lt;/code&gt; column contains a list of structs with the following schema:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"message"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"I'm writing to express my outrage and disappointment...."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"sender"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"user"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"sentiment"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"angry"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"timestamp"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2025-04-08T17:23:13"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Example: Expanding Nested Arrays with &lt;code&gt;explode()&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;This query demonstrates how to use the &lt;code&gt;explode()&lt;/code&gt; function to transform nested array data into individual rows. By applying &lt;code&gt;LATERAL VIEW explode(messages)&lt;/code&gt;, the query expands each element in the messages array into its own row. This allows direct access to the nested fields inside each message struct. &lt;/p&gt;

&lt;p&gt;This approach is useful when working with semi-structured or nested data, enabling easier filtering, aggregation, and analysis at the individual message level.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;cl&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ticket_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sender&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;sender&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sentiment&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;sentiment&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;timestamp&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nb"&gt;timestamp&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;samples&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;wanderbricks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_support_logs&lt;/span&gt; &lt;span class="n"&gt;cl&lt;/span&gt;
&lt;span class="k"&gt;LATERAL&lt;/span&gt; &lt;span class="k"&gt;VIEW&lt;/span&gt; &lt;span class="n"&gt;explode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cl&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will give us the following output:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa1hgw3cpjsajdik150on.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa1hgw3cpjsajdik150on.png" alt="Example output with explode()" width="800" height="153"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Example: Safely Extracting Fields from a Struct Using JSON Functions
&lt;/h3&gt;

&lt;p&gt;There may be cases where an attribute does not exist in a struct, or the struct has a deeply nested hierarchy. In such situations, one alternative approach is to safely extract fields by converting the struct to JSON using &lt;code&gt;to_json()&lt;/code&gt;, and then retrieving the value with &lt;code&gt;get_json_object()&lt;/code&gt; using JSON path notation (&lt;code&gt;get_json_object(to_json(m), '$.field')&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;This method allows you to reference attributes that may not exist without causing the query to fail.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;cl&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ticket_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;get_json_object&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;to_json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="s1"&gt;'$.message'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;   &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;get_json_object&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;to_json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="s1"&gt;'$.sender'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;    &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;sender&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;get_json_object&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;to_json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="s1"&gt;'$.sentiment'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;sentiment&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;get_json_object&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;to_json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="s1"&gt;'$.timestamp'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="nb"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;get_json_object&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;to_json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="s1"&gt;'$.foo'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;foo&lt;/span&gt; &lt;span class="c1"&gt;-- does not exist in struct&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;samples&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;wanderbricks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_support_logs&lt;/span&gt; &lt;span class="n"&gt;cl&lt;/span&gt;
&lt;span class="k"&gt;LATERAL&lt;/span&gt; &lt;span class="k"&gt;VIEW&lt;/span&gt; &lt;span class="n"&gt;explode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cl&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This approach returns NULL for attributes that are missing instead of raising a schema error.&lt;/p&gt;

&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;Arrays in Databricks SQL are a powerful tool for handling multiple values within a single column. You can &lt;strong&gt;aggregate rows into arrays&lt;/strong&gt; for easier summarization or &lt;strong&gt;explode arrays into rows&lt;/strong&gt; to analyze each element individually, making your queries more flexible and concise.  &lt;/p&gt;

&lt;p&gt;By leveraging arrays effectively, you can simplify complex data transformations and gain deeper insights, making your SQL workflows faster, cleaner, and more efficient.&lt;/p&gt;

</description>
      <category>database</category>
      <category>dataengineering</category>
      <category>sql</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Databricks SQL Essentials - CTE</title>
      <dc:creator>Matt Rueedlinger</dc:creator>
      <pubDate>Fri, 06 Feb 2026 08:00:00 +0000</pubDate>
      <link>https://dev.to/rueedlinger/databricks-sql-essentials-cte-3jjo</link>
      <guid>https://dev.to/rueedlinger/databricks-sql-essentials-cte-3jjo</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;🔖 &lt;em&gt;This post is part of my series &lt;a href="https://rueedlinger.ch/series/databricks-sql-essentials/" rel="noopener noreferrer"&gt;Databricks SQL Essentials&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Why use CTEs
&lt;/h2&gt;

&lt;p&gt;In this post, I want to focus on &lt;strong&gt;CTEs&lt;/strong&gt;, which can significantly simplify SQL queries by making complex logic &lt;br&gt;
easier to reason about and maintain. A &lt;strong&gt;CTE (Common Table Expression)&lt;/strong&gt; is a temporary, named result set defined &lt;br&gt;
using the &lt;code&gt;WITH&lt;/code&gt; clause. It exists only for the duration of a single query and provides a clean way to organize multi-step transformations.&lt;/p&gt;

&lt;p&gt;CTEs are especially well suited for a &lt;strong&gt;divide-and-conquer&lt;/strong&gt; approach to SQL.&lt;/p&gt;

&lt;p&gt;In this context, &lt;strong&gt;divide and conquer&lt;/strong&gt; means:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Breaking a complex query into smaller, logical steps
&lt;/li&gt;
&lt;li&gt;Solving each step independently
&lt;/li&gt;
&lt;li&gt;Combining the results in a final query
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Each step is expressed as its own CTE, resulting in clearer, more readable, and easier-to-debug SQL.&lt;/p&gt;

&lt;p&gt;Regarding querying data, &lt;strong&gt;CTEs&lt;/strong&gt; can improve your SQL queries in two main ways:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;CTE definition&lt;/strong&gt; — define a named, reusable result set at the beginning of your query using the &lt;code&gt;WITH&lt;/code&gt; clause.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Subquery replacement&lt;/strong&gt; — use a CTE instead of an inline subquery to improve readability and clarity.
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Next, we will explore each type of CTE in detail and show how to use them effectively.&lt;/p&gt;

&lt;p&gt;For more details and the official syntax, see the &lt;a href="https://docs.databricks.com/aws/en/sql/language-manual/sql-ref-syntax-qry-select-cte" rel="noopener noreferrer"&gt;Databricks SQL CTE documentation&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;
  
  
  CTE Definition at the Beginning of a Query
&lt;/h2&gt;

&lt;p&gt;A Common Table Expression (CTE) defines a named, reusable result set at the start of a query using the &lt;code&gt;WITH&lt;/code&gt; clause.&lt;br&gt;&lt;br&gt;
This approach is ideal for organizing multi-step logic and breaking complex queries into manageable parts.&lt;/p&gt;

&lt;p&gt;This example demonstrates how &lt;strong&gt;CTEs simplify data processing&lt;/strong&gt; with the &lt;a href="https://marketplace.databricks.com/details/ed6cf259-81e7-4758-94c5-b444f8a5275a/Databricks_Wanderbricks-Dataset-DAIS-2025" rel="noopener noreferrer"&gt;Wanderbricks dataset&lt;/a&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;countries&lt;/code&gt;&lt;/strong&gt; — selects country, continent, and country code information.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;users&lt;/code&gt;&lt;/strong&gt; — selects user details including their country.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;user_countries&lt;/code&gt;&lt;/strong&gt; — combines users with country info, filling any missing continent or country code with &lt;code&gt;"UNKNOWN"&lt;/code&gt;.
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The final query &lt;strong&gt;aggregates total booking amounts by country, continent, and booking status&lt;/strong&gt;, making it easy to analyze bookings across different regions while handling missing data.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- CTE for countries info&lt;/span&gt;
&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;countries&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;country&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;continent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;country_code&lt;/span&gt; 
    &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;samples&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;wanderbricks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;countries&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;
&lt;span class="p"&gt;),&lt;/span&gt; 
&lt;span class="c1"&gt;-- CTE for users info&lt;/span&gt;
&lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;country&lt;/span&gt; 
    &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;samples&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;wanderbricks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;
&lt;span class="p"&gt;),&lt;/span&gt; 
&lt;span class="c1"&gt;-- Combine users with their country info&lt;/span&gt;
&lt;span class="n"&gt;user_countries&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;select&lt;/span&gt; 
        &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
        &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
        &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
        &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;country&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;coalesce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;continent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;"UNKOWN"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;continent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;-- fill missing continent&lt;/span&gt;
        &lt;span class="n"&gt;coalesce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;country_code&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;"UNKOWN"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;country_code&lt;/span&gt; &lt;span class="c1"&gt;-- fill missing country code&lt;/span&gt;
    &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt; 
    &lt;span class="k"&gt;left&lt;/span&gt; &lt;span class="k"&gt;join&lt;/span&gt; &lt;span class="n"&gt;countries&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt; &lt;span class="k"&gt;on&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;country&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;country&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;-- Final aggregation&lt;/span&gt;
&lt;span class="k"&gt;select&lt;/span&gt; 
    &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;country&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;continent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;total_amount&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;total_amount&lt;/span&gt; &lt;span class="c1"&gt;-- sum of bookings per group&lt;/span&gt;
&lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;user_countries&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt; 
&lt;span class="k"&gt;join&lt;/span&gt; &lt;span class="n"&gt;samples&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;wanderbricks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;bookings&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;
  &lt;span class="k"&gt;on&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt; 
&lt;span class="k"&gt;group&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="k"&gt;all&lt;/span&gt; &lt;span class="c1"&gt;-- group by country, continent, and status&lt;/span&gt;
&lt;span class="k"&gt;order&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="n"&gt;country&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will give us the follwoing output.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1u1kkbhyvjrcn4jec9ag.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1u1kkbhyvjrcn4jec9ag.png" alt="Example CTE output" width="800" height="375"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Subquery Replacement with a CTE
&lt;/h2&gt;

&lt;p&gt;Use a CTE instead of an inline subquery to improve readability. By giving the subquery a name, the purpose of the logic becomes clearer, and the overall query is easier to understand and maintain.&lt;/p&gt;

&lt;p&gt;We can rewrite the previous query by defining the CTE in place of the subquery:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Aggregate total booking amounts per country, continent, and booking status&lt;/span&gt;
&lt;span class="c1"&gt;-- using the prepared user-country subquery&lt;/span&gt;
&lt;span class="k"&gt;select&lt;/span&gt; 
    &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;country&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;continent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;total_amount&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;total_amount&lt;/span&gt; &lt;span class="c1"&gt;-- sum of bookings per group&lt;/span&gt;
&lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="c1"&gt;-- subquery with CTE&lt;/span&gt;
  &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;countries&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;country&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;continent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;country_code&lt;/span&gt; 
      &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;samples&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;wanderbricks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;countries&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;
  &lt;span class="p"&gt;),&lt;/span&gt; 
  &lt;span class="c1"&gt;-- CTE for users info&lt;/span&gt;
  &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;country&lt;/span&gt; 
      &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;samples&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;wanderbricks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="c1"&gt;-- Combine users with their country info&lt;/span&gt;
  &lt;span class="k"&gt;select&lt;/span&gt; 
    &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;country&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;coalesce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;continent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;"UNKOWN"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;continent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;-- fill missing continent&lt;/span&gt;
    &lt;span class="n"&gt;coalesce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;country_code&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;"UNKOWN"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;country_code&lt;/span&gt; &lt;span class="c1"&gt;-- fill missing country code&lt;/span&gt;
  &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt; 
  &lt;span class="k"&gt;left&lt;/span&gt; &lt;span class="k"&gt;join&lt;/span&gt; &lt;span class="n"&gt;countries&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt; &lt;span class="k"&gt;on&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;country&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;country&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt; 
&lt;span class="k"&gt;join&lt;/span&gt; &lt;span class="n"&gt;samples&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;wanderbricks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;bookings&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;
  &lt;span class="k"&gt;on&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt; 
&lt;span class="k"&gt;group&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="k"&gt;all&lt;/span&gt; &lt;span class="c1"&gt;-- group by country, continent, and status&lt;/span&gt;
&lt;span class="k"&gt;order&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="n"&gt;country&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;Using &lt;strong&gt;CTEs&lt;/strong&gt; in Databricks SQL is a powerful way to break complex queries into clear, manageable steps. They help you structure your logic, improve readability, and make debugging much easier.  &lt;/p&gt;

&lt;p&gt;Overall, CTEs support a &lt;strong&gt;divide-and-conquer&lt;/strong&gt; approach, making your SQL both cleaner and more maintainable.&lt;/p&gt;

&lt;p&gt;From my experience, learning to use CTEs transformed the way I write SQL. What used to be long, nested queries now feels organized and much easier to follow. &lt;/p&gt;

&lt;p&gt;My advice: don’t hesitate to break your queries into smaller, named steps—it not only helps you understand your logic but also makes it easier for others to read and maintain your code.&lt;/p&gt;

</description>
      <category>beginners</category>
      <category>dataengineering</category>
      <category>sql</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Deploy Hugo Websites to SFTP with GitHub Actions</title>
      <dc:creator>Matt Rueedlinger</dc:creator>
      <pubDate>Fri, 30 Jan 2026 22:00:00 +0000</pubDate>
      <link>https://dev.to/rueedlinger/deploy-hugo-websites-to-sftp-with-github-actions-45c2</link>
      <guid>https://dev.to/rueedlinger/deploy-hugo-websites-to-sftp-with-github-actions-45c2</guid>
      <description>&lt;p&gt;In my &lt;a href="https://rueedlinger.ch/posts/2020/hugo_and_github" rel="noopener noreferrer"&gt;2020 post&lt;/a&gt;, I revitalized my blog and set up an automated pipeline to publish blog posts using &lt;a href="https://gohugo.io/" rel="noopener noreferrer"&gt;Hugo&lt;/a&gt; and GitHub Actions.&lt;/p&gt;

&lt;p&gt;Hugo is a fast, open-source static site generator written in Go. I store all my site content on GitHub, where it is automatically built by GitHub Actions and then published to my website via SFTP.&lt;/p&gt;

&lt;p&gt;In this post, I will give an update on my publishing pipeline and describes how to use a Hugo theme with Git submodules. Previously, I created my own private Hugo theme, but this was quite a challenge to update and maintain. So I decided to switch to the &lt;a href="https://github.com/adityatelange/hugo-PaperMod" rel="noopener noreferrer"&gt;PaperMod&lt;/a&gt; theme, which looks very nice and clean.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;Before you start, make sure you have:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Git installed&lt;/li&gt;
&lt;li&gt;Hugo installed (see &lt;a href="https://gohugo.io/getting-started/installing/" rel="noopener noreferrer"&gt;Hugo installation guide&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Steps
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Create a New GitHub Repository
&lt;/h3&gt;

&lt;p&gt;Create a new project folder and navigate into it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;mkdir &lt;/span&gt;MyFreshWebsite
&lt;span class="nb"&gt;cd &lt;/span&gt;MyFreshWebsite/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Initialize a Git repository and create a README file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"# MyFreshWebsite"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; README.md
git init
git add README.md
git commit &lt;span class="nt"&gt;-m&lt;/span&gt; &lt;span class="s2"&gt;"Initial commit"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Set the main branch, add your remote repository, and push:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git branch &lt;span class="nt"&gt;-M&lt;/span&gt; main
git remote add origin https://github.com/rueedlinger/MyFreshWebsite.git
git push &lt;span class="nt"&gt;-u&lt;/span&gt; origin main
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Create a New Hugo Site
&lt;/h3&gt;

&lt;p&gt;Run the following command to create a new Hugo site using YAML configuration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd &lt;/span&gt;MyFreshWebsite/
hugo new site &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="nt"&gt;--format&lt;/span&gt; yaml &lt;span class="nt"&gt;--force&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Add &lt;code&gt;/public&lt;/code&gt; to &lt;code&gt;.gitignore&lt;/code&gt; so that when we build our website locally, these files are ignored by Git.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;.gitignore&lt;/em&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Hugo default output directory
/public
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Install and Activate the PaperMod Theme
&lt;/h3&gt;

&lt;p&gt;This will add the PaperMod theme as a &lt;a href="https://git-scm.com/book/en/v2/Git-Tools-Submodules" rel="noopener noreferrer"&gt;submodule&lt;/a&gt; in &lt;code&gt;themes/PaperMod&lt;/code&gt;. See the &lt;a href="https://github.com/adityatelange/hugo-PaperMod/wiki/Installation" rel="noopener noreferrer"&gt;PaperMod Installation Guide&lt;/a&gt; for details.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git submodule add https://github.com/adityatelange/hugo-PaperMod.git themes/PaperMod
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This command should add the file &lt;code&gt;.gitmodules&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;.gitmodules&lt;/em&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[submodule "themes/PaperMod"]
    path = themes/PaperMod
    url = https://github.com/adityatelange/hugo-PaperMod
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, let's update the submodule. First, change to the root directory of your project:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git submodule &lt;span class="nb"&gt;sync&lt;/span&gt;               &lt;span class="c"&gt;# Updates Git configuration&lt;/span&gt;
git submodule update &lt;span class="nt"&gt;--remote&lt;/span&gt;    &lt;span class="c"&gt;# Pulls the latest commit from the submodule&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As last step we have to acivate the theme in the &lt;code&gt;hugo.yml&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;theme&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PaperMod"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As the final step, let’s test our setup by starting Hugo.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;hugo server &lt;span class="nt"&gt;-D&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should now be able to access the site at &lt;code&gt;http://localhost:1313&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Create the GitHub Action
&lt;/h3&gt;

&lt;p&gt;Our workflow looks as follows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;actions/checkout&lt;/code&gt;  - The first step checks out the Hugo site from the GitHub repository and also checks out the submodule where our theme is located.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;lowply/build-hugo&lt;/code&gt;  - The next step builds the Hugo site publish it to the &lt;code&gt;public&lt;/code&gt; directory. &lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Dylan700/sftp-upload-action&lt;/code&gt;  - In this step, we use SFTP to upload the contents of the &lt;code&gt;public&lt;/code&gt; directory to the root directory (&lt;code&gt;./&lt;/code&gt;).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Now that we have our workflow defined, it’s time to create the GitHub Action that will automate our process. For every push to the &lt;code&gt;main&lt;/code&gt; branch, the site will be built and published via SFTP to the web server. The credentials, URL and PORT are stored as GitHub secrets. For more details see:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://docs.github.com/en/actions" rel="noopener noreferrer"&gt;GitHub Actions documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.github.com/en/actions/how-tos/write-workflows/choose-what-workflows-do/use-secrets" rel="noopener noreferrer"&gt;Using secrets in GitHub Actions&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Our final workflow file &lt;code&gt;.github/workflows/publish.yml&lt;/code&gt; looks as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Build and Publish Hugo Site&lt;/span&gt; 
&lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;push&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;branches&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; 
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;main&lt;/span&gt;
&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Publish Hugo Site&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Checkout Source (GIT)&lt;/span&gt;      
      &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v6.0.1&lt;/span&gt;
      &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;submodules&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Hugo Build&lt;/span&gt;     
      &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;lowply/build-hugo@v0.154.5&lt;/span&gt;
      &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# Hugo parameters like --buildDrafts, --baseURL, etc.&lt;/span&gt;
        &lt;span class="c1"&gt;# see https://gohugo.io/getting-started/usage/&lt;/span&gt;
        &lt;span class="na"&gt;args&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;--minify&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;List files for debugging&lt;/span&gt;
      &lt;span class="c1"&gt;# For debugging list files from current directory to console&lt;/span&gt;
      &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ls&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;SFTP Upload&lt;/span&gt;      
      &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Dylan700/sftp-upload-action@v1.2.3&lt;/span&gt;        
      &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;server&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.FTP_SERVER }}&lt;/span&gt;
        &lt;span class="na"&gt;username&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.FTP_USER }}&lt;/span&gt;
        &lt;span class="na"&gt;password&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.FTP_PASSWORD }}&lt;/span&gt;
        &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.FTP_PORT }}&lt;/span&gt;
        &lt;span class="na"&gt;dry-run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
        &lt;span class="na"&gt;delete&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
        &lt;span class="na"&gt;uploads&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
          &lt;span class="s"&gt;./public/ =&amp;gt; ./&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt;&lt;br&gt;
It's also important that you enable &lt;code&gt;submodules: true&lt;/code&gt; in the &lt;code&gt;checkout&lt;/code&gt; action.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Running the Project Locally and Deploy
&lt;/h2&gt;

&lt;p&gt;To make local development a bit more convenient, I added a simple &lt;code&gt;Makefile&lt;/code&gt;. It wraps the most common Hugo commands so you don’t have to remember them or type them out every time.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight make"&gt;&lt;code&gt;&lt;span class="c"&gt;# Makefile for Hugo
&lt;/span&gt;
&lt;span class="nl"&gt;.PHONY&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;default clean serve&lt;/span&gt;

&lt;span class="c"&gt;# Default target: clean and serve
&lt;/span&gt;&lt;span class="nl"&gt;default&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;clean serve&lt;/span&gt;

&lt;span class="c"&gt;# Clean the project directory (delete public folder)
&lt;/span&gt;&lt;span class="nl"&gt;clean&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    &lt;span class="p"&gt;@&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Cleaning public/ directory..."&lt;/span&gt;
    &lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="nt"&gt;-rf&lt;/span&gt; public

&lt;span class="c"&gt;# Start Hugo server
# -D, --buildDrafts 
# -F, --buildFuture  
# -O, --openBrowser 
&lt;/span&gt;&lt;span class="nl"&gt;serve&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    &lt;span class="p"&gt;@&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Starting Hugo server..."&lt;/span&gt;
    hugo server &lt;span class="nt"&gt;-D&lt;/span&gt; &lt;span class="nt"&gt;-F&lt;/span&gt; &lt;span class="nt"&gt;-O&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To start the project locally, just run it, and then you should be able to access the page at &lt;a href="http://localhost:1313/" rel="noopener noreferrer"&gt;http://localhost:1313/&lt;/a&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;make
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As the last step, you should commit everything and push the changes.  This should now trigger the whole workflow.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git add &lt;span class="nb"&gt;.&lt;/span&gt;
git commit &lt;span class="nt"&gt;-m&lt;/span&gt; &lt;span class="s2"&gt;"Ready to deploy"&lt;/span&gt;
git push
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I enjoyed experimenting with GitHub Actions while creating this post, and I hope you find some value in it as well.&lt;/p&gt;

</description>
      <category>automation</category>
      <category>cicd</category>
      <category>github</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Databricks SQL Essentials - GROUP BY ALL</title>
      <dc:creator>Matt Rueedlinger</dc:creator>
      <pubDate>Fri, 23 Jan 2026 16:00:00 +0000</pubDate>
      <link>https://dev.to/rueedlinger/databricks-sql-essentials-group-by-all-3l11</link>
      <guid>https://dev.to/rueedlinger/databricks-sql-essentials-group-by-all-3l11</guid>
      <description>&lt;p&gt;In this post, I want to focus on &lt;strong&gt;&lt;code&gt;GROUP BY ALL&lt;/code&gt;&lt;/strong&gt; , which can simplify queries significantly, especially when you’re experimenting and constantly adapting your analyses. This can be extremely useful when you want to quickly aggregate data without manually listing all the columns, especially in exploratory queries where the SELECT list changes frequently.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Supported in:&lt;/strong&gt; Databricks Runtime 12.2 LTS and above&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Documentation:&lt;/strong&gt; &lt;a href="https://docs.databricks.com/aws/en/sql/language-manual/sql-ref-syntax-qry-select-groupby" rel="noopener noreferrer"&gt;Databricks SQL Reference&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Here is an example using the Databricks sample dataset &lt;strong&gt;bakehouse&lt;/strong&gt;. We want to sum all transactions by continent, country, state, year, month, and day.&lt;/p&gt;

&lt;p&gt;Without &lt;code&gt;GROUP BY ALL&lt;/code&gt;, we need to list all the columns explicitly in the &lt;code&gt;GROUP BY&lt;/code&gt; clause:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;select&lt;/span&gt; 
  &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;continent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
  &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;country&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
  &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;state&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  
  &lt;span class="nb"&gt;year&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;dateTime&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nb"&gt;year&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;month&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;dateTime&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="k"&gt;month&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;day&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;dateTime&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="k"&gt;day&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;totalPrice&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;sumTotalPrice&lt;/span&gt;
  &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;samples&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;bakehouse&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sales_customers&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt; 
  &lt;span class="k"&gt;join&lt;/span&gt; &lt;span class="n"&gt;samples&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;bakehouse&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sales_transactions&lt;/span&gt; &lt;span class="n"&gt;tx&lt;/span&gt; 
    &lt;span class="k"&gt;on&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customerID&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customerID&lt;/span&gt; 
  &lt;span class="k"&gt;group&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="k"&gt;all&lt;/span&gt;
  &lt;span class="k"&gt;order&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="nb"&gt;year&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;month&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;day&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As expected this will give us the follwoing output.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7k4kz845ujdpdgknw05u.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7k4kz845ujdpdgknw05u.png" alt="Example output" width="800" height="486"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>databricks</category>
      <category>sql</category>
    </item>
    <item>
      <title>An Overview About the Different Kafka Connect Plugins</title>
      <dc:creator>Matt Rueedlinger</dc:creator>
      <pubDate>Thu, 18 Feb 2021 22:20:00 +0000</pubDate>
      <link>https://dev.to/rueedlinger/an-overview-about-the-different-kafka-connect-plugins-4oc8</link>
      <guid>https://dev.to/rueedlinger/an-overview-about-the-different-kafka-connect-plugins-4oc8</guid>
      <description>&lt;p&gt;Kafka Connect is a great framework for connecting Kafka with external systems. In the best case you can use Connect right away. But in some special cases you might have to write your own plugins to add missing functionality to the framework. In this blog post I give a short overview about the different plugin types which can be used to add new functionality to Connect.&lt;/p&gt;

&lt;p&gt;In the last part of this blog post I give you a short introduction to my GitHub project &lt;em&gt;&lt;a href="https://github.com/rueedlinger/kafka-connect-quickstart"&gt;Ready, Steady, Connect - A Kafka Connect Quickstart&lt;/a&gt;&lt;/em&gt;. This project contains example Java code you can use to extend Connect with your own plugins.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The name of the project came from the blog post &lt;a href="https://medium.com/swlh/ready-steady-connect-help-your-organization-to-appreciate-kafka-8ed6cfcfd6d8"&gt;(Ready, Steady, Connect. Help Your Organization to Appreciate Kafka)&lt;/a&gt;I wrote about the experience we had with Connect.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Plugin Types
&lt;/h2&gt;

&lt;p&gt;There are two main plugin categories that can be used to add new functionality to Kafka Connect:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Connect Plugins&lt;/strong&gt; are part of the Connect API and can be used to extend the functionality of Connect.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Kafka Client Plugins&lt;/strong&gt; are part of the Kafka Client (Consumer / Producer API). Kafka Connect is build on top of the &lt;em&gt;Kafka Consumer / Producer API&lt;/em&gt;, so we have the possibility to write plugins which are part of these “lower API’s” and use them with Connect.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--KnecoYJP--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://rueedlinger.ch/posts/2021/kafka_connect_plugins/plugin_types.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--KnecoYJP--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://rueedlinger.ch/posts/2021/kafka_connect_plugins/plugin_types.png" alt="plugin types"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Connect Plugins (Connect API)
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Plugin&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;th&gt;API&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Sink Connector&lt;/td&gt;
&lt;td&gt;A &lt;code&gt;SinkConnector&lt;/code&gt; can load data from Kafka and store it into an external system (eg. database). It’s quite easy to write your own sink connector or take an existing open source version and modify it to your needs.&lt;/td&gt;
&lt;td&gt;
&lt;a href="https://kafka.apache.org/27/javadoc/org/apache/kafka/connect/sink/SinkConnector.html"&gt;SinkConnector&lt;/a&gt;, &lt;a href="https://kafka.apache.org/27/javadoc/org/apache/kafka/connect/sink/SinkTask.html"&gt;SinkTask&lt;/a&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Source Connector&lt;/td&gt;
&lt;td&gt;A &lt;code&gt;SourceConnector&lt;/code&gt; can load data from an external system and store it into Kafka. A source connector is a bit more complicated to write than a sink connector. But with some inspiration from other open source connectors this should not be to hard.&lt;/td&gt;
&lt;td&gt;
&lt;a href="https://kafka.apache.org/27/javadoc/org/apache/kafka/connect/source/SourceConnector.html"&gt;SourceConnector&lt;/a&gt;, &lt;a href="https://kafka.apache.org/27/javadoc/org/apache/kafka/connect/source/SourceTask.html"&gt;SourceTask&lt;/a&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Single Message Transforms (SMTs)&lt;/td&gt;
&lt;td&gt;With a &lt;code&gt;Transformation&lt;/code&gt; (SMT) you can transform Kafka messages when they are processed by a connector. For example you could write a SMT which appends a UUID to every message that passes trough.&lt;/td&gt;
&lt;td&gt;&lt;a href="https://kafka.apache.org/27/javadoc/org/apache/kafka/connect/transforms/Transformation.html"&gt;Transformation&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Predicates&lt;/td&gt;
&lt;td&gt;A SMT can be configured with a &lt;code&gt;Predicate&lt;/code&gt; (&lt;a href="https://cwiki.apache.org/confluence/display/KAFKA/KIP-585%3A+Filter+and+Conditional+SMTs"&gt;KIP-585&lt;/a&gt;). The SMT is only applied when the condition of the predicate was true.&lt;/td&gt;
&lt;td&gt;&lt;a href="https://kafka.apache.org/27/javadoc/org/apache/kafka/connect/transforms/predicates/Predicate.html"&gt;Predicate&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Config Providers&lt;/td&gt;
&lt;td&gt;A &lt;code&gt;ConfigProvider&lt;/code&gt; loads configuration values from external resources. These configuration values can then be referenced in the connector configuration. You could write a &lt;code&gt;ConfigProvider&lt;/code&gt; which loads configuration values from a database, rest endpoint or from environment variables.&lt;/td&gt;
&lt;td&gt;&lt;a href="https://kafka.apache.org/27/javadoc/org/apache/kafka/common/config/provider/ConfigProvider.html"&gt;ConfigProvider&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Rest Extensions&lt;/td&gt;
&lt;td&gt;With a &lt;code&gt;RestExtension&lt;/code&gt; (&lt;a href="https://cwiki.apache.org/confluence/display/KAFKA/KIP-285%3A+Connect+Rest+Extension+Plugin"&gt;KIP-285&lt;/a&gt;) you can extend the existing Kafka Connect Rest API. You could write an authorization filter or liveness/readiness endpoints for k8s.&lt;/td&gt;
&lt;td&gt;&lt;a href="https://kafka.apache.org/27/javadoc/org/apache/kafka/connect/rest/ConnectRestExtension.html"&gt;ConnectRestExtension&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Converter&lt;/td&gt;
&lt;td&gt;The &lt;code&gt;Converter&lt;/code&gt; provides support for translating between Kafka Connect’s runtime data format and the raw payload of the Kafka messages (JSON, Avro, …).&lt;/td&gt;
&lt;td&gt;&lt;a href="https://kafka.apache.org/27/javadoc/org/apache/kafka/connect/storage/Converter.html"&gt;Converter&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Kafka Client Plugins (Kafka Producer / Consumer API)
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Plugin&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;th&gt;API&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Kafka Consumer Interceptor&lt;/td&gt;
&lt;td&gt;The &lt;code&gt;ConsumerInterceptor&lt;/code&gt; (&lt;a href="https://cwiki.apache.org/confluence/display/KAFKA/KIP-42%3A+Add+Producer+and+Consumer+Interceptors"&gt;KIP-42&lt;/a&gt;) can be used to intercept Kafka messages before they are processed by the consumer.&lt;/td&gt;
&lt;td&gt;&lt;a href="https://kafka.apache.org/27/javadoc/org/apache/kafka/clients/consumer/ConsumerInterceptor.html"&gt;ConsumerInterceptor&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Kafka Producer Interceptor&lt;/td&gt;
&lt;td&gt;The &lt;code&gt;ProducerInterceptor&lt;/code&gt; (&lt;a href="https://cwiki.apache.org/confluence/display/KAFKA/KIP-42%3A+Add+Producer+and+Consumer+Interceptors"&gt;KIP-42&lt;/a&gt;) is a neat way to intercept Kafka messages before they are published to Kafka.&lt;/td&gt;
&lt;td&gt;&lt;a href="https://kafka.apache.org/27/javadoc/org/apache/kafka/clients/producer/ProducerInterceptor.html"&gt;ProducerInterceptor&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Kafka Metrics Reporter&lt;/td&gt;
&lt;td&gt;The &lt;code&gt;MetricsReporter&lt;/code&gt; can be used to listen to Kafka client metrics and process them.&lt;/td&gt;
&lt;td&gt;&lt;a href="https://kafka.apache.org/27/javadoc/org/apache/kafka/common/metrics/MetricsReporter.html"&gt;MetricsReporter&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Create Your Own Connect Plugins
&lt;/h2&gt;

&lt;p&gt;The Docker image and the source code for all plugin examples can be found in the &lt;em&gt;&lt;a href="https://github.com/rueedlinger/kafka-connect-quickstart"&gt;Ready, Steady, Connect - A Kafka Connect Quickstart (rueedlinger/kafka-connect-quickstart)&lt;/a&gt;&lt;/em&gt; repository.&lt;/p&gt;

&lt;p&gt;The first step is to clone the project.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git clone https://github.com/rueedlinger/kafka-connect-quickstart
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The main components of the project are the &lt;strong&gt;Docker Image&lt;/strong&gt; , &lt;strong&gt;Java source code&lt;/strong&gt; and the &lt;strong&gt;Docker Compose&lt;/strong&gt; file.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The custom Connect &lt;strong&gt;Docker image&lt;/strong&gt; has two parts.

&lt;ul&gt;
&lt;li&gt;The &lt;strong&gt;builder part&lt;/strong&gt; to build the example plugins from the Java source code.&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;main part&lt;/strong&gt; to run the Kafka Connect container with all the Kafka Connect plugins.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;Java source code&lt;/strong&gt; contains all the Kafka Connect plugin examples (connectors, transforms, etc.).&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;Docker Compose&lt;/strong&gt; file can be used to run the whole infrastructure (Kafka broker, zookeeper, etc).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The next step is to build all the plugins (Java) and start the containers with Docker.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;docker-compose up --build
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When all containers are started you can access the following services:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Kafka Connect Rest API&lt;/strong&gt; =&amp;gt; &lt;a href="http://localhost:8083/"&gt;http://localhost:8083/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Kafdrop&lt;/strong&gt; from &lt;a href="https://github.com/obsidiandynamics/kafdrop"&gt;Obsidian Dynamics (GitHub)&lt;/a&gt; =&amp;gt; &lt;a href="http://localhost:8082/"&gt;http://localhost:8082/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Schema Registry&lt;/strong&gt; from &lt;a href="https://github.com/confluentinc/schema-registry"&gt;Confluent (GitHub)&lt;/a&gt; =&amp;gt; &lt;a href="http://localhost:8081/"&gt;http://localhost:8081/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Kafka UI&lt;/strong&gt; from &lt;a href="https://github.com/provectus/kafka-ui"&gt;Provectus (GitHub)&lt;/a&gt; =&amp;gt; &lt;a href="http://localhost:8080/"&gt;http://localhost:8080/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Kafka Connect UI&lt;/strong&gt; from &lt;a href="https://github.com/lensesio/kafka-connect-ui"&gt;Lenses.io (GitHub)&lt;/a&gt; =&amp;gt; &lt;a href="http://localhost:8000/"&gt;http://localhost:8000/&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Happy Coding
&lt;/h2&gt;

&lt;p&gt;Now that everything is up and running. You can start to play around with Kafka Connect. I hope the &lt;em&gt;kafka-connect-quickstart&lt;/em&gt; project is useful and gives you an easy start into the world of Kafka Connect plugins.&lt;/p&gt;

</description>
      <category>apachekafka</category>
      <category>kafkaconnect</category>
    </item>
    <item>
      <title>Apache Kafka Connect Usage Patterns</title>
      <dc:creator>Matt Rueedlinger</dc:creator>
      <pubDate>Sat, 30 Jan 2021 10:00:00 +0000</pubDate>
      <link>https://dev.to/rueedlinger/apache-kafka-connect-usage-patterns-4bnp</link>
      <guid>https://dev.to/rueedlinger/apache-kafka-connect-usage-patterns-4bnp</guid>
      <description>&lt;p&gt;&lt;em&gt;Kafka Connect&lt;/em&gt; is a tool for streaming data between Apache Kafka and other systems like Oracle, DB2, JMS, Elasticsearch, MongoDB, etc. Teams can configure connectors that move large collections of data in and out of Kafka. As Kafka Connect user you don’t have to write any piece of software when there is an existing connector implementation for your system. Depending on your load profile you can run multiple Connect workers which build an Connect cluster.&lt;/p&gt;

&lt;p&gt;I had recently an interesting discussion how teams can or should use Apache Kafka Connect. We came up with two usage patterns for Apache Kafka Connect:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Shared infrastructure&lt;/strong&gt; - &lt;em&gt;All teams share the same Kafka Connect cluster&lt;/em&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;“Microservice” or shared-nothing architecture&lt;/strong&gt; - &lt;em&gt;Every team has their own Kafka Connect cluster&lt;/em&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--OVzn_5AL--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://rueedlinger.ch/posts/2021/kafka_connect_deployment/patterns.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--OVzn_5AL--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://rueedlinger.ch/posts/2021/kafka_connect_deployment/patterns.png" alt="usage patterns"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt; : I assume that you will run Apache Kafka Connect in distributed mode. This provides scalability and automatic fault tolerance for Kafka Connect.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Shared Infrastructure Usage Pattern
&lt;/h2&gt;

&lt;p&gt;In this usage pattern the Kafka Connect cluster is shared between multiple teams and the platform team is responsible to run the cluster. This means that the resources (memory, logs, configurations, etc.) and runtime (JAR’s) are shared between different teams.&lt;/p&gt;

&lt;p&gt;When you use the &lt;em&gt;shared infrastructure&lt;/em&gt; usage pattern you have to consider the following topics:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Responsibilities:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Who gets notified when a connector is failing?&lt;/li&gt;
&lt;li&gt;Who is responsible in fixing connector failures?&lt;/li&gt;
&lt;li&gt;How to distinguish between infrastructure problems (memory, connectivity, etc.) and connector problems (schema / data mismatch, configuration errors, etc.)?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Boundaries / Isolation:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;How do you enforce authentication, authorization and role-based access control?&lt;/li&gt;
&lt;li&gt;How to ensure that teams can only deploy or modify their own connectors?&lt;/li&gt;
&lt;li&gt;How to secure the access to sensitive configuration settings like credentials?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Coordination:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;How to do you coordinate patches or rollouts of new versions with all the teams?&lt;/li&gt;
&lt;li&gt;Can a team stop the rollout when there are some breaking changes?&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  “Microservice” or Shared-nothing Architecture Usage Pattern
&lt;/h2&gt;

&lt;p&gt;In this usage pattern the platform team provides the right tools for the teams to to deploy and run a Kafka Connect cluster. Here we have clear boundaries between the teams and clear responsibilities.&lt;/p&gt;

&lt;p&gt;With the &lt;em&gt;microservice&lt;/em&gt; usage pattern you have to to consider the following topics:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Operational overhead:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Can you live with the operational overhead when every team runs their own Kafka Connect cluster?&lt;/li&gt;
&lt;li&gt;Should you organize cluster also by domain or functionality?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Skill / Tools:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Does your teams have the right skills to run and operate Kafka Connect?&lt;/li&gt;
&lt;li&gt;What are the right tools to facilitate the daily life with Kafka Connect?&lt;/li&gt;
&lt;li&gt;How to automate rollouts and updates?&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;Shared Infrastructure&lt;/em&gt; has the advantages that the team does not have to care how to operate and run Kafka Connect. The biggest issue is that all teams share the same runtime and resources. This increases the complexity regarding security and responsibilities between the teams.&lt;/p&gt;

&lt;p&gt;With the &lt;em&gt;microservice&lt;/em&gt; usage patterns it’s clear who is responsible and to blame when a error occurs (&lt;em&gt;You build it, you run it!&lt;/em&gt;). The main concerns are that your team needs the right skills to run Kafka Connect and the operational overhead when every team runs their own Kafka Connect cluster.&lt;/p&gt;

&lt;p&gt;We started with the &lt;em&gt;shared infrastructure&lt;/em&gt; usage pattern and ended up with the &lt;em&gt;microservice&lt;/em&gt; usage pattern. You should not underestimate the effort in provide the right tools and teach teams how they can run and operate Kafka Connect by themself.&lt;/p&gt;

</description>
      <category>apachekafka</category>
      <category>kafkaconnect</category>
    </item>
    <item>
      <title>Revitalizing my Blog with Hugo and GitHub Actions (aka a New Hope)</title>
      <dc:creator>Matt Rueedlinger</dc:creator>
      <pubDate>Fri, 25 Sep 2020 20:00:00 +0000</pubDate>
      <link>https://dev.to/rueedlinger/revitalizing-my-blog-with-hugo-and-github-actions-aka-a-new-hope-1jm7</link>
      <guid>https://dev.to/rueedlinger/revitalizing-my-blog-with-hugo-and-github-actions-aka-a-new-hope-1jm7</guid>
      <description>&lt;p&gt;This is my third attempt to revitalize my blog since 2008. I never migrated my old posts. Some of the posts were referenced externally which I did not expect. Some of them were also posted as links in Stack Overflow answers. I realized that when I got some messages from Twitter why my post disappeared. Luckily this is not a problem anymore, because all of the topics are outdated and nobody cares anymore. 😌&lt;/p&gt;

&lt;p&gt;First I started with WordPress. This was quite easy and straightforward. One big problem was that I did not care about the updates. Yes, I have self-hosted WordPress. In hindsight it was a really bad decision. After some “security issue”, I took my blog down and started from scratch with Jekyll. I really fell in love with the concept of creating your pages in markdown and generate the whole site with a site generator. But this attempt did not take off as well. Mostly because I was lazy and had no good ideas to write about. So after a while I also shutdown this blog and created a simple static site as some kind of a virtual “business card”.&lt;/p&gt;

&lt;h2&gt;
  
  
  Hugo - A New Hope
&lt;/h2&gt;

&lt;p&gt;Some weeks ago I started playing around with &lt;a href="https://gohugo.io/"&gt;Hugo&lt;/a&gt;. Hugo is a popular open-source static site generators. What started as a small project to replace my homepage ended up in a longer session than expected, because it was quite fun to rebuild my site with Hugo and start again with a Blog.&lt;/p&gt;

&lt;p&gt;The existing &lt;a href="https://themes.gohugo.io/"&gt;Hugo themes&lt;/a&gt; are fine and look very good, but when you want to create your unique look and feel you have to create your own Hugo theme. There are good tutorials out there. I would recommend you to take the time and write your own theme. I think it’s the best way to learn Hugo and get some understanding how things work.&lt;/p&gt;

&lt;h2&gt;
  
  
  Which CSS Framework?
&lt;/h2&gt;

&lt;p&gt;To build my own theme was a good change to get my rusty JavaScript and CSS skills updated. In the last web projects I used &lt;a href="https://getbootstrap.com/"&gt;Bootstrap&lt;/a&gt; or &lt;a href="https://material.io/"&gt;Material Design&lt;/a&gt;, but this time I wanted to use a CSS framework with a smaller footprint.&lt;/p&gt;

&lt;p&gt;So I decided to start with &lt;a href="http://getskeleton.com/"&gt;Skeleton&lt;/a&gt;. The downside with Skeleton is that the &lt;a href="https://github.com/dhg/Skeleton"&gt;GitHub project&lt;/a&gt; is not active anymore. The last release was in December 2014 and there are open PR’s and issues. But besides that it was quiet easy to integrate and customize. Skeleton worked quite well, but the point that the project is not active anymore made me switch to &lt;a href="https://milligram.io/"&gt;Milligram&lt;/a&gt;. Milligram also provides a minimal setup of styles for a fast and clean starting point. So it was easy to adapt my design and switch to Milligram.&lt;/p&gt;

&lt;h2&gt;
  
  
  Hugo Build Pipeline with GitHub Actions
&lt;/h2&gt;

&lt;p&gt;Next we need a build pipeline to publish new content. The easiest way is to host your site as a &lt;a href="https://pages.github.com/"&gt;GitHub Page&lt;/a&gt; and look for GitHub Action which will publish your site. There are plenty of &lt;a href="https://github.com/marketplace?type=actions"&gt;GitHub Actions&lt;/a&gt; in the Marketplace which help you to set up a workflow with GitHub Pages and Hugo.&lt;/p&gt;

&lt;p&gt;I have an external provider which hosts my domain and emails. So this was no option for my. For that reason I have setup the following GitHub Actions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/actions/checkout"&gt;actions/checkout&lt;/a&gt;&lt;/strong&gt; - The first step will check out my Hugo site from the GitHub repo.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/lowply/build-hugo"&gt;lowply/build-hugo&lt;/a&gt;&lt;/strong&gt; - The next step will build the Hugo site in the directory &lt;code&gt;public&lt;/code&gt;. You can specifyadditional &lt;a href="https://gohugo.io/getting-started/usage/"&gt;parameters for the Hugo CLI&lt;/a&gt; like &lt;code&gt;--minify&lt;/code&gt;, &lt;code&gt;--baseURL&lt;/code&gt;, etc.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/SamKirkland/FTP-Deploy-Action"&gt;SamKirkland/FTP-Deploy-Action&lt;/a&gt;&lt;/strong&gt; - The last step will upload the generated site form the directory &lt;code&gt;public&lt;/code&gt; to the SFTP server.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; You also have to add the file &lt;code&gt;.git-ftp-include&lt;/code&gt; with the following content in your GitHub repo.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;!public/
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Because only tracked files will be published by default. This will guarantee that the generated Hugo site form the directory &lt;code&gt;public&lt;/code&gt; will be uploaded.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;If you have issues with self-signed certificate and SFTP you can add the flag &lt;code&gt;insecure&lt;/code&gt;, which will not verify the server certificate.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Example Workflow
&lt;/h2&gt;

&lt;p&gt;Here you can see the complete GitHub workflow in detail. This workflow will be triggered when changes are merged into the master branch. The SFTP password and username should be stored as GitHub secrets. It’s also a good idea to store the FTP server and port in a GitHub secrets as well when you have a public GitHub repository.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;name: Build and Publish Hugo Site (MASTER)
on:
  push:
    branches: 
      - master
jobs:
  build:
    name: Publish Hugo Site (MASTER)
    runs-on: ubuntu-latest
    steps:
    - name: Checkout Source (GIT)
      uses: actions/checkout@v2
    - name: Hugo Build
      uses: lowply/build-hugo@v0.75.1
      with:
        # Hugo parameters like --buildDrafts, --baseURL, etc.
        # see https://gohugo.io/getting-started/usage/
        args: --minify
    - name: List files for debugging
      # For debugging list files from current directory to console
      run: ls
    - name: Upload Generated Site (SFTP)
      uses: SamKirkland/FTP-Deploy-Action@3.1.1
      with:
        # eg. replace with secret ${{ secrets.FTP_URI }}/page
        ftp-server: sftp://foo.bar:22/page
        ftp-username: ${{ secrets.FTP_USERNAME }}
        ftp-password: ${{ secrets.FTP_PASSWORD }}
        local-dir: public
        # ignore self-signed certificates
        git-ftp-args: --insecure
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Source: &lt;a href="https://gist.github.com/rueedlinger/c6aa02a41b39d6f1bc6c56bbe86ce5e1"&gt;https://gist.github.com/rueedlinger/c6aa02a41b39d6f1bc6c56bbe86ce5e1&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;My key takeaways are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;When you want to learn Hugo take the time and try to create your own theme. It’s fun and youget a better understanding how Hugo works.&lt;/li&gt;
&lt;li&gt;Building your own theme gives you the maximum of flexibility how your site should look.&lt;/li&gt;
&lt;li&gt;Get some inspiration by other themes and look at the source code. Thisis a good way to learn how others have solved some of the problems you might ran into.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I hope this post was useful for you in some way. It was definitely fun for my to build a Hugo theme and play around with GitHub Actions.&lt;/p&gt;

</description>
      <category>hugo</category>
      <category>github</category>
      <category>actions</category>
      <category>blog</category>
    </item>
    <item>
      <title>How Happy is your Team or your Teammates?</title>
      <dc:creator>Matt Rueedlinger</dc:creator>
      <pubDate>Wed, 05 Dec 2018 18:31:48 +0000</pubDate>
      <link>https://dev.to/rueedlinger/how-happy-is-your-team-or-your-teammates-3pgj</link>
      <guid>https://dev.to/rueedlinger/how-happy-is-your-team-or-your-teammates-3pgj</guid>
      <description>&lt;h1&gt;
  
  
  Happiness Index as Conversation Starter in Retrospectives
&lt;/h1&gt;

&lt;p&gt;I know there are plenty of resources, games and ideas how to start or structure a retrospective. For example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="http://www.funretrospectives.com/" rel="noopener noreferrer"&gt;http://www.funretrospectives.com/&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://retromat.org/" rel="noopener noreferrer"&gt;https://retromat.org/&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://tastycupcakes.org/" rel="noopener noreferrer"&gt;http://tastycupcakes.org/&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But I personally like to start with a &lt;strong&gt;Happiness Index&lt;/strong&gt; as an ice breaker activity. It gives you an overview what level of happiness the team currently has. &lt;/p&gt;

&lt;h2&gt;
  
  
  What is the Happiness Index?
&lt;/h2&gt;

&lt;p&gt;Perhaps you may have heard of the &lt;a href="http://worldhappiness.report/" rel="noopener noreferrer"&gt;World Happiness Report&lt;/a&gt;. The World Happiness Report is an annual survey of national happiness, which ranks countries by their happiness levels.&lt;/p&gt;

&lt;p&gt;Basically it’s the same idea, but in this case every team member ranks how happy he was the last few days.  As you can see this is very personal and must not necessarily correlate with the outcome of the last week, iteration, sprint, or whatever.&lt;/p&gt;

&lt;h2&gt;
  
  
  Collect the Data and Comment the Outliers
&lt;/h2&gt;

&lt;p&gt;I usually start with the question &lt;strong&gt;“How happy are you?”&lt;/strong&gt;. Every team member then has to rate the last days and put it on a sticky note. We use sticky notes to avoid a cognitive bias when evaluating the last few days. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fxlzrk0evw37j4s4fp4y8.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fxlzrk0evw37j4s4fp4y8.JPG" alt=" Happiness Index" width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I collect the sticky notes and write the rating on the flip chart. I use the rating form 6 to 1. Where 6 is the highest rating and 1 the lowest possible one. This leads to the following scale: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  6 (excellent), &lt;/li&gt;
&lt;li&gt;  5 (good), &lt;/li&gt;
&lt;li&gt;  4 (sufficient), &lt;/li&gt;
&lt;li&gt;  3 (bad), &lt;/li&gt;
&lt;li&gt;  2 (very bad) and &lt;/li&gt;
&lt;li&gt;  1 (catastrophe).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I encourage the team to share with the others their motivations for their personal rating. My goal is that everybody has an understanding what let to that specific rating. &lt;/p&gt;

&lt;p&gt;The interesting cases that should be mentioned are the outliers. For example these could be success stories, lessons learned or stories of struggles with some obstacles. I take some notes and put them on the flip chart, so that we are able to reconstruct what let to that specific rating.&lt;/p&gt;

&lt;p&gt;As team we decide if we should continue and discus some ideas how to improve the happiness of team. This could be the case when the overall rating is very low or someone is not satisfied with the current situation and wants to find a solution together with the team. To get started with the discussion I usually ask &lt;strong&gt;“What could we do as a team to get a better rating?”&lt;/strong&gt;. &lt;/p&gt;

&lt;p&gt;If there are no urgent topics to address we continue with a common retrospective format like 4L, 3S, etc. (see &lt;a href="http://www.funretrospectives.com/" rel="noopener noreferrer"&gt;http://www.funretrospectives.com/&lt;/a&gt;)&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;I like the Happiness Index because it is a good conversation starter, easy to execute and a handy tool to track how happy the team is over a specific period of time. &lt;/p&gt;

&lt;p&gt;But the main point for me is that it gives everybody in the team the change to address personal issues or share success stories.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2F0pizzryg7lkrjc1a6wut.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2F0pizzryg7lkrjc1a6wut.png" alt=" Happiness Index Trend" width="783" height="365"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>agile</category>
      <category>retrospective</category>
      <category>culture</category>
      <category>leadership</category>
    </item>
  </channel>
</rss>
