<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Rose1845</title>
    <description>The latest articles on DEV Community by Rose1845 (@rose1845).</description>
    <link>https://dev.to/rose1845</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F676917%2F8f9d26e9-9198-4877-9b33-4fef33b661c6.png</url>
      <title>DEV Community: Rose1845</title>
      <link>https://dev.to/rose1845</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/rose1845"/>
    <language>en</language>
    <item>
      <title>ETL vs ELT: Which One Should You Use and Why?</title>
      <dc:creator>Rose1845</dc:creator>
      <pubDate>Thu, 09 Apr 2026 07:20:59 +0000</pubDate>
      <link>https://dev.to/rose1845/etl-vs-elt-which-one-should-you-use-and-why-3435</link>
      <guid>https://dev.to/rose1845/etl-vs-elt-which-one-should-you-use-and-why-3435</guid>
      <description>&lt;h2&gt;
  
  
  What’s the Difference Between ETL and ELT?
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What is ETL?
&lt;/h3&gt;

&lt;p&gt;Extract, transform, and load (ETL) is the process of combining data from multiple sources into a large, central repository called a data warehouse. ETL uses a set of business rules to clean and organize raw data and prepare it for storage, data analytics, and machine learning (ML). &lt;/p&gt;

&lt;h3&gt;
  
  
  What is ELT?
&lt;/h3&gt;

&lt;p&gt;Extract, load, and transform (ELT) is an extension of extract, transform, and load (ETL) that reverses the order of operations. You can load data directly into the target system before processing it. The intermediate staging area is not required because the target data warehouse has data mapping capabilities within it. ELT has become more popular with the adoption of cloud infrastructure, which gives target databases the processing power they need for transformations.&lt;/p&gt;

&lt;p&gt;ETL process&lt;br&gt;
ETL has three steps:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You extract raw data from various sources&lt;/li&gt;
&lt;li&gt;You use a secondary processing server to transform that data&lt;/li&gt;
&lt;li&gt;You load that data into a target database
The transformation stage ensures compliance with the target 
database’s structural requirements. You only move the data once it is transformed and ready.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9n16y3zfr4cimd50ocq8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9n16y3zfr4cimd50ocq8.png" alt=" " width="800" height="295"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;ELT process&lt;br&gt;
These are the three steps of ELT:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You extract raw data from various sources&lt;/li&gt;
&lt;li&gt;You load it in its natural state into a data warehouse or data lake&lt;/li&gt;
&lt;li&gt;You transform it as needed while in the target system
With ELT, all data cleansing, transformation, and enrichment occur within the data warehouse. You can interact with and transform the raw data as many times as needed.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Differences
&lt;/h3&gt;

&lt;p&gt;Extract, load, and transform (ELT) has improved extract, transform, and load (ETL) in several ways.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Transform and load location&lt;/strong&gt;&lt;br&gt;
Transformation and load occur in different locations(it can be a dabase,API) and use distinct processes. The ETL process transforms data on a secondary processing server.&lt;/p&gt;

&lt;p&gt;In contrast, the ELT process loads raw data directly into the target data warehouse. Once there, you can transform the data whenever you need it. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Data compatibility&lt;/strong&gt;&lt;br&gt;
ETL is best suited for structured data that you can represent in tables with rows and columns. It transforms one set of structured data into another structured format and then loads it.&lt;/p&gt;

&lt;p&gt;In contrast, ELT handles all types of data, including unstructured data like images or documents that you can’t store in tabular format. With ELT, the process loads the various data formats into the target data warehouse. From there, you can transform it further into the format you require.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Speed&lt;/strong&gt;&lt;br&gt;
ELT is faster than ETL. ETL has an additional step before it loads data into the target that is difficult to scale and slows the system down as the data size increases.&lt;/p&gt;

&lt;p&gt;In contrast, ELT loads data directly into the destination system and transforms it in parallel. It uses the processing power and parallelization that cloud data warehouses offer to deliver real-time or near-real-time data transformation for analytics. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Costs&lt;/strong&gt;&lt;br&gt;
The ETL process requires analytics involvement from the start. It needs analysts to plan on the reports they want to generate and define data structures and formatting. The time required for setup increases, which adds to costs. Additional server infrastructure for transformations may also cost more.&lt;/p&gt;

&lt;p&gt;ELT has fewer systems than ETL, as all transformations occur within the target data warehouse. With fewer systems, there is less to maintain, leading to a simpler data stack and lower setup costs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Security&lt;/strong&gt;&lt;br&gt;
When you work with personal data, you must comply with data privacy regulations. Companies must protect personally identifiable information (PII) from unauthorized access.&lt;/p&gt;

&lt;p&gt;In ETL, developers have to build custom solutions, like masking PII to monitor and protect data.&lt;/p&gt;

&lt;p&gt;On the other hand, ELT solutions provide many security features—like granular access control and multifactor authentication—directly within the data warehouse. You can invest more time in analytics and less time in meeting data regulation requirements.&lt;/p&gt;

&lt;h3&gt;
  
  
  When to use ETL vs. ELT
&lt;/h3&gt;

&lt;p&gt;Extract, load, and transform (ELT) is the standard choice for modern analytics. However, you might consider extract, transform, and load (ETL) in the following scenarios.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Legacy databases&lt;/strong&gt;&lt;br&gt;
It is sometimes more beneficial to use ETL to integrate with legacy databases or third-party data sources with predetermined data formats. You only have to transform and load it once into your system. Once transformed, you can use it more efficiently for all future analytics.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Experimentation&lt;/strong&gt;&lt;br&gt;
In large organizations, data engineers conduct experiments—things like discovering hidden data sources for analytics and trying out new ideas to answer business queries. ETL is useful in data experiments to understand the database and its usefulness in a particular scenario.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Complex analytics&lt;/strong&gt;&lt;br&gt;
ETL and ELT may both be used together for complex analytics that use multiple data formats from varied sources. Data scientists may set up ETL pipelines from some of the sources and use ELT with the rest. This improves analytics efficiency and increases application performance in some cases.&lt;/p&gt;

&lt;p&gt;For examples, here are some common use cases for ETL at the edge:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You want to receive data from different protocols and convert it into standard data formats for use in cloud workloads&lt;/li&gt;
&lt;li&gt;You want to filter high-frequency data, perform averaging functions on large datasets, and then load averaged or filtered values at a reduced rate&lt;/li&gt;
&lt;li&gt;You want to calculate values from disparate data sources on the local device and send filtered values to the cloud backend&lt;/li&gt;
&lt;li&gt;You want to cleanse, deduplicate, or fill missing time series data elements&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Tools used in both approaches
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;AWS Glue is a serverless data integration service for event-driven ETL and no-code ETL jobs.&lt;/li&gt;
&lt;li&gt;Fivetran: An automated, cloud-based platform recognized for ELT, which also supports ETL and integrates with dbt.&lt;/li&gt;
&lt;li&gt;Airbyte: An open-source, flexible platform providing pre-built connectors for both approaches.&lt;/li&gt;
&lt;li&gt;Azure Data Factory - Cloud-based, serverless services designed for managing, moving, and transforming data.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>dataengineering</category>
      <category>etl</category>
      <category>datapipeline</category>
    </item>
    <item>
      <title>Connect Power BI to a SQL Database</title>
      <dc:creator>Rose1845</dc:creator>
      <pubDate>Sun, 15 Mar 2026 16:45:41 +0000</pubDate>
      <link>https://dev.to/rose1845/connect-power-bi-to-a-sql-database-1g53</link>
      <guid>https://dev.to/rose1845/connect-power-bi-to-a-sql-database-1g53</guid>
      <description>&lt;p&gt;Power Bi  - Data Visualization Tool - process of turning raw data into visuals and charts to make it easy for humans to understand the data.&lt;br&gt;
Basically a toll created by Microsoft to turn raw data into interactive insights&lt;/p&gt;

</description>
      <category>sql</category>
      <category>powerfuldevs</category>
      <category>dataengineering</category>
      <category>database</category>
    </item>
    <item>
      <title>Install Docker in Ubuntu v22.04</title>
      <dc:creator>Rose1845</dc:creator>
      <pubDate>Mon, 09 Mar 2026 16:04:11 +0000</pubDate>
      <link>https://dev.to/rose1845/install-docker-in-ubuntu-v2204-1mg1</link>
      <guid>https://dev.to/rose1845/install-docker-in-ubuntu-v2204-1mg1</guid>
      <description>&lt;p&gt;Type this command below in your terminal &lt;/p&gt;

&lt;p&gt;&lt;code&gt;sudo apt install docker.io docker-compose-v2&lt;br&gt;
&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Then after finishing the installation&lt;/p&gt;

&lt;p&gt;Add Docker as a usergroup&lt;br&gt;
&lt;code&gt;sudo usermod -aG docker $USER&lt;br&gt;
&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;newgrp docker&lt;br&gt;
&lt;/code&gt;&lt;/p&gt;

</description>
      <category>cli</category>
      <category>docker</category>
      <category>linux</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>SQL Joins and Window Functions</title>
      <dc:creator>Rose1845</dc:creator>
      <pubDate>Mon, 02 Mar 2026 14:13:02 +0000</pubDate>
      <link>https://dev.to/rose1845/sql-joins-and-window-functions-4e7g</link>
      <guid>https://dev.to/rose1845/sql-joins-and-window-functions-4e7g</guid>
      <description>&lt;p&gt;Why SQL join?&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Recombine Data - combine the tables in one big result &amp;gt; Big picture&lt;/li&gt;
&lt;li&gt;Data enrichment -  when you want to get extra  data &lt;/li&gt;
&lt;li&gt;Check existence of other data in order to check that you need the help of another table &amp;gt; filtering&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;When you want to combine two tables, let's say table A and table B, that is through &lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Columns - in this, we combine the columns from the tables using the SQL JOINs
We have 4 common  types of joins &lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;LEFT JOIN&lt;/li&gt;
&lt;li&gt;RIGHT JOIN&lt;/li&gt;
&lt;li&gt;INNER JOIN&lt;/li&gt;
&lt;li&gt;FULL OUTER JOIN&lt;/li&gt;
&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;Rows -  when you combine 2 or more tables through rows, we use the SET Operators like UNION, UNION ALL, INTERSECT, EXCEPT.
And here, all the number of columns and data types must match&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Today, we are going to talk about 4 common types of  JOIN(s)&lt;/p&gt;

&lt;p&gt;LEFT JOIN - returns data from the left table that hs match on the right table, even if there is no match&lt;br&gt;
Right join -  return data from the right table that has match on the left table&lt;br&gt;
INNER JOIN - returns data matching from the 2 tables&lt;br&gt;
FULL OUTER JOIN -  return data from the 2 tables filled with nulls from both tables&lt;/p&gt;

</description>
      <category>sql</category>
      <category>sqlserver</category>
      <category>postgres</category>
      <category>dataengineering</category>
    </item>
    <item>
      <title>Schemas and Data Modelling in Power BI</title>
      <dc:creator>Rose1845</dc:creator>
      <pubDate>Sun, 15 Feb 2026 10:10:46 +0000</pubDate>
      <link>https://dev.to/rose1845/schemas-and-data-modelling-in-power-bi-4458</link>
      <guid>https://dev.to/rose1845/schemas-and-data-modelling-in-power-bi-4458</guid>
      <description>&lt;h2&gt;
  
  
  What is a Schema?
&lt;/h2&gt;

&lt;p&gt;Schema refers to the logical structure of a database or data model that defines how tables are organized and related. In Power BI, schemas are used to optimize data storage, retrieval, and reporting.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Schema Matters in Power BI?
&lt;/h2&gt;

&lt;p&gt;A schema in Power BI is crucial because it defines how data is structured, stored, and connected in a data model. A well-designed schema:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Enhances Performance — Optimized schemas improve query speed and report loading time.&lt;/li&gt;
&lt;li&gt;Ensures Data Accuracy — Proper relationships prevent incorrect aggregations or duplications.&lt;/li&gt;
&lt;li&gt;Simplifies Data Analysis — A clear schema makes it easier to create reports and dashboards.&lt;/li&gt;
&lt;li&gt;Improves Scalability — A structured schema allows for easy expansion as data grows.&lt;/li&gt;
&lt;li&gt;Optimizes DAX Calculations — Efficient schemas lead to better DAX performance and calculations.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Types of Schema
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Star schema
Star schema is a mature modeling approach widely adopted by relational data warehouses. It requires modelers to classify their model tables as either dimension or fact.
It is a widely used data modeling approach in Power BI for optimizing performance and simplifying relationships. It consists of:&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;em&gt;Fact Table&lt;/em&gt; (Central Table) — Stores transactional data (e.g., sales, revenue, quantity).&lt;br&gt;
It store observations or events, and can be sales orders, stock balances, exchange rates, temperatures, and more. A fact table contains dimension key columns that relate to dimension tables, and numeric measure columns. The dimension key columns determine the dimensionality of a fact table, while the dimension key values determine the granularity of a fact table. &lt;br&gt;
&lt;em&gt;For example, consider a fact table designed to store sale targets that has two dimension key columns Date and ProductKey. It's easy to understand that the table has two dimensions. The granularity, however, can't be determined without considering the dimension key values. In this example, consider that the values stored in the Date column are the first day of each month. In this case, the granularity is at month-product level.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Dimension Tables&lt;/em&gt; (Surrounding Tables) — &lt;br&gt;
Describe business entities—the things you model. Entities can include products, people, places, and concepts including time itself. The most consistent table you'll find in a star schema is a date dimension table. A dimension table contains a key column (or columns) that acts as a unique identifier, and other columns. Other columns support filtering and grouping your data.&lt;br&gt;
_&lt;br&gt;
Contain descriptive attributes (e.g., Date, Product, Customer).&lt;br&gt;
Fewer Joins — Uses one-to-many relationships, reducing complexity and improving query speed._&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft1d8hcpifp8xoic0vbab.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft1d8hcpifp8xoic0vbab.png" alt=" " width="800" height="543"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Normalization vs. denormalization
&lt;/h2&gt;

&lt;p&gt;To understand some star schema concepts described in this article, it's important to know two terms: normalization and denormalization.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Normalization _ is the term used to describe data that's stored in a way that reduces repetitious data.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcdul8lsznmp1oi4l6mtq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcdul8lsznmp1oi4l6mtq.png" alt=" " width="800" height="411"&gt;&lt;/a&gt;&lt;br&gt;
If, however, the sales table stores product details beyond the key, it's considered denormalized. In the following image, notice that the ProductKey and other product-related columns record the product.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F27ecn9cp8d5yiyw1sjp6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F27ecn9cp8d5yiyw1sjp6.png" alt=" " width="800" height="486"&gt;&lt;/a&gt;&lt;br&gt;
*&lt;em&gt;Measures&lt;br&gt;
*&lt;/em&gt;&lt;br&gt;
In star schema design, a measure is a fact table column that stores values to be summarized. In a Power BI semantic model, a measure has a different—but similar—definition. A model supports both explicit and implicit measures.&lt;br&gt;
_Explicit measures&lt;/em&gt; are expressly created and they're based on a formula written in Data Analysis Expressions (DAX) that achieves summarization. Measure expressions often use DAX aggregation functions like SUM, MIN, MAX, AVERAGE, and others to produce a scalar value result at query time (values are never stored in the model). Measure expression can range from simple column aggregations to more sophisticated formulas that override filter context and/or relationship propagation. For more information, read about DAX Basics in Power BI Desktop.&lt;br&gt;
&lt;em&gt;Implicit measures&lt;/em&gt; are columns that can be summarized by a report visual or Q&amp;amp;A. They offer a convenience for you as a model developer, as in many instances you don't need to create (explicit) measures. For example, the Adventure Works reseller sales Sales Amount column can be summarized in numerous ways (sum, count, average, median, min, max, and others), without the need to create a measure for each possible aggregation type.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fakfc5rtuby6bbld0tm4z.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fakfc5rtuby6bbld0tm4z.png" alt=" " width="800" height="419"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Surrogate keys&lt;/em&gt; is a unique identifier that you add to a table to support star schema modeling. By definition, it's not defined or stored in the source data. Commonly, surrogate keys are added to relational data warehouse dimension tables to provide a unique identifier for each dimension table row.&lt;/p&gt;

&lt;p&gt;Power BI semantic model relationships are based on a single unique column in one table, which propagates filters to a single column in a different table. When a dimension table in your semantic model doesn't include a single unique column, you must add a unique identifier to become the "one" side of a relationship. In Power BI Desktop, you can achieve this requirement by adding a Power Query index column.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyx8wyulyc5rnlllwtszr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyx8wyulyc5rnlllwtszr.png" alt=" " width="800" height="386"&gt;&lt;/a&gt;&lt;br&gt;
Advantages of Star Schema&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Optimized for Performance — Fewer joins mean faster queries and better report speed.&lt;/li&gt;
&lt;li&gt;Simplifies DAX Calculations — Flat structure makes it easier to create measures and aggregations.&lt;/li&gt;
&lt;li&gt;Enhances Data Visualization — Works seamlessly with Power BI’s data model and relationships.&lt;/li&gt;
&lt;li&gt;Reduces Complexity — Easier to design, manage, and scale compared to Snowflake or Galaxy schemas.&lt;/li&gt;
&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;Snowflake schema
is a set of normalized tables for a single business entity. For example, Adventure Works classifies products by category and subcategory. Products are assigned to subcategories, and subcategories are in turn assigned to categories. In the Adventure Works relational data warehouse, the product dimension is normalized and stored in three related tables: DimProductCategory, DimProductSubcategory, and DimProduct.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg2u69krbgp3d6oa04ayy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg2u69krbgp3d6oa04ayy.png" alt=" " width="756" height="548"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvpnrjetzr0js1dw30i6r.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvpnrjetzr0js1dw30i6r.png" alt=" " width="756" height="621"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;In Power BI Desktop, you can choose to mimic a snowflake dimension design (perhaps because your source data does) or combine the source tables to form a single, denormalized model table. Generally, the benefits of a single model table outweigh the benefits of multiple model tables. The most optimal decision can depend on the volumes of data and the usability requirements for the model.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Power BI loads more tables, which is less efficient from storage and performance perspectives. These tables must include columns to support model relationships, and it can result in a larger model size.&lt;br&gt;
Longer relationship filter propagation chains need to be traversed, which might be less efficient than filters applied to a single table.&lt;br&gt;
The Data pane presents more model tables to report authors, which can result in a less intuitive experience, especially when snowflake dimension tables contain only one or two columns.&lt;br&gt;
It's not possible to create a hierarchy that comprises columns from more than one table.&lt;br&gt;
When you choose to integrate into a single model table, you can also define a hierarchy that encompasses the highest and lowest grain of the dimension. Possibly, the storage of redundant denormalized data can result in increased model storage size, particularly for large dimension tables.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6zesdr4frb2wsffjfdeb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6zesdr4frb2wsffjfdeb.png" alt=" " width="756" height="668"&gt;&lt;/a&gt;&lt;br&gt;
Advantages of Snowflake Schema&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Less Data Redundancy — Normalized tables reduce duplication.&lt;/li&gt;
&lt;li&gt;Better Data Integrity — Structured data ensures consistency.&lt;/li&gt;
&lt;li&gt;Efficient for Large Datasets — Optimized for big data storage.&lt;/li&gt;
&lt;li&gt;Easier Maintenance — Updates are more manageable.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Model relationships
&lt;/h2&gt;

&lt;p&gt;A model relationship propagates filters applied on the column of one model table to a different model table. Filters will propagate so long as there's a relationship path to follow, which can involve propagation to multiple tables.&lt;/p&gt;

&lt;p&gt;Relationship paths are deterministic, meaning that filters are always propagated in the same way and without random variation. Relationships can, however, be disabled, or have filter context modified by model calculations that use particular Data Analysis Expressions (DAX) functions&lt;/p&gt;

&lt;h2&gt;
  
  
  Data types of columns
&lt;/h2&gt;

&lt;p&gt;The data type for both the "from" and "to" column of the relationship should be the same. Working with relationships defined on DateTime columns might not behave as expected. The engine that stores Power BI data, only uses DateTime data types; Date, Time, and Date/Time/Timezone data types are Power BI formatting constructs implemented on top. Any model-dependent objects will still appear as DateTime in the engine (such as relationships, groups, and so on). As such, if a user selects Date from the Modeling tab for such columns, they still don't register as being the same date, because the time portion of the data is still being considered by the engine.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cardinality
&lt;/h2&gt;

&lt;p&gt;Each model relationship is defined by a cardinality type. There are four cardinality type options, representing the data characteristics of the "from" and "to" related columns. The "one" side means the column contains unique values; the "many" side means the column can contain duplicate values.&lt;br&gt;
&lt;em&gt;If a data refresh operation attempts to load duplicate values into a "one" side column, the entire data refresh will fail.&lt;/em&gt;&lt;br&gt;
The four options, together with their shorthand notations, are described in the following list:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;One-to-many (1:*)&lt;/li&gt;
&lt;li&gt;Many-to-one (*:1)&lt;/li&gt;
&lt;li&gt;One-to-one (1:1)&lt;/li&gt;
&lt;li&gt;Many-to-many (&lt;em&gt;:&lt;/em&gt;)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;em&gt;One-to-many (and many-to-one) cardinality&lt;/em&gt;&lt;br&gt;
The one-to-many and many-to-one cardinality options are essentially the same, and they're also the most common cardinality types.&lt;/p&gt;

&lt;p&gt;When you configure a one-to-many or many-to-one relationship, choose the one that matches the order in which you related the columns. Consider how you would configure the relationship from the Product table to the Sales table by using the ProductID column found in each table. The cardinality type would be one-to-many, as the ProductID column in the Product table contains unique values. If you related the tables in the reverse direction, Sales to Product, then the cardinality would be many-to-one.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;One-to-one cardinality&lt;/em&gt;&lt;br&gt;
A one-to-one relationship means both columns contain unique values. This cardinality type isn't common, and it likely represents a suboptimal model design because of the storage of redundant data.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Many-to-many cardinality&lt;/em&gt;&lt;br&gt;
A many-to-many relationship means both columns can contain duplicate values. This cardinality type is infrequently used. It's typically useful when designing complex model requirements. You can use it to relate many-to-many facts or to relate higher grain facts. For example, when sales target facts are stored at product category level and the product dimension table is stored at product level.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Cross filter direction&lt;/em&gt;&lt;br&gt;
Single cross filter direction means "single direction", and Both means "both directions". A relationship that filters in both directions is commonly described as bi-directional.&lt;/p&gt;

&lt;p&gt;For one-to-many relationships, the cross filter direction is always from the "one" side, and optionally from the "many" side (bi-directional). For one-to-one relationships, the cross filter direction is always from both tables. Lastly, for many-to-many relationships, cross filter direction can be from either one of the tables, or from both tables. Notice that when the cardinality type includes a "one" side, that filters will always propagate from that side.&lt;/p&gt;

</description>
      <category>bigdata</category>
      <category>database</category>
      <category>dataengineering</category>
    </item>
    <item>
      <title>Linux for Data Engineers: A Beginner-Friendly Guide</title>
      <dc:creator>Rose1845</dc:creator>
      <pubDate>Sun, 25 Jan 2026 17:15:06 +0000</pubDate>
      <link>https://dev.to/rose1845/linux-for-data-engineers-a-beginner-friendly-guide-4bgp</link>
      <guid>https://dev.to/rose1845/linux-for-data-engineers-a-beginner-friendly-guide-4bgp</guid>
      <description>&lt;p&gt;If you’re getting into data engineering, Linux is not optional.it’s a core skill.&lt;br&gt;
Most data systems in the real world run on Linux, and knowing your way around the terminal makes your work faster, cleaner, and more powerful.&lt;/p&gt;

&lt;p&gt;This article explains why Linux matters for data engineers, introduces essential Linux commands, and shows how to create and edit files using Vi and Nano, all in plain language.&lt;/p&gt;
&lt;h2&gt;
  
  
  Why Linux Is Important for Data Engineers
&lt;/h2&gt;

&lt;p&gt;As a data engineer, you will work with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Data pipelines (ETL / ELT)&lt;/li&gt;
&lt;li&gt;Servers and cloud machines (AWS, GCP, Azure)&lt;/li&gt;
&lt;li&gt;Databases (Postgres, MySQl)&lt;/li&gt;
&lt;li&gt;Big data tools (Spark, Kafka, Airflow)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Almost all of these run on Linux servers.&lt;/p&gt;

&lt;p&gt;Linux helps you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Work directly on production servers&lt;/li&gt;
&lt;li&gt;Automate tasks using scripts&lt;/li&gt;
&lt;li&gt;Debug issues quickly&lt;/li&gt;
&lt;li&gt;Handle large files efficiently&lt;/li&gt;
&lt;li&gt;Understand how data flows at system level
If you can use Linux confidently, you immediately stand out as “production-ready”.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  Understanding the Linux Terminal
&lt;/h2&gt;

&lt;p&gt;The terminal is just a way to talk to your computer using commands instead of clicking buttons.&lt;br&gt;
eg:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;ls - shows whta files are in &lt;br&gt;
&lt;/code&gt;&lt;br&gt;
Essential Linux Commands for Data Engineers&lt;br&gt;
&lt;code&gt;pwd – Where am I?&lt;br&gt;
&lt;/code&gt;&lt;br&gt;
pwd&lt;br&gt;
Output:&lt;br&gt;
/home/rose&lt;br&gt;
This shows your current directory.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;ls – List files&lt;br&gt;
&lt;/code&gt;&lt;br&gt;
Output:&lt;br&gt;
data  scripts  README.md&lt;br&gt;
Common options:&lt;br&gt;
ls -l   # detailed view&lt;br&gt;
ls -a   # include hidden files&lt;/p&gt;

&lt;p&gt;&lt;code&gt;cd – Move between folders- I mean change to folder you want&lt;br&gt;
&lt;/code&gt;&lt;br&gt;
cd dev&lt;br&gt;
Go back:&lt;br&gt;
cd ..&lt;/p&gt;

&lt;p&gt;Go home:&lt;br&gt;
cd ~&lt;/p&gt;

&lt;p&gt;&lt;code&gt;mkdir – Create folders&lt;br&gt;
&lt;/code&gt;&lt;br&gt;
mkdir dataengineering&lt;br&gt;
This is very common when organizing ETL jobs.&lt;br&gt;
&lt;code&gt;touch – Create files&lt;br&gt;
&lt;/code&gt;&lt;br&gt;
touch extract_data.py&lt;br&gt;
Creates an empty file — perfect for scripts.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;cat – View file content&lt;br&gt;
&lt;/code&gt;&lt;br&gt;
cat README.md&lt;/p&gt;

&lt;p&gt;Use:&lt;br&gt;
&lt;code&gt;q → quit from where you are &lt;br&gt;
&lt;/code&gt;&lt;br&gt;
/error → search for “error”&lt;br&gt;
This is extremely useful for debugging pipelines.&lt;br&gt;
Editing Files with Nano &lt;br&gt;
Nano is simple and safe for beginners.&lt;br&gt;
Open a file with Nano&lt;br&gt;
&lt;code&gt;nano extract_data.py&lt;br&gt;
&lt;/code&gt;&lt;br&gt;
Write:&lt;br&gt;
print("Extracting data...")&lt;br&gt;
Nano shortcuts:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;CTRL + O → Save
Enter → Confirm
CTRL + X → Exit
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Nano tells you the shortcuts at the bottom &lt;br&gt;
Editing Files with Vi &lt;br&gt;
Vi (or Vim) is everywhere in Linux servers.&lt;br&gt;
Open a file&lt;br&gt;
vi transform.sql&lt;br&gt;
Vi modes &lt;br&gt;
Normal mode i.e navigation&lt;br&gt;
Insert mode i.e typing&lt;br&gt;
Command mode i.e saving &amp;amp; quitting&lt;br&gt;
Start typing&lt;br&gt;
Press:&lt;br&gt;
i&lt;br&gt;
Now type:&lt;/p&gt;

&lt;p&gt;SELECT * FROM users;&lt;/p&gt;

&lt;p&gt;Save and exit&lt;/p&gt;

&lt;p&gt;Press:&lt;/p&gt;

&lt;p&gt;ESC&lt;/p&gt;

&lt;p&gt;Then type:&lt;/p&gt;

&lt;p&gt;:wq&lt;/p&gt;

&lt;p&gt;And press Enter.&lt;br&gt;
Exit without saving&lt;br&gt;
:q!&lt;/p&gt;

&lt;p&gt;Practical Example: Creating a Data Script&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;mkdir etl
cd etl
touch extract.sh
nano extract.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Inside the file:&lt;/p&gt;

&lt;h1&gt;
  
  
  !/bin/bash
&lt;/h1&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;echo "Starting data extraction..."

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Make it executable:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;chmod +x extract.sh&lt;br&gt;
&lt;/code&gt;&lt;br&gt;
Run it:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;./extract.sh&lt;br&gt;
&lt;/code&gt;&lt;br&gt;
Output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Starting data extraction...

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Permissions
&lt;/h2&gt;

&lt;p&gt;Linux controls who can read, write, or execute files.&lt;br&gt;
Check permissions:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;ls -l&lt;br&gt;
&lt;/code&gt;&lt;br&gt;
Example:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;-rwxr-xr-- extract.sh&lt;br&gt;
&lt;/code&gt;&lt;br&gt;
Meaning:&lt;br&gt;
Owner can read/write/execute&lt;br&gt;
Group can read/execute&lt;br&gt;
Others can read&lt;br&gt;
This matters a lot on shared servers.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where You’ll Use These Skills as a Data Engineer
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;SSH into cloud servers&lt;/li&gt;
&lt;li&gt;Edit Airflow DAGs&lt;/li&gt;
&lt;li&gt;Inspect Spark logs&lt;/li&gt;
&lt;li&gt;Manage cron jobs&lt;/li&gt;
&lt;li&gt;Automate daily pipelines&lt;/li&gt;
&lt;li&gt;Debug production failures&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Linux is the operating system of data infrastructure.&lt;/p&gt;

</description>
      <category>linux</category>
      <category>dataengineering</category>
      <category>data</category>
      <category>programming</category>
    </item>
    <item>
      <title>How to Set Up GPG Keys for an Existing GitHub Account (Step-by-Step)</title>
      <dc:creator>Rose1845</dc:creator>
      <pubDate>Sun, 18 Jan 2026 11:40:35 +0000</pubDate>
      <link>https://dev.to/rose1845/how-to-set-up-gpg-keys-for-an-existing-github-account-step-by-step-2fj7</link>
      <guid>https://dev.to/rose1845/how-to-set-up-gpg-keys-for-an-existing-github-account-step-by-step-2fj7</guid>
      <description>&lt;p&gt;When working with Git and GitHub, you may notice a “Verified” badge on some commits. This badge means the commit was cryptographically signed, proving it truly came from the author and wasn’t tampered with.&lt;br&gt;
In this article, you’ll learn how to set up GPG keys for an existing GitHub account and start signing your commits.&lt;/p&gt;
&lt;h2&gt;
  
  
  What Is a GPG Key and Why It Matters?
&lt;/h2&gt;

&lt;p&gt;GPG (GNU Privacy Guard) is a tool used to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Digitally sign commits and tags&lt;/li&gt;
&lt;li&gt;Prove authorship and integrity&lt;/li&gt;
&lt;li&gt;Improve security and trust in collaborative projects&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  Benefits of signing commits:
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Your commits show as Verified on GitHub&lt;/li&gt;
&lt;li&gt;Protects against commit spoofing&lt;/li&gt;
&lt;li&gt;Builds credibility as a developer&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;Before you begin, make sure you have:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A GitHub account&lt;/li&gt;
&lt;li&gt;Git installed&lt;/li&gt;
&lt;li&gt;GPG installed on your system&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Terminal access&lt;/p&gt;
&lt;h2&gt;
  
  
  Step 1: Check If GPG Is Installed
&lt;/h2&gt;

&lt;p&gt;Run this command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;gpg &lt;span class="nt"&gt;--version&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If GPG is not installed:&lt;br&gt;
Ubuntu / Debian&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;apt update &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;sudo &lt;/span&gt;apt &lt;span class="nb"&gt;install &lt;/span&gt;gnupg

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;macOS (Homebrew)&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;brew &lt;span class="nb"&gt;install &lt;/span&gt;gnupg

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Windows&lt;/p&gt;

&lt;p&gt;Install Gpg4win from the official site.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: Generate a New GPG Key
&lt;/h2&gt;

&lt;p&gt;Run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;gpg &lt;span class="nt"&gt;--full-generate-key&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When prompted:&lt;br&gt;
Key type: RSA and RSA&lt;/p&gt;

&lt;p&gt;Key size: 4096&lt;/p&gt;

&lt;p&gt;Expiration: Choose what works for you (e.g., 1y or 0 for no expiry)&lt;/p&gt;

&lt;p&gt;Name &amp;amp; Email:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Use the same email address as your GitHub account&lt;br&gt;
Passphrase: Use a strong one (don’t forget it)&lt;br&gt;
After completion, your GPG key is created &lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;
  
  
  Step 3: List Your GPG Keys and Copy the Key ID
&lt;/h2&gt;

&lt;p&gt;Run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;gpg &lt;span class="nt"&gt;--list-secret-keys&lt;/span&gt; &lt;span class="nt"&gt;--keyid-format&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;long

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Example output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/home/nyaugenya/.gnupg/pubring.kbx
----------------------------------
sec   rsa3072/CBC3C9CAC3450592 2025-12-17 [SC] [expires: 2027-12-17]
      DD88627124BA164FD7D531C8CBC3C9CAC3450592
uid                 [ultimate] nyaugenya (go!!!) &amp;lt;test@gmail.com&amp;gt;
ssb   rsa3072/4DB25F105F5D7F76 2025-12-17 [E] [expires: 2027-12-17]


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Copy the key ID after rsa4096/&lt;br&gt;
Example: DD88627124BA164FD7D531C8CBC3C9CAC3450592&lt;/p&gt;
&lt;h2&gt;
  
  
  Step 4: Export the GPG Public Key
&lt;/h2&gt;

&lt;p&gt;Run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;gpg &lt;span class="nt"&gt;--armor&lt;/span&gt; &lt;span class="nt"&gt;--export&lt;/span&gt; DD88627124BA164FD7D531C8CBC3C9CAC3450592

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Copy everything, including:&lt;br&gt;
-----BEGIN PGP PUBLIC KEY BLOCK-----&lt;br&gt;
...&lt;br&gt;
-----END PGP PUBLIC KEY BLOCK-----&lt;/p&gt;
&lt;h2&gt;
  
  
  Step 5: Add the GPG Key to GitHub
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Go to GitHub → Settings&lt;/li&gt;
&lt;li&gt;Click SSH and GPG keys&lt;/li&gt;
&lt;li&gt;Under GPG keys, click New GPG key&lt;/li&gt;
&lt;li&gt;Paste the copied key&lt;/li&gt;
&lt;li&gt;Click Add GPG key&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;_GitHub now knows your signing key&lt;br&gt;
_&lt;/p&gt;
&lt;h2&gt;
  
  
  Step 6: Tell Git to Use Your GPG Key
&lt;/h2&gt;

&lt;p&gt;Configure Git with your key ID:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git config &lt;span class="nt"&gt;--global&lt;/span&gt; user.signingkey DD88627124BA164FD7D531C8CBC3C9CAC3450592
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Enable commit signing by default:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git config &lt;span class="nt"&gt;--global&lt;/span&gt; commit.gpgsign &lt;span class="nb"&gt;true&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Make sure your Git email matches GitHub:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git config &lt;span class="nt"&gt;--global&lt;/span&gt; user.email &lt;span class="s2"&gt;"test@gmail.com"&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Git to automatically GPG-sign all tags you create&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git config &lt;span class="nt"&gt;--global&lt;/span&gt; tag.gpgSign &lt;span class="nb"&gt;true&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 7: (Linux) Fix “GPG Failed to Sign the Data” Error
&lt;/h2&gt;

&lt;p&gt;If you see this error, run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;GPG_TTY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;tty&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To make it permanent:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s1"&gt;'export GPG_TTY=$(tty)'&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; ~/.bashrc

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then reload:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;source&lt;/span&gt; ~/.bashrc

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 8: Make a Signed Commit
&lt;/h2&gt;

&lt;p&gt;Create a commit:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git commit &lt;span class="nt"&gt;-m&lt;/span&gt; &lt;span class="s2"&gt;"My first signed commit"&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or explicitly sign:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git commit &lt;span class="nt"&gt;-S&lt;/span&gt; &lt;span class="nt"&gt;-m&lt;/span&gt; &lt;span class="s2"&gt;"Signed commit"&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Push your changes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git push

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



</description>
      <category>git</category>
      <category>github</category>
      <category>dataengineering</category>
      <category>softwareengineering</category>
    </item>
    <item>
      <title>Git for Beginners: What It Is, Why It Matters, and How to Use It with GitHub</title>
      <dc:creator>Rose1845</dc:creator>
      <pubDate>Sat, 17 Jan 2026 20:32:59 +0000</pubDate>
      <link>https://dev.to/rose1845/git-for-beginners-what-it-is-why-it-matters-and-how-to-use-it-with-github-5h6b</link>
      <guid>https://dev.to/rose1845/git-for-beginners-what-it-is-why-it-matters-and-how-to-use-it-with-github-5h6b</guid>
      <description>&lt;p&gt;All of this revolves around Git and version control.&lt;br&gt;
This article will walk you through:&lt;/p&gt;

&lt;p&gt;*&lt;em&gt;What Git is and why version control is important&lt;br&gt;
*&lt;/em&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;How to push code to GitHub&lt;/li&gt;
&lt;li&gt;How to pull code from GitHub&lt;/li&gt;
&lt;li&gt;&lt;p&gt;How to track changes using Git&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;What Is Git?&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Git is a version control system.In simple terms, Git helps you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Keep track of changes in your code&lt;/li&gt;
&lt;li&gt;Go back to previous versions if something breaks&lt;/li&gt;
&lt;li&gt;Work with other developers without overwriting each other’s work&lt;/li&gt;
&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;Why Is Version Control Important?
Version control solves these problems.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Benefits of Git &amp;amp; Version Control&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;History tracking – See who changed what and when&lt;/li&gt;
&lt;li&gt;Backup – Your code is safely stored remotely&lt;/li&gt;
&lt;li&gt;Collaboration – Multiple people can work on the same project&lt;/li&gt;
&lt;li&gt;Undo mistakes – Easily revert to a previous version&lt;/li&gt;
&lt;li&gt;Branching – Work on new features without breaking the main code&lt;/li&gt;
&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;What Is GitHub?&lt;br&gt;
GitHub is a platform that hosts Git repositories online.&lt;br&gt;
Git the tool (installed on your computer)&lt;br&gt;
GitHub the service that stores your Git projects on the internet&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Installing Git&lt;br&gt;
Before using Git, install it:&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Windows / macOS / Linux:&lt;br&gt;
(&lt;a href="https://git-scm.com/install/" rel="noopener noreferrer"&gt;https://git-scm.com/install/&lt;/a&gt;)&lt;br&gt;
Verify installation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git --version

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Basic Git Setup (One-Time)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Tell Git who you are:&lt;/p&gt;

&lt;p&gt;`&lt;/p&gt;

&lt;p&gt;&lt;code&gt;&lt;/code&gt;&lt;code&gt;&lt;br&gt;
git config --global user.name "Rose1845" // replace with your github username&lt;br&gt;
git config --global user.email "odhiamborose466@gmail.com" // replace with your own email&lt;br&gt;
&lt;/code&gt;&lt;code&gt;&lt;/code&gt;&lt;br&gt;
`&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;How to Push Code to GitHub
Step 1: Create a Repository on GitHub&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://github.com" rel="noopener noreferrer"&gt;Go to https://github.com&lt;br&gt;
&lt;/a&gt;&lt;br&gt;
Click New Repository&lt;/p&gt;

&lt;p&gt;Give it a name&lt;/p&gt;

&lt;p&gt;Click Create repository&lt;/p&gt;

&lt;p&gt;Step 2: Initialize Git Locally&lt;/p&gt;

&lt;p&gt;Inside your project folder:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git init

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This creates a hidden .git folder that Git uses to track changes.&lt;/p&gt;

&lt;p&gt;Step 3: Track Files&lt;/p&gt;

&lt;p&gt;Check file status:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git status

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Add files to Git:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git add .

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Step 4: Commit Changes&lt;/p&gt;

&lt;p&gt;A commit is a snapshot of your code at a specific point in time.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git commit -m "Initial commit"

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Step 5: Connect to GitHub&lt;br&gt;
Copy the repository URL from GitHub, then run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git remote add origin https://github.com/username/repository-name.git
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Step 6: Push to GitHub&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git branch -M main

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git push -u origin main

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your code is now on GitHub!&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;How to Pull Code from GitHub
Pulling means downloading the latest changes from GitHub.
Clone a Repository (First Time)
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git clone https://github.com/username/repository-name.git

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This creates a local copy on your machine.&lt;/p&gt;

&lt;p&gt;Pull Latest Changes&lt;br&gt;
If you already cloned the repo:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git pull origin main(name of your branch in this case it's main)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;This Fetches new changes&lt;/li&gt;
&lt;li&gt;Merges them into your local code&lt;/li&gt;
&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;How to Track Changes Using Git
Git gives you powerful tools to see what’s happening in your project.
Check File Status
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git status

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This Shows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Modified files&lt;/li&gt;
&lt;li&gt;Staged files&lt;/li&gt;
&lt;li&gt;Untracked files&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;**See Changes in a File&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git diff
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Shows what changed before committing.
**View Commit History
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;- git log
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;THis Displays:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Commit IDs&lt;/li&gt;
&lt;li&gt;Authors&lt;/li&gt;
&lt;li&gt;Dates&lt;/li&gt;
&lt;li&gt;Messages
*&lt;em&gt;Short Commit History
*&lt;/em&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git log --oneline

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;*&lt;em&gt;Great for a quick overview &lt;br&gt;
*&lt;/em&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;A Typical Git Workflow
Most follow this cycle:&lt;/li&gt;
&lt;li&gt;Make changes to code&lt;/li&gt;
&lt;li&gt;Check status
&lt;/li&gt;
&lt;/ol&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git status

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Add changes&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git add .

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Commit changes&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git commit -m "Describe what you changed"

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Push to GitHub&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git push

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



</description>
      <category>softwareengineering</category>
      <category>dataengineering</category>
      <category>git</category>
      <category>github</category>
    </item>
  </channel>
</rss>
