<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Massy</title>
    <description>The latest articles on DEV Community by Massy (@alumassy).</description>
    <link>https://dev.to/alumassy</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1055087%2F76f8c9b3-d5ee-44cd-85da-fa2b7d5a3998.jpg</url>
      <title>DEV Community: Massy</title>
      <link>https://dev.to/alumassy</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/alumassy"/>
    <language>en</language>
    <item>
      <title>The evolution of Data Engineering and the role of ELT tools</title>
      <dc:creator>Massy</dc:creator>
      <pubDate>Mon, 12 May 2025 10:10:32 +0000</pubDate>
      <link>https://dev.to/alumassy/the-evolution-of-data-engineering-and-the-role-of-elt-tools-1al2</link>
      <guid>https://dev.to/alumassy/the-evolution-of-data-engineering-and-the-role-of-elt-tools-1al2</guid>
      <description>&lt;p&gt;Data engineering has progressed rapidly in the past 3 decades. The warp speed changes in the field have created a significant knowledge gap for existing data Engineers, people interested in moving into a career in data engineering, data Scientists, machine learning engineers, BI &amp;amp; analytics teams, software &amp;amp; infrastructure teams as well as executives who want to better understand how data engineering fits into their companies.&lt;/p&gt;

&lt;p&gt;In the data engineering space, a good deal of ceremony occurs around data movement and processing in order to be able to effectively support downstream use cases such as data science(AI/ML), Business Intelligence and Operational analytics in production. Therefore, it comes as no surprise that the data movement and processing practices and tools are at the forefront of the data engineering evolution.&lt;/p&gt;

&lt;p&gt;This article begins by explaining the long established data movement and processing pattern known as ETL and the shift to the newer pattern known as ELT. It then covers the ELT process in detail and its benefits. Subsequent sections highlight the top ELT tools, describe the Airbyte approach to ELT and feature the exciting crucial data engineering predictions that might redefine the way we work with, process, and harness data throughout 2025 and beyond.&lt;/p&gt;

&lt;h2&gt;
  
  
  The evolution of Data Engineering
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The Birth of Data Engineering
&lt;/h3&gt;

&lt;p&gt;Data engineering &lt;strong&gt;as a practice&lt;/strong&gt; has existed in some form since companies started doing things with data—such as predictive analysis, descriptive analytics, and reports. Before it came into sharp focus &lt;strong&gt;as a distinct field&lt;/strong&gt; alongside the rise of data science in the 2010s, the practice had been branded in a whole host of different ways in the past, including as Database Administration, Data Analysis, Business Intelligence Engineering, Database Development, and more. &lt;/p&gt;

&lt;p&gt;The birth of Data Engineering can arguably be traced back to data warehousing, originating as early as the 1970s, with the business data warehouse taking shape in the 1980s, and Bill Inmon officially coining the term "data warehouse" in 1989. The advent of database technology in this period saw enterprises employ online transactional processing (OLTP) systems, which offered efficient methods for storing, querying, and updating transactional and operational data, typically managed by relational database management systems (RDBMS). OLTP systems were designed for application-oriented data collection and maintaining the most current state of the enterprise, optimised for multiple, concurrent, and fast reads and writes, ensuring ACID properties (atomicity, consistency, isolation, durability).&lt;/p&gt;

&lt;p&gt;With the ability to manage data logistics, the next logical step for enterprises was to leverage this data for insights and profitability. This led to the emergence of online analytical processing (OLAP) systems around the mid-1990s, which became the cornerstone of business intelligence and decision support. OLAP systems utilise a data warehouse (DW) or enterprise data warehouse (EDW) specifically designed for analytical processes, maintaining historical and commutative data and performing heavy operations like user-defined functions, aggregates, and complex joins for business analysis.&lt;/p&gt;

&lt;h3&gt;
  
  
  Emergence of ETL
&lt;/h3&gt;

&lt;p&gt;To move data from OLTP systems (the current state) to OLAP systems (the historical data), ETL (Extract, Transform, Load) processes emerged. In its original form, still relevant today, ETL involves identifying and extracting relevant data from various sources, transforming it for cleansing and customisation, and finally loading it into a data warehouse. This often involves a Data Staging Area (DSA) where transformations take place before loading into fact and dimension tables. Early ETL processes faced challenges such as schema mapping, data cleansing and quality, complex transformations, and a lack of standardisation.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Shift to ELT
&lt;/h3&gt;

&lt;p&gt;A significant shift in data engineering occurred with the advent of the modern data stack, catalysed by the release of Amazon Redshift, a &lt;strong&gt;cloud-native&lt;/strong&gt; massively parallel processing (MPP) / OLAP database, in October 2012. While ETL served as the primary method for data processing for decades, the evolution of cloud technologies led to the rise of Extract, Load, Transform (ELT) as a modern alternative. Traditionally, data was transformed before loading into the data warehouse because the warehouse was often too slow and constrained to handle heavy processing itself. Business intelligence (BI) tools also performed local data processing to circumvent warehouse bottlenecks, and data processing was centrally governed to avoid overwhelming the warehouse.&lt;/p&gt;

&lt;p&gt;The cloud, a significant 21st-century innovation, revolutionised how data is extracted, loaded, and transformed. The cloud flips the on-premises model by offering rented hardware and managed services, allowing for dynamic scaling of resources. This scalability and the pay-as-you-go model of cloud data warehouses have made them accessible even to smaller companies.&lt;/p&gt;

&lt;p&gt;In the ELT data warehouse architecture, data is moved more or less directly from production systems into a staging area within the data warehouse in its raw form. Transformations are then handled directly within the data warehouse, leveraging the massive computational power of cloud data warehouses and processing tools. This data is processed in batches, and the transformed output is written into tables and views for analytics. ELT is also popular today in streaming arrangements, where events are streamed and subsequently transformed within the data warehouse.&lt;/p&gt;

&lt;h2&gt;
  
  
  Understanding the ELT Process
&lt;/h2&gt;

&lt;p&gt;The ELT process comprises three main stages: Extract, Load, and Transform.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Extract: This involves retrieving data from various source systems, which can be pull-based or push-based and may require reading metadata and schema changes. Data is extracted from sources like relational databases, CRM systems, cloud applications, or APIs – essentially the data intended for eventual analytics use. Accessing these diverse sources can be simplified by managed data connector platforms and frameworks like Airbyte, reducing the need for custom development. These tools automate pipeline creation and management, extracting data and loading it into data warehouses via user interfaces.
&lt;/li&gt;
&lt;li&gt;Load: Once extracted, data is loaded into a target data platform (data warehouse, data lake). Unlike ETL, ELT loads raw data immediately after extraction, making data available for analysis much faster. This step is efficient as it requires no upfront transformation. This process is often referred to as data ingestion – the movement of data from a source to a destination. Data integration, in contrast, combines data from disparate sources into a new dataset. Ingestion processes can be batch, micro-batch, or real-time.
&lt;/li&gt;
&lt;li&gt;Transform: In the final step, the raw data loaded into the data warehouse is transformed for analytical use. This often involves light transformations like casting data types, standardising time zones, and renaming fields, as well as heavy transformations that incorporate business logic, create materialisations, and join data. Data quality checks (QA) are also crucial during this stage.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Benefits of ELT
&lt;/h2&gt;

&lt;p&gt;ELT offers several advantages, particularly in cloud-based environments where scalability, flexibility, and performance are paramount. Key benefits include:&lt;/p&gt;

&lt;p&gt;Scalability: ELT leverages the vast computational power of modern cloud data warehouses, enabling effortless scaling of data pipelines to handle growing data volumes without performance bottlenecks.&lt;/p&gt;

&lt;p&gt;Faster data availability: By loading raw data immediately, ELT makes data available for analysis much quicker than ETL, which is crucial for organisations needing near-real-time insights.&lt;/p&gt;

&lt;p&gt;Cost efficiency: ELT reduces the need for expensive on-premise ETL tools and infrastructure by offloading processing to the cloud and utilising pay-as-you-go resources.&lt;/p&gt;

&lt;p&gt;Flexibility: ELT allows for more flexibility in data transformation, as analysts can apply transformations iteratively to adapt to changing business requirements since raw data is readily available in the warehouse.&lt;/p&gt;

&lt;p&gt;Simplified pipelines: The ELT process simplifies data pipelines by eliminating the need for upfront data transformation before loading, reducing complexity and improving overall pipeline management.&lt;/p&gt;

&lt;p&gt;Adoption of software development best practices: Performing transformations last in the pipeline allows for code-based and version-controlled transformations, enabling features like easy recreation of historical transformations, code-based tests, CI/CD workflows, and documentation of data models like typical software code.&lt;/p&gt;

&lt;h2&gt;
  
  
  Top ELT Tools
&lt;/h2&gt;

&lt;p&gt;The modern data stack, which facilitates the ELT workflow, comprises various tools that have become reasonably consistent over time. These can be broadly categorised as:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Ingestion:&lt;/strong&gt; Tools like Airbyte, Fivetran and Stitch simplify the extraction and loading of data.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Warehousing/Lakehouse Platforms:&lt;/strong&gt; Cloud data warehouses such as BigQuery, Databricks, Redshift and Snowflake serve as the primary storage and transformation environment. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Transformation:&lt;/strong&gt; dbt (data build tool) has emerged as a popular tool specifically for the transformation step within the data warehouse.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;BI&lt;/strong&gt;: Tools like Looker, Mode, Periscope, Chartio, Metabase, and Redash are used for data visualisation and analysis.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Workflow Orchestration Tools:&lt;/strong&gt; While not strictly ELT tools, orchestrators like Apache Airflow are essential for scheduling and managing the entire ELT pipeline.&lt;/p&gt;

&lt;p&gt;As stated above, It is evident that some tools focus primarily on data integration (the EL part of ELT), while others, like dbt, focus on transformation (the T part). Some tools can also perform both ETL and ELT. Cloud vendors also have proprietary services for storage and databases, often bundled to work well together within their ecosystem. Examples include AWS Glue with Redshift, Databricks Workflows, Microsoft Fabric Data Factory, and BigQuery's Data Transfer Service and integration with Dataform.&lt;/p&gt;

&lt;h2&gt;
  
  
  Airbyte and the ELT Workflow
&lt;/h2&gt;

&lt;p&gt;Airbyte is an open-source data integration platform designed to consolidate data from various sources into data warehouses, lakes, and databases. It plays the “EL” role in the ELT workflow. It is available in both self-managed and cloud versions. Airbyte simplifies self-serve data extraction from numerous API(550+), database, and file sources, offering predictable data loading into over 25+ destinations while managing typing and deduplication.&lt;/p&gt;

&lt;p&gt;Airbyte enables users to build connectors using a no-code builder for HTTP APIs or a low-code CDK for REST APIs, significantly reducing development effort. Its unified platform ensures reliability across all data synchronisations, allowing control over schema propagation and flexible sync frequencies. &lt;/p&gt;

&lt;p&gt;Airbyte also provides transformation capabilities as a critical part of the ELT process, allowing users to convert raw data into a more usable format after it has been loaded. This includes basic normalisation to convert JSON blobs into structured tables. Users can also implement custom transformations using SQL or integrate with dbt cloud for more complex transformations.&lt;/p&gt;

&lt;p&gt;As highlighted above, it is clear that Airbyte strongly favours ELT over ETL.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Future of Data Engineering and ELT
&lt;/h2&gt;

&lt;p&gt;While nobody can predict the future, there’s a good perspective on the past, the present, and current trends. Below observations of ongoing developments and wild future speculation. &lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;As more organizations shift towards cloud-based infrastructures, The modern data stack is and will continue to be the default choice of data architecture and ELT will continue to play a crucial role in data integration processes.
&lt;/li&gt;
&lt;li&gt;ELT tools will continue to mature, extending their coverage to more use cases to become more reliable foundational technologies, sparking the next wave of innovation in the modern data stack.
&lt;/li&gt;
&lt;li&gt;The ELT workflow and the specific tools are changing and evolving rapidly, but the core aim will remain the same: to reduce complexity and increase modularization. Plug-and-play modular tools with easy-to-understand pricing and implementation is the way of the future.
&lt;/li&gt;
&lt;li&gt;Batch transformations are overwhelmingly popular, but given the growing popularity of stream-processing solutions and the general increase in the amount of streaming data, the popularity of streaming transformations is expected to continue growing, perhaps entirely replacing batch processing in certain domains soon.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;In this blog post, we've explored the early days of Data Engineering in which the Extract, Transform, Load (ETL)  data processing framework was popular, then the adoption of the ELT framework which was mainly driven by cloud technology. We also learnt what the ELT framework consists of in detail, the benefits of ELT and Top ELT tools. Lastly, we covered how Airbyte fits into the ELT process and future data engineering predictions. &lt;/p&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;a href="https://www.cs.uoi.gr/~pvassil/publications/TALKS/2023_03_dolap_tota/23DOLAP_TestOfTimeAward_CEUR-CR.pdf" rel="noopener noreferrer"&gt;The History, Present, and Future of ETL Technology&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.getdbt.com/blog/future-of-the-modern-data-stack" rel="noopener noreferrer"&gt;The Modern Data Stack: Past, Present, and Future | dbt Labs&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.getdbt.com/blog/extract-load-transform" rel="noopener noreferrer"&gt;Understanding ELT: Extract, Load, Transform | dbt Labs&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt; &lt;a href="https://docs.airbyte.com/using-airbyte/core-concepts/" rel="noopener noreferrer"&gt;Core Concepts | Airbyte Docs&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.oreilly.com/library/view/fundamentals-of-data/9781098108298/" rel="noopener noreferrer"&gt;Fundamentals of Data Engineering by Joe Reis &amp;amp; Matt Housley&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.getdbt.com/blog/2025-data-predictions" rel="noopener noreferrer"&gt;A decade of data evolution and 2025 predictions | dbt Labs&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.restack.io/docs/airbyte-knowledge-airbyte-transformations-guide" rel="noopener noreferrer"&gt;Airbyte transformations guide — Restack&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.getdbt.com/blog/how-ai-will-disrupt-data-engineering" rel="noopener noreferrer"&gt;How AI will disrupt data engineering as we know it | dbt Labs&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://docs.google.com/spreadsheets/d/1QKrtBpg6PliPMpcndpmkZpDVIz_o6_Y-LWTTvQ6CfHA/edit?gid=0#gid=0" rel="noopener noreferrer"&gt;ETL/ELT Benchmark_Public&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.researchgate.net/publication/375861478_Evolving_Paradigms_of_Data_Engineering_in_the_Modern_Era_Challenges_Innovations_and_Strategies" rel="noopener noreferrer"&gt;Evolving Paradigms of Data Engineering in the Modern Era: Challenges, Innovations, and Strategies&lt;/a&gt; &lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>dataengineering</category>
      <category>airbyte</category>
      <category>elt</category>
      <category>etl</category>
    </item>
    <item>
      <title>How to access the ERD (Entity-Relationship Diagram) of your database schema in MySQL Workbench</title>
      <dc:creator>Massy</dc:creator>
      <pubDate>Fri, 19 May 2023 04:12:21 +0000</pubDate>
      <link>https://dev.to/alumassy/how-to-access-the-erd-entity-relationship-diagram-of-your-database-schema-in-mysql-workbench-5813</link>
      <guid>https://dev.to/alumassy/how-to-access-the-erd-entity-relationship-diagram-of-your-database-schema-in-mysql-workbench-5813</guid>
      <description>&lt;p&gt;As a data analysis student, you may know what an Entity Relationship Diagram is and how it helps you quickly get familiar with a database schema and its main properties like tables, table relationships as well as table columns (and their respective data types)&lt;/p&gt;

&lt;p&gt;But because you've always been practising and answering the SQL questions using an online interactive SQL instance, you have only been able to view an ERD because it has always been provided to you as an image. This means you've never had to generate a physical entity relationship diagram in a Database Management System like MySQL.&lt;/p&gt;

&lt;p&gt;Yet, in the real workplace setting, or when you start answering SQL questions using a DBMS, you need to know how you can view or access the ERD of the database schema you intend to query from.&lt;/p&gt;

&lt;h2&gt;
  
  
  Below are steps you can follow to view the ERD (Entity-Relationship Diagram) of your database schema in MySQL Workbench:
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Open MySQL Workbench and open your database connection.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;In the top navigation bar, click on “Database” to expand the list of options.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft8xdsx8biggrmq6g5iyh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft8xdsx8biggrmq6g5iyh.png" alt=" " width="639" height="572"&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Select “Reverse Engineer” from the database context menu.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc4h1mqfbqu1ga59lio91.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc4h1mqfbqu1ga59lio91.png" alt=" " width="649" height="606"&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;In the “Reverse Engineer” dialogue box, click “Next” to set parameters for connecting to the Database Management System(DBMS).&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1i4wfu28g7u21lk2o066.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1i4wfu28g7u21lk2o066.png" alt=" " width="720" height="464"&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Enter your password so that MySQL can connect to your DBMS and then click ‘’next’’.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F49hgnltzhd0frhli6c1q.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F49hgnltzhd0frhli6c1q.png" alt=" " width="720" height="471"&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Select the database schema for which you want to generate an ERD. And then click ‘’Next’’ again.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5tt9aq8fjrtg8w6o1670.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5tt9aq8fjrtg8w6o1670.png" alt=" " width="720" height="559"&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Enter your password again so that MySQL can retrieve and reverse engineer the schema objects for the schema you chose.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5lg7xbfjwp5q48atzz1s.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5lg7xbfjwp5q48atzz1s.png" alt=" " width="720" height="562"&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Click “Execute” to start the reverse engineering process. This may take a few moments depending on the size of your database.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Click ‘’finish’’ to see the ERD of your database schema displayed in the main window of MySQL Workbench.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;In the ERD tab, your ERD may look like this by default :&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft1l792m1vyesvu5byo2z.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft1l792m1vyesvu5byo2z.png" alt=" " width="720" height="398"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The default ERD above may require scrolling back and forth to make sense of its contents. But you can avoid this by customising the ERD layout by dragging the tables and resizing them so that you can see all of them at a glance without the need to scroll down.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frkoo9rwqqau6yc1r8xyo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frkoo9rwqqau6yc1r8xyo.png" alt=" " width="800" height="500"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;And that’s a wrap!&lt;/p&gt;

</description>
      <category>mysql</category>
      <category>database</category>
      <category>dataanalysis</category>
      <category>sql</category>
    </item>
    <item>
      <title>Cohort Retention Analysis from A-Z in Tableau</title>
      <dc:creator>Massy</dc:creator>
      <pubDate>Sat, 13 May 2023 04:07:48 +0000</pubDate>
      <link>https://dev.to/alumassy/cohort-retention-analysis-from-a-z-in-tableau-4l0a</link>
      <guid>https://dev.to/alumassy/cohort-retention-analysis-from-a-z-in-tableau-4l0a</guid>
      <description>&lt;p&gt;I recently learnt how to carry out cohort retention analysis in Tableau.&lt;/p&gt;

&lt;p&gt;But most of the articles I came across were just using Tableau merely as a visualisation tool for the cohort retention rate results. For example, they’d use SQL for calculation and then Tableau for visualization or use Excel for calculation and then visualise in Tableau.&lt;/p&gt;

&lt;p&gt;I found there’s an efficient way to do all this. Calculating the cohort retention rate and also visualizing the results all in one place — Tableau. Isn’t that awesome?&lt;/p&gt;

&lt;p&gt;In this article, you’ll learn :&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What cohort retention analysis is&lt;/li&gt;
&lt;li&gt;Why it’s important&lt;/li&gt;
&lt;li&gt;What data is required for cohort retention analysis &amp;amp; if it’s not available, how to calculate it (derived columns/calculated field)&lt;/li&gt;
&lt;li&gt;How to create the cohort retention table&lt;/li&gt;
&lt;li&gt;How to interpret the cohort retention rate&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;You should have a &lt;a href="https://id.tableau.com/register?clientId=wcS7HwY98qdfgBREHT7Xoln7ipc75U0a" rel="noopener noreferrer"&gt;Tableau public account&lt;/a&gt; to be able to create and publicly share your visualisation after you’re done.&lt;/li&gt;
&lt;li&gt;You should know the &lt;a href="https://www.tableau.com/blog/getting-ready-publish-your-first-data-visualization" rel="noopener noreferrer"&gt;basics&lt;/a&gt; of Tableau. In this article, I assume you’re already familiar with the Tableau environment. I’ll focus on teaching you how to use Tableau for cohort analysis specifically.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Understand cohort retention analysis
&lt;/h2&gt;

&lt;p&gt;Cohort analysis is used by businesses to understand the behaviour, patterns and trends of their customers so that they can subsequently tailor their products and services to the identified cohorts.&lt;/p&gt;

&lt;p&gt;You might ask yourself what a cohort is. A cohort is simply a group of people in this case who are customers, that share common characteristics such as time and size. Therefore, Cohort Analysis is an analysis of several different cohorts to get a better understanding of their behaviour, patterns, and trends.&lt;/p&gt;

&lt;p&gt;There are different types of cohorts to analyze. They include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Time-based cohorts&lt;/li&gt;
&lt;li&gt;Segment-based cohorts&lt;/li&gt;
&lt;li&gt;Size-based cohort&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In this case, the type of cohorts you are going to create are time-based cohorts. Specifically, the cohort analysis you're going to do is retention-based. You are going to look at the time frame a certain group of people made their first purchase and then track the percentage of them that made subsequent purchases in future quarters.&lt;/p&gt;

&lt;h2&gt;
  
  
  Get Data
&lt;/h2&gt;

&lt;p&gt;The Data set that you are going to be using is the famous Superstore Dataset. You can access the dataset by downloading it from &lt;a href="https://www.kaggle.com/datasets/vivek468/superstore-dataset-final" rel="noopener noreferrer"&gt;Kaggle&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Open Tableau Public and connect to the Superstore data.&lt;/p&gt;

&lt;p&gt;Go ahead and analyse the rows and columns to know what kind of data about the superstore was documented.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fddhkzvs1ltfhr7yqtrik.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fddhkzvs1ltfhr7yqtrik.png" alt=" " width="720" height="303"&gt;&lt;/a&gt;&lt;br&gt;
Proceed to the Sheet 1 tab. This is going to be your workspace and where you’re going to subsequently create your visualisation.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhc7owqgvqdsuwd7hd37u.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhc7owqgvqdsuwd7hd37u.png" alt=" " width="720" height="433"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Data points of interest
&lt;/h2&gt;

&lt;p&gt;To carry out cohort analysis the following data points are required. You need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Unique identifier. In this case, your unique identifier is the Customer ID&lt;/li&gt;
&lt;li&gt;First purchase date. This refers to the date the customer made their first purchase from the business and became a customer. This initial date is going to come in handy in creating a cohort group.&lt;/li&gt;
&lt;li&gt;Revenue data&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  Create calculated fields
&lt;/h2&gt;
&lt;h3&gt;
  
  
  Customers’ First purchase date(quarter)
&lt;/h3&gt;

&lt;p&gt;As mentioned earlier, the first purchase date is used in assigning a cohort to each customer.&lt;/p&gt;

&lt;p&gt;Since the first purchase date field is not readily available in the data set. You are going to come up with it. This is known as a calculated field. A calculated field is a numeric or date field that derives its data from the calculation of the data in other fields that are readily present in the dataset.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flyt5y27i6dn5d4qq9qp5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flyt5y27i6dn5d4qq9qp5.png" alt=" " width="406" height="580"&gt;&lt;/a&gt;&lt;br&gt;
Through the ‘first purchase’ calculated field, you’re going to create quarter and year cohorts which will be assigned to customers depending on the date they made their first purchase&lt;/p&gt;

&lt;p&gt;Earlier in this article, I defined a cohort as a group of people that share common characteristics like time. In this case, a cohort is going to be a group of customers that made their first purchase in the same quarter and same year.&lt;/p&gt;

&lt;p&gt;If you explored this Superstore dataset at the start, you might have realised that this data is spanning four years (2014–2017), if decided to make the month cohorts, it would make the cohort table very big and hard to analyse at just a glance. So I deemed it efficient to form cohorts based on the quarter and year in which a customer made their first purchase. What I’m trying to say is that you can create time-based cohorts based on other time parameters like day, week or month and it wouldn’t matter.&lt;/p&gt;

&lt;p&gt;The calculation to establish the quarter in which a customer made their first purchase is as below:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;DATE({ FIXED [Customer ID] : MIN(DATETRUNC('quarter', [Order Date])) })
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4wa2sv44t4mncpfg2r0a.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4wa2sv44t4mncpfg2r0a.png" alt=" " width="720" height="369"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Customers per first quarter
&lt;/h3&gt;

&lt;p&gt;This calculated field is going to be used to establish the number of 'unique' customers that made their first purchase in each quarter.&lt;/p&gt;

&lt;p&gt;This calculation builds on the first calculated field that you've just completed.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{ FIXED [Customers First Purchase Quarter] : COUNTD([Customer ID]) }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Retention Rate
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;COUNTD([Customer ID])/SUM([Customers per First Quarter])
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Assemble the cohort table
&lt;/h2&gt;

&lt;p&gt;This is about assembling the calculated fields you’ve created to form a cohort retention table.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1:&lt;/strong&gt; Click on &lt;code&gt;Customers First Purchase&lt;/code&gt; field and drag it to the rows.&lt;br&gt;
&lt;iframe width="710" height="399" src="https://www.youtube.com/embed/KeX1TrLM5tw"&gt;
&lt;/iframe&gt;
&lt;br&gt;
&lt;strong&gt;Step 2:&lt;/strong&gt; On the rows, click on the drop-down, then switch the specifications from year to quarter and from continuous to discrete.&lt;br&gt;
&lt;iframe width="710" height="399" src="https://www.youtube.com/embed/sE7Mrv7Pc3c"&gt;
&lt;/iframe&gt;
&lt;br&gt;
&lt;strong&gt;Step 3:&lt;/strong&gt; Click on the &lt;code&gt;Order Date&lt;/code&gt; field and drag it to the columns. Click the drop-down in the columns to switch the specifications from year to quarter and from continuous to discrete.&lt;br&gt;
&lt;iframe width="710" height="399" src="https://www.youtube.com/embed/xhqHdHADhx4"&gt;
&lt;/iframe&gt;
&lt;br&gt;
&lt;strong&gt;Step 4:&lt;/strong&gt; Drag the &lt;code&gt;Customers per first quarter&lt;/code&gt; field you created from the measures to the dimensions. And then drag it to the rows.&lt;br&gt;
&lt;iframe width="710" height="399" src="https://www.youtube.com/embed/Mjx5RP4oXeQ"&gt;
&lt;/iframe&gt;
&lt;br&gt;
&lt;strong&gt;Step 5:&lt;/strong&gt; Click the &lt;code&gt;Retention Rate&lt;/code&gt; field you previously created and drag it to the text tile. Click the drop-down to format number to percentage of one decimal place.&lt;br&gt;
&lt;iframe width="710" height="399" src="https://www.youtube.com/embed/ba6Orsc5qMY"&gt;
&lt;/iframe&gt;
&lt;br&gt;
&lt;strong&gt;Step 6:&lt;/strong&gt; Drag the &lt;code&gt;Retention Rate&lt;/code&gt; field from the text tile to the colour tile. Then drag &lt;code&gt;Customer ID&lt;/code&gt; field to the tooltip tile. On the customer ID tool tip, click the drop down to change from attribute to a measure of count distinct. Then finally make the values visible by clicking T on the tool bar.&lt;br&gt;
&lt;iframe width="710" height="399" src="https://www.youtube.com/embed/sXfLa8paAwU"&gt;
&lt;/iframe&gt;
&lt;br&gt;
&lt;strong&gt;&lt;em&gt;Bonus tip:&lt;/em&gt;&lt;/strong&gt; You can further customize the look of the table by going over to the &lt;a href="https://coolors.co/" rel="noopener noreferrer"&gt;coolors&lt;/a&gt; website to pick out some unique colours that appeal to you.&lt;/p&gt;

&lt;h2&gt;
  
  
  Interpret the retention rate
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Going down the view
&lt;/h3&gt;

&lt;p&gt;Going down the view of the cohort table, look at the first column and second column, you see&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;different year and quarter groups (cohorts)&lt;/li&gt;
&lt;li&gt;the number of customers that made their first purchase in each of those periods.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Going across the view
&lt;/h3&gt;

&lt;p&gt;Going across the view (3rd column), you see the percentage of customers that continued to make purchases at the superstore 0 through 15 quarters after making their first purchase.&lt;/p&gt;

&lt;p&gt;For example, 160 customers made their first purchase in 2014 Q2. Of those 160 customers, 24.4% and 36.3% of them came back to make purchases in 2014 Q3 and 2014 Q4 respectively. And so forth…&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs9h9ix5ydk3qs29es8cu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs9h9ix5ydk3qs29es8cu.png" alt=" " width="720" height="262"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Creating calculated fields is inevitable in carrying out Cohort analysis in Tableau. And with creating calculated fields comes the need to use functions. If you are fairly new to the concept of Tableau functions, I understand there might be some knowledge gaps for you to fill. I recommend you check out this Tableau article on &lt;a href="https://help.tableau.com/current/pro/desktop/en-us/functions.htm" rel="noopener noreferrer"&gt;functions&lt;/a&gt; to gain a deeper understanding of the subject.&lt;/p&gt;

</description>
      <category>dataanalysis</category>
      <category>tableau</category>
      <category>businessintelligence</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Problem solving with SQL: Case Study #1 — Danny’s Diner</title>
      <dc:creator>Massy</dc:creator>
      <pubDate>Thu, 30 Mar 2023 00:16:01 +0000</pubDate>
      <link>https://dev.to/alumassy/problem-solving-with-sql-case-study-1-dannys-diner-3oob</link>
      <guid>https://dev.to/alumassy/problem-solving-with-sql-case-study-1-dannys-diner-3oob</guid>
      <description>&lt;p&gt;Thank you Danny Ma for the excellent case study! You can find it &lt;a href="https://8weeksqlchallenge.com/case-study-1/" rel="noopener noreferrer"&gt;here&lt;/a&gt; and try it yourself. While at it, you should give Danny Ma a follow on &lt;a href="https://www.linkedin.com/in/datawithdanny/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; and support his posts if you aren’t already doing so!&lt;/p&gt;

&lt;p&gt;I’ve posted the solution to this case study as a raw SQL script file on &lt;a href="https://github.com/alumassy/Data-Analysis-Projects/blob/main/Case%20Study%20%231%20-%20Danny's%20Diner.sql" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt; too.&lt;/p&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Danny seriously loves Japanese food so at the beginning of 2021, he decides to embark upon a risky venture and opens up a cute little restaurant that sells his 3 favourite foods: sushi, curry and ramen.&lt;/p&gt;

&lt;p&gt;Danny’s Diner needs your assistance to help the restaurant stay afloat — the restaurant has captured some very basic data from its few months of operation but has no idea how to use its data to help them run the business.&lt;/p&gt;

&lt;h2&gt;
  
  
  Problem Statement
&lt;/h2&gt;

&lt;p&gt;Danny wants to use the data to answer a few simple questions about his customers, especially about their&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;visiting patterns,&lt;/li&gt;
&lt;li&gt;how much money they’ve spent, and&lt;/li&gt;
&lt;li&gt;which menu items are their favourite.
Having this deeper connection with his customers will help him deliver a better and more personalised experience for his loyal customers.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;He plans on using these insights to help him decide whether he should expand the existing customer loyalty program — additionally, he needs help to generate some basic datasets so his team can easily inspect the data without needing to use SQL.&lt;/p&gt;

&lt;p&gt;The data set contains the following 3 tables which you may refer to the relationship diagram below to understand the connection.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;sales&lt;/code&gt;&lt;br&gt;
&lt;code&gt;members&lt;/code&gt;&lt;br&gt;
&lt;code&gt;menu&lt;/code&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Relational model
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpd0lhsz473taf5xwls5p.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpd0lhsz473taf5xwls5p.png" alt=" " width="595" height="343"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Case Study Questions
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;What is the total amount each customer spent at the restaurant?&lt;/li&gt;
&lt;li&gt;How many days has each customer visited the restaurant?&lt;/li&gt;
&lt;li&gt;What was the first item from the menu purchased by each customer?&lt;/li&gt;
&lt;li&gt;What is the most purchased item on the menu and how many times was it purchased by all customers?&lt;/li&gt;
&lt;li&gt;Which item was the most popular for each customer?&lt;/li&gt;
&lt;li&gt;Which item was purchased first by the customer after they became a member?&lt;/li&gt;
&lt;li&gt;Which item was purchased just before the customer became a member?&lt;/li&gt;
&lt;li&gt;What are the total items and amount spent for each member before they became a member?&lt;/li&gt;
&lt;li&gt;If each $1 spent equates to 10 points and sushi has a 2x points multiplier — how many points would each customer have?&lt;/li&gt;
&lt;li&gt;In the first week after a customer joins the program (including their join date) they earn 2x points on all items, not just sushi — how many points do customers A and B have at the end of January?&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;
  
  
  Solution
&lt;/h2&gt;

&lt;p&gt;I used &lt;strong&gt;MySQL Workbench&lt;/strong&gt; and these are the particular functions I employed:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Aggregate functions — SUM, COUNT&lt;/li&gt;
&lt;li&gt;Joins — Inner join, left join&lt;/li&gt;
&lt;li&gt;Temp tables (CTE)&lt;/li&gt;
&lt;li&gt;Window function&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Before attempting the questions I used the Entity Relationship Diagram as a guide to determine the logical structure of this database. I then went ahead to create a schema and tables to which tables I inserted the respective values&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;CREATE SCHEMA dannys_diner;
USE dannys_diner;

CREATE TABLE menu (
  product_id INT NOT NULL,
  product_name VARCHAR(5),
  price INT,
  PRIMARY KEY (product_id)
);

INSERT INTO menu
  (product_id, product_name, price)
VALUES
  ('1', 'sushi', '10'),
  ('2', 'curry', '15'),
  ('3', 'ramen', '12');

CREATE TABLE members (
  customer_id VARCHAR(1) NOT NULL,
  join_date DATE,
  PRIMARY KEY (customer_id)
);

INSERT INTO members
  (customer_id, join_date)
VALUES
  ('A', '2021-01-07'),
  ('B', '2021-01-09');

CREATE TABLE sales (
  customer_id VARCHAR(1) NOT NULL,
  order_date DATE,
  product_id INTEGER NOT NULL
);

INSERT INTO sales
  (customer_id, order_date, product_id)
VALUES
  ('A', '2021-01-01', '1'),
  ('A', '2021-01-01', '2'),
  ('A', '2021-01-07', '2'),
  ('A', '2021-01-10', '3'),
  ('A', '2021-01-11', '3'),
  ('A', '2021-01-11', '3'),
  ('B', '2021-01-01', '2'),
  ('B', '2021-01-02', '2'),
  ('B', '2021-01-04', '1'),
  ('B', '2021-01-11', '1'),
  ('B', '2021-01-16', '3'),
  ('B', '2021-02-01', '3'),
  ('C', '2021-01-01', '3'),
  ('C', '2021-01-01', '3'),
  ('C', '2021-01-07', '3');
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhwag3g9xb1zhq7vpsxpm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhwag3g9xb1zhq7vpsxpm.png" alt=" " width="197" height="149"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Questions deep dive
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Q1. What is the total amount each customer spent at the restaurant?&lt;/strong&gt;&lt;br&gt;
I use the SUM and GROUP BY functions to find out the total amount spent for each customer. I added the JOIN function because customer_id is from sales table and price is from menu table.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SELECT customer_id , SUM(price) amount_spent
FROM sales
LEFT JOIN menu 
  ON sales.product_id = menu.product_id
GROUP BY customer_id;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd0zvayywk1akhrph5wud.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd0zvayywk1akhrph5wud.png" alt=" " width="195" height="101"&gt;&lt;/a&gt;&lt;br&gt;
Answer:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Customer A spent $76.&lt;/li&gt;
&lt;li&gt;Customer B spent $74.&lt;/li&gt;
&lt;li&gt;Customer C spent $36.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Q2. How many days has each customer visited the restaurant?&lt;/strong&gt;&lt;br&gt;
I wrapped the &lt;code&gt;COUNT&lt;/code&gt; function around the &lt;code&gt;DISTINCT&lt;/code&gt;function to find out the number of days each customer visited the restaurant.&lt;/p&gt;

&lt;p&gt;If I did not use &lt;code&gt;DISTINCT&lt;/code&gt; for &lt;code&gt;order_date&lt;/code&gt;, the number of days could be repeated. For example, if customer A visited the restaurant twice on ‘2021–01–07’, then the number of days may have counted as 2 instead of 1 day.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SELECT customer_id, COUNT(DISTINCT(order_date)) no_of_visits
FROM sales
GROUP BY customer_id;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F55e241igfiynvb4kr117.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F55e241igfiynvb4kr117.png" alt=" " width="192" height="105"&gt;&lt;/a&gt;&lt;br&gt;
Answer:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Customer A visited 4 times.&lt;/li&gt;
&lt;li&gt;Customer B visited 6 times.&lt;/li&gt;
&lt;li&gt;Customer C visited 2 times.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Q3. What was the first item from the menu purchased by each customer?&lt;/strong&gt;&lt;br&gt;
I first ran a query to find out the earliest order_date and used the answer to filter for only purchases on that date.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SELECT customer_id , product_name, order_date
FROM sales
LEFT JOIN menu 
  ON sales.product_id = menu.product_id
WHERE order_date = '2021-01-01' 
GROUP BY customer_id;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnv1dx2gugaiiz0uhj4vn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnv1dx2gugaiiz0uhj4vn.png" alt=" " width="299" height="113"&gt;&lt;/a&gt;&lt;br&gt;
Answer:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Customer A’s first order was sushi.&lt;/li&gt;
&lt;li&gt;Customer B’s first order was curry.&lt;/li&gt;
&lt;li&gt;Customer C’s first order was ramen.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Q4. What is the most purchased item on the menu and how many times was it purchased by all customers?&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SELECT product_name, COUNT(product_name) times_purchased
FROM sales
LEFT JOIN menu 
  ON sales.product_id = menu.product_id
GROUP BY product_name
ORDER BY times_purchased DESC
LIMIT 1;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqoswx28rtnsizwx42gtj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqoswx28rtnsizwx42gtj.png" alt=" " width="262" height="77"&gt;&lt;/a&gt;&lt;br&gt;
Answer:&lt;br&gt;
The most purchased item on the menu is ramen.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Q5. Which item was the most popular for each customer?&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SELECT customer_id, product_name, COUNT(product_name) times_purchased
FROM sales
LEFT JOIN menu 
  ON sales.product_id = menu.product_id
GROUP BY customer_id, product_name
ORDER BY times_purchased DESC;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8plkfg9ghmarips6amhj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8plkfg9ghmarips6amhj.png" alt=" " width="300" height="169"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Customer A and C’s favourite item was ramen.&lt;/li&gt;
&lt;li&gt;Customer B equally enjoyed all items on the menu.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Q6. Which item was purchased first by the customer after they became a member?&lt;/strong&gt;&lt;br&gt;
Only two customers were members. I ran independent queries to find out the first item they purchased.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;-- Customer A
SELECT customer_id, order_date, product_name 
FROM sales
LEFT JOIN menu 
  ON sales.product_id = menu.product_id
WHERE customer_id = 'A' AND order_date &amp;gt; '2021-01-07' -- date after membership
ORDER BY order_date
LIMIT 1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyryq3hgp4bhjj9udv0el.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyryq3hgp4bhjj9udv0el.png" alt=" " width="283" height="68"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;-- Customer B
SELECT customer_id, order_date, product_name 
FROM sales
LEFT JOIN menu 
  ON sales.product_id = menu.product_id
WHERE customer_id = 'B' AND order_date &amp;gt; '    2021-01-09' -- date after membership
ORDER BY order_date
LIMIT 1;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft58az4ozj4rwxaqr5qhm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft58az4ozj4rwxaqr5qhm.png" alt=" " width="274" height="73"&gt;&lt;/a&gt;&lt;br&gt;
Answer:&lt;br&gt;
After Customer A became a member, his/her first order was ramen, whereas it was sushi for Customer B.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Q7. Which item was purchased just before the customer became a member?&lt;/strong&gt;&lt;br&gt;
I also did the same here. I ran independent queries to find out the first item they purchased because only two customers were members.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;-- Customer A
SELECT customer_id, order_date, product_name 
FROM sales
LEFT JOIN menu 
  ON sales.product_id = menu.product_id
WHERE customer_id = 'A' AND order_date &amp;lt; '2021-01-07' -- dates before membership
ORDER BY order_date DESC
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdk8od5zree3av4kbs8wr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdk8od5zree3av4kbs8wr.png" alt=" " width="292" height="90"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;-- Customer B?
SELECT customer_id, order_date, product_name 
FROM sales
LEFT JOIN menu 
  ON sales.product_id = menu.product_id
WHERE customer_id = 'B' AND order_date &amp;lt; '2021-01-09' -- get dates before membership
ORDER BY order_date DESC -- to capture closest date before membership
LIMIT 1;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffajmurs4euhy4manxm9t.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffajmurs4euhy4manxm9t.png" alt=" " width="291" height="78"&gt;&lt;/a&gt;&lt;br&gt;
Answer:&lt;br&gt;
Customer A’s order before he/she became a member was sushi and curry whereas Customer B’s order was sushi.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Q8. What are the total items and amount spent for each member before they became a member?&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;-- Customer A
SELECT customer_id, order_date, COUNT(product_name) total_items,             SUM(price) amount_spent
FROM sales
LEFT JOIN menu 
  ON sales.product_id = menu.product_id
WHERE customer_id = 'A' AND order_date &amp;lt; '2021-01-07' -- get dates before membership
GROUP BY customer_id
ORDER BY order_date;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyq5lvrnao5njvvsjjrsr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyq5lvrnao5njvvsjjrsr.png" alt=" " width="345" height="67"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;-- Customer B
SELECT customer_id, order_date, COUNT(product_name) total_items, SUM(price) amount_spent
FROM sales
LEFT JOIN menu 
  ON sales.product_id = menu.product_id
WHERE customer_id = 'B' AND order_date &amp;lt; '2021-01-09' -- dates before membership
GROUP BY customer_id
ORDER BY order_date;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7vbqv7w58s8vuk855lwm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7vbqv7w58s8vuk855lwm.png" alt=" " width="353" height="48"&gt;&lt;/a&gt;&lt;br&gt;
Answer: Before becoming members,&lt;br&gt;
Customer A spent $ 25 on 2 items.&lt;br&gt;
Customer B spent $40 on 3 items.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Q9. If each $1 spent equates to 10 points and sushi has a 2x points multiplier — how many points would each customer have?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Let’s break down the question.&lt;/p&gt;

&lt;p&gt;Each $1 spent = 10 points.&lt;br&gt;
But, sushi (product_id 1) gets 2x points, meaning each $1 spent = 20 points&lt;br&gt;
So, we use CASE WHEN to create conditional statements&lt;/p&gt;

&lt;p&gt;If product_id = 1, then every $1 price multiply by 20 points&lt;br&gt;
All other product_id that is not 1, multiply $1 by 10 points&lt;br&gt;
So, you can see the table below with the new column, total_points.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SELECT customer_id,
SUM(CASE
    WHEN product_name = 'sushi' THEN 20 * price
    ELSE 10 * price
END) total_points
FROM sales
LEFT JOIN menu 
  ON sales.product_id = menu.product_id
GROUP BY customer_id;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi3fe2t5jd9ix1yen17vp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi3fe2t5jd9ix1yen17vp.png" alt=" " width="192" height="107"&gt;&lt;/a&gt;&lt;br&gt;
Answer:&lt;br&gt;
Total points for Customer A, B and C are 860, 940 and 360 respectively.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Q10. If the first week after a customer joins the program (including their &lt;code&gt;join date&lt;/code&gt;) they earn 2x points on all items, not just sushi — how many points do customers A and B have at the end of January?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The build up to my final query&lt;/em&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Found out the customer’s validity date (which is 6 days after &lt;code&gt;join_date&lt;/code&gt; and inclusive of &lt;code&gt;join_date&lt;/code&gt;) and the last day of Jan 2021 (‘2021–01–21’). I made the result of this query a CTE because I was going to query further from this result in the following CASE WHEN statement .&lt;/li&gt;
&lt;li&gt;Used CASE WHEN to allocate points by dates and &lt;code&gt;product_name&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Filtered by the first day of February to get only &lt;code&gt;points&lt;/code&gt; that apply to January.&lt;/li&gt;
&lt;li&gt;Wrapped the CASE WHEN statement into the SUM function to add up the points for each customer. It’s at this point I dropped all the columns that were originally present in my CTE except for the &lt;code&gt;customer_id&lt;/code&gt; column. This is because retaining those other columns was not going to display their actual 'group' representation while I grouped by the &lt;code&gt;customer_id&lt;/code&gt; only which I was interested in. And also I had retained them previously to help me check if my query results were right.
&lt;/li&gt;
&lt;/ol&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;WITH cte_OfferValidity AS 
    (SELECT s.customer_id, m.join_date, s.order_date,
        date_add(m.join_date, interval(6) DAY) firstweek_ends, menu.product_name, menu.price
    FROM sales s
    LEFT JOIN members m
      ON s.customer_id = m.customer_id
    LEFT JOIN menu
        ON s.product_id = menu.product_id)
SELECT customer_id,
    SUM(CASE
            WHEN order_date BETWEEN join_date AND firstweek_ends THEN 20 * price 
            WHEN (order_date NOT BETWEEN join_date AND firstweek_ends) AND product_name = 'sushi' THEN 20 * price
            ELSE 10 * price
        END) points
FROM cte_OfferValidity
WHERE order_date &amp;lt; '2021-02-01' -- filter jan points only
GROUP BY customer_id;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp1g43aihwe3ir6gphxfo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp1g43aihwe3ir6gphxfo.png" alt=" " width="164" height="104"&gt;&lt;/a&gt;&lt;br&gt;
Answer:&lt;br&gt;
Customer A and Customer B have 1370 points and 820 points respectively by the end of January 2021.&lt;/p&gt;
&lt;h2&gt;
  
  
  Bonus Questions
&lt;/h2&gt;

&lt;p&gt;Join All The Things&lt;br&gt;
&lt;strong&gt;Recreate the table with&lt;/strong&gt;: &lt;code&gt;customer_id&lt;/code&gt;, &lt;code&gt;order_date&lt;/code&gt;, &lt;code&gt;product_name&lt;/code&gt;, &lt;code&gt;price&lt;/code&gt;, &lt;code&gt;member&lt;/code&gt; (Y/N)&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SELECT s.customer_id, order_date, menu.product_name, menu.price, 
CASE
  WHEN s.order_date &amp;gt;= '2021-01-07' AND m.join_date IS NOT NULL THEN 'Y' 
  WHEN s.order_date &amp;gt;= '2021-01-09' AND m.join_date IS NOT NULL THEN 'Y'
    ELSE 'N'
END AS member
FROM sales s
LEFT JOIN menu 
  ON s.product_id = menu.product_id
LEFT JOIN members m
  ON s.customer_id = m.customer_id;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyxqoiysihbvw53o89aor.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyxqoiysihbvw53o89aor.png" alt=" " width="371" height="314"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Rank All The Things&lt;/strong&gt;&lt;br&gt;
Danny also requires further information about the ranking of customer products, but he purposely does not need the ranking for non-member purchases so he expects null ranking values for the records when customers are not yet part of the loyalty program.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;WITH cte AS
  (SELECT s.customer_id, order_date, menu.product_name, menu.price, 
    CASE
      WHEN s.order_date &amp;gt;= '2021-01-07' AND m.join_date IS NOT NULL THEN 'Y' 
      WHEN s.order_date &amp;gt;= '2021-01-09' AND m.join_date IS NOT NULL THEN 'Y'
      ELSE 'N'
    END AS member
  FROM sales s
  LEFT JOIN menu 
    ON s.product_id = menu.product_id
  LEFT JOIN members m
    ON s.customer_id = m.customer_id)
SELECT *, 
  CASE
    WHEN member = 'N' THEN NULL 
    ELSE RANK() OVER w
  END AS ranking
FROM cte
WINDOW w AS (PARTITION BY s.customer_id, member ORDER BY s.order_date)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7yf2ufoomlqf7pjhjw1b.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7yf2ufoomlqf7pjhjw1b.png" alt=" " width="461" height="319"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary of insights
&lt;/h2&gt;

&lt;p&gt;From the analysis, I discovered a few interesting insights that would be certainly useful for Danny.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Customer B is the most frequent visitor with 6 visits in Jan 2021.&lt;/li&gt;
&lt;li&gt;Danny’s Diner’s most popular item is ramen.&lt;/li&gt;
&lt;li&gt;Customer A and C loves ramen whereas Customer B seems to enjoy sushi, curry and ramen equally.&lt;/li&gt;
&lt;li&gt;Customer A is the 1st member of Danny’s Diner and his first order is curry.&lt;/li&gt;
&lt;li&gt;Before they became members, Customer A and Customer B spent $25 and $40 respectively.&lt;/li&gt;
&lt;li&gt;Throughout Jan 2021, Customer A, Customer B and Customer C had 860 points, 940 points and Customer C: 360 points respectively.&lt;/li&gt;
&lt;li&gt;Assuming that members can earn 2x points a week from the day they became a member — not just sushi, Customer A has 1370 points and Customer B has 820 points by the end of Jan 2021.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It’s a wrap!&lt;/p&gt;




&lt;p&gt;Feel free to share your opinion about my analysis in the comments. Suggestions on how to optimize my SQL code for performance are also welcome.&lt;/p&gt;

&lt;p&gt;Happy querying folks 👋&lt;/p&gt;

</description>
      <category>sql</category>
      <category>mysql</category>
      <category>database</category>
      <category>8weeksqlchallenge</category>
    </item>
  </channel>
</rss>
