<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Yaw Opoku Mensah Baffoe</title>
    <description>The latest articles on DEV Community by Yaw Opoku Mensah Baffoe (@yaw_baffoe).</description>
    <link>https://dev.to/yaw_baffoe</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3888866%2F53329c44-b4f6-487f-b4cd-1e9dd9746d25.png</url>
      <title>DEV Community: Yaw Opoku Mensah Baffoe</title>
      <link>https://dev.to/yaw_baffoe</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/yaw_baffoe"/>
    <language>en</language>
    <item>
      <title>Building Dashboard: A Beginner's Case Study with E‑commerce Data</title>
      <dc:creator>Yaw Opoku Mensah Baffoe</dc:creator>
      <pubDate>Mon, 20 Apr 2026 14:35:30 +0000</pubDate>
      <link>https://dev.to/yaw_baffoe/building-dashboard-a-beginners-case-study-with-e-commerce-data-dg8</link>
      <guid>https://dev.to/yaw_baffoe/building-dashboard-a-beginners-case-study-with-e-commerce-data-dg8</guid>
      <description>&lt;p&gt;I wanted to learn and grow as a data analyst by analysing a well known multi tabled dataset, to grow my portfolio and prove to myself and others my skills as a data analyst. I learned a lot along the way, here’s the step by step process I used, the tools involved, and the surprising insight I found.&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting the data and defining the goal
&lt;/h2&gt;

&lt;p&gt;The O-list dataset from Kaggle is one of the first big datasets I've ever seen or used, but I never went in depth like I would for this. I decided to use it because it had lots of data with multiple tables, it was far beyond the scope of skills that I thought clients would expect from me. I picked overarching questions that I thought would push me to develop as a data analyst.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Which product categories should Olist prioritize? &lt;/li&gt;
&lt;li&gt;When should marketing be scheduled?&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Cleaning and Organizing the Data with SQL
&lt;/h2&gt;

&lt;p&gt;I used PostgreSQL to clean and organise the dataset. I made tables and uploaded the correct schema into the server and I used pgAdmin to query my database. Throughout the project I used smaller scale queries and excel to validate my SQL logic. &lt;/p&gt;

&lt;h2&gt;
  
  
  Calculating key metrics
&lt;/h2&gt;

&lt;p&gt;I joined various tables depending on what question I was looking for. For example, when I wanted to know the top performing products, I joined the order_items, orders and products tables, and used aggregate functions when needed.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;oi&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;price&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total_revenue&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n_trns&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;product_category_name_english&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;oi&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;freight_value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;order_items&lt;/span&gt; &lt;span class="n"&gt;oi&lt;/span&gt;
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;oi&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;product_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;product_id&lt;/span&gt;
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;category_name_translation&lt;/span&gt; &lt;span class="n"&gt;n_trns&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;product_category_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;n_trns&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;product_category_name&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;n_trns&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;product_category_name_english&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;total_revenue&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;I calculated other key statistics such as the most popular hours of business, and the order funnel, showing where the business would lose customers in the ordering process.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="s1"&gt;'1. Placed'&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;stage&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;order_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="k"&gt;count&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;order_purchase_timestamp&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;
&lt;span class="k"&gt;UNION&lt;/span&gt; &lt;span class="k"&gt;ALL&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="s1"&gt;'2. Approved'&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;stage&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;order_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="k"&gt;count&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;order_approved_at&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;
&lt;span class="k"&gt;UNION&lt;/span&gt; &lt;span class="k"&gt;ALL&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="s1"&gt;'3. Shipped'&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;stage&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;order_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="k"&gt;count&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;order_delivered_carrier_date&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;
&lt;span class="k"&gt;UNION&lt;/span&gt; &lt;span class="k"&gt;ALL&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="s1"&gt;'4. Delivered'&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;stage&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;order_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="k"&gt;count&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;order_status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'delivered'&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;count&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h2&gt;
  
  
  Building the dashboard
&lt;/h2&gt;

&lt;p&gt;The dashboard is organized into two pages: a high-level overview, and a deep dive into product performance.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fldft398xhwncv1tzc1p2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fldft398xhwncv1tzc1p2.png" alt=" " width="800" height="486"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj4um1zwxh8wegjtiy7dz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj4um1zwxh8wegjtiy7dz.png" alt=" " width="800" height="535"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In the top KPI cards show key metrics, like total revenue, and the week in which the most orders were made. The page also contains key graphs like the order funnel and the hour order count statistic.&lt;/p&gt;

&lt;p&gt;The next page showcases the statistics the items are generating for the business. I researched and made an equation for theoretical profit based on the profit that businesses in the same sector are likely to make, making it 25% of the revenue. I also showcase the top performing items and the percentage of revenue each item brings in for the business. I also showcase the overall growth of the business overtime.&lt;/p&gt;
&lt;h2&gt;
  
  
  The Key Insight &amp;amp; Why It Matters
&lt;/h2&gt;

&lt;p&gt;The first page shows that only roughly 6% of customers return for repeat purchases, this was easily my most surprising find. Why? Because a healthy business will retain roughly 30% of customers, higher end-businesses will want to retain closer to 40%, but despite the massive growth and revenue, the business is failing to bring customers back onto their website or into their shops. This would showcase a massive retention opportunity for the business.&lt;/p&gt;
&lt;h2&gt;
  
  
  Conclusion and takeaways
&lt;/h2&gt;

&lt;p&gt;This project taught me that data analysis isn't just about writing SQL queries or building charts. It's about asking the right questions and translating numbers into business decisions.&lt;/p&gt;

&lt;p&gt;The O-list dataset gave me a sandbox to practice every step of the analyst workflow. Cleaning messy data, joining multiple tables, validating results, designing a dashboard, and most importantly, finding an insight that actually matters.&lt;/p&gt;

&lt;p&gt;That insight only 6% of customers return for a second purchase is not just a number. It's a signal. It tells the business that their marketing budget might be focused on acquiring new customers when the real opportunity is keeping the ones they already have.&lt;/p&gt;

&lt;p&gt;For me, this project was proof. Proof that I can take a raw dataset, turn it into something useful, and deliver a clear recommendation. That's what I want to do for e‑commerce brands: help them see what's hidden in their data so they can grow smarter.&lt;/p&gt;

&lt;p&gt;If you're an e‑commerce owner struggling to understand why customers don't come back, or a fellow data analyst just starting out, I'd love to connect. You can find my dashboard here and my portfolio on GitHub.&lt;/p&gt;

&lt;p&gt;Thanks for reading. Let me know what you think.&lt;/p&gt;


&lt;div class="crayons-card c-embed text-styles text-styles--secondary"&gt;
    &lt;div class="c-embed__content"&gt;
        &lt;div class="c-embed__cover"&gt;
          &lt;a href="https://datastudio.google.com/reporting/62dc59b6-4dfb-4a01-848e-e4ab55bcc328" class="c-link align-middle" rel="noopener noreferrer"&gt;
            &lt;img alt="" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/http%3A%2F%2Fdatastudio.google.com%2Freporting%2F62dc59b6-4dfb-4a01-848e-e4ab55bcc328%2Fthumbnail%3Fsz%3Dfull%26feature%3Dopengraph" height="240" class="m-0" width="320"&gt;
          &lt;/a&gt;
        &lt;/div&gt;
      &lt;div class="c-embed__body"&gt;
        &lt;h2 class="fs-xl lh-tight"&gt;
          &lt;a href="https://datastudio.google.com/reporting/62dc59b6-4dfb-4a01-848e-e4ab55bcc328" rel="noopener noreferrer" class="c-link"&gt;
            Olist Sales Analysis
          &lt;/a&gt;
        &lt;/h2&gt;
        &lt;div class="color-secondary fs-s flex items-center"&gt;
          datastudio.google.com
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
&lt;/div&gt;



</description>
      <category>analytics</category>
      <category>beginners</category>
      <category>data</category>
      <category>learning</category>
    </item>
  </channel>
</rss>
