DEV Community

Cover image for Cowrywise SQL Assessment
Abdulqudus Oladega
Abdulqudus Oladega

Posted on • Edited on

Cowrywise SQL Assessment

INTRODUCTION

Cowrywise is a fintech company in Nigeria. It is set up to make saving, investing, and growing wealth attractive to everyone at little to no cost. It seeks to retain active customers and to develop a reactivation strategy for inactive and dormant customers.

Business Problem Statement

  1. The business wants to identify customers who have both a savings and an investment plan (cross-selling opportunity).
  2. The finance team wants to analyze how often customers transact to segment them (e.g., frequent vs. occasional users).
  3. The ops team wants to flag accounts with no inflow transactions for over one year.
  4. Marketing wants to estimate CLV based on account tenure and transaction volume (simplified model).

Business Goal

The business wants to re-engage inactive and dormant customers

Tool Used For Analysis

SQL (Target dialect: MySQL 8+)

Dataset Description

The dataset used for this problem was a database dump provided by the Cowrywise recruitment team

Dataset Dictionary

  1. users_customuser: customer demographic and contact information
  2. savings_savingsaccount: records of deposit transactions
  3. plans_plan: records of plans created by customers
  4. withdrawals_withdrawal: records of withdrawal transactions

Data Exploration:

  1. users_customuser: 1867 unique users
  2. savings_savingsaccount: 163736 transactions
  3. plans_plan: 9641 plans
  4. withdrawals_withdrawal: 1308 transactions
  5. Only 872 users funded either their savings or investment plan (select COUNT(DISTINCT owner_id) from adashi_staging.savings_savingsaccount;)

Notes & Assumptions

  • owner_id is a foreign key to the ID primary key in the users table
  • plan_id is a foreign key to the ID primary key in the plans table
  • savings_plan : is_regular_savings = 1
  • investment_plan: is_a_fund = 1
  • confirmed_amount is the field for value of inflow
  • amount_withdrawn is the field for value of withdrawal
  • all amount fields are in kobo (100 kobo = 1 Naira)
  • The two plan types (savings and investment) are mutually exclusive in the data, so a plan is never both.
  • Customer name can be blank

ERR Diagram

ERR Diagram

Data Validation

I conducted a data exploration to identify data quality issues and ensure consistency across the database. In this analysis, data cleaning was not performed, assuming the database would be used as is, since there is no reference table to verify correctness. However, I will note a few issues I discovered from exploring it below:

  • Null & Empty values across tables
  • Unwanted special characters
  • Errors in spelling
  • Missing date data in date columns

A preview of the plans_plan table

A preview of the users_customuser table

TASK 1: High-Value Customers with Multiple Products

Activity: Write a query to find customers with at least one funded savings plan AND one funded investment plan, sorted by total deposits.

*Approach: * Two CTEs are aggregated: funded_savings (plans where is_regular_savings = 1 and have at least one inflow) and funded_investments (is_a_fund = 1 and have at least one inflow). Each counts distinct plans with COUNT(DISTINCT p.id), so a plan with many inflow rows is counted once, and sums its confirmed inflows.
Inner-joining both CTEs to the users table enforces the "holds BOTH products" rule: a customer appears only if they are present in both aggregates. total_deposits is the combined inflow across both plan types, converted to Naira.

-- Funded savings plans per customer and total deposits
WITH funded_savings AS (
    SELECT 
        p.owner_id,
        COUNT(DISTINCT p.id) AS savings_count,
        SUM(sa.confirmed_amount) AS savings_deposits
    FROM 
        plans_plan AS p
    JOIN 
        (SELECT plan_id, confirmed_amount FROM savings_savingsaccount WHERE transaction_status IN (
            'success', 'monnify_success', 'successful'
        )
        AND confirmed_amount > 0) AS sa -- only transactions with any of the three values are genuine inflows (success, monnify_success, successful)
    ON sa.plan_id = p.id
    WHERE 
        p.is_regular_savings = 1
    GROUP BY 
        p.owner_id
),

-- Funded investment plans per customer and total deposits
funded_investments AS (
    SELECT 
        p.owner_id,
        COUNT(DISTINCT p.id) AS investment_count,
        SUM(inv.confirmed_amount) AS investment_deposits
    FROM 
        plans_plan AS p
    JOIN 
        (SELECT plan_id, confirmed_amount FROM savings_savingsaccount WHERE transaction_status IN (
            'success', 'monnify_success', 'successful'
        )
        AND confirmed_amount > 0) AS inv -- only transactions with any of the three values are genuine inflows (success, monnify_success, successful)
    ON inv.plan_id = p.id
    WHERE 
        p.is_a_fund = 1
    GROUP BY 
        p.owner_id
)

-- Aggregating users with both savings and investment plans 
SELECT 
    u.id AS owner_id,
    COALESCE(NULLIF(TRIM(u.name), ''), CONCAT(TRIM(first_name), ' ', TRIM(last_name))) AS name,
    fs.savings_count,
    fi.investment_count,
    ROUND((fs.savings_deposits + fi.investment_deposits) / 100, 2) AS total_deposits -- Conversion of total deposits from Kobo to Naira
FROM 
    users_customuser u
JOIN 
    funded_savings fs ON u.id = fs.owner_id
JOIN 
    funded_investments fi ON u.id = fi.owner_id
ORDER BY 
    total_deposits DESC;
Enter fullscreen mode Exit fullscreen mode

Result: 179 users have at least one funded savings plan and one funded investment plan.

High-Value Customers with Multiple Products

TASK 2: Transaction Frequency Analysis

Activity: Calculate the average number of transactions per customer per month and categorize them:

  • "High Frequency" (β‰₯10 transactions/month)
  • "Medium Frequency" (3-9 transactions/month)
  • "Low Frequency" (≀2 transactions/month)

Approach: A CTE counts each customer's confirmed inflows per distinct
calendar month (monthly_txns). A second CTE averages those monthly counts, which equals total transactions divided by the number of active months (avg_txn_per_customer). A third applies the CASE banding, and the final query reports the customer count and the average monthly rate for each band (categorized_customers).

-- Monthly transactions per customer
WITH monthly_txns AS (
    SELECT 
        owner_id, 
        -- COUNT(DISTINCT DATE_FORMAT(transaction_date, '%Y-%m')) AS active_month,
        DATE_FORMAT(transaction_date, '%Y-%m') AS txn_month, 
        COUNT(*) AS txn_count
    FROM savings_savingsaccount

    WHERE transaction_status IN (
            'success', 'monnify_success', 'successful'
        ) OR confirmed_amount > 0
    GROUP BY owner_id, txn_month
),

-- Average monthly transaction per customer
avg_txn_per_customer AS (
    SELECT 
        owner_id,
        SUM(txn_count)/COUNT(txn_month) AS avg_monthly_txn
    FROM monthly_txns
    GROUP BY owner_id
),

-- Categorization of customers based on their average transaction per month
categorized_customers AS (
    SELECT
        owner_id,
        avg_monthly_txn,
        CASE
            WHEN avg_monthly_txn >= 10 THEN 'High Frequency'
            WHEN avg_monthly_txn >= 3 THEN 'Medium Frequency'
            ELSE 'Low Frequency'
        END AS frequency_category
    FROM avg_txn_per_customer
)

SELECT 
    frequency_category,
    COUNT(owner_id) AS customer_count,
    ROUND(AVG(avg_monthly_txn), 1) AS avg_transactions_per_month
FROM categorized_customers
GROUP BY frequency_category
ORDER BY FIELD(frequency_category, 'High Frequency', 'Medium Frequency', 'Low Frequency');

Enter fullscreen mode Exit fullscreen mode

Result: Out of 872 active users, 12.8% are frequent savers, 18.6% are medium-frequency savers, and 68.6% are occasional savers.

Transaction Frequency Analysis

TASK 3: Account Inactivity Alert

Activity: Find all active accounts (savings or investments) with no transactions in the last 1 year (365 days).

Approach: The last_inflow CTE identifies each plan's most recent confirmed deposit. The main then LEFT JOINs the last_inflow CTE, so the scope filter (is_regular_savings = 1 OR is_a_fund = 1) and the active filter (is_deleted = 0 AND is_archived = 0) apply to the entire population exactly once. The inactivity_days metric tracks active accounts (those that have been funded at least once) whose most recent deposit was more than 365 days ago, as well as accounts that haven't been funded since creation.

SET @ref_date = CURRENT_DATE;

-- Last confirmed deposit per plan
WITH last_inflow AS (
    SELECT
        plan_id,
        MAX(DATE(transaction_date)) AS last_txn_date
    FROM savings_savingsaccount
        WHERE
        confirmed_amount > 0  AND transaction_status IN (
            'success', 'monnify_success', 'successful'
        )
    GROUP BY plan_id
)

SELECT
    p.id       AS plan_id,
    p.owner_id AS owner_id,
    CASE
        WHEN p.is_regular_savings = 1 THEN 'Savings'
        WHEN p.is_a_fund          = 1 THEN 'Investment'
    END        AS type,
    COALESCE(li.last_txn_date, DATE(p.created_on)) AS last_transaction_date,
    -- Days since last deposit; if no deposit, default to days since account creation
    DATEDIFF(@ref_date, COALESCE(li.last_txn_date, p.created_on)) AS inactivity_days
FROM plans_plan AS p
LEFT JOIN last_inflow AS li
       ON li.plan_id = p.id
WHERE (p.is_regular_savings = 1 OR p.is_a_fund = 1) 
-- Filter all active accounts, but the most recent one was over a year (365 Days) ago
  AND p.is_deleted  = 0
  AND p.is_archived = 0                       -- active accounts only
  AND DATEDIFF(@ref_date, COALESCE(li.last_txn_date, p.created_on)) > 365
ORDER BY inactivity_days DESC;
Enter fullscreen mode Exit fullscreen mode

Result: 2065 plans have been inactive for over 1 year

TASK 4: Customer Lifetime Value (CLV) Estimation

Activity: For each customer, assuming the profit_per_transaction is 0.1% of the transaction value, calculate:

  • Account tenure (months since signup)
  • Total transactions
  • Estimated CLV (Assume: CLV = (total_transactions / tenure) * 12 * avg_profit_per_transaction)
  • Order by estimated CLV from highest to lowest

Approach: The tenure CTE calculates the number of full months from signup to the reference date using TIMESTAMPDIFF(MONTH, ...). The transactions CTE returns total_txn (a transaction count) and SUM(confirmed_amount)/100 as the total transaction value in Naira.
CLV is computed as (total_value / tenure) * 12 * 0.1%.

SET @ref_date = CURRENT_DATE; 
SET @profit_rate = 0.001;

WITH tenure AS (
    SELECT 
        id AS owner_id, 
        COALESCE(nullif(trim(name), ''), concat(TRIM(first_name), ' ', TRIM(last_name))) AS name,
        DATE(date_joined) AS date_joined,
        TIMESTAMPDIFF(MONTH, date_joined, @ref_date) AS tenure_months 
    FROM users_customuser
    GROUP BY id
),
transactions AS (
    SELECT 
        owner_id,
        COUNT(id) AS total_txn, -- count of transactions
        SUM(confirmed_amount) / 100 AS total_txn_value -- total transaction value in Naira
    FROM savings_savingsaccount 
    WHERE 
    confirmed_amount > 0  AND transaction_status IN (
            'success', 'monnify_success', 'successful'
        )
    GROUP BY owner_id
)

-- Final CLV Calculation with division-safe logic
SELECT 
    t.owner_id AS customer_id, 
    t.name,
    t.tenure_months, 
    COALESCE(txn.total_txn, 0) AS total_transactions,
   CASE 
        WHEN t.tenure_months > 0 AND txn.total_txn > 0 THEN ROUND(((txn.total_txn_value / t.tenure_months) * 12 * @profit_rate ), 2)
        ELSE 0
    END AS estimated_clv
FROM tenure t 
LEFT JOIN transactions txn ON t.owner_id = txn.owner_id
ORDER BY estimated_clv DESC;
Enter fullscreen mode Exit fullscreen mode

Result: Chima Ataman is Cowrywise's most valuable customer, with a CLV score of ₦168446.74. They have been a customer for 35 months and have completed 1244 transactions.

Challenges and how I resolved them

  1. Identifying successful inflows. There were 27 different transfer statuses (SELECT DISTINCT transaction_status from savings_savingsaccount). Successful inflow transaction statuses were only confirmed when the transaction_status column was one of success, monnify_success, or successful.

  2. Data hygiene throughout. Blank names use a fallback; divide-by-zero and negative tenure are guarded; and distinct plan counting avoids inflation due to transaction volume.

All SQL scripts are available here: GitHub

Kindly leave a comment. Cheers!

Top comments (0)