INTRODUCTION
Cowrywise is a fintech company in Nigeria. It is set up to make saving, investing, and growing wealth attractive to everyone at little to no cost. It seeks to retain active customers and to develop a reactivation strategy for inactive and dormant customers.
Business Problem Statement
- The business wants to identify customers who have both a savings and an investment plan (cross-selling opportunity).
- The finance team wants to analyze how often customers transact to segment them (e.g., frequent vs. occasional users).
- The ops team wants to flag accounts with no inflow transactions for over one year.
- Marketing wants to estimate CLV based on account tenure and transaction volume (simplified model).
Business Goal
The business wants to re-engage inactive and dormant customers
Tool Used For Analysis
SQL (Target dialect: MySQL 8+)
Dataset Description
The dataset used for this problem was a database dump provided by the Cowrywise recruitment team
Dataset Dictionary
- users_customuser: customer demographic and contact information
- savings_savingsaccount: records of deposit transactions
- plans_plan: records of plans created by customers
- withdrawals_withdrawal: records of withdrawal transactions
Data Exploration:
- users_customuser: 1867 unique users
- savings_savingsaccount: 163736 transactions
- plans_plan: 9641 plans
- withdrawals_withdrawal: 1308 transactions
- Only 872 users funded either their savings or investment plan (
select COUNT(DISTINCT owner_id) from adashi_staging.savings_savingsaccount;)
Notes & Assumptions
- owner_id is a foreign key to the ID primary key in the users table
- plan_id is a foreign key to the ID primary key in the plans table
- savings_plan : is_regular_savings = 1
- investment_plan: is_a_fund = 1
- confirmed_amount is the field for value of inflow
- amount_withdrawn is the field for value of withdrawal
- all amount fields are in kobo (100 kobo = 1 Naira)
- The two plan types (savings and investment) are mutually exclusive in the data, so a plan is never both.
- Customer name can be blank
ERR Diagram
Data Validation
I conducted a data exploration to identify data quality issues and ensure consistency across the database. In this analysis, data cleaning was not performed, assuming the database would be used as is, since there is no reference table to verify correctness. However, I will note a few issues I discovered from exploring it below:
- Null & Empty values across tables
- Unwanted special characters
- Errors in spelling
- Missing date data in date columns
TASK 1: High-Value Customers with Multiple Products
Activity: Write a query to find customers with at least one funded savings plan AND one funded investment plan, sorted by total deposits.
*Approach: * Two CTEs are aggregated: funded_savings (plans where is_regular_savings = 1 and have at least one inflow) and funded_investments (is_a_fund = 1 and have at least one inflow). Each counts distinct plans with COUNT(DISTINCT p.id), so a plan with many inflow rows is counted once, and sums its confirmed inflows.
Inner-joining both CTEs to the users table enforces the "holds BOTH products" rule: a customer appears only if they are present in both aggregates. total_deposits is the combined inflow across both plan types, converted to Naira.
-- Funded savings plans per customer and total deposits
WITH funded_savings AS (
SELECT
p.owner_id,
COUNT(DISTINCT p.id) AS savings_count,
SUM(sa.confirmed_amount) AS savings_deposits
FROM
plans_plan AS p
JOIN
(SELECT plan_id, confirmed_amount FROM savings_savingsaccount WHERE transaction_status IN (
'success', 'monnify_success', 'successful'
)
AND confirmed_amount > 0) AS sa -- only transactions with any of the three values are genuine inflows (success, monnify_success, successful)
ON sa.plan_id = p.id
WHERE
p.is_regular_savings = 1
GROUP BY
p.owner_id
),
-- Funded investment plans per customer and total deposits
funded_investments AS (
SELECT
p.owner_id,
COUNT(DISTINCT p.id) AS investment_count,
SUM(inv.confirmed_amount) AS investment_deposits
FROM
plans_plan AS p
JOIN
(SELECT plan_id, confirmed_amount FROM savings_savingsaccount WHERE transaction_status IN (
'success', 'monnify_success', 'successful'
)
AND confirmed_amount > 0) AS inv -- only transactions with any of the three values are genuine inflows (success, monnify_success, successful)
ON inv.plan_id = p.id
WHERE
p.is_a_fund = 1
GROUP BY
p.owner_id
)
-- Aggregating users with both savings and investment plans
SELECT
u.id AS owner_id,
COALESCE(NULLIF(TRIM(u.name), ''), CONCAT(TRIM(first_name), ' ', TRIM(last_name))) AS name,
fs.savings_count,
fi.investment_count,
ROUND((fs.savings_deposits + fi.investment_deposits) / 100, 2) AS total_deposits -- Conversion of total deposits from Kobo to Naira
FROM
users_customuser u
JOIN
funded_savings fs ON u.id = fs.owner_id
JOIN
funded_investments fi ON u.id = fi.owner_id
ORDER BY
total_deposits DESC;
Result: 179 users have at least one funded savings plan and one funded investment plan.
TASK 2: Transaction Frequency Analysis
Activity: Calculate the average number of transactions per customer per month and categorize them:
- "High Frequency" (β₯10 transactions/month)
- "Medium Frequency" (3-9 transactions/month)
- "Low Frequency" (β€2 transactions/month)
Approach: A CTE counts each customer's confirmed inflows per distinct
calendar month (monthly_txns). A second CTE averages those monthly counts, which equals total transactions divided by the number of active months (avg_txn_per_customer). A third applies the CASE banding, and the final query reports the customer count and the average monthly rate for each band (categorized_customers).
-- Monthly transactions per customer
WITH monthly_txns AS (
SELECT
owner_id,
-- COUNT(DISTINCT DATE_FORMAT(transaction_date, '%Y-%m')) AS active_month,
DATE_FORMAT(transaction_date, '%Y-%m') AS txn_month,
COUNT(*) AS txn_count
FROM savings_savingsaccount
WHERE transaction_status IN (
'success', 'monnify_success', 'successful'
) OR confirmed_amount > 0
GROUP BY owner_id, txn_month
),
-- Average monthly transaction per customer
avg_txn_per_customer AS (
SELECT
owner_id,
SUM(txn_count)/COUNT(txn_month) AS avg_monthly_txn
FROM monthly_txns
GROUP BY owner_id
),
-- Categorization of customers based on their average transaction per month
categorized_customers AS (
SELECT
owner_id,
avg_monthly_txn,
CASE
WHEN avg_monthly_txn >= 10 THEN 'High Frequency'
WHEN avg_monthly_txn >= 3 THEN 'Medium Frequency'
ELSE 'Low Frequency'
END AS frequency_category
FROM avg_txn_per_customer
)
SELECT
frequency_category,
COUNT(owner_id) AS customer_count,
ROUND(AVG(avg_monthly_txn), 1) AS avg_transactions_per_month
FROM categorized_customers
GROUP BY frequency_category
ORDER BY FIELD(frequency_category, 'High Frequency', 'Medium Frequency', 'Low Frequency');
Result: Out of 872 active users, 12.8% are frequent savers, 18.6% are medium-frequency savers, and 68.6% are occasional savers.
TASK 3: Account Inactivity Alert
Activity: Find all active accounts (savings or investments) with no transactions in the last 1 year (365 days).
Approach: The last_inflow CTE identifies each plan's most recent confirmed deposit. The main then LEFT JOINs the last_inflow CTE, so the scope filter (is_regular_savings = 1 OR is_a_fund = 1) and the active filter (is_deleted = 0 AND is_archived = 0) apply to the entire population exactly once. The inactivity_days metric tracks active accounts (those that have been funded at least once) whose most recent deposit was more than 365 days ago, as well as accounts that haven't been funded since creation.
SET @ref_date = CURRENT_DATE;
-- Last confirmed deposit per plan
WITH last_inflow AS (
SELECT
plan_id,
MAX(DATE(transaction_date)) AS last_txn_date
FROM savings_savingsaccount
WHERE
confirmed_amount > 0 AND transaction_status IN (
'success', 'monnify_success', 'successful'
)
GROUP BY plan_id
)
SELECT
p.id AS plan_id,
p.owner_id AS owner_id,
CASE
WHEN p.is_regular_savings = 1 THEN 'Savings'
WHEN p.is_a_fund = 1 THEN 'Investment'
END AS type,
COALESCE(li.last_txn_date, DATE(p.created_on)) AS last_transaction_date,
-- Days since last deposit; if no deposit, default to days since account creation
DATEDIFF(@ref_date, COALESCE(li.last_txn_date, p.created_on)) AS inactivity_days
FROM plans_plan AS p
LEFT JOIN last_inflow AS li
ON li.plan_id = p.id
WHERE (p.is_regular_savings = 1 OR p.is_a_fund = 1)
-- Filter all active accounts, but the most recent one was over a year (365 Days) ago
AND p.is_deleted = 0
AND p.is_archived = 0 -- active accounts only
AND DATEDIFF(@ref_date, COALESCE(li.last_txn_date, p.created_on)) > 365
ORDER BY inactivity_days DESC;
Result: 2065 plans have been inactive for over 1 year
TASK 4: Customer Lifetime Value (CLV) Estimation
Activity: For each customer, assuming the profit_per_transaction is 0.1% of the transaction value, calculate:
- Account tenure (months since signup)
- Total transactions
- Estimated CLV (Assume: CLV = (total_transactions / tenure) * 12 * avg_profit_per_transaction)
- Order by estimated CLV from highest to lowest
Approach: The tenure CTE calculates the number of full months from signup to the reference date using TIMESTAMPDIFF(MONTH, ...). The transactions CTE returns total_txn (a transaction count) and SUM(confirmed_amount)/100 as the total transaction value in Naira.
CLV is computed as (total_value / tenure) * 12 * 0.1%.
SET @ref_date = CURRENT_DATE;
SET @profit_rate = 0.001;
WITH tenure AS (
SELECT
id AS owner_id,
COALESCE(nullif(trim(name), ''), concat(TRIM(first_name), ' ', TRIM(last_name))) AS name,
DATE(date_joined) AS date_joined,
TIMESTAMPDIFF(MONTH, date_joined, @ref_date) AS tenure_months
FROM users_customuser
GROUP BY id
),
transactions AS (
SELECT
owner_id,
COUNT(id) AS total_txn, -- count of transactions
SUM(confirmed_amount) / 100 AS total_txn_value -- total transaction value in Naira
FROM savings_savingsaccount
WHERE
confirmed_amount > 0 AND transaction_status IN (
'success', 'monnify_success', 'successful'
)
GROUP BY owner_id
)
-- Final CLV Calculation with division-safe logic
SELECT
t.owner_id AS customer_id,
t.name,
t.tenure_months,
COALESCE(txn.total_txn, 0) AS total_transactions,
CASE
WHEN t.tenure_months > 0 AND txn.total_txn > 0 THEN ROUND(((txn.total_txn_value / t.tenure_months) * 12 * @profit_rate ), 2)
ELSE 0
END AS estimated_clv
FROM tenure t
LEFT JOIN transactions txn ON t.owner_id = txn.owner_id
ORDER BY estimated_clv DESC;
Result: Chima Ataman is Cowrywise's most valuable customer, with a CLV score of β¦168446.74. They have been a customer for 35 months and have completed 1244 transactions.
Challenges and how I resolved them
Identifying successful inflows. There were 27 different transfer statuses (
SELECT DISTINCT transaction_status from savings_savingsaccount). Successful inflow transaction statuses were only confirmed when the transaction_status column was one of success, monnify_success, or successful.Data hygiene throughout. Blank names use a fallback; divide-by-zero and negative tenure are guarded; and distinct plan counting avoids inflation due to transaction volume.
All SQL scripts are available here: GitHub
Kindly leave a comment. Cheers!







Top comments (0)