Introduction.
- In today's competitive telecom market,customer retention is everything. Churn the rate at which customers stop using a service can make or break profitability.
- I will utilize SQL to explore a dataset of telecom customers,analyze churn patterns and uncover key insights that can help businesses take action.
Overview.
- The dataset is from a telecom company databel and i used it to investigate customer churn.
๐พ The data
The data consists of four parts:
- Customer status: the status and reason why customers churned
- Demographics: demographic data on each customer
- Contract information: information on the type of contract
- Subscription types & charges: numerical data on how the customer uses his subscription
Project Overview.
- The project was done entirely using PostgreSQL,focusing on data cleaning, transformation, and exploratory queries to answer specific business questions, including:
1๏ธโฃ What is the average churn rate?
2๏ธโฃ What are the top reasons for churn?
3๏ธโฃ Do data usage and unlimited plans affect churn?
4๏ธโฃ Are there demographic differences in churn (gender, age)?
5๏ธโฃ How do contract types and payment methods influence customer retention?
๐งน Step 1: Data Cleaning and Preparation.
- The first step was to prepare the raw dataset. Many column names were inconsistent they contained spaces,uppercase letters) I renamed them using snake_case for easier querying.
ALTER TABLE databel RENAME COLUMN "Customer ID" TO customer_id;
ALTER TABLE databel RENAME COLUMN "Churn Label" TO churn_label;
ALTER TABLE databel RENAME COLUMN "Account Length (in months)" TO account_length_months;
- I repeated this for all the columns, ensuring the dataset was clean and consistent.
- I handled missing values in churn_reason by replacing blanks with 'not defined'
UPDATE databel
SET churn_reason ='not defined'
WHERE churn_reason = '';
๐ Step 2: Calculating the Average Churn Rate
- To understand the performance , I computed the average churn rate
SELECT
AVG(CASE WHEN churn_label='Yes' THEN 1 ELSE 0 END)::NUMERIC(10,2) AS avg_churn
FROM databel;
- it shows the percentage of customers who left the company - a key health indicator for any telecom.
๐ Step 3: Identifying Key Churn Reasons
- To reveal the top churn drivers such as poor customer service,high data charges and lack of network reliability. I used this formula
SELECT churn_reason, COUNT(churn_reason) AS reasons_for_count
FROM databel
GROUP BY churn_reason
ORDER BY reasons_for_count DESC
LIMIT 5;
๐ถ Step 4: Churn by Data Usage.
- Finding: Customers with low data usage (โค 3 GB) showed a higher churn rate โ possibly due to underutilization or perceived lack of value.
SELECT usage_category,
AVG(CASE WHEN churn_label='Yes' THEN 1 ELSE 0 END)::NUMERIC(10,2) AS churned_proportion
FROM (
SELECT *,
CASE
WHEN avg_montly_gb_download <= 3 THEN 'low usage (<=3)'
ELSE 'high usage (>3)'
END AS usage_category
FROM databel
) sub
GROUP BY usage_category;
๐ Step 5: The Impact of Unlimited Data Plans.
SELECT usage_category,
AVG(CASE WHEN churn_label='Yes' THEN 1 ELSE 0 END)::NUMERIC(10,2) AS churned_proportion
FROM (
SELECT *,
CASE
WHEN avg_montly_gb_download <= 3 THEN 'low usage (<=3)'
ELSE 'high usage (>3)'
END AS usage_category
FROM databel
WHERE unlimited_data_plan = 'Yes'
) sub
GROUP BY usage_category;
- Customers on unlimited data plans had significantly lower churn ,showing the retention power of flexible data offerings.
๐ฅ Step 6: Demographic Analysis
- By Gender.
SELECT gender,
AVG(CASE WHEN churn_label='Yes' THEN 1 ELSE 0 END)::NUMERIC(10,2) AS churned_proportion_per_gender
FROM databel
GROUP BY gender;
- By Age Category.
SELECT age_category,
AVG(CASE WHEN churn_label='Yes' THEN 1 ELSE 0 END)::NUMERIC(10,2) AS avg_churn
FROM (
SELECT *,
CASE
WHEN age <= 30 THEN 'Youths'
WHEN age BETWEEN 31 AND 50 THEN 'Adults'
ELSE 'Seniors'
END AS age_category
FROM databel
) sub
GROUP BY age_category;
- Younger customers (under 30) tended to churn more frequently.
๐ณ Step 7: Contract and Payment Insights.
- By Contract Type
SELECT contract_type,
AVG(CASE WHEN churn_label='Yes' THEN 1 ELSE 0 END)::NUMERIC(10,2) AS churned_proportion
FROM databel
GROUP BY contract_type
ORDER BY churned_proportion DESC;
- By Payment Method
SELECT payment_method,
AVG(CASE WHEN churn_label='Yes' THEN 1 ELSE 0 END)::NUMERIC(10,2) AS avg_churn
FROM databel
GROUP BY payment_method
ORDER BY avg_churn DESC;
- Contracts with month to month plans and manual payment methods showed higher churn,indicating that automation and long -term contracts help retention.
๐บ๏ธ Step 8: Geographic Insights
SELECT state,
AVG(CASE WHEN churn_label='Yes' THEN 1 ELSE 0 END)::NUMERIC(10,2) AS avg_churn_location
FROM databel
GROUP BY state
ORDER BY avg_churn_location DESC
LIMIT 2;
Key Takeaways
- Low data users churn more -Possibly due to underutilization.
- Unlimited Plans reduce churn - highlighting their customer loyalty effect.
- Young Customers are more likely to leave - engagement programs could help
- Contract type and payment methos are strong churn predictors
- Location -based strategies could address regional retention challenges.
Recommendations to Reduce Churn:
- Competitive Analysis: Investigate competitor offers and devices to develop more attractive packages.
- Customer Support Training: Enhance support staff training to improve customer interactions.
Top comments (0)