Introduction.
- In today's competitive telecom market,customer retention is everything. Churn the rate at which customers stop using a service can make or break profitability.
- I will utilize SQL to explore a dataset of telecom customers,analyze churn patterns and uncover key insights that can help businesses take action.
Overview.
- The dataset is from a telecom company databel and i used it to investigate customer churn.
πΎ The data
The data consists of four parts:
- Customer status: the status and reason why customers churned
- Demographics: demographic data on each customer
- Contract information: information on the type of contract
- Subscription types & charges: numerical data on how the customer uses his subscription
Project Overview.
- The project was done entirely using PostgreSQL,focusing on data cleaning, transformation, and exploratory queries to answer specific business questions, including:
1οΈβ£ What is the average churn rate?
2οΈβ£ What are the top reasons for churn?
3οΈβ£ Do data usage and unlimited plans affect churn?
4οΈβ£ Are there demographic differences in churn (gender, age)?
5οΈβ£ How do contract types and payment methods influence customer retention?
π§Ή Step 1: Data Cleaning and Preparation.
- The first step was to prepare the raw dataset. Many column names were inconsistent they contained spaces,uppercase letters) I renamed them using snake_case for easier querying.
ALTER TABLE databel RENAME COLUMN "Customer ID" TO customer_id;
ALTER TABLE databel RENAME COLUMN "Churn Label" TO churn_label;
ALTER TABLE databel RENAME COLUMN "Account Length (in months)" TO account_length_months;
- I repeated this for all the columns, ensuring the dataset was clean and consistent.
- I handled missing values in churn_reason by replacing blanks with 'not defined'
UPDATE databel
SET churn_reason ='not defined'
WHERE churn_reason = '';
π Step 2: Calculating the Average Churn Rate
- To understand the performance , I computed the average churn rate
SELECT
AVG(CASE WHEN churn_label='Yes' THEN 1 ELSE 0 END)::NUMERIC(10,2) AS avg_churn
FROM databel;
- it shows the percentage of customers who left the company - a key health indicator for any telecom.
π Step 3: Identifying Key Churn Reasons
- To reveal the top churn drivers such as poor customer service,high data charges and lack of network reliability. I used this formula
SELECT churn_reason, COUNT(churn_reason) AS reasons_for_count
FROM databel
GROUP BY churn_reason
ORDER BY reasons_for_count DESC
LIMIT 5;
πΆ Step 4: Churn by Data Usage.
- Finding: Customers with low data usage (β€ 3 GB) showed a higher churn rate β possibly due to underutilization or perceived lack of value.
SELECT usage_category,
AVG(CASE WHEN churn_label='Yes' THEN 1 ELSE 0 END)::NUMERIC(10,2) AS churned_proportion
FROM (
SELECT *,
CASE
WHEN avg_montly_gb_download <= 3 THEN 'low usage (<=3)'
ELSE 'high usage (>3)'
END AS usage_category
FROM databel
) sub
GROUP BY usage_category;
π Step 5: The Impact of Unlimited Data Plans.
SELECT usage_category,
AVG(CASE WHEN churn_label='Yes' THEN 1 ELSE 0 END)::NUMERIC(10,2) AS churned_proportion
FROM (
SELECT *,
CASE
WHEN avg_montly_gb_download <= 3 THEN 'low usage (<=3)'
ELSE 'high usage (>3)'
END AS usage_category
FROM databel
WHERE unlimited_data_plan = 'Yes'
) sub
GROUP BY usage_category;
- Customers on unlimited data plans had significantly lower churn ,showing the retention power of flexible data offerings.
π₯ Step 6: Demographic Analysis
- By Gender.
SELECT gender,
AVG(CASE WHEN churn_label='Yes' THEN 1 ELSE 0 END)::NUMERIC(10,2) AS churned_proportion_per_gender
FROM databel
GROUP BY gender;
- By Age Category.
SELECT age_category,
AVG(CASE WHEN churn_label='Yes' THEN 1 ELSE 0 END)::NUMERIC(10,2) AS avg_churn
FROM (
SELECT *,
CASE
WHEN age <= 30 THEN 'Youths'
WHEN age BETWEEN 31 AND 50 THEN 'Adults'
ELSE 'Seniors'
END AS age_category
FROM databel
) sub
GROUP BY age_category;
- Younger customers (under 30) tended to churn more frequently.
π³ Step 7: Contract and Payment Insights.
- By Contract Type
SELECT contract_type,
AVG(CASE WHEN churn_label='Yes' THEN 1 ELSE 0 END)::NUMERIC(10,2) AS churned_proportion
FROM databel
GROUP BY contract_type
ORDER BY churned_proportion DESC;
- By Payment Method
SELECT payment_method,
AVG(CASE WHEN churn_label='Yes' THEN 1 ELSE 0 END)::NUMERIC(10,2) AS avg_churn
FROM databel
GROUP BY payment_method
ORDER BY avg_churn DESC;
- Contracts with month to month plans and manual payment methods showed higher churn,indicating that automation and long -term contracts help retention.
πΊοΈ Step 8: Geographic Insights
SELECT state,
AVG(CASE WHEN churn_label='Yes' THEN 1 ELSE 0 END)::NUMERIC(10,2) AS avg_churn_location
FROM databel
GROUP BY state
ORDER BY avg_churn_location DESC
LIMIT 2;
Key Takeaways
- Low data users churn more -Possibly due to underutilization.
- Unlimited Plans reduce churn - highlighting their customer loyalty effect.
- Young Customers are more likely to leave - engagement programs could help
- Contract type and payment methos are strong churn predictors
- Location -based strategies could address regional retention challenges.
Recommendations to Reduce Churn:
- Competitive Analysis: Investigate competitor offers and devices to develop more attractive packages.
- Customer Support Training: Enhance support staff training to improve customer interactions.
Top comments (0)