DEV Community

Cover image for Predicting Customer Churn with TensorFlow – A Beginner-Friendly Guide
Oussama Belhadi
Oussama Belhadi

Posted on

Predicting Customer Churn with TensorFlow – A Beginner-Friendly Guide

Introduction

Customer churn is when customers leave a company. Predicting churn helps businesses retain valuable customers and increase revenue.

In this tutorial, I’ll show you how to use TensorFlow, pandas, and scikit-learn to build a neural network that predicts churn based on a real dataset.

You can find a working ready to test/use example in my Github

No heavy theory — just step-by-step coding, explanations, and visuals.

We have a .csv file that holds the customers data and we will use it as a dataset to train our model today, It's also available in the Github Repo

Step 1: Setting Up the Environment

We need these libraries:

pip install pandas numpy scikit-learn tensorflow matplotlib
Enter fullscreen mode Exit fullscreen mode
  • pandas → for data manipulation
  • numpy → for numeric computations
  • scikit-learn → preprocessing, scaling, train/test splitting
  • tensorflow → building neural networks
  • matplotlib → plotting results

Step 2: Load and Inspect the Dataset

Load the dataset with pandas:

import pandas as pd

df = pd.read_csv("customer_churn.csv")
df.head()

Enter fullscreen mode Exit fullscreen mode

Tip: ⚠️ Always check your column names. Spaces or extra characters can break code later:

df.columns = df.columns.str.strip().str.replace(" ", "_")
Enter fullscreen mode Exit fullscreen mode

Step 3: Clean the Data

Convert numeric columns with potential issues:

df['Total_Charges'] = pd.to_numeric(df['Total_Charges'], errors='coerce')
Enter fullscreen mode Exit fullscreen mode

Drop missing rows and irrelevant columns:

df = df.dropna()
df.drop('Customer_ID', axis=1, inplace=True)
Enter fullscreen mode Exit fullscreen mode

Step 4: Encode Categorical Variables

Neural networks cannot process text. Convert categories to numbers:

from sklearn.preprocessing import LabelEncoder

df['Churn'] = df['Churn'].map({'Yes': 1, 'No': 0})

cat_cols = df.select_dtypes(include='object').columns
le = LabelEncoder()
for col in cat_cols:
    df[col] = le.fit_transform(df[col])
Enter fullscreen mode Exit fullscreen mode

Example: Male → 1, Female → 0. Similarly for other categories.

Step 5: Split Features and Target

Separate input features (X) and output (y);
Before scaling, each feature (column) has its own mean and standard deviation. Neural networks learn better when features are roughly in the same range.

Mean: average value of the feature
Standard Deviation: measures how spread out the values are

*The formula for the mean (μ) of a dataset with N values is:
*

mean calculation formula

Standard Scaler subtracts the mean and divides by the standard deviation,
The formula for the standard deviation is:

standard deviation calculation formula

After scaling, each feature has mean ~0 and std ~1.

normal distribution example

X = df.drop('Churn', axis=1)
y = df['Churn']
Enter fullscreen mode Exit fullscreen mode

Scale features (important for neural networks):

from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
Enter fullscreen mode Exit fullscreen mode

Split into train/test sets:

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.2, random_state=42)
Enter fullscreen mode Exit fullscreen mode

Step 6: Build and Train the Neural Network

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

model = Sequential([
    Dense(32, activation='relu', input_shape=(X_train.shape[1],)),
    Dense(16, activation='relu'),
    Dense(1, activation='sigmoid')
])
Enter fullscreen mode Exit fullscreen mode

Why these layers and activations?

neural network layers

  • Dense(32) and Dense(16) → number of neurons in each hidden layer. Experiment to see what works best.
  • ReLU activation → introduces non-linearity, helps the network learn complex patterns.
  • Sigmoid in output → outputs a probability between 0 and 1, perfect for binary classification.
Optimizer: Adam
model.compile(
    optimizer='adam',
    loss='binary_crossentropy',
    metrics=['accuracy']
)
Enter fullscreen mode Exit fullscreen mode

Why Adam?

  • Adaptive optimizer: adjusts learning rate automatically
  • Combines advantages of Momentum and RMSProp
  • Works well out-of-the-box for most problems
  • Loss function: binary_crossentropy → suitable for predicting 0/1 outcomes.
  • Metric: accuracy → how often the model predicts correctly.
  • Training
history = model.fit(
    X_train, y_train,
    validation_data=(X_test, y_test),
    epochs=20,
    batch_size=32
)
Enter fullscreen mode Exit fullscreen mode

Epochs = 20 → model sees the dataset 20 times.

Batch size = 32 → updates weights every 32 samples.

Step 7: Evaluate and Visualize

import matplotlib.pyplot as plt

plt.plot(history.history['accuracy'], label='Train')
plt.plot(history.history['val_accuracy'], label='Validation')
plt.title('Accuracy over Epochs')
plt.legend()
plt.show()
Enter fullscreen mode Exit fullscreen mode

Ouput Example :

output chart example

Train vs Validation curves → check for overfitting/underfitting.

Step 8: Predict Churn for a New Customer

import numpy as np
import pandas as pd

new_customer = pd.DataFrame([{
    'Gender': 0, 'Senior_Citizen': 0, 'Partner': 1, 'Dependents': 0,
    'tenure': 12, 'Phone_Service': 1, 'Multiple_Lines': 0, 'Internet_Service': 0,
    'Online_Security': 2, 'Online_Backup': 0, 'Device_Protection': 1,
    'Tech_Support': 0, 'Streaming_TV': 0, 'Streaming_Movies': 1, 'Contract': 0,
    'Paperless_Billing': 1, 'Payment_Method': 2, 'Monthly_Charges': 50.0, 'Total_Charges': 500.0
}])

new_customer_scaled = scaler.transform(new_customer)
churn_prob = model.predict(new_customer_scaled)[0][0]
churn_label = int(churn_prob > 0.5)

print(f"Churn Probability: {churn_prob:.2f}")
print(f"Churn Prediction: {churn_label} ({'Yes' if churn_label==1 else 'No'})")
Enter fullscreen mode Exit fullscreen mode

Conclusion

You now have a complete pipeline to:

  • Clean and preprocess data
  • Train a neural network in TensorFlow
  • Evaluate model performance
  • Predict churn for new customers

This workflow is reusable for other tabular datasets and binary classification problems.

Top comments (0)