Hey folks! π
I recently built a customer churn prediction system that not only predicts who will leave β but also explains why in plain English using Amazon Bedrock.
In this tutorial, I'll walk you through building the entire pipeline from scratch.
What we achieved:
- β 84.2% AUC on validation data
- β Real-time predictions via SageMaker endpoint
- β Natural language explanations powered by Claude (Bedrock)
Let's dive in!
π― What We're Building
An end-to-end ML pipeline that:
- Ingests customer data into S3
- Trains a churn prediction model with SageMaker XGBoost
- Deploys a real-time inference endpoint
- Explains predictions using Amazon Bedrock (Claude)
- Exposes everything via API Gateway + Lambda
Prerequisites: AWS account, basic Python knowledge
ποΈ Architecture Overview
The pipeline consists of 5 tiers:
| Tier | Services | Purpose |
|---|---|---|
| Data Ingestion | S3 | Store raw customer data |
| ML Training | SageMaker Training | Train XGBoost model |
| Model Storage | S3 | Store model artifacts |
| Inference & AI | SageMaker Endpoint, Bedrock | Real-time predictions + NL explanations |
| API Layer | API Gateway, Lambda | Expose REST API |
Step 1: Set Up S3 and Upload Data
First, create an S3 bucket and upload the dataset.
# Set bucket name with your account ID
export BUCKET_NAME=churn-prediction-$(aws sts get-caller-identity --query Account --output text)
# Create bucket
aws s3 mb s3://$BUCKET_NAME
# Upload your data
aws s3 cp WA_Fn-UseC_-Telco-Customer-Churn.csv s3://$BUCKET_NAME/raw/
π₯ Dataset: Download the Telco Customer Churn dataset from Kaggle.
Step 2: Create SageMaker IAM Role
In AWS Console:
- Go to IAM β Roles β Create role
- Select SageMaker - Execution
- Add policies:
AmazonSageMakerFullAccess+AmazonS3FullAccess - Name it:
SageMakerChurnRole
Step 3: Train the Model
Create train_churn.py:
import boto3
import sagemaker
import pandas as pd
import os
from sklearn.model_selection import train_test_split
from sagemaker.inputs import TrainingInput
# Config
BUCKET = os.environ['BUCKET_NAME']
ROLE = os.environ['ROLE_ARN']
PREFIX = 'churn-prediction'
session = sagemaker.Session()
region = session.boto_region_name
# Load and prepare data
df = pd.read_csv('WA_Fn-UseC_-Telco-Customer-Churn.csv')
df['TotalCharges'] = pd.to_numeric(df['TotalCharges'], errors='coerce').fillna(0)
df['Churn'] = (df['Churn'] == 'Yes').astype(int)
# Encode categorical columns
cat_cols = ['gender', 'Partner', 'Dependents', 'PhoneService', 'MultipleLines',
'InternetService', 'OnlineSecurity', 'OnlineBackup', 'DeviceProtection',
'TechSupport', 'StreamingTV', 'StreamingMovies', 'Contract',
'PaperlessBilling', 'PaymentMethod']
for col in cat_cols:
df[col] = df[col].astype('category').cat.codes
# Features
feature_cols = ['SeniorCitizen', 'tenure', 'MonthlyCharges', 'TotalCharges'] + cat_cols
X = df[feature_cols]
y = df['Churn']
# Split and save
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42, stratify=y)
train_df = pd.concat([y_train.reset_index(drop=True), X_train.reset_index(drop=True)], axis=1)
test_df = pd.concat([y_test.reset_index(drop=True), X_test.reset_index(drop=True)], axis=1)
train_df.to_csv('train.csv', index=False, header=False)
test_df.to_csv('test.csv', index=False, header=False)
# Upload to S3
s3 = boto3.client('s3')
s3.upload_file('train.csv', BUCKET, f'{PREFIX}/train/train.csv')
s3.upload_file('test.csv', BUCKET, f'{PREFIX}/test/test.csv')
# Train XGBoost
container = sagemaker.image_uris.retrieve('xgboost', region, '1.7-1')
xgb = sagemaker.estimator.Estimator(
image_uri=container,
role=ROLE,
instance_count=1,
instance_type='ml.m5.xlarge',
output_path=f's3://{BUCKET}/{PREFIX}/output',
sagemaker_session=session
)
xgb.set_hyperparameters(
objective='binary:logistic',
num_round=100,
max_depth=5,
eta=0.2,
eval_metric='auc'
)
xgb.fit({
'train': TrainingInput(f's3://{BUCKET}/{PREFIX}/train', content_type='csv'),
'validation': TrainingInput(f's3://{BUCKET}/{PREFIX}/test', content_type='csv')
})
# Deploy endpoint
predictor = xgb.deploy(
initial_instance_count=1,
instance_type='ml.t2.medium',
endpoint_name='churn-prediction-endpoint',
serializer=sagemaker.serializers.CSVSerializer()
)
Run it:
export BUCKET_NAME=churn-prediction-YOUR_ACCOUNT_ID
export ROLE_ARN=arn:aws:iam::YOUR_ACCOUNT_ID:role/SageMakerChurnRole
python3 train_churn.py
Training output:
2026-01-01 00:24:27 Uploading - Uploading generated training model
2026-01-01 00:24:27 Completed - Training job completed
Training seconds: 103
Billable seconds: 103
β
Training complete!
Model artifact: s3://churn-prediction-905418352184/churn-prediction/output/sagemaker-xgboost-2026-01-01-00-22-03-339/output/model.tar.gz
Deploying endpoint (3-5 min)...
INFO:sagemaker:Creating model with name: sagemaker-xgboost-2026-01-01-00-24-53-959
INFO:sagemaker:Creating endpoint-config with name churn-prediction-endpoint
INFO:sagemaker:Creating endpoint with name churn-prediction-endpoint
---------------!
β
Endpoint deployed: churn-prediction-endpoint
Test prediction: 0.4% churn probability
Step 4: Create Lambda with Bedrock Integration
Create a Lambda function ChurnPredictionAPI with this code:
import json
import boto3
import os
sagemaker_runtime = boto3.client('sagemaker-runtime')
bedrock = boto3.client('bedrock-runtime')
ENDPOINT_NAME = os.environ.get('SAGEMAKER_ENDPOINT', 'churn-prediction-endpoint')
def lambda_handler(event, context):
body = json.loads(event['body']) if isinstance(event.get('body'), str) else event
# Get prediction from SageMaker
response = sagemaker_runtime.invoke_endpoint(
EndpointName=ENDPOINT_NAME,
ContentType='text/csv',
Body=body['features']
)
churn_prob = float(response['Body'].read().decode())
# Generate explanation with Bedrock Claude
prompt = f"""A customer has {churn_prob:.1%} churn probability.
Customer: Tenure {body.get('tenure', 'N/A')} months, ${body.get('monthly_charges', 'N/A')}/month, {body.get('contract', 'N/A')} contract.
In 2 sentences, explain the risk and suggest one retention action."""
bedrock_response = bedrock.invoke_model(
modelId='anthropic.claude-3-haiku-20240307-v1:0',
body=json.dumps({
"anthropic_version": "bedrock-2023-05-31",
"max_tokens": 100,
"messages": [{"role": "user", "content": prompt}]
})
)
explanation = json.loads(bedrock_response['body'].read())['content'][0]['text']
risk = "High" if churn_prob > 0.7 else "Medium" if churn_prob > 0.4 else "Low"
return {
'statusCode': 200,
'headers': {'Content-Type': 'application/json'},
'body': json.dumps({
'churn_probability': f"{churn_prob:.1%}",
'risk_level': risk,
'explanation': explanation
})
}
Lambda configuration:
- Runtime: Python 3.11
- Timeout: 30 seconds
- Role:
LambdaChurnRole(with SageMaker + Bedrock permissions) - Environment variable:
SAGEMAKER_ENDPOINT=churn-prediction-endpoint
Step 5: Create API Gateway
- Create an HTTP API in API Gateway
- Add Lambda integration β
ChurnPredictionAPI - Create POST route:
/predict - Deploy and get your invoke URL
π§ͺ Test the API
curl -X POST "https://YOUR_API_URL/predict" \
-H "Content-Type: application/json" \
-d '{
"features": "0,24,65.5,1500.0,1,0,1,2,0,0,1,1,0,0,1,0,2,1,1",
"tenure": 24,
"monthly_charges": 65.5,
"contract": "Month-to-month"
}'
Response:
(.venv) server@DLG lambda_package % curl -X POST "https://jxairjovmi.execute-api.us-east-1.amazonaws.com/predict" \
-H "Content-Type: application/json" \
-d '{
"features": "0,24,65.5,1500.0,1,0,1,2,0,0,1,1,0,0,1,0,2,1,1",
"tenure": 24,
"monthly_charges": 65.5,
"contract": "Month-to-month"
}'
{"churn_probability": "0.6%", "risk_level": "Low", "explanation": "The customer's high churn probability of 0.6% and the month-to-month contract indicate a significant risk of losing the customer. To mitigate this risk, a retention action could be to offer the customer a longer-term contract with a discounted monthly rate or additional benefits, which may help increase their loyalty and reduce the likelihood of churn."}%
π§Ή Cleanup
Don't forget to delete resources to avoid charges:
# Delete SageMaker endpoint (most expensive!)
aws sagemaker delete-endpoint --endpoint-name churn-prediction-endpoint
aws sagemaker delete-endpoint-config --endpoint-config-name churn-prediction-endpoint
# Delete Lambda
aws lambda delete-function --function-name ChurnPredictionAPI
# Delete S3 bucket
aws s3 rb s3://$BUCKET_NAME --force
π‘ Key Lessons Learned
- SageMaker XGBoost is production-ready β achieved 84% AUC with minimal tuning.
- Bedrock adds real business value β converting predictions to actionable insights makes ML accessible to non-technical stakeholders.
- IAM permissions are tricky β create roles via Console if CLI gives explicit deny errors.
- Cost awareness matters β always delete endpoints when not in use (~$0.05/hour adds up!)
Resources
Thanks for reading! If this helped you, follow me for more AWS + Data Engineering content.
Questions? Leave a comment below!





Top comments (0)