Diabetes Detection on AWS — Step-by-Step Complete Guide
A practical, step-by-step walkthrough to build a production-ready Diabetes Prediction web app using AWS services. Manual setup through AWS Console with code files for EC2 deployment.
TL;DR (One-line)
Train a scikit-learn model locally → upload to S3 via console → create SNS topic & DynamoDB table via console → host Flask API on EC2 that loads model from S3, predicts & stores results → expose via API Gateway → host frontend with AWS Amplify.
GitHub reference: https://github.com/naman-0804/Diabetes_Prediction_onAWS
Architecture Overview
- Local Training — Train model locally with scikit-learn
-
S3 — Store model file (
modelaws.joblib
) - EC2 + Flask — Backend API server
- DynamoDB — Store prediction results
- SNS — Email notifications to users
- API Gateway — Secure, managed public endpoint
- Amplify — Frontend hosting
Prerequisites
- AWS account with permissions
- Python 3.8+ locally
- WinSCP for file transfer to EC2: https://winscp.net/eng/download.php
- PuTTY for SSH access
- Basic knowledge of Flask and AWS console
Manual Setup Steps
Step 1 — Train Model Locally
Create train.py
:
# train.py - Train diabetes prediction model
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
import joblib
# Load dataset (download from Kaggle: Pima Indian Diabetes Dataset)
df = pd.read_csv('diabetes.csv')
X = df.drop('Outcome', axis=1)
y = df['Outcome']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)
# Save model
joblib.dump(model, 'modelaws.joblib')
print("Model saved as modelaws.joblib")
# Test accuracy
accuracy = model.score(X_test, y_test)
print(f"Model accuracy: {accuracy:.2f}")
Run locally:
python train.py
This creates modelaws.joblib
file.
Step 2 — Manual AWS Console Setup
2.1 Create S3 Bucket
- Go to AWS Console → S3
- Click "Create bucket"
- Choose unique bucket name (e.g.,
your-diabetes-model-bucket
) - Select your region (e.g.,
us-east-1
) - Keep default settings, click "Create bucket"
- Upload
modelaws.joblib
to this bucket
2.2 Create SNS Topic
- Go to AWS Console → SNS
- Click "Create topic"
- Choose "Standard" type
- Name:
diabetes-predictions
- Click "Create topic"
- Copy the Topic ARN (you'll need this)
- Click "Create subscription"
- Protocol: Email
- Endpoint: your email address
- Check your email and confirm subscription
2.3 Create DynamoDB Table
- Go to AWS Console → DynamoDB
- Click "Create table"
- Table name:
DiabetesPredictions
- Partition key:
email
(String) - Sort key:
timestamp
(String) - Use default settings
- Click "Create table"
2.4 Create IAM Role for EC2
- Go to AWS Console → IAM
- Click "Roles" → "Create role"
- Select "AWS service" → "EC2"
- Click "Next"
- Create custom policy with this JSON:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": ["s3:GetObject"],
"Resource": ["arn:aws:s3:::your-diabetes-model-bucket/*"]
},
{
"Effect": "Allow",
"Action": ["sns:Publish", "sns:Subscribe", "sns:ListSubscriptionsByTopic"],
"Resource": ["arn:aws:sns:us-east-1:YOUR-ACCOUNT-ID:diabetes-predictions"]
},
{
"Effect": "Allow",
"Action": ["dynamodb:PutItem", "dynamodb:UpdateItem", "dynamodb:GetItem"],
"Resource": ["arn:aws:dynamodb:us-east-1:YOUR-ACCOUNT-ID:table/DiabetesPredictions"]
}
]
}
- Name the role:
diabetes-ec2-role
- Click "Create role"
Step 3 — Launch EC2 Instance
- Go to AWS Console → EC2
- Click "Launch instance"
- Choose "Amazon Linux 2 AMI"
- Instance type:
t2.micro
(free tier) -
Important: In "Advanced details" → "IAM instance profile" → select
diabetes-ec2-role
- Security Group settings:
- SSH (22) from your IP
- HTTP (80) from anywhere
- Custom TCP (5000) from anywhere (for testing)
- Create/use existing key pair
- Launch instance
Step 4 — Code Files to Upload
Create these files on your local machine:
backend.py
import json
import boto3
import pandas as pd
import joblib
from flask import Flask, request, jsonify
from flask_cors import CORS
from datetime import datetime
app = Flask(__name__)
CORS(app)
# Configuration - UPDATE THESE VALUES
BUCKET_NAME = 'your-diabetes-model-bucket' # Your S3 bucket name
MODEL_KEY = 'modelaws.joblib' # Your model file name in S3
SNS_TOPIC_ARN = 'arn:aws:sns:us-east-1:YOUR-ACCOUNT-ID:diabetes-predictions' # Your SNS topic ARN
DYNAMODB_TABLE_NAME = 'DiabetesPredictions' # Your DynamoDB table name
AWS_REGION = 'us-east-1' # Your AWS region
# Initialize AWS clients
s3 = boto3.client('s3', region_name=AWS_REGION)
sns_client = boto3.client('sns', region_name=AWS_REGION)
dynamodb = boto3.resource('dynamodb', region_name=AWS_REGION)
table = dynamodb.Table(DYNAMODB_TABLE_NAME)
# Download and load the model when the server starts
model = None
download_path = '/tmp/modelaws.joblib'
try:
s3.download_file(BUCKET_NAME, MODEL_KEY, download_path)
model = joblib.load(download_path)
print("Model loaded successfully from S3")
except Exception as e:
print(f"Error loading model: {str(e)}")
@app.route('/', methods=['GET'])
def health_check():
return jsonify({'status': 'API is live'})
@app.route('/predict', methods=['POST'])
def predict():
if not model:
return jsonify({'error': 'Model could not be loaded'}), 500
try:
# Get JSON data from the request
data = request.get_json()
if not data:
return jsonify({'error': 'Invalid or missing JSON data'}), 400
email = data.get('email')
if not email:
return jsonify({'error': 'Email is required'}), 400
# Extract input data with defaults
pregnancies = int(data.get('Pregnancies', 0))
glucose = int(data.get('Glucose', 0))
blood_pressure = int(data.get('BloodPressure', 0))
skin_thickness = int(data.get('SkinThickness', 0))
insulin = int(data.get('Insulin', 0))
bmi = float(data.get('BMI', 0.0))
pedigree = float(data.get('DiabetesPedigreeFunction', 0.0))
age = int(data.get('Age', 0))
# Check if email is already subscribed to SNS
try:
response = sns_client.list_subscriptions_by_topic(TopicArn=SNS_TOPIC_ARN)
subscriptions = response.get('Subscriptions', [])
if not any(sub['Endpoint'] == email for sub in subscriptions):
sns_client.subscribe(
TopicArn=SNS_TOPIC_ARN,
Protocol='email',
Endpoint=email
)
except Exception as e:
print(f"Error subscribing email: {e}")
# Prepare input data for prediction
input_data = [[pregnancies, glucose, blood_pressure, skin_thickness,
insulin, bmi, pedigree, age]]
input_df = pd.DataFrame(input_data, columns=[
'Pregnancies', 'Glucose', 'BloodPressure', 'SkinThickness',
'Insulin', 'BMI', 'DiabetesPedigreeFunction', 'Age'
])
# Make prediction
prediction = model.predict(input_df)
predicted_label = int(prediction[0])
# Send SNS notification
message = f"Diabetes Prediction Result: {predicted_label}"
try:
sns_client.publish(
TopicArn=SNS_TOPIC_ARN,
Message=message,
Subject='Diabetes Prediction Result'
)
except Exception as e:
print(f"Error sending SNS: {e}")
# Store result in DynamoDB
result_data = {
'email': email,
'timestamp': datetime.utcnow().isoformat(),
'Pregnancies': pregnancies,
'Glucose': glucose,
'BloodPressure': blood_pressure,
'SkinThickness': skin_thickness,
'Insulin': insulin,
'BMI': bmi,
'DiabetesPedigreeFunction': pedigree,
'Age': age,
'PredictedOutcome': predicted_label
}
try:
table.put_item(Item=result_data)
except Exception as e:
print(f"Error storing in DynamoDB: {e}")
return jsonify({'Predicted Label': predicted_label})
except Exception as e:
print(f"Error in prediction: {str(e)}")
return jsonify({'error': str(e)}), 400
if __name__ == '__main__':
app.run(host='0.0.0.0', port=5000, debug=False)
requirements.txt
flask==2.3.3
flask-cors==4.0.0
boto3==1.28.57
pandas==2.1.1
scikit-learn==1.3.0
joblib==1.3.2
gunicorn==21.2.0
index.html
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Diabetes Prediction</title>
<style>
* {
margin: 0;
padding: 0;
box-sizing: border-box;
}
body {
font-family: 'Arial', sans-serif;
background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
min-height: 100vh;
display: flex;
justify-content: center;
align-items: center;
}
.container {
background: white;
padding: 2rem;
border-radius: 15px;
box-shadow: 0 20px 40px rgba(0,0,0,0.1);
width: 100%;
max-width: 500px;
}
h2 {
text-align: center;
color: #333;
margin-bottom: 2rem;
font-size: 2rem;
}
.input-container {
margin-bottom: 1rem;
}
label {
display: block;
margin-bottom: 0.5rem;
color: #555;
font-weight: bold;
}
input {
width: 100%;
padding: 12px;
border: 2px solid #ddd;
border-radius: 8px;
font-size: 16px;
transition: border-color 0.3s;
}
input:focus {
outline: none;
border-color: #667eea;
}
button {
width: 100%;
padding: 15px;
background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
color: white;
border: none;
border-radius: 8px;
font-size: 18px;
font-weight: bold;
cursor: pointer;
transition: transform 0.2s;
margin-top: 1rem;
}
button:hover {
transform: translateY(-2px);
}
#result {
margin-top: 2rem;
padding: 1rem;
border-radius: 8px;
text-align: center;
font-weight: bold;
min-height: 50px;
display: flex;
align-items: center;
justify-content: center;
}
.success {
background-color: #d4edda;
color: #155724;
border: 1px solid #c3e6cb;
}
.error {
background-color: #f8d7da;
color: #721c24;
border: 1px solid #f5c6cb;
}
</style>
</head>
<body>
<div class="container">
<h2>🩺 Diabetes Prediction</h2>
<form id="predictionForm">
<div class="input-container">
<label for="email">📧 Email:</label>
<input type="email" id="email" placeholder="Enter your email" required>
</div>
<div class="input-container">
<label for="Pregnancies">🤱 Pregnancies:</label>
<input type="number" id="Pregnancies" placeholder="Enter number of pregnancies" required min="0">
</div>
<div class="input-container">
<label for="Glucose">🍬 Glucose Level:</label>
<input type="number" id="Glucose" placeholder="Enter glucose level (mg/dL)" required min="0">
</div>
<div class="input-container">
<label for="BloodPressure">💓 Blood Pressure:</label>
<input type="number" id="BloodPressure" placeholder="Enter blood pressure (mmHg)" required min="0">
</div>
<div class="input-container">
<label for="SkinThickness">📏 Skin Thickness:</label>
<input type="number" id="SkinThickness" placeholder="Enter skin thickness (mm)" required min="0">
</div>
<div class="input-container">
<label for="Insulin">💉 Insulin Level:</label>
<input type="number" id="Insulin" placeholder="Enter insulin level (μU/mL)" required min="0">
</div>
<div class="input-container">
<label for="BMI">⚖️ BMI:</label>
<input type="number" step="0.1" id="BMI" placeholder="Enter BMI (kg/m²)" required min="0">
</div>
<div class="input-container">
<label for="DiabetesPedigreeFunction">🧬 Diabetes Pedigree Function:</label>
<input type="number" step="0.01" id="DiabetesPedigreeFunction" placeholder="Enter DPF (0.0 - 2.5)" required min="0" max="2.5">
</div>
<div class="input-container">
<label for="Age">🎂 Age:</label>
<input type="number" id="Age" placeholder="Enter age in years" required min="1" max="120">
</div>
<button type="button" onclick="submitData()">🔮 Get Prediction</button>
</form>
<div id="result"></div>
</div>
<script>
function submitData() {
const resultElement = document.getElementById("result");
resultElement.innerText = "🔄 Processing your prediction...";
resultElement.className = "";
const data = {
email: document.getElementById("email").value,
Pregnancies: Number(document.getElementById("Pregnancies").value),
Glucose: Number(document.getElementById("Glucose").value),
BloodPressure: Number(document.getElementById("BloodPressure").value),
SkinThickness: Number(document.getElementById("SkinThickness").value),
Insulin: Number(document.getElementById("Insulin").value),
BMI: parseFloat(document.getElementById("BMI").value),
DiabetesPedigreeFunction: parseFloat(document.getElementById("DiabetesPedigreeFunction").value),
Age: Number(document.getElementById("Age").value),
};
// Update this URL to your API Gateway endpoint or EC2 public IP
const API_URL = 'http://YOUR-EC2-PUBLIC-IP:5000/predict';
fetch(API_URL, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(data)
})
.then(response => {
if (!response.ok) {
throw new Error('Network response was not ok: ' + response.statusText);
}
return response.json();
})
.then(result => {
const prediction = result['Predicted Label'];
const message = prediction === 1
? '⚠️ Higher risk detected. Please consult a healthcare professional.'
: '✅ Lower risk detected. Continue maintaining a healthy lifestyle.';
resultElement.innerHTML = `
<strong>Prediction Result: ${prediction}</strong><br>
${message}<br>
📧 A detailed report has been sent to your email.
`;
resultElement.classList.add(prediction === 1 ? "error" : "success");
})
.catch(error => {
console.error('Error:', error);
resultElement.innerText = '❌ Error: ' + error.message;
resultElement.classList.add("error");
});
}
</script>
</body>
</html>
Step 5 — Deploy to EC2
5.1 Connect to EC2
- Use PuTTY to connect to your EC2 instance
- Use the
.ppk
key file and EC2 public IP
5.2 Prepare EC2 Environment
# Update system
sudo yum update -y
# Install Python 3 and pip
sudo yum install -y python3 python3-pip
# Create app directory
mkdir ~/app
cd ~/app
# Create virtual environment
python3 -m venv venv
source venv/bin/activate
# Update pip
pip install --upgrade pip
5.3 Upload Files using WinSCP
- Open WinSCP
- Connect to your EC2 instance using
.ppk
key - Upload these files to
/home/ec2-user/app/
:backend.py
requirements.txt
index.html
5.4 Install Dependencies and Run
# Navigate to app directory
cd ~/app
source venv/bin/activate
# Install requirements
pip install -r requirements.txt
# Update backend.py with your actual values:
# - BUCKET_NAME
# - SNS_TOPIC_ARN (from Step 2.2)
# - AWS region and account ID
# Run the application
python backend.py
Step 6 — Create API Gateway (Optional)
- Go to AWS Console → API Gateway
- Create REST API
- Create resource
/predict
- Create POST method
- Integration type: HTTP
- Endpoint URL:
http://your-ec2-public-ip:5000/predict
- Enable CORS
- Deploy API
- Update
index.html
with the API Gateway URL
Step 7 — Test Your Application
Test Backend Directly:
curl -X POST http://your-ec2-public-ip:5000/predict \
-H "Content-Type: application/json" \
-d '{
"email": "test@example.com",
"Pregnancies": 2,
"Glucose": 120,
"BloodPressure": 70,
"SkinThickness": 20,
"Insulin": 79,
"BMI": 25.5,
"DiabetesPedigreeFunction": 0.5,
"Age": 33
}'
Test Frontend:
- Open
index.html
in browser - Fill out the form
- Click "Get Prediction"
- Check email for SNS notification
- Check DynamoDB table for stored result
Step 8 — Deploy Frontend with Amplify
- Create GitHub repository
- Upload
index.html
to repository - Go to AWS Console → Amplify
- Connect to GitHub
- Deploy your repository
Troubleshooting
Common Issues:
-
Model not loading from S3
- Check bucket name and file name in
backend.py
- Verify IAM role has S3 read permissions
- Check bucket name and file name in
-
SNS emails not received
- Check spam folder
- Confirm email subscription in SNS console
-
DynamoDB permission errors
- Verify IAM role has DynamoDB write permissions
- Check table name in
backend.py
-
CORS errors
- Enable CORS in API Gateway
- Check
flask-cors
is installed
-
Connection refused errors
- Check EC2 security group allows port 5000
- Verify application is running on correct port
Files Summary
You need these 3 main files:
- backend.py - Flask API server
- requirements.txt - Python dependencies
- index.html - Frontend form
Transfer to EC2 using WinSCP, install dependencies, update configuration values, and run!
Final Notes
- Always use IAM roles instead of hardcoded AWS keys
- Consider using Elastic IP for stable EC2 endpoint
- Enable CloudWatch logging for monitoring
- Use HTTPS in production with SSL certificate
- Implement input validation and rate limiting for production use
This guide provides a complete manual setup process using AWS Console with minimal CLI usage. The architecture is production-ready and follows AWS best practices.
Top comments (0)