Mate Technologies

Posted on Jan 29

Build a Titanic Survival Predictor GUI with Python 🚢🧠

#tutorial #python #scikitlearn #ttkbootstrap

Learn how to build a machine learning GUI that predicts Titanic passenger survival from CSV files. This guide walks you through the project step by step.

GitHub Repo: Titanic Survival Predictor GUI

1️⃣ Project Overview

Our app does the following:

Load CSV files with Titanic passenger data.

Auto-detect column names for flexible datasets.

Train a RandomForestClassifier if no training file exists.

Predict survival for each passenger.

Export results to CSV.

Provides a user-friendly Tkinter GUI with drag & drop support.

2️⃣ Setup Environment

We’ll need these Python packages:

pip install pandas numpy scikit-learn ttkbootstrap

Optional for drag & drop:

pip install tkinterdnd2

3️⃣ Import Required Modules

Start by importing necessary libraries:

import os, sys, threading
import pandas as pd
import numpy as np
import tkinter as tk
from tkinter import filedialog, messagebox, ttk
import ttkbootstrap as tb
from ttkbootstrap.constants import *

from sklearn.ensemble import RandomForestClassifier

💡 Explanation:

pandas & numpy → for data handling

tkinter → for GUI elements

ttkbootstrap → modern themed GUI

RandomForestClassifier → our ML model

4️⃣ Handle Optional Drag & Drop

try:
    from tkinterdnd2 import TkinterDnD, DND_FILES
    DND_ENABLED = True
except ImportError:
    DND_ENABLED = False
    print("Drag & Drop requires tkinterdnd2: pip install tkinterdnd2")

💡 Explanation:

If tkinterdnd2 is installed, drag & drop CSV files is enabled.

Otherwise, users can still browse files manually.

5️⃣ Preprocessing CSV Data

We need a function to normalize columns and fill missing values:

def preprocess(df):
    df = df.copy()
    column_map = {
        'PassengerId': ['PassengerId','passengerid','Passenger ID','pid'],
        'Name': ['Name','FullName','full_name'],
        'Pclass': ['Pclass','Class'],
        'Sex': ['Sex','Gender'],
        'Age': ['Age','age'],
        'SibSp': ['SibSp','Siblings/Spouses'],
        'Parch': ['Parch','Parents/Children'],
        'Fare': ['Fare','fare']
    }

    for key, options in column_map.items():
        for opt in options:
            if opt in df.columns:
                df[key] = df[opt]
                break
        if key not in df.columns:
            df[key] = 0 if key in ['Age','SibSp','Parch','Fare','Pclass'] else ""

    df['Sex'] = df['Sex'].map({'male':0,'female':1}).fillna(0)

    for col in ['Age','Fare']:
        df[col] = df[col].fillna(df[col].median())
    for col in ['SibSp','Parch','Pclass']:
        df[col] = df[col].fillna(0)

    return df

💡 Explanation:

Maps different column names to a standard format.

Converts Sex to numeric.

Fills missing numeric values.

6️⃣ Load or Train the Model

def load_or_train_model():
    train_file = "train.csv"
    if os.path.exists(train_file):
        df = pd.read_csv(train_file)
        print("Loaded train.csv for model training.")
    else:
        df = pd.DataFrame(columns=['Survived','Pclass','Sex','Age','SibSp','Parch','Fare'])
        print("No train.csv found. First uploaded CSV will be used for training.")

    df = preprocess(df)
    X = df[['Pclass','Sex','Age','SibSp','Parch','Fare']]
    y = df.get('Survived', pd.Series([0]*len(df)))
    model = RandomForestClassifier(n_estimators=100, random_state=42)
    if len(df) > 0:
        model.fit(X, y)
    return model

💡 Explanation:

Loads train.csv if available.

Otherwise waits for the first CSV upload to auto-train.

Uses RandomForestClassifier for predictions.

7️⃣ Build the GUI

root = tb.Window(themename="darkly")
root.title("Titanic Survival Predictor v3.1")
root.minsize(1000, 600)

Add File Selection Widgets
path_input = tb.Entry(root, width=80)
path_input.pack(side=tk.LEFT, fill=tk.X, expand=True)
path_input.insert(0, "Drag & drop CSV files here…")

browse_btn = tb.Button(root, text="📂 Browse", bootstyle="info")
browse_btn.pack(side=tk.LEFT)

💡 Explanation:

Entry → shows selected files

Button → lets users browse files manually

8️⃣ Predict and Display Results

def predict(df, model):
    df_pre = preprocess(df)
    X = df_pre[['Pclass','Sex','Age','SibSp','Parch','Fare']]
    preds = model.predict(X)
    df_pre['Survived'] = preds
    return df_pre

This function predicts survival and adds a new column to the dataframe.

9️⃣ Export Results

def export_results(df):
    path = filedialog.asksaveasfilename(defaultextension=".csv")
    if path:
        df.to_csv(path, index=False)
        messagebox.showinfo("Export", f"Saved {len(df)} rows to {path}")

💡 Explanation:

Exports predictions to CSV

Users can choose the save location

1️⃣0️⃣ Full Integration

Finally, integrate all components into a TitanicApp class, managing:

File upload & drag & drop

Model training

Predictions

Export & progress bars

Check the full code and clone the project here:
https://github.com/rogers-cyber/python-tiny-tools/tree/main/Titanic-survival-prediction-GUI

This tutorial is now broken down for beginners, showing each piece of the project step by step with explanations.