Fraud detection is one of those problems that looks simple on the surface β classify transactions as βfraudβ or βnot fraudβ. But once you look at real data, it becomes a completely different challenge.
In this project, I built FraudShield, an end-to-end machine learning system to detect fraudulent credit card transactions using both supervised and unsupervised approaches, along with a live dashboard.
π The Problem
The dataset I used contains over 284,000 transactions, but only:
π 0.17% are fraud
This creates a highly imbalanced dataset, where a model can achieve 99% accuracy just by predicting everything as βnot fraudβ.
So the real question becomes:
How do we detect fraud when itβs so rare?
π Dataset Overview
The dataset contains real-world credit card transactions made by European cardholders, anonymised using PCA transformation to protect sensitive information. It includes 284,807 transactions, of which only 492 are fraudulent (~0.17%), making it a highly imbalanced classification problem.
π§ What are V1βV28?
These are PCA-transformed features.
In simple terms:
- The original features are hidden
- Data is transformed into mathematical components
- We canβt interpret them directly
π This makes the problem harder β models must learn patterns without human-readable features.
π Exploratory Data Analysis (EDA)
Some key observations:
- The dataset is extremely imbalanced
- Most transactions are low value
- Fraud doesnβt follow obvious patterns
- Features are weakly correlated due to PCA transformation
One important realization early on:
Accuracy is NOT a useful metric here
β οΈ Why Accuracy is Misleading
If a model predicts:
text All transactions = Normal
It gets:
π 99.8% accuracy
β¦but detects zero fraud
So instead, I focused on:
- Precision
- Recall
- F1 Score
π€ Model 1 β XGBoost (Supervised Learning)
I trained an XGBoost classifier, which is well-suited for tabular data and imbalanced problems.
Key setup:
- scale_pos_weight to handle imbalance
- Stratified train/test split
- Feature scaling
π Results:
- Precision: 0.71
- Recall: 0.87 π₯
- F1 Score: 0.78
π§ Insight:
The model successfully detects 87% of fraud cases, which is critical in real-world systems.
π§ͺ Model 2 β Isolation Forest (Unsupervised)
To compare approaches, I also used Isolation Forest, an anomaly detection model.
π Results:
- Precision: 0.29
- Recall: 0.30
- F1 Score: 0.30
π§ Insight:
Unsupervised models struggle to detect subtle fraud patterns without labelled data.
βοΈ Model Comparison
| Model | Precision | Recall | F1 |
|---|---|---|---|
| XGBoost | 0.71 | 0.87 | 0.78 |
| Isolation Forest | 0.29 | 0.30 | 0.30 |
π Key takeaway:
Supervised learning significantly outperforms unsupervised anomaly detection when labelled data is available.
π Explainability with SHAP
To understand how the model makes decisions, I used SHAP (SHapley Additive exPlanations).
This helps answer:
- Which features influence predictions?
- Why was a transaction classified as fraud?
π This adds transparency and trust to the system.
π₯οΈ Deployment β Streamlit Dashboard
To make the system usable, I built a Streamlit dashboard.
Features:
- Input transaction data
- Predict fraud probability
- Display risk level
- Show model metrics
π Live Demo & Code
- π» GitHub: https://github.com/mahira-code/fraudshield-ml
- π Live Demo: https://fraudshield-ml-mahira.streamlit.app/
π§ What I Learned
This project taught me a lot about real-world machine learning:
- Handling imbalanced datasets
- Choosing the right evaluation metrics
- Comparing supervised vs unsupervised models
- Using SHAP for explainability
- Building and deploying end-to-end ML systems
π Whatβs Next
- Hyperparameter tuning
- Model monitoring (drift detection)
- API deployment (FastAPI)
- MLOps integration
π©βπ» About Me
Iβm Mahira Banu, a Data Scientist and AI Engineer focused on building practical, real-world AI systems.
- π Portfolio: https://mahirabanu.website
- π» GitHub: https://github.com/mahira-code
- π LinkedIn: https://www.linkedin.com/in/mahira-banu
π¬ Final Thoughts
Fraud detection isnβt just about building a model β itβs about understanding data, handling imbalance, and making reliable decisions in high-risk scenarios.
If youβre working on similar problems, Iβd love to hear your thougts
Top comments (0)