description: "Learn how TrustShield AI combines machine learning, URL intelligence, and real-time threat monitoring to detect sophisticated phishing attacks with 95-98% accuracy."
published: true
cover_image: https://github.com/karthikeya1498/PFSD-BLOG/blob/main/assets/hero-shield.jpg?raw=true
tags: ['python', 'machinelearning', 'cybersecurity', 'flask', 'mongodb', 'phishing']
canonical_url: https://pfsd-blog.vercel.app/
๐ก๏ธ TrustShield AI: A Multi-Layer Phishing Detection Framework Using Machine Learning
TrustShield AI is a multi-layered, AI-driven phishing detection framework designed to identify and mitigate sophisticated email-based attacks in real time. Built on a three-tier architecture comprising a frontend dashboard, a Flask-based asynchronous backend, and a MongoDB persistence layer, the system fuses six independent intelligence signals to achieve detection accuracy of approximately 95-98%.
๐ฏ Key Features
| Feature | Specification |
|---|---|
| Detection latency | < 200 ms |
| Detection accuracy | โ 95-98% |
| Real-time processing | Asynchronous Flask backend |
| Living retraining | Continuous model adaptation |
| Chrome Extension | Manifest V3 integration |
| SOC Dashboard | Real-time monitoring interface |
๐๏ธ System Architecture
TrustShield AI is structured into three logical tiers. This separation allows each tier to scale, fail and be replaced independently of the others.
๐ง Three-Tier Design
- Frontend Dashboard ๐ - Web-based SOC interface for security analysts
- Backend โ๏ธ - Flask (Python) with asyncio for asynchronous processing
- Database ๐พ - MongoDB for persistence and real-time analytics
๐ ๏ธ Technology Stack
graph TB
A[Frontend Dashboard] --> B[Flask Backend]
B --> C[MongoDB Database]
D[Chrome Extension] --> B
E[ML Models] --> B
F[URL Intelligence] --> B
G[Rule Engine] --> B
H[LLM Analysis] --> B
| Layer | Technology | Purpose |
|---|---|---|
| Frontend | HTML5, CSS3, JavaScript (Vanilla) |
Dashboard UI, real-time updates |
| Backend | Flask (Python) ยท asyncio |
API server, async processing |
| Database | MongoDB (PyMongo) |
Data persistence, analytics |
| ML Library | scikit-learn ยท pandas ยท numpy |
Model training and inference |
| Models | LogReg ยท RF ยท GBM ยท Linear SVM |
Classification algorithms |
| LLM Assist | Ollama (phi model, local) |
Semantic analysis |
| Extension | Chrome MV3 |
Browser integration |
๐ Detection Engine
The detection engine is the analytic core of TrustShield AI. Each incoming email is normalized, vectorized and dispatched to a non-blocking executor that runs ML inference alongside five rule-driven intelligence modules.
โก Aggressive Fusion Strategy
TrustShield uses a strategy referred to internally as aggressive fusion. Every layer returns a numeric score in the range [0, 1], where higher values indicate greater phishing likelihood.
# Aggressive Fusion Algorithm
final_score = (
ml_prediction * 0.35 + # Machine Learning
url_intelligence * 0.25 + # URL Analysis
rule_heuristics * 0.20 + # Rule Engine
emotional_analysis * 0.10 + # Emotion Detection
behavioral_anomalies * 0.07 + # Behavior Analysis
llm_semantic * 0.03 # LLM Understanding
)
verdict = "phishing" if final_score < 0.4 else "legitimate"
๐ SOC Dashboard
The Security Operations Centre (SOC) dashboard is a web-based interface that allows security analysts to monitor the live behaviour of TrustShield AI.
๐ฏ Dashboard Features
The dashboard surfaces four primary views:
- ๐ Live activity feed - Every scan and every verdict, streamed in real time
- ๐ Risk levels and trends - Hourly and daily phishing pressure, segmented per tenant
- ๐ค Model information - Active model version, accuracy, calibration and drift indicators
- ๐จ Alerts and notifications - High-risk verdicts, drift alarms and pipeline failures
๐ Real-time Threat Monitoring
The real-time threat monitoring interface displays live phishing detection results, risk scores, and automated threat intelligence feeds from the TrustShield AI system.
๐ Chrome Extension Integration
TrustShield AI integrates with email clients through a Chrome Manifest V3 browser extension.
๐ Extension Workflow
sequenceDiagram
participant U as User
participant E as Extension
participant A as API
participant D as Database
U->>E: Opens email
E->>E: Extract content & URLs
E->>A: Send to /analyze endpoint
A->>A: Process through detection layers
A->>E: Return verdict & score
E->>U: Display risk indicator
A->>D: Store results for retraining
๐ Extension Process
- ๐ Reads the email content and extracts URLs from the active DOM
-
๐ค Sends the payload to the Flask
/analyzeendpoint with a rotating API key - ๐ฑ Displays the risk score, classification and triggered rules to the user
- ๐ Mirrors the verdict to the SOC dashboard via the same logging spine
๐ง Living Retraining Dataset
A central design principle of TrustShield AI is that the model must learn from the traffic it sees. The system does not rely solely on static phishing corpora.
๐ Dataset Schema
| Field | Type | Description |
|---|---|---|
email_id |
ObjectId |
Unique identifier |
timestamp |
ISODate |
Time of analysis (UTC) |
content |
Text |
Email body content |
urls |
Array |
Extracted URLs |
label |
Enum |
phishing or legitimate
|
confidence_score |
Float [0,1] |
Model probability |
risk_level |
Enum |
low ยท medium ยท high ยท critical |
source |
Enum |
dashboard or extension
|
๐ Deployment & Performance
โก Core Engine Implementation
import asyncio
from typing import Dict, List
async def analyze_email(email_content: str, urls: List[str]) -> Dict:
"""Parallel processing of all detection layers"""
# Execute all detection layers concurrently
tasks = [
ml_predictor.predict(email_content),
url_analyzer.check_urls(urls),
rule_engine.evaluate(email_content),
emotion_analyzer.analyze(email_content),
behavior_detector.analyze(email_content),
llm_analyzer.analyze(email_content)
]
ml_score, url_score, rule_score, emotion_score, behavior_score, llm_score = await asyncio.gather(*tasks)
# Aggressive fusion with configurable weights
final_score = (
ml_score * 0.35 +
url_score * 0.25 +
rule_score * 0.20 +
emotion_score * 0.10 +
behavior_score * 0.07 +
llm_score * 0.03
)
return {
'verdict': 'phishing' if final_score < 0.4 else 'legitimate',
'confidence': final_score,
'risk_level': calculate_risk_level(final_score),
'layer_scores': {
'ml': ml_score,
'url': url_score,
'rules': rule_score,
'emotion': emotion_score,
'behavior': behavior_score,
'llm': llm_score
}
}
๐ Performance Metrics
| Metric | Value |
|---|---|
| Latency |
< 200ms per email |
| Throughput |
1000+ emails/minute |
| Accuracy | 95-98% |
| False Positive Rate | < 2% |
| Coverage |
100% of inbound emails |
๐ Future Enhancements
๐ฎ Planned Features
- โก Edge inference - Execution of the model inside the extension itself
- ๐ค Autonomous remediation - Automatic quarantine and sender disposal
- ๐ข Multi-tenant support - Isolated environments for different organizations
- ๐ง Advanced LLM integration - Fine-tuned models for specific phishing patterns
- ๐ฑ Mobile app - Native applications for iOS and Android
๐ฌ Research Directions
- ๐ฏ Zero-day phishing detection - Using unsupervised learning for novel attack patterns
- ๐ Cross-platform integration - Support for Outlook, Gmail, and other email clients
- โ๏ธ Blockchain integration - Immutable audit trails for compliance
- ๐ค Federated learning - Collaborative model training across organizations
๐ References
Putra, F. P. E. et al. (2024). "Analysis of phishing attack trends, impacts and prevention methods: Literature study." Brilliance: Research of Artificial Intelligence, 4(1), 413โ421.
Alghenaim, M. et al. (2025). "The state of the art in ai-based phishing detection: A systematic literature review." Studies in Computational Intelligence, 1178.
Afane, K. et al. (2024). "Next-generation phishing: How llm agents empower cyber attackers." IEEE International Conference on Big Data (BigData), 2558โ2567.
Roy, S. S. et al. (2024). "From chatbots to phishbots?: Phishing scam generation in commercial large language models." IEEE Symposium on Security and Privacy (SP), 36โ54.
Kyaw, P. H. et al. (2024). "A systematic review of deep learning techniques for phishing email detection." Electronics, 13(3823).
๐ ๏ธ Getting Started
๐ Prerequisites
- โ Python 3.8+
- โ MongoDB 4.4+
- โ Node.js 16+
- โ Chrome Browser (for extension)
๐ Installation
# Clone the repository
git clone https://github.com/karthikeya1498/PFSD-BLOG.git
cd PFSD-BLOG
# Install backend dependencies
pip install -r requirements.txt
# Install frontend dependencies
npm install
# Start MongoDB
mongod
# Run the Flask backend
python app.py
# Run the frontend
npm run dev
โ๏ธ Configuration
- Set up your MongoDB connection string in
config.py - Configure your Ollama instance for LLM integration
- Load the pre-trained ML models from
models/ - Install the Chrome extension from
extension/
๐ค Contributing
We welcome contributions to TrustShield AI!
- ๐ Source Code: GitHub Repository
- ๐ Live Demo: TrustShield AI Blog
- ๐ Issues: GitHub Issues
๐ก๏ธ TrustShield AI ยท 2026
Written by TrustShield AI Team
This blog was last edited on 26 April 2026, by TrustShield AI Team. Text is available under the open documentation license; the source code is published on github.com/Tejus468/pfsd_project.




Top comments (0)