DEV Community

Karthik Vankayalapati
Karthik Vankayalapati

Posted on

TrustShield AI: Multi-Layer Phishing Detection Framework Using Machine Learning

description: "Learn how TrustShield AI combines machine learning, URL intelligence, and real-time threat monitoring to detect sophisticated phishing attacks with 95-98% accuracy."
published: true
cover_image: https://github.com/karthikeya1498/PFSD-BLOG/blob/main/assets/hero-shield.jpg?raw=true
tags: ['python', 'machinelearning', 'cybersecurity', 'flask', 'mongodb', 'phishing']

canonical_url: https://pfsd-blog.vercel.app/

๐Ÿ›ก๏ธ TrustShield AI: A Multi-Layer Phishing Detection Framework Using Machine Learning

TrustShield AI

TrustShield AI is a multi-layered, AI-driven phishing detection framework designed to identify and mitigate sophisticated email-based attacks in real time. Built on a three-tier architecture comprising a frontend dashboard, a Flask-based asynchronous backend, and a MongoDB persistence layer, the system fuses six independent intelligence signals to achieve detection accuracy of approximately 95-98%.


๐ŸŽฏ Key Features

Feature Specification
Detection latency < 200 ms
Detection accuracy โ‰ˆ 95-98%
Real-time processing Asynchronous Flask backend
Living retraining Continuous model adaptation
Chrome Extension Manifest V3 integration
SOC Dashboard Real-time monitoring interface

๐Ÿ—๏ธ System Architecture

TrustShield AI is structured into three logical tiers. This separation allows each tier to scale, fail and be replaced independently of the others.

๐Ÿ”ง Three-Tier Design

  1. Frontend Dashboard ๐Ÿ“Š - Web-based SOC interface for security analysts
  2. Backend โš™๏ธ - Flask (Python) with asyncio for asynchronous processing
  3. Database ๐Ÿ’พ - MongoDB for persistence and real-time analytics

๐Ÿ› ๏ธ Technology Stack

graph TB
    A[Frontend Dashboard] --> B[Flask Backend]
    B --> C[MongoDB Database]
    D[Chrome Extension] --> B
    E[ML Models] --> B
    F[URL Intelligence] --> B
    G[Rule Engine] --> B
    H[LLM Analysis] --> B
Enter fullscreen mode Exit fullscreen mode
Layer Technology Purpose
Frontend HTML5, CSS3, JavaScript (Vanilla) Dashboard UI, real-time updates
Backend Flask (Python) ยท asyncio API server, async processing
Database MongoDB (PyMongo) Data persistence, analytics
ML Library scikit-learn ยท pandas ยท numpy Model training and inference
Models LogReg ยท RF ยท GBM ยท Linear SVM Classification algorithms
LLM Assist Ollama (phi model, local) Semantic analysis
Extension Chrome MV3 Browser integration

๐Ÿ” Detection Engine

The detection engine is the analytic core of TrustShield AI. Each incoming email is normalized, vectorized and dispatched to a non-blocking executor that runs ML inference alongside five rule-driven intelligence modules.

โšก Aggressive Fusion Strategy

TrustShield uses a strategy referred to internally as aggressive fusion. Every layer returns a numeric score in the range [0, 1], where higher values indicate greater phishing likelihood.

# Aggressive Fusion Algorithm
final_score = (
    ml_prediction * 0.35 +      # Machine Learning
    url_intelligence * 0.25 +  # URL Analysis  
    rule_heuristics * 0.20 +   # Rule Engine
    emotional_analysis * 0.10 + # Emotion Detection
    behavioral_anomalies * 0.07 + # Behavior Analysis
    llm_semantic * 0.03         # LLM Understanding
)

verdict = "phishing" if final_score < 0.4 else "legitimate"
Enter fullscreen mode Exit fullscreen mode

๐Ÿ“Š SOC Dashboard

The Security Operations Centre (SOC) dashboard is a web-based interface that allows security analysts to monitor the live behaviour of TrustShield AI.

SOC Dashboard

๐ŸŽฏ Dashboard Features

The dashboard surfaces four primary views:

  • ๐Ÿ“ˆ Live activity feed - Every scan and every verdict, streamed in real time
  • ๐Ÿ“Š Risk levels and trends - Hourly and daily phishing pressure, segmented per tenant
  • ๐Ÿค– Model information - Active model version, accuracy, calibration and drift indicators
  • ๐Ÿšจ Alerts and notifications - High-risk verdicts, drift alarms and pipeline failures

๐Ÿ” Real-time Threat Monitoring

Real-time Monitoring

The real-time threat monitoring interface displays live phishing detection results, risk scores, and automated threat intelligence feeds from the TrustShield AI system.


๐Ÿ”Œ Chrome Extension Integration

TrustShield AI integrates with email clients through a Chrome Manifest V3 browser extension.

๐Ÿ”„ Extension Workflow

sequenceDiagram
    participant U as User
    participant E as Extension
    participant A as API
    participant D as Database

    U->>E: Opens email
    E->>E: Extract content & URLs
    E->>A: Send to /analyze endpoint
    A->>A: Process through detection layers
    A->>E: Return verdict & score
    E->>U: Display risk indicator
    A->>D: Store results for retraining
Enter fullscreen mode Exit fullscreen mode

๐Ÿ“‹ Extension Process

  1. ๐Ÿ“– Reads the email content and extracts URLs from the active DOM
  2. ๐Ÿ“ค Sends the payload to the Flask /analyze endpoint with a rotating API key
  3. ๐Ÿ“ฑ Displays the risk score, classification and triggered rules to the user
  4. ๐Ÿ”„ Mirrors the verdict to the SOC dashboard via the same logging spine

๐Ÿง  Living Retraining Dataset

A central design principle of TrustShield AI is that the model must learn from the traffic it sees. The system does not rely solely on static phishing corpora.

๐Ÿ“š Dataset Schema

Field Type Description
email_id ObjectId Unique identifier
timestamp ISODate Time of analysis (UTC)
content Text Email body content
urls Array Extracted URLs
label Enum phishing or legitimate
confidence_score Float [0,1] Model probability
risk_level Enum low ยท medium ยท high ยท critical
source Enum dashboard or extension

Dataset Preview


๐Ÿš€ Deployment & Performance

โšก Core Engine Implementation

import asyncio
from typing import Dict, List

async def analyze_email(email_content: str, urls: List[str]) -> Dict:
    """Parallel processing of all detection layers"""

    # Execute all detection layers concurrently
    tasks = [
        ml_predictor.predict(email_content),
        url_analyzer.check_urls(urls),
        rule_engine.evaluate(email_content),
        emotion_analyzer.analyze(email_content),
        behavior_detector.analyze(email_content),
        llm_analyzer.analyze(email_content)
    ]

    ml_score, url_score, rule_score, emotion_score, behavior_score, llm_score = await asyncio.gather(*tasks)

    # Aggressive fusion with configurable weights
    final_score = (
        ml_score * 0.35 +
        url_score * 0.25 +
        rule_score * 0.20 +
        emotion_score * 0.10 +
        behavior_score * 0.07 +
        llm_score * 0.03
    )

    return {
        'verdict': 'phishing' if final_score < 0.4 else 'legitimate',
        'confidence': final_score,
        'risk_level': calculate_risk_level(final_score),
        'layer_scores': {
            'ml': ml_score,
            'url': url_score,
            'rules': rule_score,
            'emotion': emotion_score,
            'behavior': behavior_score,
            'llm': llm_score
        }
    }
Enter fullscreen mode Exit fullscreen mode

๐Ÿ“Š Performance Metrics

Metric Value
Latency < 200ms per email
Throughput 1000+ emails/minute
Accuracy 95-98%
False Positive Rate < 2%
Coverage 100% of inbound emails

๐Ÿ“ˆ Future Enhancements

๐Ÿ”ฎ Planned Features

  1. โšก Edge inference - Execution of the model inside the extension itself
  2. ๐Ÿค– Autonomous remediation - Automatic quarantine and sender disposal
  3. ๐Ÿข Multi-tenant support - Isolated environments for different organizations
  4. ๐Ÿง  Advanced LLM integration - Fine-tuned models for specific phishing patterns
  5. ๐Ÿ“ฑ Mobile app - Native applications for iOS and Android

๐Ÿ”ฌ Research Directions

  • ๐ŸŽฏ Zero-day phishing detection - Using unsupervised learning for novel attack patterns
  • ๐Ÿ”„ Cross-platform integration - Support for Outlook, Gmail, and other email clients
  • โ›“๏ธ Blockchain integration - Immutable audit trails for compliance
  • ๐Ÿค Federated learning - Collaborative model training across organizations

๐Ÿ“š References

  1. Putra, F. P. E. et al. (2024). "Analysis of phishing attack trends, impacts and prevention methods: Literature study." Brilliance: Research of Artificial Intelligence, 4(1), 413โ€“421.

  2. Alghenaim, M. et al. (2025). "The state of the art in ai-based phishing detection: A systematic literature review." Studies in Computational Intelligence, 1178.

  3. Afane, K. et al. (2024). "Next-generation phishing: How llm agents empower cyber attackers." IEEE International Conference on Big Data (BigData), 2558โ€“2567.

  4. Roy, S. S. et al. (2024). "From chatbots to phishbots?: Phishing scam generation in commercial large language models." IEEE Symposium on Security and Privacy (SP), 36โ€“54.

  5. Kyaw, P. H. et al. (2024). "A systematic review of deep learning techniques for phishing email detection." Electronics, 13(3823).


๐Ÿ› ๏ธ Getting Started

๐Ÿ“‹ Prerequisites

  • โœ… Python 3.8+
  • โœ… MongoDB 4.4+
  • โœ… Node.js 16+
  • โœ… Chrome Browser (for extension)

๐Ÿš€ Installation

# Clone the repository
git clone https://github.com/karthikeya1498/PFSD-BLOG.git
cd PFSD-BLOG

# Install backend dependencies
pip install -r requirements.txt

# Install frontend dependencies
npm install

# Start MongoDB
mongod

# Run the Flask backend
python app.py

# Run the frontend
npm run dev
Enter fullscreen mode Exit fullscreen mode

โš™๏ธ Configuration

  1. Set up your MongoDB connection string in config.py
  2. Configure your Ollama instance for LLM integration
  3. Load the pre-trained ML models from models/
  4. Install the Chrome extension from extension/

๐Ÿค Contributing

We welcome contributions to TrustShield AI!


๐Ÿ›ก๏ธ TrustShield AI ยท 2026

Written by TrustShield AI Team

This blog was last edited on 26 April 2026, by TrustShield AI Team. Text is available under the open documentation license; the source code is published on github.com/Tejus468/pfsd_project.

Top comments (0)