DEV Community

Mohammad Waseem
Mohammad Waseem

Posted on

Detecting Phishing Patterns in Legacy JavaScript Codebases: A Security Researcher’s Approach

Detecting Phishing Patterns in Legacy JavaScript Codebases: A Security Researcher’s Approach

Legacy web applications often harbor security vulnerabilities, with phishing attacks being a persistent threat. As security researchers, our goal is to develop reliable methods to detect phishing patterns embedded within older JavaScript code, which often lacks modern security practices.

The Challenge with Legacy Code

Many legacy systems inline JavaScript directly within HTML, relying on convoluted, unstructured scripts. This makes pattern detection complex due to obfuscated code, dynamic script generation, and inconsistent coding styles. Traditional static analysis tools may struggle to parse such code effectively.

Our Approach: Pattern-Based Detection

To tackle this, we focus on identifying specific phishing indicators within JavaScript snippets. These indicators include suspicious URL manipulations, deceptive DOM modifications, and obfuscated string patterns.

Step 1: Identifying Common Phishing Signatures

Phishing scripts often perform the following actions:

  • Dynamic creation of DOM elements like <iframe>, <script>, or <a> tags that mimic legitimate sources.
  • Use of eval() and Function() for obfuscation.
  • URL string manipulations with base64 or hex encoding.

Step 2: Pattern Recognition in JavaScript

Here's an example detection snippet focusing on identifying suspicious URL manipulations:

// Detect base64 encoded URLs in scripts
function containsSuspiciousURLs(scriptContent) {
  const base64Pattern = /[A-Za-z0-9+/=]{20,}/g;
  const matches = scriptContent.match(base64Pattern);
  if (matches) {
    for (const match of matches) {
      // Decode base64 string
      try {
        const decoded = atob(match);
        if (decoded.startsWith('http') && decoded.includes('maliciousdomain.com')) {
          return true;
        }
      } catch (e) {
        continue;
      }
    }
  }
  return false;
}
Enter fullscreen mode Exit fullscreen mode

This function scans script content for base64 strings that decode to suspicious URLs.

Step 3: Analyzing Script Content for Obfuscation

Obfuscation is common in phishing scripts. To detect this, we analyze the use of eval() and Function():

// Detect eval or Function usage
function containsObfuscatedCode(scriptContent) {
  const patterns = [/eval\(/, /new Function\(/];
  return patterns.some(pattern => pattern.test(scriptContent));
}
Enter fullscreen mode Exit fullscreen mode

Scripts that use these functions might be hiding malicious URLs or logic.

Integrating Detection in Legacy Systems

For real-world applications, embed these detection functions into your legacy code analysis pipeline. Use static code analysis tools or custom parsers to extract inline scripts from HTML and pass them through these detectors.

// Example: analyzing scripts within HTML
const htmlContent = `...`; // load from legacy page
const scriptTags = htmlContent.match(/<script[^>]*>([\s\S]*?)<\/script>/gi);
if (scriptTags) {
  scriptTags.forEach(scriptTag => {
    const scriptInnerContent = scriptTag.replace(/<script[^>]*>|<\/script>/gi, '');
    if (containsSuspiciousURLs(scriptInnerContent) || containsObfuscatedCode(scriptInnerContent)) {
      console.log('Potential phishing script detected');
    }
  });
}
Enter fullscreen mode Exit fullscreen mode

Final Thoughts

By applying pattern recognition techniques, specifically targeting URL manipulations and obfuscation, security researchers can identify malicious scripts embedded in legacy JavaScript codebases. Continual updating of patterns and leveraging machine learning models trained on known phishing scripts can enhance detection accuracy over time.

Proactively identifying these threats helps mitigate phishing risks in environments where code revision might be limited. Combining static analysis methods with runtime behavior monitoring offers a comprehensive approach to securing legacy systems against evolving phishing tactics.


References:


🛠️ QA Tip

I rely on TempoMail USA to keep my test environments clean.

Top comments (0)