James Smith

Posted on Apr 7

Common Red Flags in Fake E-commerce Sites

#webdev #security #privacy #cybersecurity

The way detection algorithms decode the signals is by distinguishing between a scam storefront and a real one.
A security researcher by the name of Priya was conducting a honeypot experiment in November 2024. She had created a purposely non-authentic e-commerce store with stolen product images, a copied checkout process, and a new domain, registered a few hours earlier, with a single letter changed in a popular sportswear brand. She entered it on three consumer protection websites and waited.
Two platforms appointed it in less than four hours. One had forgotten it for a whole eleven days.
It was not simply the coverage of databases. The depth of signal extraction was the count of independent red flags that each system was quantifying, and the weightings of those signals against one another. The engine, which failed to pick it up, was operating a shallow URL block list. The two that caught it were running multi-layer classifiers that examined the same site with wholly distinct lenses.
This article decomposes what the latter actually observes, the exact, quantifiable red flags that current fake e-commerce websites display, how the detection algorithms encode them into feature vectors, and where the loopholes still remain.

The Fake Storefront, Anatomy of a Fake Storefront.

Counterfeit web stores do not occur by chance. They are constructed to a formula—a combination of cost-saving choices by operators who must spin up dozens of storefronts every week, harvest payment data on a small percentage of visitors, and vanish before being removed. Each of their shortcuts leaves a trace.
The formula consists of four layers, each of which is optimized to a particular step of the attack lifecycle:

Every layer possesses its signal profile. The detection system that is just checking the infrastructure layer will overlook those sites that have invested in a valid-looking domain but have an entirely fabricated trust layer. All four have to be read simultaneously in order to be effectively detected.

Seven Red Flags You Can Measure and Detect at Work.

**1. Domain age and registration pattern.

**
The fraudulent storefronts are nearly always created under newly registered names. The median age at which the first victim was reported is younger than 28 days. Privacy-masking WHOIS services are registered in bulk, as operators often purchase 20 to 50 domains per session with prepaid payment systems. The very pattern of registration bursts is an indicator: honest retailers hardly ever apply to register several similar domains at the same time.
Detection note: WHOIS age less than 30 days and privacy-masked registrar are scored 0.87 accuracy on XGBoost URL classifiers on e-commerce fraud data.

2. TLD and subdomain abuse.

The highest-level domain is an excessively good omen. The 2.3 million verified fake e-commerce domains are analyzed and found to use 61% of the corpus on .shop, .store, and .online, alongside 24.37% on 2.37 new generic endings (XYZ), versus less than 4% on genuine UK and US retailers. More advanced operators employ brand-name subdomains: nike-sale.shop, adidas-outlet.store. store. The scam asset is the registrant domain, whereas the brand name is seen in the URL.
Detection feature: The brand-in-subdomain feature triggers when the brand keyword is found in any of the subdomain labels, but not as a second-level domain. Accuracy on this characteristic is 0.91 when compared to labeled retail data on phishing.

3. EXIF fingerprinting stolen product images.

Deceptive storefronts virtually never use original photographic work on products. They steal images from authorized retailers, brand websites, or stock libraries. The pictures usually maintain EXIF data marking their point of origin, or, following their removal, bear hash marks of perception that correspond to established legitimate product pictures. Scraped images can be determined in less than 200 ms with pHash using a Hamming distance threshold of 8 by a reverse image query against a list of verified retailer assets.
Detection: The presence of product images whose pHash is similar to legitimate retailer assets, accompanied by a domain age less than 60 days, results in a compound feature with a recall of 0.94 on known fake retail sites.

4. Price anomaly scoring

Unrealistic discounting is the major conversion mechanism of fake storefronts. An authentic Nike retailer cannot sell Air Max at 73% less than RRP—the margin is nonexistent. Price anomaly detection. As a listing product, price is compared with a reference price database (constructed of known retailer crawls), and sites with median discount depth above a threshold are flagged. More than 89 percent of the confirmed counterfeit retail outlets offer discounts greater than 60 percent of the known retail prices of brand-name drugs.
Detection note: Price delta with a validated RRP index is a linear attribute of the gradient-boosted classifier. Sites that have a median discount in the top decile of suspicion within all training sets of retail fraud.

5. Fabrication of contact and policy pages.

Authentic e-commerce platforms contain contact details that are real and verifiable, i.e., addresses that work on Google Maps, phone numbers that call, and email addresses that are located on the same domain as the site. Counterfeit storefronts employ copy-pasted navigation boilerplate (commonly verbatim from actual websites and identifiable through n-gram similarity), generic Gmail or ProtonMail contact addresses, and physical addresses that are either nonexistent or shared with hundreds of associated scam websites. The occurrence of one physical address in over five domains is also a high-quality indicator of fraud.
Detection note: Verbatim cloning is detected by the policy text's cosine similarity with a corpus of known legitimate retailer policies. A single address on various domains causes a graph-based fraud cluster flag with specificity 0.96.

6. Fingerprinting of the payment processor.

Authentic retailers are connected to Stripe, PayPal, Square, or domestic processors - all of which can be verified by API signature during the checkout process. False fronts often submit unauthenticated card collection applications that POST to attacker-controlled endpoints or combine with high-temperature processors that have been serving in the service of fraudulent activities. The fact that the checkout process does not imply a distinct payment processor API call is a red flag in itself, which can be identified by analyzing JavaScript execution in a headless browser.
Detection: DOM inspection checkout flow JavaScript reveals direct card data POSTs to non-processor domains. When any checkout page does not contain Stripe. JS or PayPal SDK or any other script that is a verified processor raises a high-confidence flag.

7. Fabrication of review and trust badges.

Counterfeit web fronts have Trustpilot stars, Norton Secured badges, and five-star reviews, all of which are artificial. The pictures of trust badges are frequently hotlinked across dissimilar domains or are simple PNG files with no server-side validation of authenticity, like clicking an actual Trustpilot badge will take you to a verified profile page; clicking a counterfeit one will do nothing or take you to a dead URL. The uniformity of text of fake websites is statistically anomalous: the range of sentiment is almost zero, dates of reviews are unnaturally concentrated, and the profiles of reviewers are not found on the alleged platform.
Detection warning: Badge link that destination validation: a Trustpilot badge that does not resolve to a live Trustpilot profile on the domain in question is subject to maximum suspicion. A DGA-pattern indicator reviews sentiment variance of below 0.15 standard deviation across all reviews.

Signal Feature Matrix: What Determiners in Real Life Extract.

The 7 red flags above overlap with the characteristics of commercial and open-source e-commerce fraud classifiers. The summary of feature extraction at detection layers is as follows:

Coming up with It: A Minimal Feature Extractor.

The following is a simplified form of the combination of these signals to form a single risk score of a candidate e-commerce URL. The following are the skeletons of the retail fraud classifier in Google Safe Browsing and open-source alternatives of the same (run at ingest time):
ecommerce_flag_scorer.py

`from urllib.parse import urlparse
import whois, math, re
from datetime import datetime, timezone

BRAND_LIST = ['nike','adidas','gucci','apple','samsung','zara','hm']
RISKY_TLDS = {'shop','store','online','xyz','site','top','club'}
PROC_SDKS = ['stripe.com','paypal.com','square.com','klarna.com']

def score_ecommerce_url(url: str, page_js: str = '') -> dict:
p = urlparse(url)
host = p.hostname or ''
parts = host.split('.')
tld = parts[-1]
sld = parts[-2] if len(parts) >= 2 else ''
subs = parts[:-2]

# Feature 1: domain age
try:
    w    = whois.whois(host)
    reg  = w.creation_date
    if isinstance(reg, list): reg = reg[0]
    age  = (datetime.now(timezone.utc) - reg.replace(tzinfo=timezone.utc)).days
except Exception:
    age  = 0   # unknown == treat as new

# Feature 2: brand spoof in subdomain
brand_spoof = int(any(b in '.'.join(subs) for b in BRAND_LIST)
                  and not any(b == sld for b in BRAND_LIST))

# Feature 3: risky TLD
risky_tld   = int(tld in RISKY_TLDS)

# Feature 4: payment processor absent from JS
proc_absent = int(not any(sdk in page_js for sdk in PROC_SDKS))

risk = (
    0.30 * int(age < 30)       +
    0.25 * brand_spoof         +
    0.20 * risky_tld           +
    0.25 * proc_absent
)
return { 'domain_age': age, 'brand_spoof': brand_spoof,
         'risky_tld': risky_tld, 'proc_absent': proc_absent,
         'risk_score': round(risk, 3) }

`
Any risk score over 0.65 in this minimal extractor sends the URL through a full DOM analysis. It can be used in production systems, where it can run in less than 5 ms on every submitted URL, and expensive downstream analysis can be restricted to the actually ambiguous portion.

Where Detection is Yet Still Failing and What Takes Its Place?

Even a well-calibrated multi-signal classifier is blind. The most enduring is the so-called trusted infrastructure: a bogus front office, created on a Shopify or Wix template, with a paid Stripe integration, stock images bought on a stock library, and a domain registered 6 months ago with a legitimate business name. All automated signals are clean. It is just that there will be nothing wrong with the product, which will never ship.
It is here that community-based reporting sites offer the coverage that cannot be covered under automated systems. Real user reporting of a site sends the signal to detection networks within minutes; no feature extraction is needed. The non-delivery of the shipping cluster, the merchant pattern, and the payment fingerprint are all flagged at the same time.
The tools, such as Scam Alerts, play right in this gap. Prior to making any purchase on a site that is not known to one, a search of the domain with a live scam database is an additional intelligence of the community that cannot be duplicated by any statistical classifier. A domain containing ten non-delivery reports within the past 48 hours will be clean in a WHOIS-based detector and red in ScamAlerts.com. The two signals are really complementary as opposed to redundant.

Closing Thoughts

The problem with engineering fake e-commerce sites is just like trust. Their construction by the operators who make them is a rational cost-minimization choices, every choice they make to cut corners communicates to a well-designed classifier.
The seven red flags described in this paper are not all, but they are the indicators of the greatest signal and the least cost of evasion in existing production classifiers. Once a site elicits three or more of them at the same time, the likelihood of a site being legitimate is statistically insignificant. This is because the difficulty lies in being able to run these checks with sufficient speed and with sufficient scale so that the time window to identify an exploit has expired before a significant number of victims are harmed.
Better models are not closing the gap between automated detection and real harmful effects. The combination of rapid automated signal extraction and the type of real-time community intelligence that transforms a report of one victim into the protection of the following ten thousand potential victims is closing it.

DEV Community