James Smith

Posted on Apr 8

How Fraudsters Exploit Social Engineering Online.

#security #cybersecurity #socialengineering #privacy

A technical analysis of the psychology, automation, and detection details of attacks of online manipulation.
In September 2023, a Slack message was received by a security engineer with a large US-based technology company because someone claiming to be a coworker in the IT department sent it to the individual. The message mentioned a real internal system by name, used the right internal vocabulary, and was delivered at 4:47 PM on a Friday, when attention is the lowest and the need to finalize things before the weekend is the greatest.
The notification requested the engineer to authorize a regular MFA reset on a locked-out colleague. The engineer approved it. In forty minutes, the attacker had moved through 3 internal systems and stolen source code on a private repository.
The attacker had not decrypted even a single piece of cryptography. They had not taken advantage of a computer bug. They had just known human psychology too well to model it and to automate it at scale. This is social engineering at its new art form and it is not an art that is applied by a single con artist anymore. It is an engineering field that has repeatable methods, quantifiable conversion figures, and robotic delivery support.

The Fingerprints of the Technical Stack of Social Engineering Attacks.

The extent of automation and personalization that has been operationally available is what renders the contemporary online social engineering categorically distinct compared to its historical predecessors. A campaign that would have taken a competent team of fraudsters working for the Harvester several weeks is now ready to go in less than an hour by a single operator through commodity tools.
There are four layers in the stack, which perform a certain role in the attack pipeline:

Layer 1: OSINT Harvest

The basis is the open-source intelligence collection. Prior to the dispatch of a single message, automated tools scan LinkedIn profiles with job titles and reporting structures; public Git repositories with technology stack information; company press releases with recent events that can serve as context anchors; and social media with personal information that can be utilized in the subsequent messages to add credibility.
Such tools as Maltego and custom LinkedIn scrapers can compile a target dossier within minutes, including known colleagues, new projects, internal system names found on job postings, and communication patterns based on the time of the post made publicly.

Layer 2: Persona Synthesis

LLMs have reinvented the persona construction layer. In 2021, it took a manual handicraft to create a convincing fake colleague. By 2024 or later, a GPT-4-class model trained on samples of communication style of a target can generate messages that are indistinguishable from actual internal communication on a large scale. The synthetic persona consists of not only the contents of the message but also timing patterns; the attacks are orchestrated to happen at context-reasonable times, as in the 4:47 PM Friday above case.
Automated persona generators now generate conversational turn-taking patterns, response time distributions, and vocabulary frequency distributions based on scraped information and generate a communication fingerprint that closely resembles the person being impersonated, such that they will escape close examination.

The 6 Social Engineering Attacks based on Cognitive Biases.

The essence of social engineering is cognitive science. Each of the techniques is associated with one or more cognitive biases that are well described. The knowledge of this mapping is what both human defenders and detection systems can use to predict patterns of attacks before they hit.

These biases are not independent of each other. The most effective attacks combine several biases at the same time: an impersonated authority figure (authority bias) that presents a matter of urgency (urgency/fear) and refers to an actual colleague that has already been briefed (social proof) and has a set timeframe (scarcity). The layers get the others compounded.

The industrialization of personalization by AI.

The transition from artisanal to industrial social engineering can only be explained using conversion rate economics. An email spear-phishing attack that is hand-designed has an average response rate of 3-5% of the victims. A personalized message created by an AI and using the OSINT context, which is consistent with the known style of communication of the target and is delivered at the most appropriate time, will result in the engagement of 14-26% in the reported red team drills.
The pseudocode of the pipeline producing such an outcome is as follows:

social_engineering_pipeline.py

`def build_attack_message(target_id: str) -> AttackPayload:
# Phase 1: gather target context
profile = osint_scraper.build_profile(target_id)
colleagues = linkedin_graph.get_first_degree(target_id)
style_model = llm.fine_tune(
base_model='gpt-4',
samples=profile.public_messages,
task='style_transfer'
)

# Phase 2: select trigger stack
biases  = bias_selector.pick_optimal(
              role=profile.job_title,
              platform=SLACK,
              time_of_day=optimal_send_time(profile)
          )

# Phase 3: synthesise message
msg = style_model.generate(
          persona=random.choice(colleagues),
          trigger_stack=biases,
          payload=CREDENTIAL_HARVEST_URL,
          context_anchor=profile.recent_projects[0]
      )

return AttackPayload(message=msg, send_time=optimal_send_time(profile))`

What is terrifying about this pipeline is that it is horizontally scaled, with marginal costs to add an additional target of zero. Thousands of parallel attacks can be customized by an operator who is running this infrastructure, each of which is as contextually convincing as a well-trained social engineer might have written individually.

Detection Mechanics: Fighting Back by Systems.

Social engineering in the detection direction is an adversarial classification that is based on a vastly distinct feature space as compared to URL-based phishing detection. The signs are behavioral and semantic and not structural.

Content in the messages: semantic analysis.

NLP classifiers that have been trained on recognized social engineering corpora yield features that are handled by human readers implicitly: urgency density (ratio of urgency-encoded tokens to overall message length), claim of authority (named entity recognition indicating an impersonation of organizational roles), and specificity of action (requests that specify specific actions such as entry of credentials or transfer of funds score better than general requests).

Detection of behavioral anomaly.

Enterprise communication security systems model basic patterns of communication between users. A message from a familiar colleague that is quite unlike in terms of its historical communication pattern, in terms of cosine similarity with a rolling TF-IDF profile, evokes a review flag. The model is not required to be aware of the nature of the content being malicious but only that the style is abnormal.

Timing pattern analysis

The best time option of the attack line is in itself a signal that can be detected. Attack messages are agglomerated around high-susceptibility windows—end of day Friday, the first half hour Monday mornings, and company-wide announcement periods. An anomalous communication that has an abnormal time and an abnormal urgency profile will score more in the anomaly classifiers before the content analysis is performed.
Key detection features
• Urgency_density: time-pressure tokens/length of messages (0.15 or more)
• Authority-entity-mismatch: sender domain and asserted identity organization.
• Style cosine delta: outlier to historical TF-IDF style profile.
• Action specificity score: Specificity of the action being requested (credential / payment / transfer)
• Send time anomaly: Kullback-Leibler distance between the historical sending timing distribution and the sender distribution.

The Chasm Automated Detection Cannot Be Sealed.

CASB platforms, email security gateways, and communication anomaly detectors are all enterprise detection systems that are primarily calibrated to internal corporate environments. They act on this pre-existing data: past communication history, familiar user pairs, and internal directory hierarchy.
Their coverage of the consumer attack surface is very low. A social engineering attack, which is presented in the form of a fake investment site, a WhatsApp message that has been sent by a cloned identity, a fake job opportunity on a legitimate job board, or a romance scam account on a dating app, is totally beyond the reach of the detection perimeter of enterprise security tooling.
This is the loophole that is bridged by community-based verification sites. In cases where a social engineering campaign leads to the creation of a fake website, a spoofing sender domain, or a scam telephone number, consumer reports form a real-time signal that spreads to databases such as ScamAlerts, which is used to collect both automatically detected signals and community reports to deliver coverage in areas where enterprise tooling fails.
When a target gets a suspicious message and visits the associated domain with ScamAlerts.com before taking any action, he/she will add an additional element of real-time community intelligence that no internal security measures could offer, especially since social engineering campaigns may use newly registered infrastructure with no blocklist history but potentially victim reports.

The Architecture of Deception and Its Limit.

Social engineering is an issue with the system. This is an optimization loop that the attacker is running: test message variants, gauge conversion rates, refine the bias stack, refine the persona model, and repeat. It is gradient descent on human psychology.
It is also a systems problem with defense. It involves integrating automated anomaly detection at the communication layer, semantic classification of message content, and timing pattern analysis, as well as verification of the infrastructure used by the scammers to collect data by the community. None of the layers suffice, as advanced attackers have already modeled and bypassed individual layers.
The combination, especially one that contains the unpredictability of a highly informed human who is aware of what the attack pipeline is, hesitates before acting on urgency and independently verifies before acting on any request that would require credentials or money, is what the attacker cannot model easily.
The Slack message that knocked off the security engineer was successful not because the engineer simply was not informed, but because the attack was timed, contextualized, and framed in such a manner that verification was not felt necessary. The necessity of verification seems quite intuitive under the conditions of understanding the mechanics. And when verification is a reflex and not an exception, the ratios converting social engineering into an economically feasible activity fail completely.

Tools and further reading

Prior to taking any action with a questionable link, domain, or contact: ScamAlerts.com a live scam database that combines autonomous cues with community warning signs.
MITRE ATT&CK: Social Engineering methodology catalogue (TA0001, T1566 series).
Cialdini, R. (1984). Influence: Psychology of Persuasion, foundational cognitive bias taxonomy
SANS Social Engineering Prevention Guide enterprise detection configuration guide.

DEV Community