James Smith

Posted on Apr 21

The Role of Behavioral Biometrics in Scam Prevention

#security #cybersecurity #biometrics #ai

What you know is verified by passwords. Physical biometrics confirm your identity. Behavioral biometrics authenticate something that is difficult to steal and impossible to borrow: your motion within a system. The signal layer appearance, the model operating principles, and the boundaries are presented here.
In 2021, the UK flagged a transaction by a major UK retail bank that appeared totally legitimate on all the dimensions of credentials that were static. The user was authenticated by a known device. The IP address was in line with the history of the account location. It was the correct password. The code used in the two-factor authentication was correct. The account was of a 67-year-old retired accountant in Leeds.

None of that was what made it flagged. The typing pattern was it.

The account holder typed using two fingers a behavioral signature that is consistently measured and is a behavior that has been used dozens of times in the past. The session that elicited the flag was initiated with ten-finger touch-typing at 94 words per minute. The qualifications were legit. The individual who accessed them was not the account holder. It was a $14,000 wire transfer, instigated by an authorized push payment fraud, a type of scam wherein the account user had been socially engineered to give them credentials to a fraudster who were now using them in a concomitant session. The credential verification stack was not able to capture the behavioral biometric layer.
This is the essence of the behavioral biometrics value proposition in scam detection: not that it supplants any credential authentication, but that it can offer a continuous, session-based identity signal that cannot be offered by static credentials. In essence, it cannot. The technical and operational understanding of how that signal is implemented is becoming a more significant consideration for engineers who develop fraud prevention infrastructure.

The Signal Layer: What Behavioral Biometrics Measures.

Behavioral biometrics denotes the category of authentication and anomaly detection methods that work on the patterns of interaction with a device or the interface of a user, instead of what they are aware of or what physical capabilities they have. The signals come in various forms, and each has a different dimension of interaction behavior, as well as different noise properties and discriminative power profiles.

Keystroke Dynamics

Keystroke dynamics has two main classes of features: dwell time (how long a key is held down) and flight time (the time between the release of one key and the press of another key). These characteristics, which are scaled at milliseconds during a typing sequence, generate a time series that is very individual and consistent across sessions of a particular user. The time distributions of digraphs and trigraphs are the most discriminative, namely the latencies between certain pairs of keys, and are indicative of the neuromuscular programming of the habituated typing response, as opposed to voluntary selection.
The statistical modeling problem with keystroke dynamics is intra-user variability. The typing speed and rhythm are dependent on fatigue, emotional state, keyboard equipment, and environmental factors. A strong model should be able to differentiate between legitimate within-user variance and anomalous deviation, which represents a different user—a non-trivial classification problem that existing methods can solve using Gaussian mixture models, recurrent neural networks using temporal sequences, and one-class classifiers using enrollment data of the target user.

Short-term Motor Dynamics

The patterns of mouse movement can encode behavioral cues on several levels of abstraction. At the raw signal level, there are cursor velocity profiles, acceleration and deceleration profiles, angular deviation of straight-line trajectories, and micro-tremor features of hand movement. Features at the interaction level encompass distribution of click pressure, timing of double clicks, scrolling behavior, and the spatial relationship between the resting position of the cursor and the next target of interaction.
The use of mouse dynamics is especially useful in detecting bots as well as verifying identity. Automated form-filling software, credential-stuffing scripts, and browser automation systems generate movement and click patterns with statistical properties distinctly unnatural to human-generated pointer movement: unnaturally linear trajectories, instantaneous velocity variation, mathematically regular click intervals, and the lack of sub-pixel jitter that is characteristic of organic hand movement.

Touch and Gesture Dynamics.

The signal space is considerably increased on mobile devices. The area of touch contact, distribution of finger pressure, swipe velocity, and curvature, scroll inertia application, and tap timing patterns can all be measured using standard device APIs. The motion sensors on the device, the accelerometer and gyro sensor, are passive channels that encode the manner in which the user grips and moves the device when interacting with it. The confluence of touch dynamics and device orientation produces a behavioral signature in high dimensionality that is both extremely unique and constantly present without the need to take action by the user or enroll them.

Model Architecture: Raw Signal to Anomaly Score.

The raw behavioral event stream to actionable fraud signal pipeline consists of various processing steps, each with its own architectural tradeoffs. An example production line works in the following manner:

Collection of event streams: Browser-side JavaScript, or mobile SDK, gathers raw events of interaction keydown/keyup, mousemove, and touch event (or event properties) and stores them client-side to be transmitted periodically to the analysis server. Sampling rate is an instrument of design: faster rates will enhance discriminative resolution but also bandwidth and processing overhead.
Feature extraction: Raw data are converted into feature vectors indicating the statistical summaries of the behavioral patterns average and variance of dwell times of particular key pairs, velocity distribution parameters of mouse motion segments, and pressure profile statistics of touch events. Here, feature engineering is essential: the feature space should be behaviorally significant and be able to resist variance caused by hardware.
Profile construction and maintenance: User behavioral profiles are constructed based on enrollment data and updated over time within the context of sessions through exponential moving averages or online learning algorithms. Maintenance of profiles should address legitimate behavioral drift a user who has recovered a hand injury, has changed the type of devices used, or has other stress-related behavioral changes without considering long-term authentic change an anomaly.
Anomaly scoring: Comparison of current session feature vectors with the stored profile is done with distance measures, either Mahalanobis distance with multivariate normal profiles or neural network similarity scoring with deep representation methods. It produces a continuous anomaly score instead of a binary match/no-match decision as input to a risk-stratified response system.
Risk-stratified response: Response scores exceeding threshold levels result in responses that are calibrated to the risk of fraud and false positive cost at each level—invisible monitoring (low score), step-up authentication (medium score), and termination of the session and flagging by manual review (high score).

Mapping to Scam-Specific Attack Patterns.

Behavioral biometrics exhibits different value profiles on diverse types of scam attacks. It is crucial to understand what attack patterns it can deal with well and those it cannot to determine where it fits in a stack of layered fraud prevention.
• Authorized push payment fraud: The situation, as described in the opening, where authorized credentials are used by a fraudster in a parallel session, is where behavioral biometrics offers the most obvious value. A mismatch of behavioral signature between the account holder who is enrolled and the attacker who is using stolen credentials is a high-confidence fraud signal. This type of attack bypasses the static credential verification by default; behavioral verification is among the limited number of mechanisms that can identify such an attack after authentication.
• Credential stuffing and account takeover: Automated credential stuffing attacks involve bot frameworks to test credential lists at scale. Automation tools have statistically different pointer and keystroke dynamics from human behavior and can be detected with high confidence. It can be used together with velocity analysis and device fingerprinting to ensure a layer of behavioral biometric bot detection that cannot be achieved by relying on rate limiting alone.
• Session hijacking: A stolen authenticated session token reused by a different client will result in the behavioral profile of the subsequent session not matching the authenticated user signature, especially when the attacker is not executing the session via automation. This type of attack is detected by continuous session-level behavioral monitoring, while it cannot be detected by point-in-time authentication.
• Social engineering as observed: When a valid user is being real-time coached by a scammer being told what to type, where to click, what to grant access to, etc. - Their behavioral pattern will be anomalous in many instances: non-typical hesitation patterns, non-typical navigation patterns, and dwell time distributions that are not in keeping with their established profile. The behavioral layer is able to expose stress and coercion indicators that are impossible to detect using any credential system.

Adversarial Robustness: Behavioral Biometrics Can Be Gamed?

Any identifying system installed on a large scale is turned into an enemy target. Behavioral biometrics does not require a yes/no question about whether it can be bypassed but whether it can be operationally bypassed with the scale and cost involved in fraudulent campaigns.
There are theoretical circumvention methods. An attacker who has access to prolonged behavioral monitoring of a target user by malware such as a keylogger capturing not only credentials but also the entire timing stream of events can build a behavioral replay attack simulating the enrolled signature. Adversarial machine learning has shown that behavioral biometric models can be compromised with well-designed input in the context of research with complete access to the white-box models.
These theoretical attacks are greatly restricted in practice by the operational constraints. Complete behavioral replay can only be attained by recording the full high-resolution event stream of previous sessions, a surveillance operation that is far more complicated than stealing a credential. The behavioral profile captured also varies depending on the context of the session: a profile recorded in a situation of low stress when emailing might not be applicable to a high-stakes financial transaction situation where the behavioral pattern of the user is justifiably different. The model is constantly updated with new sessions, and as such, a profile captured by the model will rapidly age.
This is not due to its theoretical invulnerability but to the fact that the cost and complexity of circumventing behavioral biometrics are sufficiently high to be economically prohibitive in most fraud schemes that are driven by scale and low per-target pricing.

Community Intelligence and Integration Architecture.

Behavioral biometrics should be used as a single signal of a multi-signal fraud prevention architecture and not as a single system. Its discriminative value is greatest when its production is combined with contextual cues, transaction value, geographic anomaly, device fingerprint mismatch, velocity patterns, and threat intelligence obtained via avenues inaccessible to behavioral analysis.
This is where community-sourced intelligence systems provide architectural value that enhances what is detected by behavioral systems. A behavioral biometric system will be able to detect the fact that the operator in the current session is not the account holder. It does not inform you of whether the site that the user was on prior to this session was a phishing site that stole their credentials. Those upstream threat indicators, i.e., which scam campaign is live, which credential-harvesting infrastructure is in use, and which fraudulent sites are currently targeting specific demographics, are community-reported incident data aggregated by websites such as Scam Alerts, where the attack vector is reported in near real-time by users who were targeted by it.
This is architecturally complementary: behavioral biometrics can verify identity at the session level in a way that static credentials cannot, and community threat intelligence can give attack-vector context that behavioral signals can not deduce. A synthesis of both sources yields a risk score that is more accurate in assessing fraud than the source created alone.

Privacy Architecture: A First-Class Design Constraint.

Privacy architecture needs to be explicitly addressed in any production treatment of behavioral biometrics. Behavioral profiles are personal data and subject to the GDPR and analogous frameworks and have particular implications in terms of storage, consent to process, and data subject rights. The active and passive collection of interaction behavior, which is ongoing without the express consent of the user, poses informed consent issues that differ greatly by jurisdiction.
This is often addressed by production systems via a mix of on-device processing retaining raw event data on the client and sending feature vectors derived from it differential privacy techniques when storing profiles, defined retention limits related to session or account lifecycle, and explicit disclosure in privacy policy models. The on-device processing model is especially beneficial in a privacy and latency sense: feature extraction at the edge will minimize the volume of sensitive data transmission and round-trip latency in the anomaly scoring pipeline in case of low-complexity classifiers.
Privacy architecture as a first-class requirement in systems is more likely to be deployable in the regulatory jurisdictions and more acceptable to users whose behavioral data is being processed, a condition to ensuring that the enrollment coverage enables the detection signal to be significant at scale.

The Leeds case proves what?

The Leeds bank fraud case was solved amicably. Before any money flowed, the transfer of £14,000 was blocked at the behavioral anomaly flag, the session was terminated, and the account holder was notified via an out-of-band channel. The fraudster bypassed all of the layers of the authentication stack that were not dynamic. The layer, which they were unable to break, the layer that coded how a particular human has traveled through thousands of previous sessions, was the one that counted.
Combined with real-time threat intelligence via community-sourced sites such as Scam Alerts, which reveals the upstream attack vectors on which credential theft builds its foundation, behavioral biometrics will resolve the inherent weakness of all credential-based systems: that they are authenticating what a user possesses and not who is using it. Credentials get stolen. There is no transfer of behavior.
The scammer came with the correct key. The wrong hand was known to the lock.

DEV Community