Bank of Scotland was fined £160K for a Cyrillic transliteration failure. Here's the technical breakdown.

#webdev #fintech #api #security

In January 2026, OFSI fined Bank of Scotland £160,000.
24 payments went through to a designated Russian individual.
Root cause: the screening tool couldn't match Cyrillic
transliteration variants.

This wasn't negligence. It was a technical failure that
most sanctions screening tools still have today.

Why Cyrillic matching fails

There are multiple competing standards for Cyrillic → Latin
transliteration: BGN/PCGN (used by US/UK governments), ISO 9,
GOST, ICAO, and dozens of informal spellings.

A single name like "Шварц" legitimately appears as:

Shvarts
Shvartz
Schwarz
Shvarc
Svarc

Every one of them is "correct" — depending on which standard
was used. Most screening tools pick one. If the watchlist
entry uses BGN/PCGN and the customer's passport uses ICAO,
you get a miss. That miss cost Bank of Scotland £160K.

The patronymic problem

Russian names have three parts: given name, patronymic,
and surname.

"Ivan," "Ivanov," and "Ivanovich" are completely different
people:

Ivan → given name
Ivanov → surname ("of Ivan")
Ivanovich → patronymic ("son of Ivan")

A naive fuzzy matcher sees 70%+ character overlap and scores
them as near-matches. This floods compliance queues with
false positives while simultaneously missing real hits.

The "Mohammed problem"

Arabic has 12+ formal romanization systems: ALA-LC, ISO 233,
UNGEGN, BGN/PCGN, DIN 31635...

A single Arabic name produces 300+ valid Latin spellings.
"Mohammed," "Muhammad," "Mohamed," "Mehmet," "Muhamad" —
same person, different systems.

The Beider-Morse algorithm — arguably the most sophisticated
phonetic matching system ever built — explicitly removed
Arabic support. The maintainers cited "severe performance
issues related to excessively complicated phonetics."

If the best phonetic algorithm gives up on Arabic, what are
most commercial tools doing?

Answer: Jaro-Winkler with a threshold. Which is why false
positive rates on Arabic names run above 90% in most systems.

The substring trap

"Computing" contains the substring "p-u-t-i-n."

Without whole-word boundary enforcement, your screening
system flags tech companies. This sounds absurd — but it
happens in production systems every day.

We caught this when testing our own engine. A query for
a software company returned a high-confidence sanctions
match because a substring of the company name overlapped
with a sanctioned individual's name.

The fix: whole-word tokenization. Only match on complete
tokens, never on substrings.

What the benchmark gap looks like

No commercial sanctions screening vendor publishes accuracy
benchmarks. Not Refinitiv, not ComplyAdvantage, not
sanctions.io.

OpenSanctions — the best open-source system — publishes
their numbers: 91.3% F1, 99% recall, 84.5% precision.

The Federal Reserve published a sanctions screening paper
in September 2025. Best result using GPT-4o: 98.95% F1 —
tested on Latin-script organization names only.

Nobody is publishing results on Arabic transliteration,
Cyrillic variants, or patronymic edge cases. Exactly the
cases that generate real fines.

What we built

We built Verifex (verifex.dev) to address this directly.
The matching engine combines:

Soft TF-IDF + Monge-Elkan — the academic gold standard for string matching (Cohen, Ravikumar, Fienberg 2003)
IDF corpus weighting — "Mohammed" and "Kim" are statistically common. They should score lower than rare tokens like "Qadhafi"
Double Metaphone phonetic blocking — across multiple transliteration standards simultaneously
9 penalty layers — patronymic derivatives, substring boundaries, entity-type mismatches, mixed-script detection
LLM cascade — for ambiguous matches in the 40-95% confidence range

Result: 100% F1 on an independent 145-case benchmark —
including Arabic transliteration, Cyrillic variants, phonetic
matching, and adversarial substring inputs.

The full benchmark is public: verifex.dev/benchmark

Anyone can run it against any provider.

Bank of Scotland's fine was preventable. The technology
to handle Cyrillic transliteration exists — it's just not
in most commercial tools. If you're building or evaluating
a sanctions screening solution, the benchmark cases at
verifex.dev/benchmark show exactly where most tools fail.