Fraud systems are hard to test well.
Not because engineers cannot write tests, but because the data needs to tell a story. A suspicious transaction is rarely suspicious because of one field. It is usually suspicious because of patterns:
- a new account with too many beneficiaries
- many transactions in a short window
- failed logins followed by a device change
- KYC failures mixed with country mismatches
- transaction amounts far above normal behavior
Using production customer data for this is risky and often unacceptable. Basic mock data is usually too flat.
So I built fintech-fraud-sim, a TypeScript CLI that generates synthetic fintech users and transactions with configurable fraud patterns.
NPM package: https://www.npmjs.com/package/fintech-fraud-sim
Quick Start
npx fintech-fraud-sim generate --users 1000 --fraud-rate 0.08
That command generates:
users.csv
transactions.csv
users.json
transactions.json
summary.json
You can also choose a format:
npx fintech-fraud-sim generate --users 5000 --fraud-rate 0.12 --format csv
Or send output to a folder:
npx fintech-fraud-sim generate --users 2000 --fraud-rate 0.05 --format json --out ./data
Why I Built It
Fraud detection products need test data in many places:
- dashboards
- QA fixtures
- rules engines
- risk scoring services
- transaction monitoring demos
- analytics pipelines
- prototype fraud models
But realistic fraud test data is not just random rows. It needs consistent signals across users and transactions.
For example, an account takeover scenario should not only mark a transaction as suspicious. The user should also show supporting signals such as failed logins, device changes, and country mismatch.
That is the idea behind this CLI.
What the Data Looks Like
Generated users include fields like:
user_id
country
account_age_days
kyc_status
failed_kyc_attempts
device_count
ip_country
declared_country
failed_login_attempts_24h
beneficiary_count_24h
chargeback_count
is_fraud
fraud_pattern
risk_label
reason_codes
Generated transactions include:
transaction_id
user_id
timestamp
amount
currency
channel
beneficiary_id
beneficiary_country
device_id
ip_country
status
is_suspicious
fraud_pattern
reason_codes
The package does not generate real names, emails, phone numbers, BVNs, NINs, or bank account numbers.
Supported Fraud Patterns
The CLI currently supports:
| Pattern | Description |
|---|---|
mule_account |
New account, high beneficiary count, rapid funds movement |
account_takeover |
Device change, country mismatch, failed login spike |
velocity_abuse |
Many transactions in a short period |
kyc_abuse |
Multiple failed KYC attempts and inconsistent country data |
chargeback_risk |
Prior chargebacks and high-value transactions |
transaction_spike |
Amount far above user baseline |
cross_border_anomaly |
IP or beneficiary country mismatch |
beneficiary_burst |
Many new beneficiaries within 24 hours |
You can select specific patterns:
npx fintech-fraud-sim generate \
--users 1000 \
--fraud-rate 0.08 \
--patterns mule,account_takeover,velocity_abuse
mule is accepted as an alias for mule_account.
Deterministic Generation
For tests and demos, deterministic output is important.
npx fintech-fraud-sim generate --users 1000 --fraud-rate 0.08 --seed demo
The same seed produces the same dataset, which makes the CLI useful in CI and repeatable test suites.
Testing
The project includes tests for:
- generating users and transactions
- fraud rate handling
- seeded deterministic output
- zero fraud rate behavior
- fraud pattern logic
- CSV and JSON writers
- CLI execution
Example:
npm test
The test suite uses Node's built-in test runner with tsx, so TypeScript tests can run directly.
Example Use Cases
You can use the generated data to:
- demo fraud dashboards
- test transaction monitoring rules
- validate CSV import workflows
- build realistic QA fixtures
- prototype risk scoring logic
- test data pipelines without exposing customer records
Safety Disclaimer
fintech-fraud-sim is for testing, education, and fraud-model prototyping only. It generates synthetic data and should not be treated as real customer data or used as a production fraud decisioning system.
Final Thoughts
Good fraud testing needs more than random transactions. It needs plausible behavior.
This CLI is my attempt to make that easier for fintech builders, QA teams, and engineers working on risk systems.
Top comments (0)