When we set out to build a risk screening tool for cross-border founders, the obvious path was to throw an LLM at it. Feed in the founder's situation, get back a risk assessment. Ship it.
We went the opposite direction: a pure client-side, deterministic, rule-based scoring engine. No API calls. No ML models. No LLM. Same input always produces the same output.
Here's why, and how the engine works.
Why deterministic > intelligent
Our product maps structural risk across four dimensions for solo founders operating businesses across borders. The core principle: determinism in analysis, intelligence in interaction.
For a risk screening tool, this means:
- A founder in India with a Delaware LLC and Stripe-only payments should get the exact same score every time
- No temperature variance, no model drift, no "it depends on how the LLM is feeling today"
- The scoring logic is auditable — you can trace every point back to a specific rule
- It runs entirely client-side — no server costs, no latency, no API rate limits
When the output affects how someone understands their legal and financial exposure, hallucination isn't a minor inconvenience. It's a liability.
The architecture
The engine scores 7 inputs across 4 dimensions (we call them META: Money, Entity, Tax, Accountability). Each dimension gets a score from 0-5, totaling 0-20.
export interface RiskCheckAnswers {
residenceCountry: string;
citizenships: string[];
entityCountry: string; // "none" if no entity
incomeCountries: string[];
annualRevenue: 'under-25k' | '25k-50k' | '50k-100k' | '100k-250k' | '250k-plus';
paymentMethods: string[];
daysAbroad: 'none' | 'under-30' | '30-90' | '90-183' | '183-plus';
}
export interface RiskCheckResult {
money: DimensionScore;
entity: DimensionScore;
tax: DimensionScore;
accountability: DimensionScore;
totalScore: number;
riskLevel: 'low' | 'moderate' | 'high' | 'critical';
}
The main scoring function delegates to four dimension-specific scorers:
export function scoreRiskCheck(answers: RiskCheckAnswers): RiskCheckResult {
const money = scoreMoney(answers);
const entity = scoreEntity(answers);
const tax = scoreTax(answers);
const accountability = scoreAccountability(answers);
const totalScore = money.score + entity.score + tax.score + accountability.score;
let riskLevel: RiskCheckResult['riskLevel'];
if (totalScore <= 4) riskLevel = 'low';
else if (totalScore <= 8) riskLevel = 'moderate';
else if (totalScore <= 12) riskLevel = 'high';
else riskLevel = 'critical';
return { money, entity, tax, accountability, totalScore, riskLevel };
}
How scoring works: pattern matching, not prediction
Each dimension scorer follows the same pattern: check for structural conditions, accumulate points, cap at 5.
Here's the Money dimension as an example:
function scoreMoney(a: RiskCheckAnswers): DimensionScore {
let score = 0;
const flags: RiskFlag[] = [];
// Single payment rail — if Stripe freezes, 100% of income stops
if (a.paymentMethods.length === 1) {
score += 1;
flags.push({
id: 'm-single-rail',
dimension: 'M',
severity: 'warning',
title: 'Single Payment Rail Dependency',
description: 'All revenue flows through one payment method...',
});
}
// No entity but earning income — no liability boundary
if (entityCountry === 'none' && a.incomeCountries.length > 0) {
score += 2;
flags.push({
id: 'm-no-entity',
dimension: 'M',
severity: 'critical',
title: 'No Entity Boundary for Income',
description: 'Income flows directly to you as an individual...',
});
}
return { score: Math.min(score, 5), flags };
}
Key design decisions:
- Each rule is independent — no cascading dependencies between rules
- Flags carry context — every score point comes with an explanation and severity level
-
Scores are capped —
Math.min(score, 5)prevents any single dimension from dominating - No weights — each rule contributes a fixed number of points. Weights add complexity without adding clarity when you have <20 rules.
The flag system
Every score point generates a RiskFlag with structured metadata:
export interface RiskFlag {
id: string; // Unique identifier
dimension: 'M' | 'E' | 'T' | 'A'; // Which META dimension
severity: 'info' | 'warning' | 'critical';
title: string;
description: string;
articleSlug?: string; // Links to educational content
}
The articleSlug is key to the UX: every risk flag links to an in-depth article explaining that specific structural pattern. The scoring engine doesn't give advice — it surfaces structural conditions and points to educational content.
This is an intentional boundary: the engine observes and describes, it never recommends. "You have a single payment rail" is a structural observation. "You should add a second payment processor" is advice. We only do the first.
Cross-border pattern detection
The most interesting scoring rules involve cross-jurisdictional patterns that most founders don't think about:
Entity-residence mismatch:
// Entity in different country than residence
if (a.entityCountry !== 'none' && a.entityCountry !== a.residenceCountry) {
score += 1;
flags.push({
id: 'e-entity-residence-mismatch',
severity: 'warning',
title: 'Entity-Residence Mismatch',
description: 'Your entity is registered in a different country than where you live...',
});
}
A founder in Portugal with a Delaware LLC triggers this. It's not "wrong" — it's the most common setup for non-resident founders. But it creates questions about effective management, permanent establishment risk, and which jurisdiction's tax rules apply. The flag surfaces this so the founder knows it's a structural characteristic worth understanding.
The 183-day rule trap:
if (a.daysAbroad === '183-plus') {
score += 1;
flags.push({
id: 't-extended-abroad',
severity: 'warning',
title: 'Extended Time Abroad',
description: 'Spending 183+ days outside your residence country may trigger tax residency questions...',
});
}
Digital nomads assume 183 days = tax-free. In reality, 183 days is one of many factors in tax residency determination, and different countries count them differently.
What we'd change
After running this in production for several months:
Weighted scoring would help at scale. With ~15 rules, fixed points work fine. If we expand to 50+ rules, some should matter more than others.
Country-specific rule sets. Currently, rules are universal. A US citizen abroad triggers FATCA/FBAR rules that don't apply to anyone else. Country-specific scoring branches would be more precise.
The 7-question limit is both a feature and a constraint. It keeps the tool fast (2 minutes), but it means we can't detect some patterns that require more granular input.
The tech stack
- Next.js 16 with App Router (React 19)
- TypeScript strict mode — the type system catches scoring logic errors at compile time
- Pure client-side execution — the scoring engine imports no server dependencies
- Vitest for unit testing — each scoring rule has test cases for boundary conditions
The full risk check is live at globalsolo.global/tools/risk-check. It takes about 2 minutes and generates a shareable scorecard.
When to use rules vs. ML vs. LLM
Our approach works because:
- The domain has clear, codifiable rules (tax thresholds, jurisdictional requirements)
- The input space is bounded (7 structured questions, not free text)
- Reproducibility matters more than nuance — a founder checking their risk twice should get the same answer
- The audience is making real decisions based on the output
If your domain has fuzzy inputs, requires natural language understanding, or benefits from creative interpretation, ML/LLM is the right tool. If your domain has clear rules and your users need to trust the output, consider starting with deterministic scoring and adding intelligence at the interaction layer.
We use LLMs elsewhere in our stack — for generating narrative sections of paid diagnostic reports, where the scoring is still deterministic but the explanation benefits from natural language. The principle: determinism in analysis, intelligence in interaction.
Built by Jett Fu at Global Solo — structural risk visibility for cross-border founders.
Top comments (0)