Toward Robust Legal Text Formalization into Defeasible Deontic Logic using LLMs is a rule-based non-monotonic formalism for representing legal norms and automating its evaluation. It combines defeasible logic — which models rules that hold by default but can be overridden by exceptions — with deontic logic, which provides the vocabulary of obligations, permissions, and prohibitions. Together, these make it well-suited to insurance law, where coverage obligations are established by grants, narrowed by exclusions, and partially restored by exceptions to those exclusions. This three-layer structure maps precisely onto DDL's core mechanism: defeasible rules ordered by a superiority relation, so that an exclusion defeats a grant, and an exception defeats the exclusion in turn.
The system described here uses DDL as the semantic backbone for automated coverage determination. A preprocessing pipeline converts structured policy clauses into typed, prioritised DDL rules. A forward pass then applies those rules to claim facts to produce a coverage decision together with a full, auditable reasoning trace. Both stages rely on prompting rather than training, making the approach directly deployable on any sufficiently capable language model.
What "governs" means — and why it is not the same as "relevant"
A common mistake in building these systems is equating governing with semantic similarity: the water exclusion is relevant to any water-related claim, so it governs. That is the wrong test.
In Defeasible Deontic Logic a clause governs a claim if and only if every one of its antecedent conditions is satisfied by the claim facts. This is applicability — a logical check, not a similarity score. A fire exclusion is semantically relevant to any property loss — it is an exclusion about physical perils — but it does not govern a water damage claim because "damage caused by fire" is simply false given the facts. A retrieval system based on embeddings would surface it; the applicability test correctly excludes it.
That one distinction — governs = all conditions satisfied, not semantically close — is the whole reason the preprocessing pipeline exists. Its job is to make each clause's conditions explicit enough to test.
Pre-Processing Insurance Conditions
┌──────────────────────────────────────┐
│ Section text │
│ with resolved definitions │
└──────────────────┬───────────────────┘
│
[LLM]
│
▼
┌──────────────────────────────────────┐
│ Classify + extract │
│ List[ClauseExtraction] │
└──────────────────┬───────────────────┘
│
[rule-based]
│
▼
┌──────────────────────────────────────┐
│ Assign priority per clause │
│ exception > exclusion > grant │
└──────────────────┬───────────────────┘
│
▼
┌──────────────────────────────────────┐
│ List[ProcessedClause] │
│ type · antecedents · priority │
└──────────────────────────────────────┘
There are essentially two steps per section and one rule-based step.
Step 1 — Classify and extract (one LLM call per section)
The model receives a full policy section and returns all clauses it contains, each labelled with its DDL type and governing conditions.
You are formalizing all clauses in an insurance policy section.
Section path: {section_path}
Section text: {section_text}
Definitions used in this section: {resolved_definitions}
For each distinct normative unit (clause) in this section:
1. Assign a clause_id (e.g. "s3_c01", incrementing per clause)
2. Classify the clause type:
- grant: establishes the insurer's obligation to cover a loss
- exclusion: removes coverage for specific circumstances
- exception: restores coverage that an exclusion removed
- definition: fixes the meaning of a term
- condition: a duty on the policyholder or insurer
3. List the antecedents — the conditions that must ALL hold for this
clause to apply. Express each as a plain English statement,
e.g. "the damage was caused by water".
4. State the conclusion: what follows when all antecedents hold.
Return a JSON array of clause extractions.
class ClauseExtraction(BaseModel):
clause_id: str
clause_type: Literal["grant", "exclusion", "exception", "definition", "condition"]
antecedents: List[str] # e.g. ["damage was caused by water"]
conclusion: str # e.g. "the insurer is not obliged to pay"
# The LLM returns:
List[ClauseExtraction]
Step 2 — Assign priority (rule-based, no LLM)
PRIORITY = {"exception": 3, "exclusion": 2, "grant": 1, "definition": 0, "condition": 0}
Stored clause unit (full index entry)
class ProcessedClause(BaseModel):
clause_id: str
clause_type: Literal["grant", "exclusion", "exception", "definition", "condition"]
antecedents: List[str]
conclusion: str
priority: int # derived from PRIORITY lookup
Forward Pass — Deciding a Claim
┌──────────────────┐ ┌────────────────┐
│ Claim facts │ │ Clause index │
└────────┬─────────┘ └───────┬────────┘
│ │
└─────────────┬─────────────┘
│
[LLM]
│
▼
┌──────────────────────────────────────────────┐
│ Applicability check │
│ do facts satisfy all antecedents? │
│ governs ≠ similar │
└─────────────────────┬────────────────────────┘
│
[applicable only]
│
▼
┌──────────────────────────────────────────────┐
│ Contest │
│ applicable clauses that conflict │
└─────────────────────┬────────────────────────┘
│
[superiority]
│
▼
┌──────────────────────────────────────────────┐
│ Priority resolution │
│ superior clause overrides the rest │
└─────────────────────┬────────────────────────┘
│
▼
┌──────────────────────────────────────────────┐
│ Decision trace │
│ decision and which rule prevailed │
└──────────────────────────────────────────────┘
The applicability prompt (one call per candidate clause) makes the governing test explicit:
Claim facts:
{claim_facts}
Clause antecedents:
{antecedents}
For each antecedent, decide whether the claim facts satisfy it.
A clause governs this claim only when ALL antecedents are satisfied.
Return JSON: for each antecedent, {antecedent, satisfied: bool, reason}.
Then: governs: bool.
Decision trace — burst pipe example
Claim: "A pipe inside the kitchen wall burst suddenly. Water flooded and damaged the kitchen floor."
Step 1 — Applicability check
| Clause | Type | Priority | Antecedents | All satisfied? | Governs? |
|---|---|---|---|---|---|
| GRANT-01 | grant | 1 | damage to the home | ✓ | YES |
| EXCL-03 | exclusion | 2 | damage caused by water | ✓ | YES |
| EXCL-07 | exclusion | 2 | damage caused by fire | ✗ | NO |
| EXCPT-03a | exception | 3 | water from sudden internal pipe burst | ✓ | YES |
EXCL-07 is semantically relevant — it is an exclusion about a physical peril, the same category as EXCL-03. A naive retrieval system would surface it. The applicability check correctly excludes it: "damage caused by fire" is not satisfied by the facts. This is the governs/relevant distinction in action.
Step 2 — Contest (governing clauses with conflicting conclusions)
GRANT-01 → covered · EXCL-03 → not covered · EXCPT-03a → covered
Step 3 — Priority resolution
| Step | Rule | Outcome |
|---|---|---|
| 1 | priority(EXCL-03 = 2) > priority(GRANT-01 = 1) | EXCL-03 defeats GRANT-01 |
| 2 | priority(EXCPT-03a = 3) > priority(EXCL-03 = 2) | EXCPT-03a defeats EXCL-03 |
| 3 | Nothing defeats EXCPT-03a | EXCPT-03a wins → COVERED ✓ |
The trace is: GRANT-01 → overridden by EXCL-03 → overridden by EXCPT-03a → covered.
The core insight is that the applicability prompt is not asking whether a clause is about the same topic as the claim. It is asking whether all the conditions listed in the clause's antecedents are true given the specific claim facts. That shift — from topic similarity to condition satisfaction — is what Defeasible Deontic Logic formalises, and what separates a system that correctly identifies governing clauses from one that merely retrieves thematically related ones.
Top comments (0)