DEV Community: yutaro

FaultRay: Why We Formalized Cascade Failure Propagation as a Labeled Transition System

yutaro — Fri, 10 Apr 2026 08:51:26 +0000

The gap that motivated this project

Production fault injection tools — Gremlin, Steadybit, AWS FIS — are powerful, and the chaos engineering discipline they represent has genuinely matured over the past decade. But every tool in that class shares a structural constraint: it operates on running systems.

That constraint is fine for many organizations. It is not fine for regulated industries operating under mandates like the EU Digital Operational Resilience Act (DORA), where touching production with fault injection commands introduces risk that regulators may not accept. And it is not fine for the more fundamental question that fault injection cannot answer: what is the highest availability your architecture is mathematically capable of reaching, given its dependency structure and external SLA commitments?

Classical reliability methods — Fault Tree Analysis and Reliability Block Diagrams — do answer availability ceiling questions analytically. But they operate on static trees under a component independence assumption that does not hold for cloud infrastructure. When a shared underlay network fails, your database, your cache, and your application tier all degrade simultaneously. They are not independent. A classical RBD will overestimate availability in exactly those cases.

FaultRay is a research prototype that tries to address both gaps: no production touch, and an explicit model of correlated failure propagation. This post describes the two core technical contributions and where the work stands today.

Core contribution 1: Cascade propagation as a Labeled Transition System

The cascade engine in FaultRay is formalized as a Cascade Propagation Semantics (CPS), a Labeled Transition System (LTS) over a dependency graph.

The CPS state is a 4-tuple S = (H, L, T, V) where:

H: Component → HealthStatus — health map (each component is UP, DEGRADED, OVERLOADED, or DOWN)
L: Component → float — accumulated latency map in milliseconds
T: float — elapsed simulation time in seconds
V: set[Component] — visited set, monotonically growing

The key properties we prove for this system (see src/faultray/simulator/cascade.py for the implementation and docs/patent/cascade-formal-spec.md for the derivations):

Monotonicity — health can only worsen during a simulation run. Once a component is marked DOWN, it cannot recover to UP within the same simulation. This prevents oscillation and makes simulation results stable.
Causality — a component transitions to a degraded state only if a dependency has already transitioned. There are no spontaneous failures from unaffected upstream nodes.
Circuit Breaker Correctness — when a circuit breaker is tripped on an edge, cascade propagation halts at that edge. The LTS formulation makes it possible to prove this is actually the case rather than just asserting it.
Termination — for acyclic dependency graphs, CPS terminates in O(|V| + |E|) time. For graphs with cycles (which do appear in real infrastructure — think mutual health-check dependencies), a depth limit D_max = 20 guarantees termination.

The implementation uses BFS traversal with three simulation modes corresponding to different transition subsets:

simulate_fault — Rules 1–5: fault injection followed by recursive propagation
simulate_latency_cascade — Rules 1, 6–7: latency BFS with circuit breaker halts
simulate_traffic_spike — Rule 1 applied per-component for capacity threshold checks

Why does formalizing this as an LTS matter in practice? Because it turns "the cascade engine behaves correctly" from an informal claim into something you can reason about systematically. The O(|V| + |E|) complexity bound is not a benchmark result — it follows from the BFS structure and the monotonicity guarantee. The termination proof holds for the cyclic case not because we tested it on enough graphs but because the depth bound is structurally enforced.

Core contribution 2: N-layer min-composition availability model

The second contribution is an availability ceiling model that explicitly decomposes a system's maximum achievable availability across five distinct constraint layers.

The five layers are:

Layer	What it captures
L1 Software	Deployment downtime, human error rate, configuration drift
L2 Hardware	MTBF, MTTR, redundancy factor, failover promotion time
L3 Theoretical	Irreducible physical noise: packet loss, GC pauses, kernel scheduling jitter
L4 Operational	Incident response time, on-call team size, detection latency
L5 External SLA	Product of all external dependency SLAs (cloud providers, third-party APIs)

The composition operator is min:

A_effective = min(A_L1, A_L2, A_L3, A_L4, A_L5)

This departs from the independence assumption of classical Reliability Block Diagrams, where you would multiply availabilities across components. The min operator captures a different claim: the most constrained layer determines the ceiling. If your external SLA chain caps you at 99.9% (three nines), it does not matter that your software and hardware layers could theoretically support 99.99%. The system cannot exceed its external dependency constraint.

The L2 hardware layer uses a standard parallel reliability model: for a component with replicas instances, the tier availability is A_tier = 1 - (1 - A_single)^replicas, where A_single = MTBF / (MTBF + MTTR). This is classical. What the model adds is the explicit failover penalty — the fraction of uptime lost during replica promotion — and the structural separation of the five layers so the binding constraint is visible rather than hidden inside a single number.

What the tool looks like in practice

pip install faultray
faultray demo

Building demo infrastructure...
╭────────────────────────────────────────────────────╮
│ Metric           │ Value                           │
│ Components       │ 9                               │
│ Dependencies     │ 12                              │
│ Resilience Score │ 50.0/100                        │
╰────────────────────────────────────────────────────╯

Running chaos simulation...

╭──────────── FaultRay Chaos Simulation Report ──────────╮
│ Resilience Score: 50/100                                │
│ Scenarios tested: 255                                   │
│ Critical: 21  Warning: 84  Passed: 150                  │
╰─────────────────────────────────────────────────────────╯

  Generate HTML report: faultray simulate --html report.html
  Generate DORA evidence: faultray dora evidence infra.yaml

To run the five-layer availability model on a topology:

faultray availability --model infra.json --layers 5

To run the cascade engine directly on a YAML infrastructure model:

faultray simulate --model infra.yaml --cascade-depth 5 --html report.html

The tool accepts infrastructure defined in YAML (manual) or JSON exported from Terraform (faultray tf-check plan.json). The dependency graph is a directed acyclic graph by default; the engine handles cyclic cases via the depth bound described above.

Honest assessment of the backtest

We ran the cascade engine against 18 well-documented cloud incidents spanning 2017–2024 (AWS S3 2017, Meta BGP 2021, Cloudflare 2022, CrowdStrike 2024, and others from the public postmortem record). The results show F1 = 1.000, precision = 1.000, recall = 1.000 on affected-component identification across all 18 incidents.

We want to be explicit about what those numbers mean and do not mean.

What they mean: Given a topology that matches the incident's documented architecture and a fault injection at the documented root-cause component, the cascade engine correctly identifies which components the postmortem reported as affected. This validates that the LTS propagation rules are consistent with real-world cascade behavior on these known incidents.

What they do not mean: This is a post-hoc reproduction, not a prospective prediction. We built the topologies knowing what failed. F1 = 1.000 on 18 known incidents does not imply the engine will predict future incidents correctly on topologies it has never seen. Prospective validation — building topologies for incidents that occurred after the paper was written and measuring prediction accuracy without ground-truth fitting — is the work that needs to happen before any predictive claim can be made.

The downtime MAE of ~3,159 minutes across the 18 incidents reflects a known deficiency in the current model: the cascade engine propagates structural failure correctly but does not model recovery dynamics. Actual downtime depends on incident response procedures, team capacity, and external vendor resolution timelines that the simulation does not capture. The calibration recommendations in docs/backtest-results.md include a downtime_bias_correction factor of 3,159.53 minutes, which is a signal that the downtime estimation module needs a richer operational model.

Severity accuracy averages 0.819. Severity is harder to match than affected-component sets because it depends on load and traffic patterns at the time of the incident, which the static topology model does not capture.

What this is not

FaultRay is not a compliance tool. Its outputs are not certified audit evidence. The DORA research dashboard is a prototype mapping of FaultRay's simulation outputs to DORA's five pillars — it is illustrative, not certifiable. Do not submit FaultRay output as audit evidence without independent legal and technical review.

FaultRay does not predict future incidents. The formal properties of the LTS — termination, monotonicity, causality — are properties of the simulation engine, not of your production system. The simulation shows you what would happen given the assumptions encoded in your topology model. If your model is wrong, the simulation output is wrong.

FaultRay is not a replacement for operational chaos engineering. Gremlin, Steadybit, and AWS FIS test your actual system under actual load with actual failure signals propagating through actual monitoring. FaultRay tests a model of your system. The two approaches answer different questions and are complementary rather than competitive.

Concurrent work

Krasnovsky (arXiv:2506.11176, to appear at ICSE-NIER 2026) presents concurrent complementary work on in-memory graph simulation for chaos engineering, using Monte Carlo fail-stop simulation over service-dependency graphs auto-discovered from Jaeger distributed traces. The core overlap is positioning — both tools simulate in-memory rather than injecting real faults. The approaches diverge technically: Krasnovsky uses Monte Carlo methods without formal proofs or multi-layer decomposition; FaultRay uses an LTS with formal termination and complexity guarantees plus the N-layer min-composition model. We treat this as concurrent independent validation that the in-memory simulation direction is worth pursuing, not as prior art that invalidates either contribution.

Status and roadmap

PyPI: pip install faultray (v11.1.0, Apache 2.0)
GitHub: mattyopon/faultray
Zenodo DOI: 10.5281/zenodo.19139911
USPTO provisional patent: Application No. 64/010,200, filed 2026-03-19 (non-provisional deadline 2027-03-19)
ISSRE 2026 Fast Abstract: submission planned for the 37th IEEE International Symposium on Software Reliability Engineering (Fast Abstracts track)

The paper rewrite currently in progress (v12) is stripping the AI agent failure taxonomy sections — that contribution was pre-dated by MAST (arXiv:2503.3657, NeurIPS 2025) and multiple concurrent papers — and focusing on strengthening the formal cascade engine proof and the N-layer model justification. The prospective validation experiment (building topologies for post-v11 incidents and measuring unseen-topology precision/recall) is the next concrete empirical step.

If you are working on infrastructure resilience simulation, formal methods for distributed systems, or chaos engineering tooling, the repository is open and pull requests are welcome. Issues with real incident topologies that the cascade engine handles incorrectly are especially useful.

FaultRay is a research prototype. It is NOT validated for DORA, FISC, or any regulatory audit. Do not rely on FaultRay outputs for compliance decisions without independent legal and technical review. Apache License 2.0.

How We Simulate 2,000+ Infrastructure Failures Without Touching Production

yutaro — Mon, 06 Apr 2026 12:50:24 +0000

It is 2am. Your pager fires. A terraform apply that "just changed a timeout" has taken down the payment service, the order queue, and half the API layer. The plan output looked clean. The PR had two approvals. And yet here you are, staring at a cascade failure that nobody predicted.

This is the scenario that led me to build FaultRay.

The problem with breaking things to test things

The standard chaos engineering playbook, pioneered by Netflix's Chaos Monkey in 2011 and continued by tools like Gremlin, Steadybit, and AWS FIS, follows a simple premise: inject real faults into real systems, observe what breaks, fix it.

This works, but it has structural limitations:

It requires a production-like environment. Staging is always out of sync. The failure you test in staging may not match what happens in prod.
It tests scenarios you think of. You write the experiments. You choose what to break. The failures you did not imagine are the ones that page you.
It cannot answer the ceiling question. No amount of fault injection will tell you that your architecture physically cannot reach 99.99% uptime, because your external SLA chain caps you at 99.9%.
Regulated industries cannot use it. Banks, healthcare systems, and government agencies are not going to randomly kill production processes to see what happens.

A different approach: simulate, don't break

FaultRay takes a fundamentally different path. Instead of injecting faults into running systems, it builds a dependency graph of your infrastructure and simulates over 2,000 failure scenarios entirely in memory. Nothing is deployed. Nothing is touched. You get a resilience score, a list of single points of failure, and a map of every cascade path — in seconds.

The most common integration point is the Terraform pipeline. After terraform plan, you export the plan as JSON and run:

terraform plan -out=plan.out
terraform show -json plan.out > plan.json
faultray tf-check plan.json

╭──────────── FaultRay Terraform Guard ────────────╮
│                                                   │
│  Score Before: 72/100                             │
│  Score After:  45/100  (-27 points)               │
│                                                   │
│  NEW RISKS:                                       │
│  - Database is now a single point of failure      │
│  - Cache has no replication (data loss risk)      │
│                                                   │
│  Recommendation: HIGH RISK - Review Required      │
│                                                   │
╰───────────────────────────────────────────────────╯

FaultRay models what your infrastructure looks like before and after the planned change, runs the full simulation against both states, and shows you the delta. Not "this is risky" but "this specific change drops your score by 27 points and introduces a new SPOF."

CI/CD integration in 2 lines

# .github/workflows/terraform.yml
- name: Check Terraform Plan
  run: |
    pip install faultray
    faultray tf-check plan.json --fail-on-regression --min-score 60

--fail-on-regression fails the job if the resilience score drops at all. --min-score 60 fails if the resulting score is below your threshold. The job blocks the merge. The 2am page never happens.

The math behind the score

This is the part that might interest you if you have read this far. FaultRay is not a heuristic engine. It is built on formal methods with proven properties.

5-Layer Availability Limit Model

Most teams set SLO targets (99.99%, four nines) without knowing whether their architecture can physically reach them. FaultRay computes five independent availability ceilings:

Layer 1: Software Limit     → Deployment downtime, human error, config drift
Layer 2: Hardware Limit     → Component MTBF, MTTR, redundancy, failover time
Layer 3: Theoretical Limit  → Irreducible physical noise (packet loss, GC, jitter)
Layer 4: Operational Limit  → Incident response time, team size, on-call coverage
Layer 5: External SLA Chain → Product of all third-party dependency SLAs

Your system's availability ceiling is:

A_system = min(L1, L2, L3, L4, L5)

If Layer 5 says your external SLA chain caps you at 99.9% (three nines), it does not matter that your hardware can do five nines. The bottleneck is the weakest layer. FaultRay surfaces this before you spend months over-engineering the wrong layer.

LTS-based cascade engine

The cascade simulator implements a Labeled Transition System (LTS) formalized as a 4-tuple S = (H, L, T, V):

H: health map (component to status)
L: accumulated latency map
T: elapsed time
V: visited set (monotonically growing)

The system has four proven properties:

Monotonicity — health can only worsen during a simulation run
Causality — a component fails only if a dependency has failed
Circuit breaker correctness — a tripped circuit breaker stops cascade at that edge
Termination — the engine terminates in O(|V| + |E|) for acyclic graphs; a depth limit of 20 guarantees termination for cyclic graphs

These properties mean the simulation is deterministic and complete. It will find every reachable failure state, and it will always halt. The full formal specification is in the paper.

AI agent hallucination model

FaultRay v11 introduced failure modeling for AI agent systems. The core model computes hallucination probability as a function of three variables:

H(a, D, I)

Where a is the agent, D is the set of data sources, and I is the infrastructure state. When a data source goes DOWN, the agent's hallucination probability increases proportionally to its dependency weight on that source:

If source d is HEALTHY:    h_d = h0
If source d is DOWN:       h_d = h0 + (1 - h0) * w(d)
If source d is DEGRADED:   h_d = h0 + (1 - h0) * w(d) * delta

This captures a failure mode that traditional chaos tools cannot model: your LLM endpoint stays up, your agent keeps responding, but its answers become unreliable because the grounding data it depends on is gone. The agent does not throw an error. It hallucinates. FaultRay quantifies the probability and traces the cascade through multi-agent chains.

Validation: 18 real-world incidents

I backtested FaultRay against 18 documented public cloud incidents (AWS, GCP, Azure outages with known root causes and blast radii). The engine was given the pre-incident topology, told which component failed, and asked to predict which downstream services would be affected.

Results: F1 = 1.000 across all 18 incidents.

I should be honest about what this means and what it does not. The topologies were constructed post-hoc from incident reports. I knew the architecture because the post-mortems described it. This validates that the cascade engine correctly propagates failures through a known graph. It does not validate topology discovery from real Terraform state, which is a harder and less controlled problem. The backtest methodology and all 18 incidents are documented in the paper.

Try it

pip install faultray
faultray demo

The demo runs a simulation against a sample infrastructure (load balancer, app servers, database, cache, queue) and outputs a full resilience report. Add --web for an interactive D3.js dependency graph in your browser.

To analyze your own infrastructure, define it in YAML:

components:
  - id: nginx
    type: load_balancer
    replicas: 2
  - id: api
    type: app_server
    replicas: 3
  - id: postgres
    type: database
    replicas: 1  # FaultRay will flag this

dependencies:
  - source: nginx
    target: api
    type: requires
  - source: api
    target: postgres
    type: requires

faultray load infra.yaml
faultray simulate --html report.html

Or import directly from Terraform state with faultray tf-import.

The numbers

This is a solo project, but I did not cut corners on quality:

32,000+ tests, all passing
CI runs lint, type check, unit, E2E, security, performance, and mutation testing on every push
USPTO provisional patent filed (US 64/010,200)
Peer-reviewed paper on Zenodo (DOI: 10.5281/zenodo.19139911)

Try it

Live demo (browser): faultray.com/demo

pip install faultray
faultray demo

How We Simulate 2,000+ Infrastructure Failures Without Touching Production

yutaro — Mon, 23 Mar 2026 10:56:23 +0000

description: FaultRay scores your infrastructure
resilience before terraform apply — catching cascade
risks, SPOFs, and availability ceiling violations in
seconds.
tags: chaosengineering, devops, python, terraform
cover_image:
---

 It is 2am. Your pager fires. A `terraform apply` that
 "just changed a timeout" has taken down the payment
 service, the order queue, and half the API layer. The
 plan output looked clean. The PR had two approvals. And
  yet here you are, staring at a cascade failure that
 nobody predicted.

 This is the scenario that led us to build FaultRay.

 ## The problem with breaking things to test things

 The standard chaos engineering playbook, pioneered by
 Netflix's Chaos Monkey in 2011 and continued by tools
 like Gremlin, Steadybit, and AWS FIS, follows a simple
 premise: inject real faults into real systems, observe
 what breaks, fix it.

 This works, but it has structural limitations:

 - **It requires a production-like environment.**
 Staging is always out of sync. The failure you test in
 staging may not match what happens in prod.
 - **It tests scenarios you think of.** You write the
 experiments. You choose what to break. The failures you
  did not imagine are the ones that page you.
 - **It cannot answer the ceiling question.** No amount
 of fault injection will tell you that your architecture
  physically cannot reach 99.99% uptime, because your
 external SLA chain caps you at 99.9%.
 - **Regulated industries cannot use it.** Banks,
 healthcare systems, and government agencies are not
 going to randomly kill production processes to see what
  happens.

 ## A different approach: simulate, don't break

 FaultRay takes a fundamentally different path. Instead
 of injecting faults into running systems, it builds a
 dependency graph of your infrastructure and simulates
 over 2,000 failure scenarios entirely in memory.
 Nothing is deployed. Nothing is touched. You get a
 resilience score, a list of single points of failure,
 and a map of every cascade path — in seconds.

 The most common integration point is the Terraform
 pipeline. After `terraform plan`, you export the plan
 as JSON and run:

 ```bash
 terraform plan -out=plan.out
 terraform show -json plan.out > plan.json
 faultray tf-check plan.json
 ```

 ```
 ╭──────────── FaultRay Terraform Guard ────────────╮
 │                                                   │
 │  Score Before: 72/100                             │
 │  Score After:  45/100  (-27 points)               │
 │                                                   │
 │  NEW RISKS:                                       │
 │  - Database is now a single point of failure      │
 │  - Cache has no replication (data loss risk)      │
 │                                                   │
 │  Recommendation: HIGH RISK - Review Required      │
 │                                                   │
 ╰───────────────────────────────────────────────────╯
 ```

 FaultRay models what your infrastructure looks like
 *before* and *after* the planned change, runs the full
 simulation against both states, and shows you the
 delta. Not "this is risky" but "this specific change
 drops your score by 27 points and introduces a new
 SPOF."

 ### CI/CD integration in 2 lines

 ```yaml
 # .github/workflows/terraform.yml
 - name: Check Terraform Plan
   run: |
     pip install faultray
     faultray tf-check plan.json --fail-on-regression
 --min-score 60
 ```

 `--fail-on-regression` fails the job if the resilience
 score drops at all. `--min-score 60` fails if the
 resulting score is below your threshold. The job blocks
  the merge. The 2am page never happens.

 ## The math behind the score

 This is the part that might interest you if you have
 read this far. FaultRay is not a heuristic engine. It
 is built on formal methods with proven properties.

 ### 5-Layer Availability Limit Model

 Most teams set SLO targets (99.99%, four nines) without
  knowing whether their architecture can physically
 reach them. FaultRay computes five independent
 availability ceilings:

 ```
 Layer 1: Software Limit     → Deployment downtime,
 human error, config drift
 Layer 2: Hardware Limit     → Component MTBF, MTTR,
 redundancy, failover time
 Layer 3: Theoretical Limit  → Irreducible physical
 noise (packet loss, GC, jitter)
 Layer 4: Operational Limit  → Incident response time,
 team size, on-call coverage
 Layer 5: External SLA Chain → Product of all
 third-party dependency SLAs
 ```

 Your system's availability ceiling is:

 ```
 A_system = min(L1, L2, L3, L4, L5)
 ```

 If Layer 5 says your external SLA chain caps you at
 99.9% (three nines), it does not matter that your
 hardware can do five nines. The bottleneck is the
 weakest layer. FaultRay surfaces this before you spend
 months over-engineering the wrong layer.

 ### LTS-based cascade engine

 The cascade simulator implements a Labeled Transition
 System (LTS) formalized as a 4-tuple `S = (H, L, T,
 V)`:

 - `H`: health map (component to status)
 - `L`: accumulated latency map
 - `T`: elapsed time
 - `V`: visited set (monotonically growing)

 The system has four proven properties:

 1. **Monotonicity** — health can only worsen during a
 simulation run
 2. **Causality** — a component fails only if a
 dependency has failed
 3. **Circuit breaker correctness** — a tripped circuit
 breaker stops cascade at that edge
 4. **Termination** — the engine terminates in O(|V| +
 |E|) for acyclic graphs; a depth limit of 20 guarantees
  termination for cyclic graphs

 These properties mean the simulation is deterministic
 and complete. It will find every reachable failure
 state, and it will always halt. The full formal
 specification is in the
 [paper](https://doi.org/10.5281/zenodo.19139911).

 ### AI agent hallucination model

 FaultRay v11 introduced failure modeling for AI agent
 systems. The core model computes hallucination
 probability as a function of three variables:

 ```
 H(a, D, I)
 ```

 Where `a` is the agent, `D` is the set of data sources,
  and `I` is the infrastructure state. When a data
 source goes DOWN, the agent's hallucination probability
  increases proportionally to its dependency weight on
 that source:

 ```
 If source d is HEALTHY:    h_d = h0
 If source d is DOWN:       h_d = h0 + (1 - h0) * w(d)
 If source d is DEGRADED:   h_d = h0 + (1 - h0) * w(d) *
  delta
 ```

 This captures a failure mode that traditional chaos
 tools cannot model: your LLM endpoint stays up, your
 agent keeps responding, but its answers become
 unreliable because the grounding data it depends on is
 gone. The agent does not throw an error. It
 hallucinates. FaultRay quantifies the probability and
 traces the cascade through multi-agent chains.

 ## Validation: 18 real-world incidents

 We backtested FaultRay against 18 documented public
 cloud incidents (AWS, GCP, Azure outages with known
 root causes and blast radii). The engine was given the
 pre-incident topology, told which component failed, and
  asked to predict which downstream services would be
 affected.

 Results: **F1 = 1.000** across all 18 incidents.

 We should be honest about what this means and what it
 does not. The topologies were constructed post-hoc from
  incident reports. We knew the architecture because the
  post-mortems described it. This validates that the
 cascade engine correctly propagates failures through a
 known graph. It does not validate topology discovery
 from real Terraform state, which is a harder and less
 controlled problem. The backtest methodology and all 18
  incidents are documented in the paper.

 ## Try it

 ```bash
 pip install faultray
 faultray demo
 ```

 The demo runs a simulation against a sample
 infrastructure (load balancer, app servers, database,
 cache, queue) and outputs a full resilience report. Add
  `--web` for an interactive D3.js dependency graph in
 your browser.

 To analyze your own infrastructure, define it in YAML:

 ```yaml
 components:
   - id: nginx
     type: load_balancer
     replicas: 2
   - id: api
     type: app_server
     replicas: 3
   - id: postgres
     type: database
     replicas: 1  # FaultRay will flag this

 dependencies:
   - source: nginx
     target: api
     type: requires
   - source: api
     target: postgres
     type: requires
 ```

 ```bash
 faultray load infra.yaml
 faultray simulate --html report.html
 ```

 Or import directly from Terraform state with `faultray
 tf-import`.

 ## Links

 - **GitHub:** [github.com/mattyopon/faultray](https://g
 ithub.com/mattyopon/faultray)
 - **Paper (DOI):** [doi.org/10.5281/zenodo.19139911](ht
 tps://doi.org/10.5281/zenodo.19139911)
 - **PyPI:** [pypi.org/project/faultray](https://pypi.or
 g/project/faultray/)

 FaultRay is licensed under BSL 1.1, converting to
 Apache 2.0 in 2030. Contributions and feedback are
 welcome.