Nikita

Posted on May 19

I Asked an AI to Check My Payment App for DORA Compliance. Here's What It Found.

#fintech #compliances #ai

Six months into building a payment orchestration layer, I was confident about the compliance posture. I read the DORA documentation. I looked at the EBA technical standards. I had a rough sense of what obligations were. Then I ran an actual gap analysis and got back a list of seventeen control failures, and the confidence evaporated fairly quickly.

Most of the failures were not in places where I had been careless. They were in decisions I had made deliberately in week two, considered reasonable at the time, and never gone back to look at. That is a different kind of problem.

This post is a record of what the review found, what I changed, and one thing I decided not to change. It is also, if I am honest, a note to myself about how easy it is to conflate "I understand the regulation" with "my system satisfies the regulation."

What DORA actually requires from developers

The Digital Operational Resilience Act, which began applying across the EU financial sector in January 2025, applies to financial entities and their critical ICT third-party providers. If you are building payment infrastructure for a regulated firm, or building the regulated firm yourself, DORA's Chapter II requirements are your problem as much as they are the compliance team's.

The thing that catches developers out is that DORA is primarily an evidence obligation. Understanding the requirements is the easy part. The harder part is building systems that produce the right evidence, continuously, and can reconstruct what happened when something goes wrong. That distinction took me a while to internalise.

Four areas come up most often in engineering decisions:

ICT risk management — documented controls for identifying, protecting against, detecting, responding to, and recovering from ICT-related incidents. A policy document does not satisfy this. The requirement is a demonstrable, tested system.

Incident classification and reporting — major ICT incidents must be reported to the relevant competent authority within specific timeframes. Your logging and alerting infrastructure determines whether you can meet those windows.

Digital operational resilience testing — threat-led penetration testing (TLPT) for significant firms, and basic resilience testing across the board. If your test environments differ materially from production, your resilience testing may fail to demonstrate operational resilience under real-world conditions.

Third-party risk management — every critical ICT third-party provider needs a documented register, contractual clauses covering audit rights and exit strategies, and ongoing monitoring. That includes your cloud provider, your payment processor, and your monitoring SaaS.

The application

The system sits in the middle of payments. It takes in a payment instruction, sends it to the right processor, tries again if something fails, checks that the outcome matches, and keeps a record of everything that happened. Node.js, PostgreSQL, Redis for queuing, deployed on AWS. Third-party integrations: Stripe, fraud scoring API, and Datadog for observability.

Fairly standard setup. Nothing is exotic.

I ran the review using AI Governor, which maps your described architecture against regulatory control requirements and surfaces gaps with specific references to the relevant DORA articles. You describe your system, infrastructure, integrations, data flows, incident response setup, and it returns a structured gap analysis tied to specific obligations.

I went in expecting it to flag the two or three things I already knew were incomplete. It flagged those. It also found eleven things I genuinely had not considered problems.

What came back

1. Logging retention was below supervisory expectations

My Datadog retention was set to 15 days. That was the configuration I had carried forward from development, and I had never changed it because it had never caused a visible problem.

DORA's incident reporting and auditability requirements, combined with EBA technical standards and broader financial-sector recordkeeping obligations, mean that fifteen-day retention periods are unlikely to satisfy supervisory expectations for incident reconstruction and regulatory review. When an incident occurs, regulators expect you to reconstruct exactly what happened. A two-week log window makes that impossible for anything that surfaces slowly — a fraud pattern, a data integrity issue, a control that degraded gradually over weeks.

Fix: Moved critical transaction logs and incident records to S3 with a five-year retention policy. Datadog stays for live monitoring; S3 becomes the compliance archive. Added a daily reconciliation job that confirms log delivery to S3 and alerts if the previous day's volume drops below threshold.

Simple fix. I had just never made it.

2. No documented RTO or RPO — anywhere

I had a mental model of acceptable downtime: an hour is fine, four hours is a bad day, longer than that is a serious problem. That model had never been written down, never been tested, and appeared nowhere in the codebase, runbooks, or infrastructure configuration.

DORA Article 11 requires that financial entities set recovery time objectives and recovery point objectives for ICT systems supporting critical functions, and that those objectives directly inform the backup and failover configuration. A mental model does not satisfy that requirement.

Short version: if you cannot point to a written RTO, you do not have one for regulatory purposes.

Fix: Documented RTO of 4 hours and RPO of 1 hour for the payment orchestration layer. That immediately surfaced a problem: the RDS automated backup schedule was set daily, which does not meet a 1-hour RPO. Moved to continuous backup with point-in-time recovery. Added a quarterly failover drill to the engineering calendar, because untested failover procedures have a way of not working the way you think they will.

3. Third-party register did not exist

I had a spreadsheet of integration. It had the API endpoint, the credentials reference, and a notes column I had stopped updating around month three.

What it lacked was the data classification of information shared with each provider, the contractual clauses covering audit rights, the exit strategy if a provider became unavailable, or any assessment of whether the provider qualified as a critical ICT provider within our operational risk assessment under DORA Article 28.

Stripe processes every payment. It is operationally critical to this application. The register entry for Stripe had none of the required fields.

Fix: Built a proper third-party ICT register in Notion: provider name, services provided, data classification, criticality assessment, contract reference, audit rights clause status, exit strategy summary, last review date. Unglamorous work. It took half a day. It had been the missing piece for eight months, and I kept not doing it because it felt like paperwork rather than engineering. That distinction turned out to be a false one.

4. Retry logic had no circuit breaker

This was the finding that surprised me most, because the retry logic had been a deliberate architectural decision.

My payment routing layer retried failed processor calls with exponential backoff. I had considered it a solid resilience pattern. What the review flagged was that under DORA's resilience testing requirements, retry behaviour without a circuit breaker creates a failure mode that can cascade into a broader outage — exactly what DORA Article 25 resilience testing is designed to surface.

A missing circuit breaker does not show up in a code review. It shows up in a resilience test report, at which point the gap is on record. The fix here is less about the regulation and more about the fact that the regulation forced me to look at something I should have looked at anyway.

Fix: Implemented the circuit breaker pattern using opossum (Node.js). Thresholds at 50% error rate over a 10-second window, 30-second reset. Added a Datadog monitor on circuit state changes. Wrote a runbook entry covering the manual override procedure, because whoever is on call at 2am deserves something useful in front of them.

5. Incident classification had no severity matrix

When something broke, I knew it was bad. I did not have a documented severity classification framework to determine, within the DORA-mandated timeframe, whether the incident crossed the threshold for regulatory reporting.

The EBA's regulatory technical standards on incident classification use five criteria: the number of clients affected, geographic spread, duration, economic impact, and reputational exposure. None of those criteria appeared anywhere in my incident response process.

The reporting clock starts from the moment you classify an incident as a major. It does not start when you fix it. If your classification process takes two days, you have already missed the initial notification window regardless of how quickly the underlying issue was resolved.

Fix: Added a severity matrix to the runbook covering the five EBA classification criteria with specific thresholds for this application. Added an incident log template that captures the classification decision and timestamp at the point of triage. It took longer to agree on the thresholds than to write the document.

The one thing I did not fix

The review flagged our use of Stripe as a critical third-party without a contractual audit rights clause. Stripe standards do not include the individual audit rights that DORA Article 30 technically requires.

This is a widely recognised challenge across the sector, particularly with hyperscale and large platform providers that do not offer bespoke contractual terms to individual customers. The practical resolution is to document the gap, reference Stripe's SOC 2 Type II and ISO 27001 certifications as compensating controls, and record it in the third-party register as an accepted risk pending any industry-level resolution.

I documented that. Any compliance auditor who has worked in payments for more than six months would understand why attempting to renegotiate Stripe's standard terms is not a practical option.

What I took away from this

Seventeen gaps. Most of them were not in the code. They were in configuration that had never been updated, documentation that had never been written, and operational assumptions that had never been tested or written down. Log retention is a configuration value. The third-party register is a spreadsheet. The RTO is a sentence in a document.

Here's the updated version with the EU AI Act deadline sentence added naturally near the end:

The review did not tell me my system was insecure or broken. It told me my system could not demonstrate that it was doing what I said it was doing. That is the gap DORA is actually trying to close, between "we have controls" and "here is the evidence that our controls ran on Tuesday at 14:32."

Run the review before you go live. The fixes are almost never as dramatic as you fear. Finding them after your first audit is a different experience entirely. If you are also building AI into your payment flows, the EU AI Act's 2 August 2026 deadline for high-risk system obligations is closer than most teams have planned for.

What does your third-party ICT register currently look like?

The DORA review in this post was conducted using AI Governor. All architectural details are representative of common payment application patterns.