DEV Community: Sakura Sky

Identity Is the Bank Now

Andrew Stevens — Wed, 15 Jul 2026 09:56:08 +0000

A customer who has banked with the same institution for years is known to it in a number of different ways:

To the onboarding system that ran the identity checks, the customer is a passport, an address, and a date on which verification was completed.
To the fraud engine, the same customer is a pattern of devices, locations, and spending that it scores in real time.
To the anti-money-laundering system, a risk rating and a stack of alerts, most of them cleared long ago.
To the app and the contact centre, a name, a photograph, and a service history.

Four example systems, four versions of one person, and they do not agree with one another. When that customer moves house, the bank often hears about it three times, because three of the four systems have no idea the other two have already been updated. The customer notices the seams. So, in a different way, do the people who defraud banks for a living.

Every retail bank has some version of this customer, and the fragmentation is not the failing of one institution. Fraud, financial crime, KYC, and customer experience grew up as separate disciplines, with separate teams, separate budgets, and separate systems, and for a long time that separation held up fine. It holds up much less well now, because the four have collapsed into a single engineering question, and banking identity is the name for the answer.

Does the bank hold one reliable, current, shared understanding of who its customer is, and can every function that needs it read from and write to that understanding in close to real time? The wider series framed identity as the point where trust stops being a claim and becomes an engineering output (see Trust Is an Engineering Output).

Inside a bank, identity is turning into the thing the whole institution runs on.

The seamed customer

Look first at what the seams cost, because they are expensive on both sides at once. For the customer, they show up as friction. A repeat verification for a transaction the fraud engine did not recognise, a wait while a legitimate payment is held, a request to re-confirm details the bank already holds, an address updated in one channel while another channel keeps posting to the old one. None of this is catastrophic. All of it, accumulated across millions of customers, is the texture of a bank that feels harder to deal with than it should.

The other side is where it gets serious.

The same seams are an attack surface, and financial crime lives in the gaps between the four systems. A synthetic identity, assembled from real and fabricated fragments, can pass onboarding because the KYC system checks documents rather than coherence, and then behave well enough that the fraud engine, which never saw the onboarding anomaly, has no reason to look twice. An account takeover exploits the fact that the fraud engine sees a device change the AML system will never hear about, while the AML system sees a transaction pattern the fraud engine treats as somebody else's problem. No single system holds the whole picture of the customer, so no single system can see the whole picture of the attack. The criminal's real advantage is not sophistication. It is that the bank's view of the customer is split four ways and the criminal's view of that same customer is whole.

How identity got fragmented

The fragmentation was built one sensible decision at a time. The KYC system arrived to satisfy onboarding obligations, and those obligations are real: the global standard requires a bank to identify and verify the customer, identify beneficial owners, understand the relationship, and monitor it on an ongoing basis (FATF, 2012). The AML monitoring system was bought separately, often from a different vendor, and keyed on accounts and transactions rather than on people. The fraud engine came in for real-time scoring, keyed on sessions and devices. The customer relationship system was built for service, keyed on a contact record. Each was the right tool for its job, procured by the team that owned that job, on its own timeline.

What none of them shared was a canonical idea of the customer. Each held its own identifier for the same human being, and nothing tied those identifiers together into one durable customer identity. So the bank ended up able to answer four narrow questions well and the one broad question, who is this customer and what do we currently know about them, not at all. This is the state most KYC architecture is in: strong at the point check, silent on the whole.

The regulatory current now runs the other way on both sides of the Atlantic, which is worth noticing. In the United States, FinCEN's customer due diligence rule already requires banks to identify and verify the beneficial owners behind their legal-entity customers and to keep that understanding current, and a February 2026 order recast the refresh obligation on a risk basis rather than as a mechanical repeat at every new account (FinCEN, 2016; FinCEN, 2026). In the European Union, the 2024 anti-money-laundering reform moves the bloc onto a single rulebook and stands up a central authority to supervise it, with directly applicable rules on customer due diligence and beneficial ownership designed to make that data consistent across institutions rather than bespoke to each (European Parliament and Council, 2024). Different regimes, one direction of travel, toward shared and coherent identity. A bank whose own AML data cannot be reconciled across its four internal systems is starting that journey a long way back.

What a unified identity layer looks like

The fix is a unified identity layer, and it is worth being precise about what that means, because it is not another copy of the data in a warehouse. It is one canonical, current, authoritative record of each customer, built by resolving the fragments the four systems already hold into a single entity. Entity resolution, the work of deciding that this passport, that device pattern, and that contact record all refer to the same real person, is the hard technical core of it, and it is never perfectly clean, which is exactly why it has to be engineered rather than assumed.

The layer holds the durable answer to who the customer is, and links out to the signals each function produces: the KYC status and its evidence, the AML risk rating and its alerts, the live fraud signals, the service history. The point is not that one team now owns everything. The point is that any function can see the whole customer without having to own all of the customer. The fraud engine can factor in that onboarding flagged something odd. The AML system can see that the fraud engine just watched the customer's device and location change. Onboarding can stop asking for what the bank already holds. The identity layer becomes the shared, low-latency, authoritative view that every other system reads from and contributes to, and building that view out of messy source data is a serious piece of identity engineering rather than a reporting exercise.

The security architecture that supports it

Here is the part that a bank has to take seriously before it builds any of this, because getting it wrong turns an asset into a liability. A unified identity layer concentrates the most sensitive data in the entire institution into one place. That is precisely what makes it valuable, and precisely what makes it the highest-value target the bank owns. Unify identity without hardening it and the result is a better-organised honeypot.

So the unification and the protection are the same project, and banking security has to be designed into the layer from the first day rather than added once it works. That means access that is fine-grained and purpose-limited, so the fraud engine reads the fields it needs for fraud prevention and nothing more, and the contact centre sees a service view that does not expose the full financial-crime picture. It means strong identity for the workloads and the people reaching the layer, not just for the customers described in it. It means tokenising the most sensitive attributes, so that a breach of one consumer does not spill raw identity data. And it means a complete, tamper-evident record of every read and write, both because a supervisor will eventually ask who saw what, and because the bank itself needs to know. Get this right and the concentration of identity is a strength. Get it wrong and it is the worst single point of failure a bank could design. This seam between unifying data and defending it is the ground Sakura's Security practice works on with a bank's identity teams.

What changes once it exists

Come back to that customer, and follow what a working identity layer changes for them and for the bank at the same time. The address update lands once and every channel reflects it. The legitimate payment clears because the fraud engine can see the customer is exactly who the rest of the bank already knows them to be. Onboarding to a new product takes minutes because the bank reuses what it verified years ago instead of starting over. The friction that made the bank feel hard to deal with mostly disappears, and it disappears for the same reason the bank gets safer.

Because the criminal loses the seams. Synthetic identities are harder to sustain when onboarding, fraud, and financial-crime signals resolve to one view that has to stay coherent over time. Account takeover is harder when the device change the fraud engine sees and the transaction pattern the AML system sees are looking at the same customer record rather than two strangers. Fraud prevention and financial-crime detection stop being separate contests the bank fights with half the picture each.

And the bank gains the property the whole series has been circling. It can answer, for any customer, what it knows and how it knows it, on demand and with the evidence attached. That is trust as an output of the architecture rather than an assertion in a policy, and it turns out to run on a clean identity foundation. A bank that builds one is not a slow bank: Xapo Bank reached production on a tightly governed stack in weeks rather than quarters, because the foundations were engineered once and everything else inherited them.

The unified identity layer that all four functions read from and write to has to be built out of the fragmented systems a bank already runs, and constructing that resolved, current, authoritative view of the customer is the work Sakura's Data & AI practice does inside financial institutions.

References

European Parliament and Council, 2024. Regulation (EU) 2024/1624 of the European Parliament and of the Council of 31 May 2024 on the prevention of the use of the financial system for the purposes of money laundering or terrorist financing (Anti-Money Laundering Regulation). Official Journal of the European Union, L, 19 June. Available at: https://eur-lex.europa.eu/eli/reg/2024/1624/oj [Accessed 10 July 2026].

Financial Action Task Force, 2012. International Standards on Combating Money Laundering and the Financing of Terrorism and Proliferation: The FATF Recommendations (updated). Financial Action Task Force, Paris. Available at: https://www.fatf-gafi.org/en/publications/Fatfrecommendations/Fatf-recommendations.html [Accessed 10 July 2026].

Financial Crimes Enforcement Network (FinCEN), 2016. Customer Due Diligence Requirements for Financial Institutions, Final Rule. 31 CFR Parts 1010, 1020, 1023, 1024 and 1026. US Department of the Treasury. Available at: https://www.federalregister.gov/documents/2016/05/11/2016-10567/customer-due-diligence-requirements-for-financial-institutions [Accessed 10 July 2026].

Financial Crimes Enforcement Network (FinCEN), 2026. Order Granting Exceptive Relief from the Beneficial Ownership Requirements for Legal Entity Customers. US Department of the Treasury, 13 February. Available at: https://www.fincen.gov/system/files/2026-02/FinCEN-Order-CCDExceptiveRelief.pdf [Accessed 10 July 2026].

Migrating a Live Real-Time Communications Platform from AWS to Google Cloud

Andrew Stevens — Tue, 14 Jul 2026 11:15:08 +0000

11Sight operates a real-time voice and video engagement platform, the communications backbone behind its AI agents for automotive and hospitality businesses. Calls are the product. An infrastructure migration that takes the platform offline for a weekend was never an option. Sakura Sky's Cloud practice moved the platform from AWS to Google Cloud with a phased hybrid strategy that kept it serving live traffic throughout, and held user-facing downtime at the final cutover to under five minutes.

The Challenge

11Sight's production environment had grown up on EC2 and RDS: a monolithic web application on VMs, a Jitsi-based conferencing stack with video bridges and recorders, and a PostgreSQL database holding the customer data that every call depends on. Three constraints shaped the engagement:

Live traffic, all the time. A real-time communications platform has no quiet maintenance window long enough for a big-bang cutover, so the migration had to run while customers kept making calls.
Prove everything before production. Cutover timing, VPN latency, and application behaviour on GKE all had to be validated against production-grade infrastructure before any customer traffic depended on them, which made a full rehearsal environment a first-class deliverable of the migration plan.
Modernization over relocation. The goal was to land on a cloud-native footing, with Kubernetes where it earned its keep and managed services for state, rather than reproduce the VM-centric estate on new hardware.

The Engineering

We designed a compute-first, three-phase hybrid migration that decoupled the application move from the database move, so each could be validated independently. Four engineering decisions carried the project:

Landing zone before workloads. The first deliverable was a Google Cloud foundation built from Sakura Sky's Enclave Terraform blueprint: organization structure, IAM groups, centralized logging and monitoring, a Shared VPC, and a Cloud HA VPN linking the AWS and GCP networks. Every subsequent resource was defined in Terraform, in repositories created inside 11Sight's environment from day one.
Rehearse the whole migration in staging first. We provisioned a complete staging environment (GKE for the containerized web application, Cloud SQL for PostgreSQL 16, Memorystore with the Valkey engine, Compute Engine instances behind autoscaling groups for the conferencing workloads) and used it to rehearse the entire migration. That rehearsal validated that VPN latency between the GCP application tier and the AWS database was within production thresholds, and produced a measured downtime estimate of 45 to 90 minutes for the final cutover.
Compute first, data second. In production, Phase 1 shifted live traffic from the AWS VMs to the GKE load balancer via DNS while all reads and writes continued against AWS RDS over the VPN. Phase 2 enabled logical replication on RDS and ran a continuous Database Migration Service job into Cloud SQL, keeping the two databases in near-real-time sync.
A rehearsed cold cutover. Phase 3 was a stop-and-go cutover inside the planned window: drain traffic at the GKE ingress, quiesce the source database, promote the Cloud SQL replica once DMS reported no lag, repoint GKE configuration and secrets, and redeploy. A war room voice bridge kept migration, DevOps, development, and QA leads on one channel, and traffic reopened only after internal health checks passed against the new database.

The Results

Under 5 minutes of downtime. Against a planned 45 to 90 minute maintenance window, actual user-facing downtime at the production cutover was less than five minutes.
Zero data loss. Continuous DMS replication and the no-lag promotion gate meant the cutover moved the database without losing a single write.
A cloud-native production platform. The web application now runs containerized on GKE, state lives in managed services, and the conferencing fleet scales behind autoscaling groups instead of hand-tended VMs.
A proving ground that outlives the project. The rehearsal environment was built to outlast the migration: a permanent, Terraform-defined staging environment where future releases, upgrades, and scaling decisions get validated before they reach customers.
Everything as code, owned by the client. All infrastructure is defined in Terraform in 11Sight-owned repositories, so there was nothing to hand over at close-out that 11Sight did not already control.

This is the type of work our Accelerate solution delivers on, with foundations laid by Enclave: production capability built inside the client's environment, jointly with their team. Contact us to scope a similar migration.

Regulatory Evidence at Machine Speed

Andrew Stevens — Mon, 13 Jul 2026 09:38:21 +0000

The request arrived on a Tuesday and gave the compliance team twenty-four hours. The regulator wanted to know why the bank's transaction monitoring system had cleared a particular payment eleven months earlier: which rules fired and which did not, what customer risk score applied at that moment, who reviewed the resulting alert, and what that reviewer actually saw. The system that made the decision was still running, unchanged, and working correctly. The evidence about that one decision was somewhere else entirely. It sat in a queue, behind a request to rebuild a dataset, behind a restore from backup, behind a data engineer who had other work booked. The bank was not being accused of anything. It was being asked to show its working, and it had twenty-four hours to discover whether it could.

Banks have always been asked for evidence. What has changed is the nature of the request. The old rhythm was periodic, predictable, and aggregate: a return filed on schedule, a report at quarter end, a sample pulled for an inspection booked weeks in advance. The new rhythm is specific, unscheduled, and granular: this decision, this customer, this moment, and show me now. Most banks built their evidence systems for the first rhythm and are now being asked to serve the second. This is the banking form of a tension that runs through most regulated industries, which the wider series has set out as the gap between evidence and speed (see Evidence Versus Speed). Inside a bank it takes a particular shape, and it has a particular fix.

The twenty-four-hour request

What the compliance team actually had to do, in those twenty-four hours, is the most revealing part of the story. The decision they needed to explain was not recorded anywhere as a decision. It had to be reassembled from three systems that had never been designed to be read together.

The monitoring engine's logs held which rules had evaluated the payment, but those logs rotated after ninety days, so the relevant window had to be restored from backup. The customer risk score was worse. It was recomputed nightly and overwritten each time, so the score that actually applied eleven months ago no longer existed anywhere; it had to be rebuilt by rerunning the scoring logic against archived inputs and hoping the logic had not changed in the meantime. The analyst's review sat in a case management tool with its own retention policy and no link back to the payment except a reference number typed in by hand.

The team got there, barely, and the answer was correct. The payment had been cleared for good reasons and the bank had done nothing wrong. The cost was two engineers for the better part of a week and an uncomfortable realisation in the room afterwards: nobody was confident they could do it again. The decision itself had been sound. What the bank could not do was demonstrate it on demand, and that is a different kind of risk from getting the decision wrong. Under MAS Notice 626, records must be retained and be retrievable in a usable form (Monetary Authority of Singapore, 2022). Retention that exists in principle but cannot be produced when the regulator asks is not really retention at all.

What regulators used to ask for

The reason banks are in this position is that their evidence systems were built, quite rationally, for the requests that used to arrive. Regulatory reporting was a calendar activity. Returns were filed monthly, quarterly, or annually. Inspections were scheduled. The unit of evidence was the aggregate: a capital position, an exposure summary, a count of alerts raised and cleared. Banks built reporting factories to serve that model, with data warehouses, reconciliation processes, sign-off workflows, and a small industry of controls wrapped around the production of the report.

Even the regulation that pushed hardest on data quality assumed this shape. The Basel Committee's principles for effective risk data aggregation and risk reporting, published in 2013, told banks to be able to aggregate risk data accurately and to trace it, and it remains the reference point for banking data lineage (Basel Committee on Banking Supervision, 2013). But the output it had in mind was still a report, produced on a cycle, for a supervisor who would read it later.

Under that model, assembling evidence retrospectively was a perfectly sensible strategy, because the ask was predictable and the deadline was known. Banking compliance became organised around producing documents on a schedule. Evidence was a product of the reporting cycle rather than a property of the transaction, and for a long time nobody had reason to notice the difference.

What they ask for now

The difference is now impossible to miss, because supervisors have stopped confining themselves to the aggregate. They ask about individual decisions, at short notice, and they expect the bank to trace one end to end.

The pressure is visible in the supervisors' own output. More than a decade after the Basel principles, the European Central Bank found it necessary to issue a guide pressing banks on effective risk data aggregation and reporting, precisely because so many still cannot demonstrate complete, end-to-end lineage across their data estate (European Central Bank, 2024). The Digital Operational Resilience Act, in application since January 2025, requires financial entities to maintain registers of their ICT arrangements and to evidence their operational resilience continuously rather than annually (European Parliament and Council, 2022). And the five-year retrievability standard in MAS Notice 626 is not satisfied by a backup tape that takes a week to read.

Put together, these describe a single shift. The deliverable has become the reconstructable chain behind any single decision the bank made, available on request, at something close to the speed the bank operates. The report still gets filed, but it is no longer the thing the supervisor is really testing. Real-time compliance is a slightly misleading phrase, because nobody expects the answer instantly. What is expected is that producing the answer is a query rather than an excavation. A financial services audit is becoming a series of specific questions with specific answers, and regulatory reporting, while it continues, is no longer the whole of the job.

Engineering evidence into the flow

Meeting that expectation is an engineering problem, and it has an engineering answer. The evidence has to be emitted by the system as it runs.

Look again at what those twenty-four hours actually required. Engineers excavated rotated logs from a backup and hoped the restore was complete. They reran scoring logic against archived inputs and hoped the logic had not drifted in eleven months. They tied a case file to a payment through a reference number somebody had typed in by hand. Every step was archaeology. Every step introduced a guess. What the bank finally handed the regulator was a well-argued account of the decision, assembled under deadline by people reconstructing their own system from its debris.

Engineer the evidence into the flow and the same request lands very differently. At the moment the decision is made, the system writes it down. It records the inputs that fed the rule, the version of the rule and the model that evaluated them, the score they produced, the identity of anyone who touched the alert, and the time it happened, all bound together by a single identifier. It preserves the risk score as it stood that day instead of overwriting it at midnight. It hash-links the record, so any later alteration shows. It makes the record addressable, so one payment resolves to one chain. One approach excavates. The other retrieves.

This is what engineered compliance means in a bank. The evidence becomes a by-product of operating the system, produced continuously whether or not anyone asks for it. It costs something to build. It costs considerably less than paying for reconstruction every time and never being certain the reconstruction will hold.

None of that capability lives in the monitoring engine. It lives one layer down, in the data foundation that carries lineage and preserved history underneath every transaction the bank processes, and that layer is precisely what Sakura's Data & AI practice builds inside regulated institutions. A bank standing on that foundation can move quickly: Xapo Bank reached production on a tightly governed stack in weeks rather than quarters, because the controls were engineered into the platform instead of being negotiated afresh for every release.

The audit cycle that disappears

The payoff goes well beyond surviving the next twenty-four-hour request. The audit cycle stops being an event in the bank's calendar at all.

When evidence is a running output, the request that consumed two engineers for a week becomes a query answered in minutes by someone in compliance who does not need to open a ticket. Audit-ready banking means the bank answers without a change freeze, without pulling engineers off delivery, and without the background dread that the answer might not be reproducible. Everything else the bank is trying to build keeps moving while the question is answered.

The banks that bolt evidence on afterwards pay for it twice. They pay once in reconstruction, and again in the drag on everything else, because a system whose evidence cannot be produced on demand cannot safely be changed quickly. Every release carries an unpriced risk of breaking a chain nobody can currently see. There is a further dividend, too: a bank that can prove exactly what its transaction monitoring did, and why, can afford to tune it more aggressively, because it can demonstrate the effect of the change rather than argue about it. Evidence engineered into the flow is what lets a bank keep moving while it answers.

The work does not end when the architecture is built. The chain has to hold through every release, every model change, and every new rule, and someone has to be able to prove it still does. Running that discipline inside a bank is what Sakura's GRC service is for.

References

Basel Committee on Banking Supervision, 2013. Principles for effective risk data aggregation and risk reporting. Bank for International Settlements, Basel. Available at: https://www.bis.org/publ/bcbs239.htm [Accessed 10 July 2026].

European Central Bank, 2024. Guide on effective risk data aggregation and risk reporting. ECB Banking Supervision, Frankfurt. Available at: https://www.bankingsupervision.europa.eu/ecb/pub/pdf/ssm.supervisory_guides240503_riskreporting.en.pdf [Accessed 10 July 2026].

European Parliament and Council, 2022. Regulation (EU) 2022/2554 of the European Parliament and of the Council of 14 December 2022 on digital operational resilience for the financial sector (Digital Operational Resilience Act). Official Journal of the European Union, L 333, 27 December, pp. 1-79. Available at: https://eur-lex.europa.eu/eli/reg/2022/2554/oj [Accessed 10 July 2026].

Monetary Authority of Singapore, 2022. MAS Notice 626: Notice to Banks on Prevention of Money Laundering and Countering the Financing of Terrorism. Monetary Authority of Singapore. Available at: https://www.mas.gov.sg/regulation/notices/notice-626 [Accessed 10 July 2026].

Your Ontology Is the Asset. Stop Renting It Back.

Andrew Stevens — Fri, 10 Jul 2026 13:12:11 +0000

Enterprise AI post-mortems have changed character. The older ones read like RAND's catalogue of failure causes, misunderstood problems and data that could not carry the weight, feeding a failure rate the researchers put above 80 percent (Ryseff, De Bruhl and Newberry, 2024). The ones crossing my desk in 2026 name a different trio, over and over. The agent behaved differently today than it did yesterday, same inputs. The model asserted a purchase order, a clause, a valuation model that does not exist. And when the auditor asked why the system did what it did, nobody could produce an answer that survived scrutiny.

Determinism failures.
Hallucinations.
Compliance gaps.

Three symptoms, one defect. A model invents entities when nothing machine-readable tells it what exists. Runs diverge when nothing pins down which actions are permitted, under what conditions, against which objects.

Compliance evidence cannot be produced when authority and auditability were never encoded anywhere a system could enforce them. The older definitional fights still contribute: nobody could agree on what a "customer" was, or whether a work order and a maintenance event were the same thing. A real factor, but no longer the largest. That ambiguity is the human-speed version of the disease that now presents as invented valuation models and unreproducible runs. Each failure is a failure of meaning, and the artifact that fixes all of them has an old name from knowledge engineering: an ontology, an explicit specification of a conceptualisation (Gruber, 1993). Strip the philosophy and it is three registers. What things exist in our business. How they connect. What actions are allowed against them, by whom, with what trace. One register grounds the model, one constrains the behaviour, one produces the evidence.

Everyone rediscovers this eventually. The interesting question in 2026 is not whether you need one. It is who ends up owning it.

The market has already priced this

You do not have to take my word for the value of the thing. Watch where the money went.

Palantir spent roughly two decades as a bespoke integration house before posting its first GAAP-profitable full year in 2023 (Palantir Technologies, 2024). The product that emerged from those years is not a database and not a model. Palantir's own documentation describes the Foundry Ontology as the layer that sits on top of datasets and models and binds them to their real-world counterparts, with object types, link types, and action types as the primitives (Palantir Technologies, 2026). Entities, relationships, permitted actions. The three registers, productised.

Founder-facing commentary has caught up. A recent essay in a series on durable moats holds Palantir up as the template, casting the ontology, accumulated through years of on-site engineering, as the defensible asset that AI-era startups should now race to build inside their own customers (FounderCoHo, 2026). As advice to vendors, I think it is broadly correct, and that is exactly why it should stop enterprise architects mid-coffee.

Read the same claim from the other side of the table. The moat is not the vendor's algorithms. The moat is a machine-readable description of your operations, extracted from your subject-matter experts, refined against your edge cases, on your payroll's time. When a platform's defensibility is defined by how much of your meaning it has captured, the polite word for that is partnership. The accurate word is leverage.

Workflow lock-in is dying, semantic lock-in is replacing it

For twenty years, SaaS retention rested on two things: workflow habit and migration pain. Both are collapsing at once. Coding agents have gutted the cost of the unglamorous work that used to make replatforming a two-year programme, the schema translation, the connector rewrites, the dual-running reconciliation that nobody budgets for and everybody pays for. And data portability has hardened from courtesy into law. Regimes such as the EU Data Act oblige providers to remove obstacles to switching and to hand over exportable data in a structured, machine-readable format (European Parliament and Council, 2023).

So the tables come with you. Here is what does not, unless you engineered for it: the object model that says a rebuild and an overhaul are different events with different warranty consequences. The link that ties a specific invoice line to a specific contract rate table. The action definition that says who may write off a variance and above what threshold a second approval kicks in. The validation logic encoding ten years of hard-won exceptions. None of that is "your data" in the export-a-parquet-file sense. It is your meaning, and in most platforms it is expressed in proprietary configuration that has no life outside the walls.

That is the substitution under way in the AI era. Vendors can no longer hold your rows hostage, so the compounding asset moves up a layer. Every workshop where your maintenance planners explain the difference between a failure and a defect to a vendor's deployed engineers, every bespoke pipeline that encodes a rule your own wiki never captured, deepens a semantic dependency that a data export will never discharge. You can leave with your tables and still be unable to leave, because the map of what the tables mean stays behind.

The stakes compound from here. That same map is what grounds your agents, pins their behaviour, and generates your compliance evidence. Surrender it and you have not just made switching expensive. You have made someone else the landlord of your determinism, your accuracy, and your auditability.

Notice what actually trapped you in that scenario. Not the platform. The location of the definitions. Meaning that exists only as someone else's configuration is meaning you have already surrendered. Which points directly at the counter-move: put the definitions somewhere you control, in a form any platform can consume.

What owning it actually looks like

This is not an argument for refusing the platforms. Ontology-driven platforms are productive precisely because the pattern works, and I will not pretend otherwise. The argument is about what you bring to them. Author the ontology yourself. Version it, govern it, and let platforms compile it. Never accept it as a by-product of a deployment, recoverable only by re-running the workshops.

Concretely: the master copy of your semantic layer lives in a repository you control, in a neutral, boring format, reviewed like code. It does not need to be clever. It needs to be explicit. For a model risk domain in a bank, the skeleton fits on one screen.

entities:
  FinancialModel:
    keys: [model_id]
    properties: [purpose, tier, owner_desk, last_validated]   # tier: 1 | 2 | 3 by materiality
  ValidationReview:
    keys: [review_id]
    properties: [outcome, review_date, open_findings]
  ModelChange:
    keys: [change_id]
    properties: [category, submitted_date, status]   # category: recalibration | methodology | data_source
  MarketDataFeed:
    keys: [feed_id]
    properties: [provider, asset_class, snapshot_frequency]

relationships:
  - FinancialModel owned_by TradingDesk
  - ValidationReview assesses FinancialModel
  - ModelChange applied_to FinancialModel
  - FinancialModel consumes MarketDataFeed

actions:
  approve_model_change:
    target: ModelChange
    guard: latest_validation_outcome == approved && open_findings == 0
    authority: model_risk_officer
  grant_production_use:
    target: FinancialModel
    guard: months_since(last_validated) <= 12
    authority: head_of_model_risk
    audit: mandatory

Nothing in that fragment is sophisticated. That is the point. A head of model validation can read it and object to it, which is the property that matters most. From a file like this you can generate platform configuration, warehouse semantic models, API contracts, agent tool definitions, and access policies. When you change vendors, the file comes with you and the regeneration is mechanical. The workshops do not have to be re-run, because their output was never trapped in a console.

Now run the three failure modes against it. An agent whose tools are compiled from this file cannot hallucinate a fifth entity type; it can approve a model change that exists or fail loudly, because the entity register bounds what it may talk about. Its behaviour stops drifting between runs, because the guard on approve_model_change is a versioned expression, not a paragraph of prompt the model interprets differently on Tuesday; the judgment lives in the file, and the model's job shrinks to invoking it. And when the auditor arrives, authority: head_of_model_risk and audit: mandatory are not aspirations in a policy document. They are enforced properties of the system, with the trace to prove it. Grounding, determinism, evidence: one artifact, three failure modes retired.

Three disciplines follow from treating the ontology this way. First, the neutral form is written before the platform is configured, not reverse-engineered afterwards; if a definition exists only inside a vendor tool, it does not exist. Second, changes go through review by the people who own the business concept, not just the people who own the pipeline, because an ontology nobody disputes is an ontology nobody read. Third, exportability of semantics becomes a procurement gate: ask the vendor to demonstrate, before signature, how object types, links, action definitions, and their guards leave the platform in a documented format. The demonstration is more informative than the answer.

The important bit

Enterprises do not drift into renting their ontology out of stupidity. They drift into it because extracting a real ontology is slow, contested, and politically expensive. It means adjudicating between the plant's definition of downtime and finance's definition of downtime in a meeting neither side enjoys, with a decision at the end that one of them will resent. A vendor's deployed engineers solve a problem your org chart cannot: they are paid outsiders with the standing to sit in that argument until it resolves, and no career at stake in the outcome. Look past the software and that is a large part of what the invoice buys. Arbitration, packaged as engineering.

Buy it if you need it. It is often worth every dollar. But contract for the output, not just the relief. The resolved definitions, the entity registers, the action guards, the exception logic: name them as deliverables, in the neutral form, landing in your repository as they are produced. The vendor keeps the generalisable patterns, as vendors always will. You keep the specific asset your people's time created. That is the difference between hiring a facilitator and donating your institutional knowledge to someone else's balance sheet.

The founder-side essays are right about the destination. Durable advantage now accumulates in meaning, not storage, and whoever holds the authoritative description of an operation holds pricing power over everyone who needs it. Those essays simply address the party that hopes to hold it. I am addressing the other party. Your organisation has spent decades learning how it actually works, and most of that knowledge still sits in the heads of people who are very good at their jobs and very close to retirement. Turning it into explicit, versioned, portable form is the most valuable data engineering your team can do this decade. It is the same artifact, remember, that retires the three failures filling your post-mortems: it grounds what your agents may assert, fixes what they may do, and proves both to anyone who asks. Where that artifact lives when the work is done decides who banks the value.

Build the map. Let the platforms render it. Never let them own it.

References

European Parliament and Council (2023) Regulation (EU) 2023/2854 of the European Parliament and of the Council of 13 December 2023 on harmonised rules on fair access to and use of data (Data Act). Official Journal of the European Union. Available at: https://eur-lex.europa.eu/eli/reg/2023/2854/oj (Accessed: 10 July 2026).

FounderCoHo (2026) How ontology became a moat: Palantir's FDE model, demystified. Substack, 8 July. Available at: https://foundercoho.substack.com/p/how-ontology-became-a-moat-palantirs (Accessed: 10 July 2026).

Gruber, T.R. (1993) 'A translation approach to portable ontology specifications', Knowledge Acquisition, 5(2), pp. 199-220.

Palantir Technologies (2024) Palantir reports its fifth consecutive quarter of GAAP profitability; fourth quarter GAAP EPS of $0.04. Business Wire, 5 February. Available at: https://www.businesswire.com/news/home/20240203047330/en/ (Accessed: 10 July 2026).

Palantir Technologies (2026) Ontology overview, Foundry documentation. Available at: https://www.palantir.com/docs/foundry/ontology/overview (Accessed: 10 July 2026).

Ryseff, J., De Bruhl, B.F. and Newberry, S.J. (2024) The root causes of failure for artificial intelligence projects and how they can succeed: avoiding the anti-patterns of AI. RAND Corporation, RR-A2680-1. Available at: https://www.rand.org/pubs/research_reports/RRA2680-1.html (Accessed: 10 July 2026).

Trust Is an Engineering Output

Andrew Stevens — Fri, 10 Jul 2026 09:21:55 +0000

A players' union arrived at the athlete-data platform to investigate how its members' data was being handled. The platform had nothing to hide, its access controls were sound, its data accurate, its sharing rules strictly enforced, and its deletion process reliable. By every internal measure, it ran well.

None of that matters, when the customer had not come to be told the system was correct, it had come to be shown. Trust turned out to be the one thing the engineering team could not produce on demand.

The union meeting

What follows is a composite, but anyone who has sat on the engineering side of a meeting like it will recognise the pattern.

The union's representative opened with a simple question. The platform held biometric and performance data on every player he represented, and he wanted to know who could see it. The engineering lead had a good answer, access was role-based and enforced in the system rather than written in a policy and hoped for: medical staff saw medical data, coaching staff saw performance data, and nobody reached anything they were not entitled to. He said he believed her, and then he asked her to show him. For one named player, on one date, who had accessed his data, and why.

That was where the good answers ran out.

The records existed, but they lived across several systems, and assembling them into a single account would take the team a day or two. He moved on. Could she prove the player's data had never been shared with a betting partner, or with a club he was about to be transferred to? It would not have been, she said, because that sharing was not permitted. Not permitted, he pointed out, is a policy and not a proof, and he was asking whether she could demonstrate it had not happened. She would have to go and check.

The last question was about deletion. The player had asked for last season's data to be removed, and it had been, in the primary database. What about the backups, the analytics copies, and the model that had already been trained on it? Those were separate systems. She would have to come back to him.

The representative was not hostile, and he was careful to say he was not accusing anyone of anything. His point was narrower and harder to answer. If the platform could not show him what it had done, then from where he sat, the inability to prove it was indistinguishable from the thing he had come worried about. That was the part that had to change.

The engineering lead was almost certainly telling the truth throughout. The controls existed, the sharing had not happened, and the deletion had been actioned. What she could not do, in the room, with the person who had standing to ask, was produce a verifiable account of any of it. She was not being tested on whether the platform was well run. She was being tested on whether it could demonstrate its integrity on demand, to an outsider. The gap between being correct and being able to prove it is the whole of the problem, and it is where the trust the platform assumed it had quietly disappeared.

Correct but unprovable was good enough for a long time, because nobody with the standing to demand proof was likely to show up. That assumption has failed, and the meeting is what its failure looks like in a room. Closing that gap is the engineering work that the previous five posts in this series each described one face of.

What trust used to mean operationally

For most of the history of enterprise technology, trust was a relationship backed by a document. An organisation was trusted because it had the certifications on the wall, a signed data-processing agreement in the drawer, and an audit sign-off from last year. Trust was a static artefact. It was produced occasionally, by people, for a named audience, describing a state of affairs at a single point in time. A SOC report or an ISO certificate said, in effect, this was true when we looked. Everyone quietly agreed to treat the snapshot as though it still described the present.

That worked because of two assumptions. The first was that the party asking would accept a description in place of the underlying facts. The second was that checking the description against reality was expensive enough that almost nobody did. Trust could rest on reputation and a binder because verification was rare and costly. Those are the same conditions that let evidence be assembled after the fact rather than emitted continuously (see Part 3).

Both assumptions are gone. Individuals now hold enforceable rights to know exactly how their personal data is processed, and to have it corrected or erased, exercisable by them or by their representatives (European Parliament and Council, 2016). A union, a regulator, or a customer no longer has to accept "that is not permitted" as an answer. They can ask to see, and increasingly they have the legal standing to insist. The binder still exists. It is simply no longer sufficient. The question is no longer whether you can describe your controls. It is whether you can prove they held for a specific case.

What it has to mean now

Trust must now be a property the system produces on demand. It has to be verifiable by whoever has standing to ask, without them taking the organisation's word for it. That is the difference between two moves. One is saying, here is our access policy. The other is handing over an intact, tamper-evident record of every access to a player's data, in seconds. That second move is what data trust means once the people asking can insist on proof. It is an output of the system, not a claim about it.

This is what it means to treat trust as an engineering output. Audit-ready systems do not scramble to assemble a defence when the union or the regulator arrives. They answer the question as a query, because the evidence is a running product of how they operate. System trust becomes a matter of whether the architecture can render an account of itself. Transparent systems are simply the ones that can do so without a two-day reconstruction project.

The standards are moving the same way. Demonstrable, certifiable governance of AI systems now has its own management-system standard, ISO/IEC 42001, built around continuous, evidenced control rather than a point-in-time attestation (ISO/IEC, 2023). The demand arrives from every direction at once. Regulators and auditors ask for proof. So do customers, boards, commercial partners, unions, and governments, each with its own standing and its own version of the same two words: show me. The direction is unmistakable. We are moving from asserting trust to producing it.

The engineering disciplines that produce it

Trust is not a single control that can be bought and switched on. It is the sum of the five tensions this series has worked through, each engineered correctly, which is why trust architecture is really an aggregate of the other five rather than a discipline of its own.

Sovereignty engineered into placement lets an organisation show exactly where a given data point lives, and under whose jurisdiction (see Part 1). Load handled so that peak is the steady state means the system holds precisely when scrutiny arrives, which is usually its worst day (see Part 2). Evidence emitted as a runtime property means the record exists before anyone asks for it (see Part 3). Telemetry governed as a data product means the operational truth is captured, owned, and reachable, rather than trapped in the system that produced it (see Part 4). Coherence across clouds means the account is single and complete, not three partial logs no one can reconcile (see Part 5).

Put together, those are the components of engineered trust: portable identity, immutable and hash-linked evidence, first-class lineage, policy enforced as code, and a control plane that spans the estate. None of them is exotic on its own. The difficulty, and the discipline, is engineering them together, so that identity, evidence, lineage, policy, and control reinforce one another instead of sitting in five separate systems that have to be reconciled by hand the moment someone asks.

That is evidence engineering and accountability engineering in the same motion, and it is why a durable governance architecture cannot be bolted on at the end. It is assembled from the data foundation up, which is the work Sakura's Data & AI practice exists to do, and it is secured at the identity and control layer, which is where Sakura's Security practice works. The platform in the meeting did not lack good intentions or competent engineering. It lacked these disciplines wired together, so that an account of itself was always one query away.

How this goes deeper inside each industry

While this series has argued the general case, a set of industry deep-dives is coming next, each taking the argument inside one of the sectors where Sakura Sky works and studying how trust is engineered under its particular pressures:

The financial services set will examine producing regulatory evidence at machine speed, the sovereign-cloud question inside a bank, and identity as the thing the bank now runs on.
The media set will look at launch-day scale as a permanent condition, protecting intellectual property in a generative world, and audience telemetry as a strategic asset.
The government set will treat sovereign cloud as an architecture rather than a procurement, alongside API governance done properly and legacy modernisation that meets the audit bar.
The pharma and healthcare set will cover trial data that survives the inspector, the lab notebook as a data product, and patient data across boundaries.
The QSR and retail set will work through the loyalty backend as a data-engineering problem, thousands of sites on one pipeline, and multi-cloud retail without the mess.

Each brings this same argument down to the ground of a single industry: trust is not claimed, it is built.

Across every one of those industries, the cross-industry trust question reduces to the same thing. Can the organisation produce, on demand, a verifiable account of what its systems did? Turning that capability from an aspiration into a standing output is what Sakura's GRC service is built to deliver.

References

European Parliament and Council, 2016. Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data (General Data Protection Regulation). Official Journal of the European Union, L 119, 4 May, pp. 1-88. Available at: https://eur-lex.europa.eu/eli/reg/2016/679/oj [Accessed 10 July 2026].

ISO/IEC, 2023. ISO/IEC 42001:2023 Information technology, Artificial intelligence, Management system. International Organization for Standardization and International Electrotechnical Commission. Available at: https://www.iso.org/standard/81230.html [Accessed 10 July 2026].

Multi-Cloud Versus Consolidation

Andrew Stevens — Thu, 09 Jul 2026 08:29:51 +0000

A government agency committed to multi-cloud for resilience. The logic was sound in the business case: spread critical services across more than one provider so that no single outage, and no single commercial relationship, could take the whole thing down. Two years later the agency was running three clouds, each with its own team, its own pipelines, and its own way of doing identity, and none of the three could fail over to either of the others because they had never been built to talk. The resilience was theoretical, the cost was real and recurring, and the commitment was politically unbreakable because it had been announced as strategy.

Nobody could say the multi-cloud plan had failed. Nobody could say what it was actually delivering either.

This is the shape the multi-cloud-versus-consolidation debate takes in practice, and both sides of it have been corrupted by procurement narratives. One camp sells multi-cloud as resilience and leverage; the other sells consolidation as simplicity and savings, meanwhile the ground truth is that most enterprises already run several clouds whether they meant to or not, using an average of well over two public providers and, in most cases, private infrastructure alongside them in a hybrid cloud estate (Flexera, 2025). The cloud consolidation instinct is not wrong, but it usually arrives too late to be the real question, because by the time anyone drafts a consolidation slide the cloud strategy has already been settled in a hundred small delivery decisions that nobody went back to revisit.

The real question is how to engineer coherence across what an organisation already has.

This post lays out five things multi-cloud actually demands once it is treated as an architectural fact rather than a strategic choice, and the earlier argument that placement belongs in the data and identity layers rather than at the procurement desk runs straight through all five (see Part 1).

The three-cloud government

Look more closely at how the agency ended up where it did, because the mechanism is general, the resilience mandate was real and reasonable. What turned it into three islands was that it was discharged as a procurement exercise rather than an architectural one. Each cloud was stood up as its own programme, with its own delivery team, its own landing zone, its own identity model, and its own security posture, because that was the fastest way for each team to ship. Standardising inside each cloud was achievable and each estate, taken alone, was competently run. The problem lived entirely in the gaps between them.

Resilience was the first casualty.

You cannot fail a workload onto a platform it has never been integrated with, and three environments that share no identity, no data plane, and no common control cannot actually take over from one another. The failover existed on a diagram and nowhere else. Cost was the second. Three platforms meant roughly three times the engineering surface, triplicated tooling, and three teams solving the same problems in parallel, none of which the original business case had priced. And because the commitment was public, none of it could be unwound; the only way out was forward, through coherence.

That is the useful reframing. Multi-cloud stopped being a decision the moment it became a fact on the ground, and the same is true for most enterprises reading this. Coherence is engineered across clouds, not chosen between them. Government does have working models for this, where shared standards and a common governance layer let independently built systems interoperate, as Singapore's whole-of-government approach to platform and API governance shows (Singapore Government API Governance case study). The rest of this post is about what that coherence actually requires.

The five things multi-cloud actually demands

Coherence is not a single feature you buy. It is five distinct properties an architecture has to hold at once, and the organisations winning at multi-cloud have engineered all five rather than any one of them in isolation.

1. A control plane that spans clouds

The first demand is the one everything else hangs off: a control plane that spans clouds.

A control plane is the layer that makes decisions about the estate, who is allowed to do what, where a workload may run, which policy applies, and whether an action is permitted, as distinct from the data plane that does the actual work. In a single cloud, the provider gives you one for free. Across three, you either build one that spans them or you have none, and three well-run estates with no common control are still three islands.

This is the move that separates coherent multi-cloud architecture from expensive sprawl. Standardising inside each cloud, making each landing zone tidy, feels like progress but does not close the gaps, because the gaps are between the clouds, not inside them. A spanning control plane authenticates, authorises, places, and governs uniformly regardless of which provider executes the workload: a policy engine, an identity broker, a placement layer, and an evidence aggregator that sit above the providers rather than inside any one of them. That is where cloud governance actually comes to live in a multi-provider estate, and the four demands that follow are the things this layer has to carry. Building that layer, attested and provider-neutral, is the core of what Sakura's Cloud practice does. The first thing that control plane has to carry, before anything else, is identity.

2. Identity that does not fragment

The second demand is that identity must not fragment along cloud boundaries.

Left to defaults, each provider's native access management becomes its own identity island: a workload in one cloud and a workload in another authenticate by different mechanisms, hold different credentials, and cannot verify each other without falling back to shared secrets passed across the gap. Every one of those secrets is a liability, and the security posture forks the instant the second cloud appears.

The resolution is portable workload identity, a way of giving every workload, service, and agent an identity that is the same regardless of which cloud it runs in. The open standard here is SPIFFE, implemented by SPIRE, a graduated project of the Cloud Native Computing Foundation whose federation model exists precisely to establish trust across organisational and cloud boundaries (CNCF, 2022). With cross-cloud identity in place, a call from a workload in one provider to a workload in another is authenticated the same way as a local one, short-lived credentials replace the long-lived secrets that used to bridge the gap, and the security model stops forking per environment. Uniform identity is also what makes the next demand safe, because once workloads can reach each other securely across clouds, there is far less reason to copy data into all of them.

3. Data products that do not duplicate

The third demand is that data has to be exposed across clouds as products, not copied into each of them.

The default failure mode is duplication: a dataset that a workload in another cloud needs gets replicated into that cloud, and then into the next, until the same data exists in three places, drifting apart and each carrying its own governance and its own copy of the risk. The cross-cloud data problem is not that data cannot move; it is that duplicating it multiplies cost, staleness, and exposure at once, with egress charges the visible tax and governance drift the hidden one. A dataset copied nightly into two other clouds is three datasets to secure and three to keep current, plus two recurring egress bills, and by the second week the three copies have already begun to disagree.

The alternative is to treat data as a product with a clear owner, contract, and access path, reachable across the cloud boundary rather than replicated across it, which is the same discipline the previous post argued operational telemetry now needs. A well-defined data product exposed through the control plane lets a consumer in one cloud use data that lives in another without a copy landing locally. Duplication becomes the exception you justify, not the default you inherit. That discipline also happens to be where a large share of multi-cloud cost quietly hides, which is the fourth demand.

4. Cost discipline that survives autonomy

The fourth demand is cost discipline that survives team autonomy.

Multi-cloud multiplies spend by construction, and organisations already find single-cloud spend hard to control: a large majority report struggling to manage their cloud costs even before a second and third provider enter the picture (Flexera, 2025). Give three autonomous teams three providers with three billing models and cost visibility fragments completely, which is how multi-cloud cost overruns become invisible until the invoice arrives.

The answer is not to remove the autonomy that lets teams move quickly; it is to make cost a property the control plane enforces rather than a report someone assembles after the fact. Consistent tagging and allocation across providers, budget guardrails expressed as policy that can halt a runaway workload before the month closes, and showback that makes each team see its own spend, turn the practice of cloud financial management, now widely called FinOps, from a monthly reconciliation into a live control. Autonomy without visibility is just untracked spend, and multi-cloud makes the gap between the two expensive fast. The last property the control plane has to carry closes the loop the agency opened, which is proof.

5. Operational evidence that holds up across providers

The fifth demand is operational evidence that holds up across providers.

When a workload fails, or an auditor asks how a decision was reached, an organisation needs one coherent account of what happened, not three partial logs in three formats that no one can reconcile under pressure. Multi-cloud makes this harder because every provider emits its own telemetry, its own audit trail, and its own idea of an event, and left alone they never add up to a single chain.

This is where the argument in the third post of this series, that evidence has to be an engineered property rather than a retrospective reconstruction, meets multi-cloud head on (see Part 3). Coherent evidence across clouds means normalising those disparate signals into one verifiable record at the control-plane level, so a question about any workload resolves against a single account regardless of which provider ran it. Without it, the resilience and compliance claims that justified going multi-cloud in the first place cannot actually be demonstrated across the estate, which returns the agency to exactly where it started: a strategy nobody could fault and nobody could prove.

The organisations that win at multi-cloud stop trying to standardise inside each provider and instead engineer these five properties as one coherent layer across all of them, which is the coherence Sakura's Managed Services team builds and runs once the architecture is in place.

References

CNCF, 2022. SPIFFE and SPIRE Projects Graduate from Cloud Native Computing Foundation Incubator. Cloud Native Computing Foundation. Available at: https://www.cncf.io/announcements/2022/09/20/spiffe-and-spire-projects-graduate-from-cloud-native-computing-foundation-incubator/ [Accessed 9 July 2026].

Flexera, 2025. Flexera 2025 State of the Cloud Report. Flexera. Available at: https://info.flexera.com/cm-report-state-of-the-cloud [Accessed 9 July 2026].

Telemetry Is Becoming the Business

Andrew Stevens — Wed, 08 Jul 2026 10:38:29 +0000

It started, the way these things usually do, with a maintenance problem. A quick-service restaurant chain with a few thousand sites began pulling data off the equipment in its kitchens: fridge and freezer temperatures, fryer and grill status, oven cycles, drive-thru timers, and the order and payment stream from every till, all reporting continuously from each restaurant. The first goal was narrow and sensible, to stop losing stock to a fridge that failed overnight and to stop losing lunch service to a fryer that went down at noon. It worked, so the same feeds were turned toward speed of service, which found real money in the drive-thru, and then toward labour scheduling and demand forecasting, which found more. Somewhere in that sequence the data stopped being a maintenance tool and turned into something else. By the time anyone gave it a name, the telemetry coming off the estate was arguably the most strategically valuable asset the company owned, and the finance team first learned this from a single line in a board paper. Nobody had set out to build a strategic asset. They had built a monitoring feed, and it had grown into one while no one was managing it as such.

That progression, from by-product to primary asset, is now playing out across quick-service restaurants, retail, media, financial services, and increasingly healthcare. The signal a business throws off while doing its actual work, its operational telemetry, is becoming one of the things the business is most valuable for knowing. The trouble is that almost nobody's data infrastructure was built for that. It was built for telemetry as exhaust: cheap to store, quick to sample, safe to discard. This post follows the restaurant chain's estate through each stage of the shift, because the architecture that has to change is easiest to see through the thing that changed it.

The telemetry moment

Go back to the moment the data changed jobs. For the first years of the programme, the restaurant telemetry lived where operational signals always live, in the hands of the people who keep the sites running. It fed equipment-monitoring portals and a facilities dashboard, it was watched by store operations and maintenance teams, and it answered exactly one kind of question, which was whether a given restaurant was operating normally right now. That was the job, and the setup did it well.

The moment arrived when a different kind of question needed the same data. Someone at the centre wanted to lower cost per order and lift speed of service across the whole estate, which meant understanding how equipment health, prep times, drive-thru flow, and staffing interacted across thousands of restaurants over months, not one site over one shift. The data to answer that existed. It had been flowing for years. But it was trapped in systems built to alarm on a single failing appliance, not to be queried across time or joined with rosters, weather, promotions, and sales. The question was strategic and the data was operational, and the gap between them was pure architecture. Answering it took a one-off extraction that nobody owned: store operations had no remit to serve analytics, the analytics team had never been granted access to the equipment systems, and the data had to be lifted out by hand, cleaned, and stitched to context that lived in four other places. The answer, when it eventually came, was good enough to prove the point and slow enough to prove the problem.

This is the general shape of the telemetry moment, and it recurs in every business that runs on instrumented operations. It is the first time an operational feed is asked a question its owners never anticipated, and the data turns out to be present but unusable in the form it was kept. The signal was always valuable. It was simply filed under maintenance.

How telemetry got demoted

To understand why the architecture is wrong, look at how operational telemetry came to be treated as exhaust in the first place, because the demotion was deliberate and, at the time, correct. For years the entire purpose of this data was to answer one question in the present tense: is this site behaving normally. Everything about how it was handled followed from that. It was captured at high frequency, kept for a short window, downsampled or discarded once the window passed, and stored in store controllers and equipment-vendor portals tuned for live monitoring rather than historical analysis.

It was also walled off. The equipment feeds lived on the operational side of each restaurant, owned by facilities and store operations, deliberately kept apart from the corporate systems for sound reliability and payment-security reasons. This was industrial IoT in a commercial-kitchen setting, connected refrigeration, cooking equipment, and point-of-sale hardware, and it was governed as site infrastructure, not as information. The data engineering teams who build analytical platforms mostly never saw it, and had no reason to expect to.

None of that was a mistake. It was the right design for the question being asked. Storing every reading forever, at full fidelity, in a warehouse an analyst could reach, would have been waste when the only consumer was a technician checking whether a freezer was holding temperature. The demotion of telemetry to a disposable by-product was a rational response to its narrow use. It stopped being rational the moment the use widened, and the architecture did not notice the moment had come.

What it is now

What the chain discovered is that the same feed, unchanged at the sensor, had become several different things at once. It was training data for the models that forecast demand and optimise labour. It was the evidence base for menu, pricing, and new-site selection. It fed supply-chain replenishment and gave the food-safety team continuous proof that the cold chain had held, rather than a clipboard checked twice a day. The restaurants were doing the same job they always had. The telemetry had become the raw material of data-driven operations across the business.

The precise word for what it had become is a data product: a dataset deliberately built to be consumed by people and systems beyond the team that produced it, with the discoverability, quality, and reliability that implies. The idea that data should be treated as a product with real consumers, rather than as a by-product of the system that emits it, is one of the load-bearing principles of the data mesh approach (Dehghani, 2022), and operational telemetry is where it now bites hardest. The shift is from operational data as a means to an operational end, to operational data as an asset with its own standing, its own consumers, and, in a growing number of cases, its own external market. Some chains now benchmark across franchisees or sell insight derived from their estates, and the years of history behind those numbers are a moat a competitor cannot replicate quickly. The exhaust became inventory.

The data architecture this actually requires

Once telemetry is a strategic asset, the infrastructure built for exhaust starts failing in specific, predictable ways, and the fixes define the telemetry architecture the asset actually needs. The first is retention and fidelity. A system that downsamples readings to death because storage was once precious destroys exactly the history a model needs. The asset requires purpose-built time-series data storage that keeps long, high-resolution history and supports both real-time analytics on the live stream and retrospective analysis across years. The elastic capacity for that, without overbuilding a private estate for a load that spikes at every lunch rush and subsides, is one of the reasons this work tends to land on cloud foundations, which is a large part of what Sakura's Cloud practice builds underneath it.

The second is movement and contract. The data has to flow off tens of thousands of devices across the estate into an IoT data platform that can absorb millions of events without dropping them, and it has to arrive with a schema, an owner, and a quality guarantee. Treating each telemetry stream as a data product means giving it a contract and an owner responsible for it, so the analysts and models downstream can find it and trust it rather than reverse-engineering it every time. It also means separating the two clocks the asset runs on: a streaming path that carries the live signal for real-time analytics and alerting within seconds, and a batch path that lands the full-fidelity history the models and long-range analysis depend on. Systems built only for monitoring tend to have the first and not the second, which is why the historical asset, when someone finally reaches for it, so often turns out to be full of holes. This is where telemetry crosses back into the ground the earlier posts in this series covered, because a data product without governed provenance is not one you can build decisions on (see Part 3).

The third is security, and it is the one most often underestimated. The instant operational telemetry leaves the store network to feed corporate analytics, the separation that protected the restaurant environment, where connected kitchen equipment sits alongside payment terminals, is breached, and the attack surface widens in both directions. Piping equipment and till data into a lake without rebuilding that boundary is how an analytics convenience becomes a path into store systems. The discipline for doing it properly is well established in the operational-technology security standards (ISA/IEC, n.d.), and getting the segmentation, identity, and monitoring right is exactly the kind of work Sakura's Security practice does at the seam between operations and IT.

Who is doing it well

The organisations getting this right look similar across very different industries. The leading quick-service and retail chains treat their store and equipment telemetry as products with owners and service levels rather than as monitoring feeds, and keep the long, high-fidelity history that demand forecasting and cold-chain assurance need. Grocers do the same with shelf, refrigeration, and supply-chain signals. The pattern extends well beyond the shop floor: media businesses treat playback and engagement telemetry as a governed audience asset, and financial services firms treat platform and transaction telemetry the same way. The common thread is not the industry or the tooling. It is that the telemetry was given an owner, kept at fidelity, secured across the operations-to-IT boundary, and made discoverable across the business.

The tell of doing it badly is just as consistent. The data still lives only where it was produced, every strategic question that needs it becomes a fresh extraction project, and the value everyone can see in the signal never quite becomes value anyone can use. In each case where it goes right, the change was organisational as much as technical: someone was made accountable for the telemetry as a product, with a budget and a service level, instead of leaving it as a shared cost that nobody owned and everybody assumed. The chain in this story eventually got there too, but only after the board paper, which is a more expensive way to find out than deciding it in advance.

The organisations pulling ahead engineer the signal their operations throw off with intent, giving it an owner, a contract, and a deliberate place in the architecture, which is the work Sakura's Data & AI practice does when it turns raw operational telemetry into data products the whole business can rely on.

References

Dehghani, Z., 2022. Data Mesh: Delivering Data-Driven Value at Scale. Sebastopol, CA: O'Reilly Media. Available at: https://www.oreilly.com/library/view/data-mesh/9781492092384/ [Accessed 8 July 2026].

ISA/IEC, n.d. ISA/IEC 62443 Series of Standards: Security for Industrial Automation and Control Systems. International Society of Automation and International Electrotechnical Commission. Available at: https://www.isa.org/standards-and-publications/isa-standards/isa-iec-62443-series-of-standards [Accessed 8 July 2026].

Evidence Versus Speed

Andrew Stevens — Tue, 07 Jul 2026 13:55:50 +0000

The question that stopped the room was not a difficult one. Halfway through a routine inspection, the regulator pointed to a single figure in a phase three submission and asked, reasonably enough, where it had come from. Which instrument had measured it, what had happened to it on the way into the dossier, whose sign-off sat behind each step. Everyone around the table knew the number was right, and knew the system that produced it had done its job. What they could not do, with the inspector waiting, was pull that figure back through every hop and show, on the spot, that someone else could arrive at it the same way. The company had the answer, but producing the proof was another matter entirely.

This is a lineage gap, and it is not only a pharma problem. It is the visible symptom of a long-standing assumption that has finally stopped being true: that speed and evidence are separable, that an organisation can move fast now and assemble the regulatory evidence later when someone asks. That assumption held for a long time, through generations of audit regimes, because someone rarely asked, and when they did, reconstructing the proof by hand was tedious but possible. Both halves of that are now false.

This post works backwards from the symptom, through how evidence used to be produced and what regulators are now asking for, to the architecture that closes the gap and the reason closing it makes an organisation faster rather than slower. Earlier posts in this series argued that sovereignty and load are engineering questions before they are anything else (see Part 1). Evidence is the same kind of question.

The lineage gap

Start with what the inspector was actually asking for. Data lineage, sometimes called data provenance, is the traceable history of a data point: the origin it came from, every transformation applied to it, and the identity and authority behind each step. The pharma company had all of the underlying events somewhere. The instrument wrote to a log. The transformation ran in a pipeline that emitted its own log. The approval sat in a workflow tool. What it did not have was those events linked to each other, fixed against later alteration, and addressable by the one thing the inspector cared about, the data point itself.

That is the anatomy of the gap. The system was built to produce outputs, and it recorded its outputs well. It was not built to record the provenance of those outputs as a connected, verifiable object. Reconstruction was therefore a human task: an analyst stitching timestamps across three systems, inferring the links, and producing a narrative that was plausible rather than proven. Plausible is no longer the bar.

The same gap appears wherever a consequential decision has to be defended after the fact. A bank asked to show why a specific credit decision was made can usually show the decision but not the full lineage of the inputs that drove it. A government department asked who accessed a citizen record, under what authority, and whether that authority was still valid at the moment of access, often finds the access logged but the authority unlinked. In every case the events exist and the connective tissue does not. The gap is not a missing log. It is a missing architecture for turning logs into evidence.

What evidence used to look like

Work backwards, and the reason the gap exists is historical. For most of the compliance era, evidence meant artefacts assembled for a point in time. A team knew an audit or an inspection was coming, and it spent the weeks before assembling the binder: printed logs, screenshots of configuration, signed PDFs, a spreadsheet of controls mapped to a framework, a sample of records pulled and annotated. The evidence was manufactured for the occasion and then set aside until the next one.

This worked for two reasons, and both have expired. The first was cadence. Audits were periodic, so producing evidence a few times a year was survivable, even if each round consumed a quarter of somebody's life. The second was the cost of verification. Checking that the binder matched reality was expensive and slow, so verifiers sampled lightly and trusted the description. The controlling assumption underneath both was that the document was the artefact and reality was something you checked rarely.

Regulated industries did build genuine data-integrity discipline in this period. The FDA's electronic-records rules established that records had to be attributable, legible, contemporaneous, original, and accurate, backed by secure, computer-generated, time-stamped audit trails (U.S. Food and Drug Administration, n.d.). That was real and it mattered. But it was still framed around records as things you keep and produce, not around evidence as a property the running system emits continuously. The binder got thicker and better governed. It remained a binder.

What regulators are actually asking for now

The bar has moved, and it has moved in the same direction across every regulated sector at once: from "can you describe your controls" to "can you produce verifiable evidence, for this specific item, on demand." In clinical research, the revised good clinical practice guideline reached final adoption in January 2025, went live across the major regulators in mid-2025, and gains its second annex during 2026. It expands the old integrity principles to ALCOA++, adding complete, consistent, enduring, and available, and reframes data governance as a lifecycle property rather than a filing obligation (ICH, 2025). The expectation is no longer a well-kept archive. It is the ability to trace a data point through its whole life at inspection speed.

The audit profession is rewriting its own foundations around the same shift, and the pace has picked up through 2026. Having issued a catalog of the specific issues that automated tools and machine-generated evidence create for its standards (IAASB, 2025), the international standard-setter is now redrafting the core audit-evidence standard itself: the first full draft of the revised ISA 500 was considered in March 2026 and is moving into public consultation, explicitly to clarify how evidence obtained through automated tools should be treated (IAASB, 2026). In parallel, the internal-control guidance issued in February 2026 for organisations running generative systems in financially material processes is blunt that set-and-forget assurance is inadequate for probabilistic models, and that any output affecting a material figure must be supported by appropriate, traceable evidence (COSO, 2026).

Regulation of AI systems codifies the same expectation directly. The EU AI Act requires high-risk systems to keep automatic logs over their lifetime, obliges deployers to retain those logs for at least six months, and sustains post-market monitoring, which is to say it requires evidence to be a running output of the system rather than a retrospective reconstruction (European Parliament and Council, 2024, Articles 12, 26 and 72). The 2026 Digital Omnibus package rescheduled when the high-risk obligations begin to bite, but not their shape: the logging and monitoring duties are intact, and the direction is settled (European Commission, 2026). Read together, these are not separate compliance projects. They are one signal. The reconstructable, machine-verifiable chain is now the deliverable, and the binder is a record of a world that no longer sets the terms.

The architecture of audit-readiness

Move forward from the diagnosis and the resolution is an audit-ready architecture, not a procedural fix. This compliance architecture becomes achievable when evidence is engineered into two layers of the system rather than assembled on top of it.

The first is the data layer. Lineage has to be captured at write time as first-class metadata, not reconstructed at audit time from scattered logs. Every dataset and every transformation records, as it runs, the inputs it consumed, the version of the code that touched them, the identity that authorised the step, and the time it happened, all linked by shared identifiers. When that is in place, the inspector's question stops being an archaeology project and becomes a query: give me the lineage of this data point, and the system returns it. Building the data layer so that provenance travels with the data, rather than being inferred afterwards, is the core of the work Sakura's Data & AI practice does under a regulated architecture.

The second is the execution layer. Every consequential action, a tool call, a policy decision, an access to a protected record, emits an immutable, tamper-evident record as an ordinary byproduct of running. Hash-linking those records means any later alteration is detectable, which is the difference between a log you keep and evidence someone else can trust. Policy decisions are themselves recorded as decision objects: what was requested, what rule applied, what was permitted or denied, and why. An evidence pipeline then makes all of it addressable, so a request for a given control, data point, or period resolves against a real chain rather than a narrative. The defining property is simple to state and demanding to build: evidence is produced as a side effect of operating the system, whether or not anyone ever asks for it. That property, engineered once, is what turns audit readiness from a recurring scramble into a standing capability. It is engineered compliance in the literal sense, and it is the outcome Sakura's GRC service is organised around.

The speed dividend

The counterintuitive part is that this makes regulated organisations faster, not slower. When evidence is emitted continuously, the audit cycle stops being a project and becomes a machine-speed audit: a query against a live chain rather than a quarter of reconstruction. There is no change freeze while the binder is assembled, no team pulled off delivery to reconstruct history, and regulated workloads carry their own proof as they run. The proof travels with the work, so a release does not have to pause to prove itself.

The organisations that bolt evidence on afterwards pay for it twice. They pay once in the reconstruction, the analyst-weeks spent stitching logs into a defensible story. They pay again in the drag on everything else while that reconstruction is under way, because a system whose evidence cannot be produced on demand cannot safely change quickly. Every deployment carries the unpriced risk that it breaks a chain nobody can currently see. Evidence-first operations remove that risk by making the chain explicit and continuous, which is precisely what lets a regulated business move at close to the speed of an unregulated one without surrendering its right to operate.

This is the resolution of the tension the post opened with. Evidence and speed look opposed only while evidence is a thing you produce on request. Engineer it into the data and execution layers and the opposition dissolves, because the same architecture that makes the system fast to change is the one that makes it ready to prove. The pharma company at the start did not have a speed problem or an evidence problem. It had an architecture that treated the two as separate, and paid for both.

The organisations turning a regulator's request into a query rather than a quarter are the ones engineering evidence into the layer beneath the strategy, and building that capability as a running property of the platform is what Sakura's Praxis compliance solution exists to do.

References

COSO, 2026. Achieving Effective Internal Control Over Generative AI. Committee of Sponsoring Organizations of the Treadway Commission. Available at: https://www.coso.org/ [Accessed 7 July 2026].

European Commission, 2026. Timeline for the implementation of the EU AI Act. AI Act Service Desk, European Commission. Available at: https://ai-act-service-desk.ec.europa.eu/en/ai-act/timeline/timeline-implementation-eu-ai-act [Accessed 7 July 2026].

European Parliament and Council, 2024. Regulation (EU) 2024/1689 of the European Parliament and of the Council of 13 June 2024 laying down harmonised rules on artificial intelligence (Artificial Intelligence Act). Official Journal of the European Union, L 2024/1689, 12 July. Available at: https://eur-lex.europa.eu/eli/reg/2024/1689/oj [Accessed 7 July 2026].

ICH, 2025. ICH E6(R3) Guideline for Good Clinical Practice, Step 4. International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use. Available at: https://www.ich.org/page/efficacy-guidelines [Accessed 7 July 2026].

International Auditing and Assurance Standards Board (IAASB), 2025. Technology Position: Catalog of Issues and Possible Actions. International Federation of Accountants. Available at: https://ifacweb.blob.core.windows.net/publicfiles/2025-11/IAASB-Technology-Catalog-of-Issues-Proposed-Actions.pdf [Accessed 7 July 2026].

International Auditing and Assurance Standards Board (IAASB), 2026. Proposed International Standard on Auditing 500 (Revised), Audit Evidence. International Federation of Accountants. Available at: https://www.iaasb.org/consultations-projects/isa-500-series [Accessed 7 July 2026].

U.S. Food and Drug Administration, n.d. 21 CFR Part 11: Electronic Records; Electronic Signatures. Code of Federal Regulations, Title 21, Chapter I, Subchapter A, Part 11. Available at: https://www.ecfr.gov/current/title-21/chapter-I/subchapter-A/part-11 [Accessed 7 July 2026].

HTTP QUERY Finally Settles the Argument. Your Infrastructure Gets the Next One.

Andrew Stevens — Mon, 06 Jul 2026 18:24:23 +0000

There is a meeting that happens in every engineering org, roughly once a quarter, forever. Someone is designing a search endpoint. The filter object is large and structured: date ranges, nested facets, a list of tenant IDs longer than any sane URL. And so the question arrives, on cue, like a season finale you have already seen fifteen times.

"Do we send the filter as a body on GET, or do we just make it a POST?"

Then the room splits. One faction wants the body on GET, because a search is a read, and reads should be safe, idempotent, and cacheable. The other faction points out, correctly, that a body on GET has no defined semantics (Fielding et al., 2022), that some proxies quietly drop it, and that you are building on sand. So the room retreats to POST, everybody feels slightly dirty about it, and the search endpoint ships as a POST that is secretly a read. The caching story is now your problem forever.

I have been in that meeting more times than I can count. As of June 2026, it is over. RFC 10008 defines the HTTP QUERY method (Reschke et al., 2026), and it settles the argument cleanly. I want to be honest about what it settles, because the useful part is not the part everyone is quoting.

What the method actually is

QUERY is a request method that carries content in the request body, like POST, but is explicitly defined as safe and idempotent, like GET. The abstract puts it plainly: a QUERY asks the target to process the enclosed content in a safe and idempotent manner and return the result, and unlike POST it can be automatically repeated or restarted without concern for partial state changes (Reschke et al., 2026).

That single sentence is the whole point. Safe means the request is not expected to change server state. Idempotent means a client, a proxy, or a flaky mobile connection can retry it without booking a second charge or creating a duplicate. POST gives you neither guarantee, which is why every retry layer treats POST with suspicion. GET gives you both, but will not carry your 4 KB filter object in a way anyone has agreed on.

Here is the search endpoint everyone actually builds today.

POST /feed HTTP/1.1
Host: example.org
Content-Type: application/json

{ "since": "2026-01-01", "tenants": [11, 42, 108], "facets": ["status", "region"] }

And here is the same request as QUERY, taken from the shape the RFC itself uses (Reschke et al., 2026).

QUERY /feed HTTP/1.1
Host: example.org
Content-Type: application/json

{ "since": "2026-01-01", "tenants": [11, 42, 108], "facets": ["status", "region"] }

The body is identical. The difference is that the second one tells the truth about what it does.

The properties, and why they end the argument

Your instinct to line these up in a table was correct. Here is the version I would put on the whiteboard, reconciled against the comparison table the RFC ships in its own text.

Property	GET	QUERY	POST
Safe	Yes	Yes	No
Idempotent	Yes	Yes	Not reliably
Cacheable	Yes	Yes, with an asterisk	Only for a later GET or HEAD
Request body	No defined semantics	Expected and defined	Allowed

Two cells are worth a second look, because they are where the honesty lives.

The body row is the reason the argument existed at all. GET does not forbid a body, it simply assigns it "no defined semantics," which is standards language for "you are on your own and do not come crying to us." QUERY defines the body as the substance of the request. That is the gap POST used to fill by accident, now filled on purpose.

The caching row is the one people are quoting too confidently, so let me spoil the party.

The asterisk on cacheable

QUERY responses are cacheable. That is real, and it is the headline feature. But a cache that stores a QUERY response has to key on the request body, not just the method and URL, because the body is the query. Most of the caching infrastructure deployed in the world today does not do that. It was built for GET, where the URL is the whole identity of the request.

The RFC has an answer, and it is worth knowing because it changes how you design the endpoint. A server can hand back a Content-Location pointing at a URI that represents the result, so a client can follow up with an ordinary, boringly cacheable GET (Reschke et al., 2026). There is also a new Accept-Query response header so a resource can advertise that it speaks QUERY and say which query formats it accepts.

Accept-Query: application/json, application/sql

So "cacheable: yes" is true, and it is also a small project. The method gives you the semantics for free. The caching benefit you have to actually plumb through your stack. Hold that thought, because it is the whole second half of this post.

We had this in 2008 and flinched

The comedic framing going around is that HTTP waited decades to give us a method it always needed. That is nearly right, and the accurate version is funnier. We did not wait. We had it.

WebDAV shipped a SEARCH method in 2008 for exactly this problem: a safe request with a body describing what to look for (Reschke et al., 2008). The early drafts of this very specification were also called SEARCH before the working group settled on QUERY, partly because the WebDAV lineage made people uneasy (Reschke et al., 2026). So the honest history is not that HTTP starved us for fifteen years. It is that HTTP offered us a perfectly good fork in 2008, the fork said "WebDAV" on the handle, we recoiled, and we spent the next decade and a half arguing in meeting rooms instead. QUERY is SEARCH with a better haircut and a cleaner spec, and this time nobody has to say the word WebDAV out loud.

The argument does not end, it moves down a layer

Here is where I stop agreeing with the celebration and start reading the room.

A method is only as real as the software that speaks it. QUERY the specification is settled. QUERY the deployed reality is a green field with three fence posts in it. On the day I am writing this, your browser fetch call, your HTTP client library, your API gateway, your load balancer, your WAF, and your CDN mostly do one of two things with QUERY: reject it, or pass it through as an opaque method they do not understand and therefore do not cache. The moment your endpoint needs to work for a caller you do not control, you are shipping a POST fallback anyway, and you are back in a version of the old meeting.

So the argument I have been having for years is genuinely settled at the semantic layer, and it reopens instantly one layer down. It stops being "which method is correct" and becomes "does our edge actually cache a QUERY, and what did the gateway team say when we asked." For a shop that lives in the CDN and API-gateway layer, that is not a footnote. That is the interesting part. It is worth noting that one of the RFC's three authors works at Cloudflare (Reschke et al., 2026), which tells you where the people who wrote it expect the hard problems to land.

What I would actually do on Monday

Use QUERY where you own both ends. Internal service to internal service, your own client to your own API, anywhere you control the caller and the infrastructure between them, adopt it now. You get safe, idempotent, retryable reads with a real body and no guilt.

At the public edge, treat it as progressive enhancement. Advertise support with Accept-Query, accept QUERY from callers who send it, and keep a POST path for everyone whose stack has not caught up, which today is most of them. If you want the caching win, design for the Content-Location handoff deliberately rather than assuming your existing cache will do the right thing, because it will not until someone teaches it to.

And the next time that meeting starts, you get to say the words I have wanted to say for years: the correct method exists, it is called QUERY, and the only open question is whether our infrastructure has heard of it yet. That is a much better argument to be having. This is, as it happens, the exact seam where Sakura spends its days, wiring HTTP semantics through the CDN and gateway layer so the correct answer on paper becomes the fast answer in production.

The old argument is settled. Long live the new argument.

References

Fielding, R., Nottingham, M. and Reschke, J. (2022) RFC 9110: HTTP Semantics. Internet Engineering Task Force. Available at: https://www.rfc-editor.org/rfc/rfc9110 (Accessed: 6 July 2026).

Jawad, A. (2026) HTTP finally gets a method it's needed for decades: meet QUERY. Medium. Available at: https://medium.com/@anmjawad007/http-finally-gets-a-method-its-needed-for-decades-meet-query-bb1b77def9a3 (Accessed: 6 July 2026).

Reschke, J., Reddy, S., Davis, J. and Babich, A. (2008) RFC 5323: Web Distributed Authoring and Versioning (WebDAV) SEARCH. Internet Engineering Task Force. Available at: https://www.rfc-editor.org/rfc/rfc5323 (Accessed: 6 July 2026).

Reschke, J., Snell, J.M. and Bishop, M. (2026) RFC 10008: The HTTP QUERY Method. Internet Engineering Task Force. Available at: https://www.rfc-editor.org/rfc/rfc10008 (Accessed: 6 July 2026).

Compliance You Can Prove

Andrew Stevens — Sun, 05 Jul 2026 19:04:40 +0000

Most compliance programmes produce documents. A policy says data is encrypted. A control narrative says access is restricted. A spreadsheet says the annual review happened. When a supervisory authority asks whether you actually meet an obligation, you hand over the documents and hope the description still matches the system.

The trouble is that systems change faster than documents. Code ships weekly. Configurations drift. A model gets retrained. The PDF that described your controls in January quietly stops being true by March, and nobody notices until an auditor, a breach, or a regulator's letter forces the question. At that point the work is archaeology: reconstructing, months later, what the system was actually doing on the day that mattered.

Praxis takes a different position. Whether your systems meet a given obligation is for the regulator to decide, but the substrate they will examine is evidence, and evidence is something you can engineer. Instead of describing controls in prose, Praxis wires them into the systems themselves and produces a continuous, verifiable record of what those systems did. The output is not a document a regulator has to trust. It is proof a regulator can check.

The gap between saying and showing

Four regulations are converging on the same demand, and it is not more paperwork. It is demonstrable, technical evidence.

GDPR Article 32 requires "appropriate technical and organisational measures" and, under Article 32(1)(d), "a process for regularly testing, assessing and evaluating the effectiveness" of those measures (European Parliament and Council, 2016). The EU AI Act makes accuracy, robustness, and cybersecurity a binding requirement for high-risk AI systems under Article 15, backed by the technical documentation required under Article 11 and verified through the conformity assessment procedure in Article 43 (European Parliament and Council, 2024). The EU Data Act creates obligations around user authorisation, data lineage, and access constraints for connected products and B2B data flows, principally the right to access product data under Article 4 and the right to share it with third parties on fair, reasonable, and non-discriminatory terms under Article 5 (European Parliament and Council, 2023b). MiCA imposes custody segregation under Article 70(1), market-conduct surveillance under Article 92(1), and transparency obligations under Article 66 on crypto-asset firms (European Parliament and Council, 2023a).

Read together, a pattern emerges. Regulators are no longer satisfied with a description of your intentions. They want to see the control operating, and they want a record that has not been edited after the fact. That is a fundamentally different deliverable from a control narrative, and it is an engineering deliverable, not a legal one.

Engineered controls, continuous evidence

Praxis is the productised core of our Governance, Risk and Compliance service, and it rests on a simple idea: design a control once, evidence it continuously, and map that single piece of evidence to every obligation it answers.

At the centre of every engagement is a domain agent we build and operate on your behalf. It reads the live regulatory text alongside your own systems, policies, and code. It surfaces gaps as they appear rather than at quarter-end, keeps evidence current as your platform changes, and assembles the artefact a supervisory authority will accept when the request comes. It does not replace your compliance officer or your counsel. It removes the parts of the job a machine should be doing: tracking which service logs which fields against which lawful basis, watching for the next authoritative opinion, generating the pack, so your specialists keep the judgement calls that only humans should make.

Underneath the agent sits the part that makes the evidence trustworthy: Sentinel.

Where Sentinel comes in

Praxis answers the compliance question. Sentinel produces the evidence that lets it answer honestly.

Sentinel is our runtime security framework, the enforcement point that sits on the perimeter of your AI and data systems. Every governed action passes through it: a request is checked against policy, allowed or denied, and then recorded. That record is not an ordinary application log. It is written to a hash-chained, cryptographically signed evidence ledger, where each entry is bound to the one before it, so the sequence cannot be silently altered or back-dated. Periodically the ledger is checkpointed and those checkpoints are anchored to an external transparency log, the same class of tamper-evidence technology used to secure the world's software supply chains.

The effect is that "we enforced this policy" stops being a claim in a document and becomes a fact you can verify. If a single entry were changed after the fact, the chain would break and the anchor would no longer match.

Praxis then consumes that evidence across a deliberate boundary. It reads only the ledger, the standardised, portable evidence format, and never reaches into Sentinel's internals. It maps each recorded action to the regulatory provisions it demonstrates, runs its analysis deterministically so the same inputs always produce the same output, and generates a regulator-ready evidence pack: the human-readable narrative plus a signed record whose every claim traces back to a specific, tamper-evident ledger entry.

Because the evidence format is open, an auditor does not have to take our word for any of it. They can verify the chain themselves, independently, using a public reference tool. That is the difference between an evidence pack that persuades and one that merely asserts.

Where Praxis fits

The pattern is easiest to see in the situations our clients actually bring us.

Agentic AI reaching into personal data (GDPR Article 32): An enterprise rolls out AI agents that read and update customer records. Every governed action the agent takes is checked and recorded at runtime, and Praxis maps those records to the confidentiality, integrity, and effectiveness-testing duties of Article 32 (European Parliament and Council, 2016). When the data protection authority asks how you ensure ongoing security and regularly test that it works, the evidence is already assembled rather than reconstructed.
A high-risk AI system heading for conformity assessment (AI Act Article 15): A lender runs an AI model in its credit decisions. Praxis operationalises the accuracy, robustness, and cybersecurity obligations the Act imposes under Article 15, documenting the adversarial testing the model is subjected to, enforcing policy at runtime, and producing the Article 11 technical documentation that supports the Article 43 conformity assessment your notified body will expect (European Parliament and Council, 2024).
A connected product that has to share data (EU Data Act): A device manufacturer must give users access to their own product data under Article 4 and, at the user's direction, make it available to authorised third parties under Article 5, with a defensible record of where the data went (European Parliament and Council, 2023b). Praxis engineers the authorisation capture, the cryptographic lineage, and the access-purpose enforcement in code, so "the user authorised it and the data went only where permitted" is something you can show, not just assert.
A crypto-asset firm under supervision (MiCA): A crypto-asset service provider must demonstrate custody segregation to ESMA and national competent authorities under Article 70(1), maintain market-abuse surveillance under Article 92(1), and meet the client transparency obligations of Article 66 (European Parliament and Council, 2023a). Praxis builds the substrate: provable segregation, real-time surveillance, and transparency artefacts, scaled to the title of MiCA you are in scope of.
Several deadlines at once: A scale-up faces GDPR today, the AI Act as it deploys models, and a MiCA obligation on the horizon. Instead of three separate annual scrambles, Praxis maps the controls those regimes share, keeps the evidence live, and turns each audit into a query against a current record rather than a months-long reconstruction.

Build once, defend everywhere

The commercial argument for engineered compliance is not only that it is more honest. It is that it is dramatically cheaper to maintain across multiple frameworks.

Most enterprises buy compliance one regulation at a time, which produces duplicated controls, duplicated evidence, and duplicated audit cost. But a single encryption-at-rest configuration designed to meet GDPR Article 32 also speaks to the AI Act's Article 15 cybersecurity obligation, to the Data Act's access constraints under Articles 4 and 5, and, for a crypto-asset firm, to MiCA's Article 70 safekeeping requirement. The control is the same. Only the framing changes.

Praxis is built for that reuse. It maps every implemented control to every obligation it answers, so you can show a single piece of evidence discharging duties across four regulations at once. Add a new framework later and most of the substrate already covers it. Build once, evidence once, defend everywhere.

What continuous actually means

A point-in-time audit certifies what was true on the day of the audit. Everything after that is hope.

Praxis treats evidence as a stream instead of a snapshot. In continuous engagements, verification runs against your live systems as they change: every deployment, every policy change, every model update triggers re-attestation, and drift surfaces the moment it appears rather than at the next annual review. Control posture becomes a property of the running system, not a photograph taken from outside it.

That is also where the legal-engineering handoff usually breaks, and where Praxis is designed to sit. Lawyers write requirements engineers cannot implement; engineers build controls lawyers cannot defend. Our team works in that gap, translating in both directions without losing fidelity, so your legal team gets implementations they can stand behind in front of a regulator, and your engineers get requirements they can actually build.

Seeing it end to end

None of this is theoretical. In a single run, Sentinel stands up the governed gateway and makes two decisions: it allows a legitimate read, and it denies an unknown agent attempting a financial action. Both are written to the signed, hash-chained ledger, because a refusal is evidence exactly as much as an approval, and Sentinel verifies the chain and signatures before handing the record on.

Praxis then consumes that ledger alone. It pins the regulatory text to a precise, reproducible version of GDPR Article 32 (European Parliament and Council, 2016), maps the recorded activity to the provisions it demonstrates, and generates the evidence pack: a narrative and a signed record whose every claim traces back to a specific ledger entry.

And then it does the thing a document never will. The same run reports, plainly, that two of the eight requirements Praxis tracks under Article 32 are currently covered by evidence and six are not, ranked by severity so remediation has an order. Praxis does not rubber-stamp. It shows you what you can prove today and, just as clearly, what you cannot yet prove. For a compliance leader, that is the difference between a tool that flatters you and one you can actually run your programme on.

Runtime enforcement on one side, regulator-facing proof on the other, joined by evidence that cannot be quietly rewritten, and an honest, current picture of where the gaps are. Compliance stops being a story you tell once a year and becomes a property of the system you can demonstrate on any given day.

Praxis is the engineered-compliance core of Sakura Sky's GRC practice, spanning GDPR, the EU AI Act, the EU Data Act, and MiCA. To scope where it fits your systems, book a roadmap engagement.

Praxis and Sentinel, named and described above, are Sakura Sky products developed and sold by Sakura Sky.

References

European Parliament and Council (2016) Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data (General Data Protection Regulation). Official Journal of the European Union. https://eur-lex.europa.eu/eli/reg/2016/679/oj [Accessed 5 July 2026].

European Parliament and Council (2023a) Regulation (EU) 2023/1114 of the European Parliament and of the Council of 31 May 2023 on markets in crypto-assets (MiCA). Official Journal of the European Union. https://eur-lex.europa.eu/eli/reg/2023/1114/oj/eng [Accessed 5 July 2026].

European Parliament and Council (2023b) Regulation (EU) 2023/2854 of the European Parliament and of the Council of 13 December 2023 on harmonised rules on fair access to and use of data (Data Act). Official Journal of the European Union. https://eur-lex.europa.eu/eli/reg/2023/2854/oj [Accessed 5 July 2026].

European Parliament and Council (2024) Regulation (EU) 2024/1689 of the European Parliament and of the Council of 13 June 2024 laying down harmonised rules on artificial intelligence (Artificial Intelligence Act). Official Journal of the European Union. https://eur-lex.europa.eu/eli/reg/2024/1689/oj/eng [Accessed 5 July 2026].

Peak Load Is the Steady State

Andrew Stevens — Fri, 03 Jul 2026 09:11:59 +0000

The product drop had been planned for months. The direct-to-consumer subscription business had run three separate load tests, provisioned extra capacity for the launch window, and staffed a warroom across two time zones. The drop itself went cleanly. Two hours in, an unrelated video from a creator with a large following mentioned the product without warning, and the sign-up flow collapsed under a rush of new members for twenty-eight minutes. Customers were told the site was busy and to try again later. Some did. Most did not. The refund exposure was manageable. The customer acquisition exposure was not.

What went wrong is not the interesting question. The system was under-provisioned for a specific traffic shape it had not seen before, and the team fixed it. The interesting question is what happened seven weeks later. A weather event redirected a wave of app traffic in an entirely different sector, at midnight on a Tuesday, without any warning. That system held, because a small group of engineers had spent those seven weeks quietly rebuilding assumptions about when peak load happens and what it looks like. The lesson from the product drop was not "provision more capacity for product drops." The lesson was that the mental model of peak load as a scheduled event had stopped being useful.

This is another post in our series on the engineering layer underneath enterprise strategy. The previous post (Sovereignty Versus Efficiency) argued that sovereignty has become an architectural property that procurement cannot solve on its own. This post makes an analogous argument about load. Across banking, media, retail, travel, restaurant chains, and sport, the architectures built to survive named events are increasingly the wrong architectures for the traffic these businesses now routinely encounter. The discipline required has moved closer to what telecommunications engineers have always done, while the cost models have not caught up.

What peak load used to mean

For most of the last two decades, peak load was an event. Retail had Black Friday, Cyber Monday, and the Christmas run. Media had launch day and the season finale. Travel had school holidays, bank holidays, and the summer window. Restaurant chains had big fixtures and Friday nights. Sport had grand finals and cup weekends. All of these came with dates on the calendar. Traffic curves were forecastable within a reasonable margin, and the discipline of preparation was well understood.

The architecture that resulted from this world was event-driven in the fullest sense. Systems were designed with a nominal steady state that carried the ordinary business, and a set of playbooks for named events. The playbooks involved pre-provisioning capacity, staffing warrooms, running rehearsals, and communicating readiness to the executive team. Failure modes were understood. If the promotion drove more traffic than forecast, the team scaled harder in the moment or accepted degraded response times until it passed. If a component fell over, the runbook covered it. On Monday, the extra capacity was released and the team went back to steady state.

The economics of this world were also well understood. Capacity was a variable cost with known peaks. The finance team could budget for the season the way they could budget for a marketing campaign. The engineering team could plan hires around known intensity windows. The board could see the operational risk on a calendar. Peak load, in other words, was a project. And projects have edges.

Why the steady state is now permanent

Several forces have collapsed the edges of the event. None of them are new. Together they are decisive.

The first is the velocity of social platforms. A single video from a creator can drive more traffic in an hour than a planned marketing campaign delivered over a week. There is no way to forecast this reliably, and no way to prevent it. A retail brand can wake up on a Tuesday to a fifteen-times traffic surge because a product went viral. A subscription business can find its onboarding pipeline overwhelmed in an afternoon because an influencer made a video. A streaming platform can see peak concurrency exceed launch-day peak because a scene became a meme.

The second is direct-to-consumer economics. Businesses that used to sit behind wholesalers, aggregators, or franchise networks now absorb variance directly. There is no distribution layer to smooth demand. When a brand ran through retailers, the store network absorbed the customer-facing intensity. When it runs on its own commerce platform and app, the intensity is bounded by the backend, and by the backend alone.

The third is the compounding effect of always-on services. Loyalty accrual, real-time personalisation, live sports betting, live streaming chat, in-app messaging, real-time inventory, and mobile ordering all share a property. They are running all the time, and the load they produce is a function of engagement rather than opening hours. A sport that started as a broadcast product now runs a season-long, twenty-four-hour engagement platform. A retailer that opened at nine and closed at nine now sells continuously and personalises continuously.

The fourth is the shift in the data layer. The pipelines that used to run overnight now run continuously. Inventory pipelines that ran hourly batches now stream against point-of-sale events because the personalisation and forecasting layers downstream cannot tolerate stale data. Fraud pipelines that ran on a fifteen-minute cadence now score every transaction inline because the fraud rings that used to work slowly have automated. Audience pipelines that ran nightly now emit events at the rate of playback, because the ad-tech layer that consumes them prices on freshness. The pipeline used to be a job that finished. It is now a service that has to be up. And the peak load that used to arrive in the nightly ETL window now arrives at every minute of the day, with a spike shape driven by the same viral, direct-to-consumer, always-on forces reshaping the customer-facing layer above.

The fifth is the shape of the failure. When peak load was an event, the failure was contained: the promotion failed, the launch went badly, the game weekend was rough. When peak load is a steady state, the failure is customer trust. A service that is unreliable at random intervals across the year is a service customers will disengage from. A streaming platform that stutters unpredictably is a streaming platform subscribers will cancel. A personalisation engine that shows yesterday's recommendations is a personalisation engine that stops earning its keep. Reliability has moved from being an operational property to being a commercial one.

Put these together, and peak-as-normal is not a possibility to plan for. It is the reality most businesses in these sectors are already operating in, whether or not the architectural assumptions have caught up.

The architectures that hold up

The architectures that survive this environment share a set of properties that used to be niche and are becoming ordinary. None of them are new inventions, and all of them require a different design posture from the event-driven era.

The first is capacity as a continuous discipline rather than a project. Forecasting is done not against a calendar of named events but against a rolling demand model that treats the next surge as an unknown-when rather than an unknown-if. Reserve capacity is not held for a specific date; it is held as an ongoing property of the architecture, with automatic scaling that assumes warm-up latency will otherwise cost the business real money. The best teams treat capacity forecasting the way network engineers treat traffic engineering: as an operational function that runs continuously (Beyer et al., 2016). The same discipline now applies inside the data platform. Streaming pipelines are sized against a rolling load model, not a nightly window, and the reserve capacity of the pipeline is a first-class engineering concern, not an afterthought of the ingestion team.
The second is graceful degradation as a first-class design property. When a component is under pressure, the system does not fall over. It sheds work in a controlled order that protects the customer-visible parts of the experience. Loyalty accrual can queue for thirty seconds; a purchase acknowledgement cannot. A live stream can drop bitrate before it stops; an authentication service cannot start rejecting requests. A batch reconciliation can slip an hour; a real-time fraud score cannot. These distinctions are not made by the engineering team on the day. They are designed in months earlier, with explicit prioritisation of what degrades first and what does not degrade at all. If the design does not encode these choices, the system makes them randomly. Randomness is what customers experience as broken.
The third is fault isolation, often described as cell-based architecture. Instead of one large system that shares infrastructure across all customers, tenants, or regions, the architecture is composed of independent cells that fail independently. A cell can be lost without the others noticing. The customer whose session was on the failing cell sees an error and can retry into a healthy one. Fault isolation is the difference between a bad hour for a subset of customers and a bad day for everyone. It is also the difference between a controllable incident and a headline.
The fourth is observability that is continuous rather than incident-driven. In an event-driven world, dashboards were watched during launches and largely ignored otherwise. In a peak-as-normal world, the signal has to be running all the time, because the surges arrive without notice and the only way to catch them early is to have the telemetry in front of the on-call engineer already. This is a different quality of investment. It is not about spending more on tools. It is about designing observability as an integral part of the service, with the same care as the service itself. Pipeline lag, dead-letter volume, and freshness against source are treated as first-class service-level indicators alongside API latency and error rate.
The fifth is intelligent load shedding. When a surge exceeds even the elastic capacity, the system needs to make a judgement about what to preserve. Random back-pressure is the worst outcome. Preserving paying customers over exploratory browsers, or preserving in-flight transactions over new ones, or preserving core commerce over recommendations, are all defensible choices. Whichever the business makes, the choice has to be made in advance and encoded, because the peak is not a moment to hold a design conversation.

These properties do not sit on top of an application. They are engineered into the layer beneath the strategy, and the businesses that get this right treat them as core operational architecture rather than resilience add-ons. It is the kind of continuous discipline that shows up in engagements like Sakura's work with Craveable Brands, where a multi-brand, multi-site backend has to hold under exactly the unpredictable surges described here.

What this costs to do right

The uncomfortable truth is that peak-as-normal costs more than event-driven peak used to cost. The infrastructure spend is higher, because reserve capacity is a continuous line rather than a periodic one. The talent spend is higher, because the engineering discipline required is a rarer skill set than most enterprises had staffed for. The design work is more expensive, because features have to be reasoned about in terms of blast radius, degradation behaviour, and cell placement, not just functionality.

The finance conversation is harder as well. In the event-driven world, the CFO could see a specific commercial reason for a spike in infrastructure cost. In the peak-as-normal world, the argument is that the cost of being unreliable is higher than the cost of being resilient, and the evidence sits in customer trust metrics that are harder to attribute to any specific investment. That conversation is now happening in every executive team in every industry where peak load has permanently changed shape, and the businesses that resolve it in favour of resilience are the ones whose competitors cannot explain how they hold up.

The trade-off is not really optional. The businesses that decline to engineer for the steady state are the ones whose incidents show up on the front page. The ones that accept the cost quietly become the ones that hold. The question for the executive team is no longer whether to make the investment. It is whether to make it before the first incident that decides it for them.

This is where the engineering underneath earns its investment. Sakura's cloud engineering practice, the data and AI practice, and the managed services practice that runs alongside them treat these disciplines as day-to-day operating work rather than incident response. If peak-as-normal is the operating condition your business is already living with, the conversation worth having is what your architecture assumes about it, and whether those assumptions are still current.

References

Beyer, B., Jones, C., Petoff, J. and Murphy, N.R., 2016. Site Reliability Engineering: How Google Runs Production Systems. Sebastopol, CA: O'Reilly Media. Available at: https://sre.google/sre-book/table-of-contents/ [Accessed 3 July 2026].

AI Safety Assurance Is Still Goodwill, Not Evidence

Andrew Stevens — Thu, 02 Jul 2026 13:28:25 +0000

On 1 July 2026, the Independent International Scientific Panel on AI released its first Preliminary Report: a scientific assessment of AI capabilities, opportunities and risks, produced by forty scientists from every UN region and co-chaired by Yoshua Bengio and Maria Ressa (Independent International Scientific Panel on Artificial Intelligence, 2026). It landed a week before the inaugural Global Dialogue on AI Governance in Geneva, where the findings go to Member States on 6 and 7 July. UN Secretary-General António Guterres put the point plainly at the launch: "The science is here. We can no longer say we did not know. What we do with it is now up to all of us" (UN News, 2026).

The report's framing, echoed in the Panel's own announcement, is that AI's benefits are not automatic and its harms are not inevitable, and that the difference comes down to the institutions and policies built around the technology (United Nations Office for Digital and Emerging Technologies, 2026). That is a reasonable, careful sentence. It is also the kind of sentence that is easy to nod along to and forget by lunchtime. The line worth sitting with is further down the page.

The line that matters more than the headline

Buried in the Panel's discussion of AI measurement is this: "Without standardized, rigorous, independent third-party assessment, similar to what exists for the pharmaceutical and aeronautical industries, assurance of safety largely depends on developer goodwill" (Independent International Scientific Panel on Artificial Intelligence, 2026, p. 15).

That is a UN scientific panel saying, in a report going to every Member State next week, that the industry building the most consequential technology of the decade currently asks the world to trust its own homework. Frontier developers set their own risk thresholds, design their own safety evaluations, and choose what to disclose. Governments mostly receive whatever testing data the developer decides to share (Independent International Scientific Panel on Artificial Intelligence, 2026, p. 15).

The Panel is equally blunt about the state of governance built on top of that foundation. Over forty distinct AI governance instruments already exist across corporate, national and international layers, but they are "fragmented, concentrated at the corporate level" and "neither systematic nor comprehensive" (Independent International Scientific Panel on Artificial Intelligence, 2026, p. 43). Some measure nothing beyond inputs like investment and headcount. Without a way to check whether any of them actually change outcomes, the report warns, governance risks becoming symbolic.

None of this is new to anyone who has tried to produce audit-ready evidence for an AI system rather than a policy document about one. What is new is a UN scientific body saying it out loud, on the record, ahead of a governance summit.

Agentic AI is why this stopped being theoretical

Self-attestation was already a weak foundation for chatbots. For agentic systems, the Panel argues, it stops being adequate at all.

Three findings compound each other. First, oversight has not been operationalized as something you can actually measure: "Human oversight is not yet operationalized as a measurable requirement, with concrete expectations for intervention, reversibility and accountability" (Independent International Scientific Panel on Artificial Intelligence, 2026, p. 43). Second, the report is explicit that testing the model is not the same as testing the system a model runs inside: "the unit of evaluation must be the deployed system including model, tools, environment and users, not the model alone" (Independent International Scientific Panel on Artificial Intelligence, 2026, p. 42). Third, and most uncomfortable, evaluation itself may not be trustworthy: leading systems show "evaluation awareness," and the Panel notes that models "could be instructed by humans or autonomously choose to temporarily reduce their test performance on dangerous capability assessments" (Independent International Scientific Panel on Artificial Intelligence, 2026, p. 15).

Put those together and the picture is stark. An agent can be granted real-world tool access, act with reduced human oversight, and the only evidence anyone has that it behaved safely is a self-reported test result from a system that may know it is being tested. That is not a compliance gap. It is a verification gap, and it sits directly beneath every claim about "responsible AI deployment" made by a vendor whose evaluation methodology nobody outside the company has audited.

What evidence-based assurance actually looks like

The Panel's pharma and aviation comparison is worth taking literally rather than as a rhetorical flourish. Neither industry asks the public to trust a manufacturer's internal safety claims. Both require reproducible evidence, generated by a process the manufacturer does not fully control, that a party who was not in the room can independently verify.

Applied to agentic AI, that means moving the evidence away from the developer's word and onto the runtime itself: a record of what an agent was permitted to do, what it actually did, and whether oversight controls were honoured or bypassed, generated in a way that cannot be quietly edited after the fact. That is the specific problem GATE, an open framework for agentic AI evidence that I have been developing independently of any Sakura Sky product, sets out to address: attested evidence trails rooted in hardware isolation rather than in a vendor's internal test report. The GATE Conformance Runner covers the mechanics of how that evidence gets produced and mapped against specific regulatory obligations.

For organisations that need this working today rather than as a research question, that is also the shape of the gap Sakura Sky's GRC practice is built to close: evidence-based assurance delivered as an outcome, not a framework left for a compliance team to implement from a whitepaper.

Geneva is the test

The Global Dialogue on AI Governance meets in Geneva on 6 and 7 July, and this report is what Member States will be handed on arrival. The Panel has done the harder-than-expected part: naming the evidence dilemma precisely enough that "we did not know" no longer holds. Whether Geneva produces the standardized, independent, pharma-and-aviation-grade assessment regime the Panel is describing, or simply adds a forty-first fragmented instrument to the pile, is the question the next twelve months will answer.

Disclosure: Sakura Sky offers GRC advisory services, including a Praxis compliance offering, referenced above. GATE is Andrew Stevens' independent open framework and is not a Sakura Sky product.

References

Independent International Scientific Panel on Artificial Intelligence, 2026. Preliminary Report of the Independent International Scientific Panel on AI: Evidence-based assessment of opportunities, risks and impacts of artificial intelligence. United Nations. Available at: https://www.un.org/independent-international-scientific-panel-ai/sites/default/files/2026-07/en_Preliminary%20Report_.pdf [Accessed 2 July 2026].

UN News, 2026. 'The science is here': UN chief welcomes first global AI assessment. United Nations. Available at: https://news.un.org/en/story/2026/07/1167853 [Accessed 2 July 2026].

United Nations Office for Digital and Emerging Technologies, 2026. Everyone has an opinion on AI. Now there's an agreed set of facts [LinkedIn post]. Available at: https://www.linkedin.com/company/unodet/posts [Accessed 2 July 2026].