<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Audacia</title>
    <description>The latest articles on DEV Community by Audacia (@audaciatechnology).</description>
    <link>https://dev.to/audaciatechnology</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F846740%2Fa625b018-125e-4136-8dde-4ffe735c084b.png</url>
      <title>DEV Community: Audacia</title>
      <link>https://dev.to/audaciatechnology</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/audaciatechnology"/>
    <language>en</language>
    <item>
      <title>A look at Microsoft Fabric</title>
      <dc:creator>Audacia</dc:creator>
      <pubDate>Mon, 30 Mar 2026 09:00:00 +0000</pubDate>
      <link>https://dev.to/audaciatechnology/a-look-at-microsoft-fabric-obk</link>
      <guid>https://dev.to/audaciatechnology/a-look-at-microsoft-fabric-obk</guid>
      <description>&lt;p&gt;Running data functions at large organisations might look like one set of tools for ingestion, another for storage, something else for transformation, a separate analytics layer, and a BI platform bolted on top. Each being the right choice at the time, however, collectively, they've become a problem.&lt;/p&gt;

&lt;p&gt;This can end up with a setup where:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Data engineers spend their time preparing and packaging data to hand off to analytics teams&lt;/li&gt;
&lt;li&gt;Analysts build reports in tools that sit outside the engineering environment, often working from copies or extracts rather than a single source of truth&lt;/li&gt;
&lt;li&gt;Data scientists operate in yet another silo, pulling data into notebooks and models that live separately from everything else&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Handoffs can lead to delays, and each disconnected tool can add governance complexity as well as creating a significant cumulative cost - in time, money, and organisational friction.&lt;/p&gt;

&lt;p&gt;The industry has been moving towards platform consolidation for years, and the major cloud providers have all made progress in this direction. Microsoft's entry with Fabric represents an attempt to bring the entire data lifecycle, from raw ingestion to executive dashboard, into a single, unified environment.&lt;/p&gt;

&lt;p&gt;For those evaluating where Fabric fits, the challenge is understanding what Fabric actually changes, who benefits most, and whether the strategic shift is worth pursuing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Understanding the spectrum
&lt;/h2&gt;

&lt;p&gt;Before assessing any platform, it helps to step back and consider where your organisation sits with regards to data structuring. Different businesses need different levels of sophistication, and understanding the different requirements helps clarify where Fabric adds value and where simpler solutions might still serve you well.&lt;/p&gt;

&lt;p&gt;At the most straightforward level, simple databases serve a clear and important purpose. Relational databases like SQL Server or PostgreSQL handle structured data storage and retrieval for individual applications effectively. If your needs are transactional, such as powering a web application, managing customer records, or supporting a single product, a well-designed database does the job without unnecessary complexity. Many teams start here, and for contained use cases, there's no reason to move beyond it.&lt;/p&gt;

&lt;p&gt;As organisations grow and the demand for cross-functional reporting increases, data warehouses become the natural next step. Platforms like Azure Synapse Analytics, Snowflake, or Google BigQuery are designed to aggregate data from multiple sources into a structured, optimised environment built for analytical queries. This is the traditional backbone of enterprise business intelligence where data is extracted from operational systems, transformed into consistent schemas, and made available for reporting and analysis. For organisations that need reliable, governed analytics across departments, a data warehouse remains a solid foundation.&lt;/p&gt;

&lt;p&gt;The challenge arises when the warehouse alone is no longer enough. Modern data demands often include unstructured data, real-time streaming, machine learning workloads, and self-service analytics - none of which a traditional warehouse handles natively. This is where organisations start layering in additional tools such as a lakehouse for unstructured data, a Spark environment for data science, a separate streaming platform for real-time use cases, and a BI tool on top. Each addition solves a problem, but each also introduces another integration point, another security model to manage, and another team boundary to navigate.&lt;/p&gt;

&lt;p&gt;Unified platforms like Microsoft Fabric sit at the far end of this spectrum. Rather than asking organisations to assemble their own stack from best-of-breed components, Fabric brings storage, engineering, warehousing, data science, real-time analytics, and business intelligence together in a single environment. For those operating at scale, with multiple data teams and increasingly complex requirements, the cost of maintaining a fragmented stack can become harder to justify.&lt;/p&gt;

&lt;p&gt;Understanding where your organisation sits on this spectrum matters because the value of Fabric depends heavily on context. An organisation running a handful of straightforward reporting use cases may find a warehouse and Power BI perfectly sufficient. An organisation juggling data engineering, science, streaming, and BI workloads across five different platforms will feel the consolidation benefits immediately.&lt;/p&gt;

&lt;h2&gt;
  
  
  Microsoft Fabric: The umbrella explained
&lt;/h2&gt;

&lt;p&gt;Fabric can be easy to misunderstand if you approach it as simply another Microsoft product release. In reality, it's an umbrella platform - a unified SaaS offering that brings together multiple previously separate data services under a common foundation.&lt;/p&gt;

&lt;p&gt;At the base of everything sits OneLake, Fabric's unified data layer. OneLake acts as a single storage foundation for your entire organisation's data, regardless of whether that data is structured, semi-structured, or unstructured. Every service within Fabric reads from and writes to OneLake, which means there's one copy of the data, one set of access controls, and one lineage trail. This is the architectural decision that makes the rest of the consolidation possible. A shared data layer means the services built on top of it genuinely share a foundation, rather than simply being co-located.&lt;/p&gt;

&lt;p&gt;Built on top of that foundation, Fabric consolidates several core services that organisations have traditionally sourced and managed independently.&lt;/p&gt;

&lt;p&gt;Data Factory handles data integration and orchestration. If you're currently running ETL or ELT pipelines to move data between systems, Data Factory provides that capability natively within Fabric. It connects to a wide range of source systems and allows you to build, schedule, and monitor data movement and transformation workflows without reaching for a separate integration tool.&lt;/p&gt;

&lt;p&gt;Data Engineering provides a Spark-based environment for large-scale data processing. Data engineers can work with notebooks and Spark jobs directly within the Fabric environment, processing large volumes of data without needing a standalone Spark cluster or a separate Databricks workspace. The data they process lives in OneLake, immediately accessible to every other service.&lt;/p&gt;

&lt;p&gt;Data Warehousing delivers a T-SQL-based analytical data warehouse. For organisations with teams skilled in SQL, this provides a familiar interface for building and querying structured analytical models without the need to provision and manage separate warehouse infrastructure.&lt;/p&gt;

&lt;p&gt;Data Science supports machine learning and advanced analytics workloads. Data scientists can build, train, and deploy models within the same environment where the data engineering and warehousing work happens. This reduces the friction that typically exists when models need to move between teams or when data needs to be extracted into separate science environments.&lt;/p&gt;

&lt;p&gt;Real-Time Analytics addresses streaming and event-driven data. For organisations working with IoT data, application telemetry, or any use case that requires near-instant insight from data as it arrives, this service provides real-time ingestion and querying capabilities natively within the platform.&lt;/p&gt;

&lt;p&gt;Power BI, already the dominant enterprise BI tool in many organisations, is integrated directly into Fabric rather than sitting alongside it as a separate product. Reports and dashboards connect directly to data in OneLake, with no need to extract, export, or duplicate data into a separate BI layer.&lt;/p&gt;

&lt;p&gt;Data Activator adds an automation layer, allowing organisations to set up alerts and trigger actions based on data conditions. Rather than building custom monitoring solutions, teams can define rules that respond automatically when data meets certain thresholds or patterns.&lt;/p&gt;

&lt;p&gt;Comparable tools for each of these capabilities exist elsewhere in the market. Where Fabric distinguishes itself is in the shared foundation. These services share OneLake, share a security model, share a governance framework, and share a licensing structure. They were built on a common platform rather than bundling together tools.&lt;/p&gt;

&lt;h2&gt;
  
  
  Data consolidation
&lt;/h2&gt;

&lt;p&gt;For data teams, the most compelling argument for Fabric is often operational rather than technical. The way most large organisations currently work with data involves a series of handoffs between teams that can create unnecessary friction.&lt;/p&gt;

&lt;p&gt;Consider a workflow of:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A data engineering team builds and maintains pipelines that ingest raw data from source systems, transform it, and load it into a warehouse or lakehouse. Once the data is structured and validated, it's made available, often through a separate access layer or export process, to an analytics team.&lt;/li&gt;
&lt;li&gt;The analytics team then builds reports and dashboards in a BI tool like Power BI, Tableau, or Looker.&lt;/li&gt;
&lt;li&gt;If a data science team is involved, they'll often pull data into yet another environment to build models, the outputs of which may then need to be fed back into the warehouse for the analytics team to report on.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This also creates an opportunity for data to drift out of sync, for definitions to diverge, and for governance to become fragmented. With each of these handoffs also being a potential point of failure, introducing latency, and requiring coordination between teams who may be using different tools, different interfaces, and different mental models of the same data.&lt;/p&gt;

&lt;p&gt;Fabric's consolidation directly addresses this. When Power BI sits within the same platform as the data engineering and warehousing layers, the gap between "data is ready" and "report is built" shrinks dramatically. An analyst building a Power BI report in Fabric is working directly with data in OneLake, the same data the engineering team just processed, governed by the same access controls, with the same lineage. There's no export, no separate connection to configure and no waiting for data to appear in a different system.&lt;/p&gt;

&lt;p&gt;Similarly, when data scientists work within the same environment, they can access the data they need directly rather than extracting it into a standalone notebook server or requesting access through a separate process. They work on the same platform, with the same data, subject to the same governance. The output of their models can be written back to OneLake and immediately consumed by BI reports or downstream applications.&lt;/p&gt;

&lt;p&gt;Organisational roles and specialisms remain important in this model. Data engineering, analytics, and data science are distinct disciplines with distinct skills, and Fabric doesn't change that, however it does reduce the friction between them. Teams still specialise, but they collaborate on a shared platform rather than throwing work in silos between disconnected tools.&lt;/p&gt;

&lt;p&gt;The governance implications are equally significant. In a fragmented stack, security and access controls need to be configured and maintained separately across each tool; data lineage is difficult to track end-to-end when data passes through multiple systems; and compliance reporting requires pulling information from multiple audit logs. However, in Fabric, a single security model covers the entire lifecycle. Access controls set at the OneLake level apply consistently whether the data is being accessed by an engineer in a Spark notebook, an analyst in Power BI, or a scientist in a machine learning experiment.&lt;/p&gt;

&lt;p&gt;For organisations operating in regulated industries such as financial services, healthcare or the public sector, this unified governance model creates a significant reduction in compliance risk and audit complexity.&lt;/p&gt;

&lt;h2&gt;
  
  
  Considerations
&lt;/h2&gt;

&lt;p&gt;Understanding what Fabric offers is a useful starting point. The harder work is deciding whether and how to adopt it. There are several strategic dimensions worth considering:&lt;/p&gt;

&lt;h3&gt;
  
  
  Market positioning
&lt;/h3&gt;

&lt;p&gt;Fabric exists within a competitive landscape. Databricks offers a strong lakehouse platform with deep data science capabilities, while Snowflake provides a mature, cloud-agnostic data warehousing experience, and AWS has its own suite of data services. Each has genuine strengths, and the right choice depends on the specific context. Fabric's distinctive advantage lies in its breadth and native integration with the Microsoft ecosystem. If your organisation already runs on Azure, uses Microsoft 365, and has Power BI embedded across business teams, Fabric offers a consolidation path that leverages existing investments and skills. Organisations whose stacks are primarily built on AWS or GCP will need to weigh that integration benefit against the switching costs involved.&lt;/p&gt;

&lt;h3&gt;
  
  
  Migration reality
&lt;/h3&gt;

&lt;p&gt;No large organisation is going to rip and replace its entire data infrastructure overnight, and Fabric doesn't require that. A more realistic approach is phased adoption - identifying workloads where consolidation delivers the most immediate value and starting there. Power BI teams that currently connect to external data sources can be a logical first choice, with data engineering teams managing complex pipeline orchestration across multiple tools being another. Starting with high-friction, high-visibility workloads can help to build internal confidence and demonstrate value before committing to broader migration.&lt;/p&gt;

&lt;h3&gt;
  
  
  Skills and team readiness
&lt;/h3&gt;

&lt;p&gt;Fabric lowers certain barriers, such as analysts can do more without engineering support, and the shared environment reduces the need for manual handoffs. At the same time, adopting any new platform requires an investment in learning. Teams will need to understand OneLake's storage model, the nuances of each service within Fabric, and how governance works across the unified environment. Planning for this upskilling alongside the technical migration is essential.&lt;/p&gt;

&lt;h3&gt;
  
  
  Governance and compliance
&lt;/h3&gt;

&lt;p&gt;For organisations in regulated sectors, Fabric's unified security and lineage model is a significant draw. Having a single place to manage access controls, audit data movement, and trace lineage from source to report simplifies compliance in a way that fragmented stacks struggle to match.&lt;/p&gt;

&lt;h3&gt;
  
  
  Platform maturity
&lt;/h3&gt;

&lt;p&gt;Fabric is still evolving. Some components are more mature than others and Microsoft continues to ship updates and new capabilities at pace. Early adopters should be prepared for a platform that is moving quickly, with all the opportunity and occasional rough edges that brings. Evaluating Fabric today means accepting that some features may still be maturing while recognising that Microsoft's investment and trajectory suggest significant development ahead.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;The fragmented data stack served its purpose for a long time. It allowed organisations to adopt best-of-breed tools for each stage of the data lifecycle and build capabilities incrementally. But the operational and strategic costs of maintaining that fragmentation are growing, and the expectations placed on data teams - to deliver faster, govern better, and do more with less - are only increasing.&lt;/p&gt;

&lt;p&gt;Microsoft Fabric represents a credible path towards consolidation. By bringing the full data lifecycle under one roof, sharing a common data layer, and unifying governance across every workload, it addresses many of the friction points that data teams deal with daily.&lt;/p&gt;

&lt;p&gt;Whether Fabric is the right move for an organisation depends on the current stack, their team's capabilities, and overall strategic direction. For data leaders already embedded in the Microsoft ecosystem and feeling the strain of a fragmented infrastructure, it can be a good option to evaluate.&lt;/p&gt;

&lt;h2&gt;
  
  
  Author
&lt;/h2&gt;

&lt;p&gt;Chris is a Lead Data Scientist, with a background in astrophysics, and has over 4 years’ experience in providing data strategies insights using computational models and machine learning methodology. Chris has worked with a number of organisations across industries to successfully deliver AI projects, from PoC development and use case validation, through to model training and maintenance.&lt;/p&gt;

</description>
      <category>microsoft</category>
      <category>microsoftfabric</category>
      <category>tooling</category>
    </item>
    <item>
      <title>Testing AI: How to Effectively Evaluate LLMs</title>
      <dc:creator>Audacia</dc:creator>
      <pubDate>Mon, 23 Mar 2026 10:00:00 +0000</pubDate>
      <link>https://dev.to/audaciatechnology/testing-ai-how-to-effectively-evaluate-llms-4603</link>
      <guid>https://dev.to/audaciatechnology/testing-ai-how-to-effectively-evaluate-llms-4603</guid>
      <description>&lt;p&gt;Traditional software testing rests on a basic assumption that given the same input, the system produces the same output. A test case defines expected behaviour, and a test passes or fails based on whether the output matches. This assumption – deterministic behaviour with verifiable correctness – is the foundation on which decades of quality assurance practices have been built.&lt;/p&gt;

&lt;p&gt;However, this can break down with large language models. An LLM may produce a different response to the same prompt on successive runs. Its outputs are sensitive to context, prompt phrasing, temperature settings and the interaction between retrieved documents and parametric knowledge. It can produce responses that are fluent, confident and completely wrong - a failure mode that traditional testing has no framework for detecting. And unlike a conventional software bug, which typically manifests consistently and can be reproduced, AI system failures are often probabilistic, context-dependent and difficult to predict.&lt;/p&gt;

&lt;p&gt;For engineering leaders, this creates a new problem. Organisations are deploying LLM-powered features at pace, such as customer-facing chatbots, internal knowledge assistants, AI-augmented search, automated document processing, coding assistants and increasingly autonomous agentic workflows. However, the testing and evaluation practices for these systems are struggling to keep up.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.capgemini.com/insights/research-library/world-quality-report-2025-26/" rel="noopener noreferrer"&gt;The World Quality Report 2025&lt;/a&gt;, surveying over 2,000 senior executives across 22 countries, found that hallucination and reliability concerns are now among the top barriers to generative AI adoption in quality engineering, cited by 60% of respondents - a challenge that barely registered two years ago.&lt;/p&gt;

&lt;p&gt;This article looks at what testing looks like for AI systems, why it is fundamentally different from traditional software testing, and how organisations can build the evaluation capability required to deploy LLMs responsibly.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Traditional Testing Fails for AI Systems
&lt;/h2&gt;

&lt;p&gt;The differences between testing traditional software and testing AI systems are not differences of degree but of kind.&lt;/p&gt;

&lt;p&gt;In conventional software, correctness is binary. A function either returns the right value or it does not. Test cases can enumerate expected input-output pairs, and 100% pass rates are achievable and expected. The system under test is deterministic - run the same test twice, get the same result. And when a test fails, the failure is reproducible, allowing engineers to diagnose and fix the root cause.&lt;/p&gt;

&lt;p&gt;Little of these properties hold for LLM-powered systems. There is no single "correct" response to most natural language queries. A question about company policy might have multiple valid phrasings, levels of detail and degrees of nuance. The system is non-deterministic by design (temperature and sampling parameters introduce controlled randomness). And failures, such as hallucinations, reasoning errors, safety violations and biased outputs, may occur intermittently, triggered by specific combinations of context, phrasing and retrieved information that are difficult to anticipate or reproduce.&lt;/p&gt;

&lt;p&gt;This means testing AI systems is an evaluation discipline rather than a verification discipline. Instead of asking "does this pass or fail?", organisations must ask "how well does this system perform across a range of scenarios, and is the distribution of performance acceptable for our use case?" This requires statistical thinking, domain-specific quality criteria and continuous evaluation rather than one-off test suites.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Hallucination Problem: Scale and Consequences
&lt;/h2&gt;

&lt;p&gt;Hallucination - where an LLM generates content that is fluent and confident but factually incorrect or unsupported by source material - is the most visible failure mode and the one that most concerns enterprise adopters.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/vectara/hallucination-leaderboard" rel="noopener noreferrer"&gt;Vectara's Hallucination Leaderboard&lt;/a&gt;, which benchmarks LLMs for factual consistency in summarisation tasks, found that even frontier reasoning models, including GPT-5, Claude Sonnet 4.5, Grok-4, and DeepSeek-R1, all exhibited hallucination rates exceeding 10% on their updated, more challenging benchmark. The recently released Gemini-3-pro demonstrated a 13.6% hallucination rate and did not make the top-25 list.&lt;/p&gt;

&lt;p&gt;These are the best available systems, evaluated on a straightforward summarisation task, not adversarial conditions or edge cases.&lt;/p&gt;

&lt;p&gt;The academic community is also grappling with how to define and categorise hallucinations consistently. &lt;a href="https://aclanthology.org/2025.acl-long.1176/" rel="noopener noreferrer"&gt;The HalluLens benchmark&lt;/a&gt;, presented at ACL 2025, identified a fundamental challenge in existing benchmarks often conflating hallucination with factuality, despite these being distinct problems requiring different evaluation approaches. HalluLens proposes a taxonomy distinguishing between extrinsic hallucinations (where generated content deviates from or contradicts source material the model had access to) and intrinsic hallucinations (where the model contradicts its own earlier outputs). This distinction matters for enterprise applications because the mitigation strategies differ, with extrinsic hallucination being a retrieval and grounding problem, while intrinsic hallucination is a consistency and reasoning problem.&lt;/p&gt;

&lt;p&gt;The real-world consequences of inadequate hallucination testing are already visible and increasingly costly.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Air Canada lost a legal case after its chatbot fabricated a bereavement discount policy that did not exist – the airline was held liable for the AI's invention.&lt;/li&gt;
&lt;li&gt;New York City's public-facing chatbot provided illegal advice to business owners about regulatory requirements.&lt;/li&gt;
&lt;li&gt;And a GPTZero analysis of over 4,000 papers accepted at NeurIPS 2025 found that dozens contained fabricated AI-generated citations – invented authors, titles and journals that passed peer review undetected.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://responsibleailabs.ai/knowledge-hub/articles/llm-evaluation-benchmarks-2025" rel="noopener noreferrer"&gt;These incidents&lt;/a&gt; share a common root cause in systems being deployed without adequate evaluation of their failure modes under realistic conditions.&lt;/p&gt;

&lt;h2&gt;
  
  
  What LLM Evaluation Looks Like
&lt;/h2&gt;

&lt;p&gt;Practitioners are converging on a multi-dimensional evaluation approach that moves well beyond traditional pass/fail testing. The emerging consensus spans at least seven dimensions: accuracy, safety, bias, hallucination, robustness, latency and security. Each requires different evaluation methods, and the relative importance of each dimension varies by use case – a customer service chatbot has different critical dimensions than a code generation tool or a medical information system.&lt;/p&gt;

&lt;h3&gt;
  
  
  Benchmark suites
&lt;/h3&gt;

&lt;p&gt;Benchmark suites are the most familiar evaluation approach, adapted from academic AI research. Standardised benchmarks test model capabilities across reasoning, knowledge, coding and other dimensions. However, generic benchmarks have significant limitations for enterprise use. Many models now saturate standard benchmarks like MMLU (exceeding 90% accuracy), which has driven the development of harder alternatives. More fundamentally, a model's score on a general benchmark tells you little about how it will perform on your specific domain, data and use cases. Organisations deploying LLMs need domain-specific evaluation datasets that reflect the actual questions their users ask, the documents their RAG systems retrieve, and the edge cases their particular deployment will encounter.&lt;/p&gt;

&lt;h3&gt;
  
  
  LLM-as-judge approaches
&lt;/h3&gt;

&lt;p&gt;LLM-as-judge approaches use one language model to evaluate the outputs of another. This approach is both practical and scalable, allowing automated evaluation of thousands of responses without human reviewers, with tools like DeepEval and RAGAS making this accessible. But the approach does have an inherent risk. If both the generating model and the evaluating model are prone to hallucination, they may reinforce each other's errors, creating what researchers describe as a "hallucination echo chamber." Effective LLM-as-judge implementations mitigate this through multi-model consensus (using several different models as judges), structured evaluation rubrics that constrain the judge's assessment to specific, verifiable dimensions and periodic calibration against human judgement.&lt;/p&gt;

&lt;h3&gt;
  
  
  Red-teaming and adversarial testing
&lt;/h3&gt;

&lt;p&gt;Red-teaming and adversarial testing deliberately probe the system for failure modes. This includes testing for prompt injection (where adversarial inputs manipulate the model's behaviour), safety violations (where the model produces harmful or inappropriate content), and edge cases where the model's confidence exceeds its accuracy. Red-teaming is particularly important for customer-facing AI systems, where an adversarial user may deliberately attempt to exploit the system. &lt;a href="https://artificialintelligenceact.eu/" rel="noopener noreferrer"&gt;The EU AI Act&lt;/a&gt; explicitly requires adversarial testing for general-purpose AI models, making this a compliance requirement rather than a best practice.&lt;/p&gt;

&lt;h3&gt;
  
  
  Human evaluation
&lt;/h3&gt;

&lt;p&gt;Human evaluation remains essential for high-stakes use cases. Automated metrics cannot fully capture whether a response is genuinely helpful, appropriately nuanced, or safe in context. Human evaluation is expensive and slow, which makes it impractical for comprehensive testing, but it serves a critical role in calibrating automated evaluation systems and validating performance on the most important and sensitive scenarios.&lt;/p&gt;

&lt;h3&gt;
  
  
  Continuous evaluation in production
&lt;/h3&gt;

&lt;p&gt;Continuous evaluation in production closes the loop. Unlike traditional software where testing occurs before deployment, AI systems require ongoing monitoring because their performance depends on inputs that cannot be fully anticipated. This includes tracking hallucination rates on real user queries, monitoring for distribution shift (where the types of questions users ask diverge from what the system was evaluated on), and collecting user feedback to identify failure patterns that pre-deployment testing missed.&lt;/p&gt;

&lt;h2&gt;
  
  
  Testing RAG Systems: Where Retrieval Meets Generation
&lt;/h2&gt;

&lt;p&gt;Retrieval-augmented generation (RAG), where an LLM's responses are grounded in documents retrieved from an organisational knowledge base, is the most common enterprise LLM deployment pattern. It is also where testing becomes particularly nuanced, because failures can originate in the retrieval step, the generation step or the interaction between the two.&lt;/p&gt;

&lt;p&gt;A RAG system can fail in several distinct ways. The retrieval component may return irrelevant documents, missing the information needed to answer the query. It may return relevant documents but rank them poorly, burying the critical information below less relevant content. The generation component may ignore the retrieved context and rely on its parametric knowledge instead, producing a plausible but ungrounded answer. Or it may hallucinate details that are not present in any of the retrieved documents, fabricating specifics while appearing to cite its sources.&lt;/p&gt;

&lt;p&gt;Testing RAG systems therefore requires evaluating each component independently and the system as a whole. Retrieval quality can be measured through precision (what proportion of retrieved documents are relevant?) and recall (what proportion of relevant documents are retrieved?). Generation quality requires checking faithfulness (does the response accurately reflect the retrieved content?), relevance (does the response actually answer the question?) and completeness (does it include all pertinent information from the retrieved documents?).&lt;/p&gt;

&lt;p&gt;The challenge is that these evaluations require ground-truth datasets specific to the organisation's knowledge base and user queries. Off-the-shelf benchmarks do not test whether your RAG system correctly answers questions about your company's policies, products or processes. Building these evaluation datasets, such as curating representative questions, establishing correct answers and maintaining them as the knowledge base evolves, is one of the most labour-intensive but essential aspects of AI testing. Enterprise research has found that content quality and organisation within the knowledge base itself often has a larger impact on RAG performance than the choice of model or retrieval architecture, which means testing must extend to the data layer, not just the AI components.&lt;/p&gt;

&lt;h2&gt;
  
  
  Testing Agentic AI: The Next Frontier
&lt;/h2&gt;

&lt;p&gt;The testing challenge compounds further as organisations move from simple question-answering systems to agentic AI – systems that can plan multi-step tasks, use tools and take actions in the real world. An agentic workflow might involve an AI system that receives a customer request, retrieves relevant information from multiple sources, reasons about the best course of action and executes a series of steps (updating a database, sending a communication, triggering a workflow) with minimal human intervention.&lt;/p&gt;

&lt;p&gt;Testing agentic systems requires evaluating not just the quality of individual outputs but the correctness of entire decision chains. Does the agent correctly decompose a complex task into appropriate sub-tasks? Does it select the right tools for each step? Does it handle errors and unexpected conditions gracefully? Does it know when to escalate to a human rather than proceeding autonomously?&lt;/p&gt;

&lt;p&gt;These questions go beyond hallucination testing into territory that more closely resembles integration testing and end-to-end workflow validation. However, with the added complexity that the system's behaviour is non-deterministic and its decision-making is opaque.&lt;/p&gt;

&lt;p&gt;The real-world consequences of inadequate agentic AI testing have already surfaced: in one widely reported incident, an autonomous AI coding agent deleted a company's primary database during a self-directed "cleanup" operation, violating a direct instruction prohibiting modifications. The root cause was not a hallucination but a reasoning failure, where the agent decided that a database cleanup was appropriate despite an explicit code freeze instruction, and no separation existed between test and production environments.&lt;/p&gt;

&lt;p&gt;For engineering leaders, agentic AI testing demands a combination of traditional integration testing principles (test the workflow end-to-end, validate boundary conditions, verify error handling) with AI-specific evaluation (assess the quality of the agent's reasoning, its compliance with guardrails and its behaviour under adversarial or unexpected conditions). Sandbox environments with realistic but non-production data become essential, as does the ability to replay and analyse the agent's decision chain after the fact.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Regulatory Dimension
&lt;/h2&gt;

&lt;p&gt;The regulatory environment is adding both urgency and specificity to AI testing requirements.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://artificialintelligenceact.eu/" rel="noopener noreferrer"&gt;The EU AI Act&lt;/a&gt;, now entering enforcement, establishes graduated testing obligations based on risk classification. High-risk AI systems, which include those used in employment, credit decisions, education and critical infrastructure, require comprehensive testing for accuracy, robustness, cybersecurity and non-discrimination before deployment, with ongoing monitoring obligations thereafter.&lt;/p&gt;

&lt;p&gt;General-purpose AI models face model evaluation requirements including adversarial testing. Organisations deploying LLM-powered features must be able to demonstrate that they have tested their systems against these criteria – a compliance requirement that many have not yet begun to address.&lt;/p&gt;

&lt;p&gt;The UK's approach differs in structure but converges in its implications. Rather than prescriptive legislation, UK regulators are applying existing regulatory frameworks, through the FCA, ICO, CMA and sector-specific regulators, to AI systems within their remit. The ICO's guidance on AI and data protection, for instance, requires organisations to demonstrate that AI systems processing personal data are accurate, fair and transparent. The practical effect is similar in that organisations must be able to evidence that they have evaluated their AI systems' behaviour against relevant quality and safety criteria.&lt;/p&gt;

&lt;p&gt;The EU Cyber Resilience Act adds another layer for AI-powered software products, requiring that products be developed according to secure-by-design principles, free from known exploitable vulnerabilities and supported by ongoing security updates. For AI systems that interact with external inputs (user queries, retrieved documents, API calls), this implies testing for adversarial inputs, prompt injection and data leakage – categories that traditional security testing does not cover.&lt;/p&gt;

&lt;h2&gt;
  
  
  Building AI Testing Capability
&lt;/h2&gt;

&lt;p&gt;Perhaps the most practical challenge facing engineering leaders is where AI testing capability should sit organisationally and what skills it requires.&lt;/p&gt;

&lt;p&gt;AI evaluation requires a blend of competencies. It requires understanding of ML evaluation methodology, such as benchmark design, statistical analysis of non-deterministic outputs and evaluation metric selection. It also requires domain expertise to define what "correct" means for specific use cases – a question that is ultimately a business judgement rather than a technical one. As well as this, it requires prompt engineering capability to design effective evaluation prompts and adversarial test cases. And lastly, it requires the infrastructure skills to build and run evaluation pipelines at scale, integrate monitoring into production systems, and maintain evaluation datasets as the system and its usage evolve.&lt;/p&gt;

&lt;p&gt;Some organisations are embedding this capability within existing QA teams, extending their remit to encompass AI evaluation alongside traditional testing. Others are building dedicated AI quality or AI evaluation functions, sometimes within ML engineering teams, sometimes as standalone roles. Neither approach has emerged as clearly superior. The right answer depends on the organisation's AI maturity, the scale and criticality of its AI deployments, and whether the dominant challenge is evaluation methodology (which favours ML expertise) or integration with existing quality processes (which favours QA expertise).&lt;/p&gt;

&lt;p&gt;What is clear is that there is a skills gap. The World Quality Report found that 50% of organisations lack AI/ML expertise, unchanged from the prior year, and that generative AI has emerged as the single most in-demand skill for quality engineers (63%), ahead of core quality engineering fundamentals (60%).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.practitest.com/state-of-testing" rel="noopener noreferrer"&gt;PractiTest's State of Testing&lt;/a&gt; 2026 data reinforces this from the practitioner perspective. Testing professionals who actively use AI tools are significantly less anxious about their future and earn a measurable salary premium, suggesting that the market is already pricing in AI evaluation capability.&lt;/p&gt;

&lt;h2&gt;
  
  
  From Optional to Essential
&lt;/h2&gt;

&lt;p&gt;The window during which AI testing could be treated as an emerging discipline is closing. Organisations are deploying LLM-powered systems into production, customers and employees are interacting with them daily, and the failure modes are documented and increasingly expensive.&lt;/p&gt;

&lt;p&gt;The hallucination rates are quantified, with even frontier models exceeding 10% on rigorous benchmarks. The regulatory requirements are specific, with the EU AI Act mandating testing that most organisations cannot yet perform. And the deployment patterns are growing more complex, with RAG systems compounding retrieval and generation failures, while agentic workflows are introducing autonomous decision-making with real-world consequences.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.veracode.com/resources/analyst-reports/2025-genai-code-security-report" rel="noopener noreferrer"&gt;The Veracode research&lt;/a&gt; on AI-generated code security showed the same pattern – newer, larger models do not produce more secure code, highlighting that these are not problems that will be solved with the next model release. Instead, teams require exploration and investment into testing capability, evaluation infrastructure and the organisational capacity to assess and manage the risks inherent in deploying probabilistic systems.&lt;/p&gt;

&lt;h2&gt;
  
  
  Author
&lt;/h2&gt;

&lt;p&gt;Richard Brown is the Technical Director at Audacia, where he is responsible for steering the technical direction of the company and maintaining standards across development and testing.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>testing</category>
      <category>llm</category>
      <category>rag</category>
    </item>
    <item>
      <title>Why AI Governance is Key to Scaling AI</title>
      <dc:creator>Audacia</dc:creator>
      <pubDate>Mon, 16 Mar 2026 10:00:00 +0000</pubDate>
      <link>https://dev.to/audaciatechnology/why-ai-governance-is-key-to-scaling-ai-5aka</link>
      <guid>https://dev.to/audaciatechnology/why-ai-governance-is-key-to-scaling-ai-5aka</guid>
      <description>&lt;p&gt;Governance is the aspect of AI that most reliably triggers resistance from delivery teams. The perception, which can often be well-founded in experience, is that governance means delays, committees, paperwork and risk management leading to blockers.&lt;/p&gt;

&lt;p&gt;This perception is understandable but can lead to significant risk. Understandable, because many organisations have governance frameworks that aren’t necessarily suited to the iterative, experimental nature of AI development. However the absence of governance does not eliminate risk, but rather it means that risks are often uncovered in production, where the consequences are most severe and the cost of remediation is highest.&lt;/p&gt;

&lt;p&gt;The organisations that are scaling AI successfully have resolved this tension - not by choosing between speed and governance, but by fundamentally rethinking what governance means in the context of AI. They have made it proportionate, embedded and automated, with the evidence showing that this approach can help to accelerate delivery, not slow it down.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Cost of Ungoverned AI
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai" rel="noopener noreferrer"&gt;McKinsey's 2025 State of AI survey&lt;/a&gt; found that 51% of organisations report at least one negative AI-related incident in the past 12 months. The most commonly cited incidents involved inaccuracy, followed by compliance failures, reputational damage, privacy breaches and unauthorised actions by AI systems.&lt;/p&gt;

&lt;p&gt;These are risks affecting the majority of organisations deploying AI at any meaningful scale, and they are growing. The average organisation is now actively managing around four types of AI risk, up from approximately two in 2022, with inaccuracy, cybersecurity, privacy and regulatory risk most frequently addressed. Explainability – the ability to understand and explain why an AI system produced a particular output – stands out as a risk that many organisations experience but fewer have robust controls for.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.deloitte.com/us/en/what-we-do/capabilities/applied-artificial-intelligence/content/state-of-ai-in-the-enterprise.html" rel="noopener noreferrer"&gt;Deloitte's 2026 State of AI in the Enterprise report&lt;/a&gt; adds a governance dimension specific to the emerging agentic AI frontier, with only one in five companies having a mature governance model for autonomous AI agents. As AI systems move from answering questions to taking independent action, the governance gap becomes a genuine operational risk.&lt;/p&gt;

&lt;p&gt;The business case for governance is fundamental to building the organisational trust required to scale AI beyond pilots. Without governance, boards can hesitate to approve production deployment, business stakeholders can question the reliability of AI outputs, regulators can ask questions that cannot be answered, and individual AI initiatives that might otherwise create value remain confined to sandboxes because teams don’t have the confidence to release them.&lt;/p&gt;

&lt;h2&gt;
  
  
  The EU AI Act: A New Regulatory Baseline
&lt;/h2&gt;

&lt;p&gt;The most significant regulatory development for enterprise AI is the EU AI Act – the first comprehensive AI legislation globally. Its phased implementation timeline is now well underway and directly affects any organisation operating in or serving EU markets.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://digital-strategy.ec.europa.eu/en/policies/regulatory-framework-ai" rel="noopener noreferrer"&gt;The Act&lt;/a&gt; entered into force on 1 August 2024. Within this, prohibited AI practices – including social scoring and certain forms of biometric categorisation – have been banned since February 2025. Obligations for general-purpose AI (GPAI) models, including transparency and documentation requirements, became applicable in August 2025. The penalty regime is now active, with fines of up to €35 million or 7% of global turnover for prohibited practices, and up to €15 million or 3% for other infringements.&lt;/p&gt;

&lt;p&gt;The most consequential &lt;a href="https://trilateralresearch.com/responsible-ai/eu-ai-act-implementation-timeline-mapping-your-models-to-the-new-risk-tiers" rel="noopener noreferrer"&gt;deadline&lt;/a&gt; for enterprises is August 2026, when the comprehensive compliance framework for high-risk AI systems takes effect. This covers AI used in areas including biometrics, critical infrastructure, education, employment, essential services, law enforcement and border management. Organisations deploying AI in these domains will need to demonstrate risk management systems, data governance measures, technical documentation, human oversight mechanisms and conformity assessments.&lt;/p&gt;

&lt;p&gt;For UK organisations, the Act has extraterritorial reach - if the output of an AI system is used within the EU, the obligations apply regardless of where the provider is based. Any UK enterprise with EU customers, operations or supply chain connections should therefore understand and plan for compliance.&lt;/p&gt;

&lt;h2&gt;
  
  
  The UK Approach: Principles-Based but Tightening
&lt;/h2&gt;

&lt;p&gt;The UK has deliberately chosen a different path from the EU's prescriptive legislation. As of early 2026, the UK has not adopted a single cross-economy AI law. Instead, it relies on existing sector regulators to apply current frameworks to AI within their domains – a principles-based, outcomes-focused approach.&lt;/p&gt;

&lt;p&gt;In financial services – the UK sector furthest advanced in AI adoption – this approach is well-articulated. &lt;a href="https://www.fca.org.uk/firms/innovation/ai-approach" rel="noopener noreferrer"&gt;The FCA confirmed&lt;/a&gt; in December 2025 that it will not introduce AI-specific rules, citing the technology's rapid evolution. Instead, it relies on existing frameworks including the Consumer Duty, Senior Managers and Certification Regime (SM&amp;amp;CR), and operational resilience requirements. FCA's position is that these technology-agnostic frameworks already cover the key risks associated with AI deployment – accountability, transparency, consumer protection and resilience.&lt;/p&gt;

&lt;p&gt;The Bank of England and FCA's third &lt;a href="https://www.bclplaw.com/en-US/events-insights-news/ai-regulation-in-financial-services-turning-principles-into-practice.html" rel="noopener noreferrer"&gt;survey&lt;/a&gt; of AI in UK financial services, published in November 2024, found that 75% of firms are already using AI, with a further 10% planning to adopt within three years. Foundation models account for 17% of use cases, though most deployments remain low materiality. Lloyds' 2025 Financial Institutions Sentiment Survey reported that 59% of institutions now see measurable productivity gains from AI, up from 32% a year earlier.&lt;/p&gt;

&lt;p&gt;But "principles-based" does not mean "relaxed." The FCA's Chief Data Officer has noted that advances in AI may require modified approaches to firm risk management and governance, and that regulation will need to adapt. The Treasury Committee published a report on AI in financial services in January 2026, examining both opportunities and risks. And the UK government appointed two AI Champions for financial services – signalling that regulatory attention is intensifying.&lt;/p&gt;

&lt;p&gt;For organisations outside financial services, the landscape is less codified but no less important. The ICO's existing guidance on automated decision-making under UK GDPR applies to any AI system that processes personal data. Sector-specific regulators in healthcare (MHRA, CQC), energy (Ofgem), and other domains are developing their own positions. And the UK government's AI Opportunities Action Plan, published in early 2025, signals a direction of travel toward greater expectations around safety, transparency and accountability – even without prescriptive legislation.&lt;/p&gt;

&lt;p&gt;The practical implication for UK enterprises is that the absence of an AI-specific law does not mean the absence of regulatory obligation. Existing frameworks already create accountability for AI outcomes, and the direction of travel – both domestically and through the extraterritorial reach of the EU AI Act – is clearly toward greater scrutiny.&lt;/p&gt;

&lt;h2&gt;
  
  
  Three Principles for AI Governance
&lt;/h2&gt;

&lt;p&gt;The organisations succeeding with AI governance share three design principles that distinguish their approach from the heavyweight, process-oriented governance models that have historically frustrated delivery teams.&lt;/p&gt;

&lt;h3&gt;
  
  
  Proportionate governance
&lt;/h3&gt;

&lt;p&gt;Proportionate governance calibrates the level of oversight to the level of risk. Not every AI application carries the same risk profile. A model that recommends internal knowledge articles requires a fundamentally different governance posture than a model that makes credit decisions or informs clinical diagnoses.&lt;/p&gt;

&lt;p&gt;A practical risk-tiering framework – typically three or four tiers – allows low-risk use cases to move quickly with lightweight review, while high-risk applications receive the scrutiny they demand. The key dimensions for tiering include: the impact on individuals if the model produces an incorrect output, the regulatory sensitivity of the domain, the degree of human oversight in the workflow and the nature of the data being processed (particularly personal or sensitive data). This approach avoids the bottleneck of treating every AI initiative as though it were mission-critical.&lt;/p&gt;

&lt;h3&gt;
  
  
  Embedded governance
&lt;/h3&gt;

&lt;p&gt;Embedded governance builds compliance checks into the development process rather than imposing them as a gate at the end. This includes bias testing as part of model evaluation, data privacy assessments as part of pipeline design, explainability requirements as part of model selection and risk assessment as part of use case approval.&lt;/p&gt;

&lt;p&gt;When governance is embedded, it does not create a bottleneck at deployment. Instead, it prevents the far more costly rework that comes from discovering compliance issues after a model has been built, tested and handed to the operations team. The shift is from governance as a stage gate to governance as a continuous practice – present throughout the development lifecycle, not concentrated at a single approval point.&lt;/p&gt;

&lt;h3&gt;
  
  
  Automated governance
&lt;/h3&gt;

&lt;p&gt;Automated governance leverages tooling to enforce standards without human bottlenecks. Automated checks for data quality thresholds, model performance metrics, bias indicators and audit logging can be built into CI/CD pipelines, ensuring that governance is consistently applied without requiring manual review for every model update or retraining cycle.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.cisco.com/c/m/en_us/solutions/ai/readiness-index.html" rel="noopener noreferrer"&gt;Cisco's AI Readiness Index&lt;/a&gt; found that 97% of the most AI-ready organisations ("Pacesetters") deploy AI at the scale and speed necessary to realise value, compared to just 41% overall – and that 84% of these Pacesetters have comprehensive change management plans, versus 35% of all companies. This highlights that governance and speed are not in tension for the most advanced organisations, they can in fact be mutually reinforcing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Building the Governance Framework: Components
&lt;/h2&gt;

&lt;p&gt;For organisations looking to establish or strengthen their AI governance, several components form the foundation.&lt;/p&gt;

&lt;h3&gt;
  
  
  An AI risk register and use case inventory
&lt;/h3&gt;

&lt;p&gt;Before governance can be applied proportionately, the organisation needs visibility into what AI is being used, where and at what risk level. This sounds quite simple, but many organisations – particularly those where AI adoption has been bottom-up and decentralised – lack a comprehensive view of their AI estate. The inventory should capture each use case, its risk tier, its data sources, its intended users and its current lifecycle stage.&lt;/p&gt;

&lt;h3&gt;
  
  
  Clear roles and accountability
&lt;/h3&gt;

&lt;p&gt;Governance requires named individuals accountable for AI risk. In the UK financial services context, the SM&amp;amp;CR already provides this structure – the Senior Manager responsible for AI outcomes is personally accountable. Outside regulated sectors, the principle still applies: someone senior must own AI governance, with authority to approve, escalate or halt deployments based on risk assessment.&lt;/p&gt;

&lt;h3&gt;
  
  
  Model documentation standards
&lt;/h3&gt;

&lt;p&gt;Each AI model in production should be accompanied by documentation covering its purpose, training data, performance metrics, known limitations, bias assessments and monitoring arrangements. This documentation serves multiple purposes – it enables effective oversight, supports regulatory compliance, facilitates knowledge transfer when team members change and provides the audit trail that boards and regulators increasingly expect.&lt;/p&gt;

&lt;h3&gt;
  
  
  Monitoring and incident management
&lt;/h3&gt;

&lt;p&gt;Governance does not end at deployment. Production AI systems require ongoing monitoring for model drift (degradation in performance as real-world data diverges from training data), data quality issues, emerging biases and unexpected behaviours. A clear incident management process – defining how AI-related issues are detected, escalated, investigated and remediated – is essential, particularly given how many organisations have already experienced at least one negative AI incident.&lt;/p&gt;

&lt;h3&gt;
  
  
  Regular review and adaptation
&lt;/h3&gt;

&lt;p&gt;The governance framework itself should evolve. The regulatory landscape is changing rapidly – the EU AI Act's high-risk obligations take effect in August 2026, UK regulatory expectations continue to sharpen, and the technology itself is advancing at pace. A governance framework designed for today's AI capabilities will need updating as agentic systems, multimodal models and new deployment patterns continue to evolve.&lt;/p&gt;

&lt;h2&gt;
  
  
  Governance as Competitive Advantage
&lt;/h2&gt;

&lt;p&gt;It is tempting to view governance as a cost centre – an overhead imposed by regulators and risk committees that can add little to the value AI delivers.&lt;/p&gt;

&lt;p&gt;However, governance is what gives the board confidence to approve production deployment, as well as allow the use of AI in customer-facing and decision-critical contexts rather than confining it to internal experimentation. And it prevents the compliance complexity when regulatory expectations tighten.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.bcg.com/publications/2024/wheres-value-in-ai" rel="noopener noreferrer"&gt;BCG's research&lt;/a&gt; found that AI leaders follow a 10-20-70 resource allocation: 10% to algorithms, 20% to technology and data and 70% to people and processes – the category that includes governance, change management and organisational readiness. With the organisations investing most heavily in governance the same ones generating the most value from AI.&lt;/p&gt;

&lt;p&gt;The lack of governance can be one of the main reasons that AI projects stall. It can lead to eroding trust, increased compliance risk, rework, ultimately keeping promising AI initiatives confined to sandboxes. However, if governance is built it in from the start, proportionate to risk, embedded in the development lifecycle and automated where possible, it can be the element that leads to production success.&lt;/p&gt;

&lt;p&gt;Author&lt;br&gt;
Chris is a Lead Data Scientist, with a background in astrophysics, and has over 4 years’ experience in providing data strategies insights using computational models and machine learning methodology. Chris has worked with a number of organisations across industries to successfully deliver AI projects, from PoC development and use case validation, through to model training and maintenance.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>aigovernance</category>
    </item>
    <item>
      <title>Managing Hidden Waterfalls in Legacy Modernisation Projects</title>
      <dc:creator>Audacia</dc:creator>
      <pubDate>Mon, 09 Mar 2026 08:30:00 +0000</pubDate>
      <link>https://dev.to/audaciatechnology/managing-hidden-waterfalls-in-legacy-modernisation-projects-fi3</link>
      <guid>https://dev.to/audaciatechnology/managing-hidden-waterfalls-in-legacy-modernisation-projects-fi3</guid>
      <description>&lt;h2&gt;
  
  
  &lt;strong&gt;Why agile delivery fails in legacy heavy environments without structural preparation.&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Agile remains the dominant model for modern software delivery for good reasons. Iterative development, fast feedback loops and the ability to adapt to new information are essential in complex, evolving systems. However, when agile is introduced into legacy-heavy organisations without accounting for institutional constraints, its effectiveness can diminish over time. What begins as an agile programme can often shift imperceptibly into a sequential delivery model beneath the surface.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.ciodive.com/news/waterfall-regress-agile-momentum-forrester/623135/" rel="noopener noreferrer"&gt;In 2019&lt;/a&gt;, 29% of organisations reported using waterfall delivery models. By 2022, that figure had risen to 43%. Not because teams chose to abandon agile, but because the environments they were delivering into quietly forced the shift.&lt;/p&gt;

&lt;p&gt;Teams start with discovery and prototyping, iterate rapidly and validate assumptions early. But as delivery progresses, unaddressed constraints begin to emerge, such as undocumented legacy behaviours, regulatory edge cases or operational workarounds that were never captured as formal requirements. At this point, the legacy system reasserts itself as a source of truth.&lt;/p&gt;

&lt;p&gt;Agile ceremonies may continue, but the programme becomes more reactive. The goal can subtly shift from solving user problems to reproducing historical behaviour. What remains is a hybrid model becoming agile in appearance, yet waterfall in substance - the hidden waterfall.&lt;/p&gt;

&lt;h2&gt;
  
  
  Legacy Replacement as High-Risk
&lt;/h2&gt;

&lt;p&gt;The data on legacy modernisation is consistent. These programmes fail more often, and more visibly, than greenfield initiatives.&lt;/p&gt;

&lt;p&gt;A review of ERP project outcomes by &lt;a href="https://kpcteam.com/kpposts/unveiling-the-erp-conundrum-why-55-75-of-erp-projects-fail" rel="noopener noreferrer"&gt;KPC Team&lt;/a&gt; places failure or severe underperformance rates between 55% and 75%, depending on scope and definition. With &lt;a href="https://erp.today/most-digital-transformations-fail-but-comprehensive-testing-processes-can-help-succeed/" rel="noopener noreferrer"&gt;ERP Today&lt;/a&gt; highlighting testing, data quality and scope volatility as common points of failure.&lt;/p&gt;

&lt;p&gt;Data migration projects carry even higher risk. According to &lt;a href="https://www.oracle.com/a/ocom/docs/middleware/data-integration/data-migration-wp.pdf" rel="noopener noreferrer"&gt;Oracle&lt;/a&gt;, over 80% of data migration initiatives either overrun, underdeliver or fail entirely. Most often due to undocumented dependencies and inadequate validation. Factors such as schema drift, semantic inconsistencies and legacy entanglement are reported as persistent blockers to successful transformation.&lt;/p&gt;

&lt;p&gt;Most digital transformation efforts in large organisations are not pure greenfield builds. They are legacy replacement or coexistence programmes - subject to all the structural, technical and operational complexity that entails.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Legacy Becomes the Specification
&lt;/h2&gt;

&lt;p&gt;A common misstep in legacy modernisation is the assumption that existing systems simply encode outdated implementations of known requirements. In practice, legacy systems carry decades of organisational memory, much of it undocumented.&lt;/p&gt;

&lt;p&gt;Research in requirements engineering reveals several persistent patterns in legacy systems:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Business logic is embedded in code rather than documentation&lt;/li&gt;
&lt;li&gt;Exceptions are handled through hidden branches or procedural workarounds&lt;/li&gt;
&lt;li&gt;User behaviours evolve around system constraints, becoming de facto requirements&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When delivery teams attempt to define new system requirements without examining these embedded behaviours, they quickly encounter gaps. At that point, the legacy platform is no longer a background dependency - it becomes the only available reference model.&lt;/p&gt;

&lt;p&gt;Studies published in &lt;a href="https://thesai.org/Downloads/Volume7No5/Paper_10-Identify_and_Manage_the_Software_Requirements_Volatility.pdf" rel="noopener noreferrer"&gt;IJACSA&lt;/a&gt; and &lt;a href="https://link.springer.com/chapter/10.1007/978-3-319-33515-5_10" rel="noopener noreferrer"&gt;Springer&lt;/a&gt; show that late discovery of implicit requirements is a leading cause of rework. In legacy replacement programmes, these “requirements” were never made explicit because they were never formally captured.&lt;/p&gt;

&lt;p&gt;This is a structural outcome of relying on systems that evolve without parallel investment in shared knowledge.&lt;/p&gt;

&lt;h2&gt;
  
  
  Water-Scrum-Fall
&lt;/h2&gt;

&lt;p&gt;In 2011, Forrester introduced the term &lt;a href="https://www.verheulconsultants.nl/water-scrum-fall_Forrester.pdf" rel="noopener noreferrer"&gt;“Water-Scrum-Fall”&lt;/a&gt; to describe hybrid delivery models in which agile practices are embedded between upfront planning and downstream release governance. More than a decade later, this pattern persists, and if anything, it has increased.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.ciodive.com/news/waterfall-regress-agile-momentum-forrester/623135/" rel="noopener noreferrer"&gt;CIO Dive&lt;/a&gt; reported in 2022 that 43% of organisations still use waterfall models, up from 29% in 2019, with compliance, assurance and funding structures cited as the main reasons. &lt;a href="https://www.knowledgehut.com/blog/agile/state-of-agile" rel="noopener noreferrer"&gt;KnowledgeHut’s&lt;/a&gt; 2025 State of Agile found that agile adoption is now stagnating or reversing in many enterprise environments, with hybrid models becoming the norm.&lt;/p&gt;

&lt;p&gt;These regressions are rarely ideological. Most organisations want to be agile. But delivery becomes sequential by structure.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Waterfall Reappears by Default
&lt;/h2&gt;

&lt;p&gt;Several factors can pull agile programmes toward waterfall behaviours:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Funding cycles require fixed scope and budget commitments before discovery&lt;/li&gt;
&lt;li&gt;Governance models rely on stage gates, rather than continuous assurance&lt;/li&gt;
&lt;li&gt;Supplier contracts focus output completion over outcome delivery&lt;/li&gt;
&lt;li&gt;Compliance processes are serial in nature, with formal sign-offs and audit trails&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In the public sector, the &lt;a href="https://www.nao.org.uk/reports/digital-transformation-in-government/" rel="noopener noreferrer"&gt;National Audit Office&lt;/a&gt; has repeatedly highlighted how legacy estates, inflexible procurement and capacity gaps create barriers to agile working. The &lt;a href="https://www.gov.uk/government/publications/state-of-digital-government-review" rel="noopener noreferrer"&gt;State of Digital Government Review 2025&lt;/a&gt; confirms that many central government services still rely on systems more than two decades old, with modernisation constrained by high operational risk and fragile dependencies.&lt;/p&gt;

&lt;p&gt;In this environment, teams may adopt agile practices within their sprint cycles, but the programme remains governed by linear constraints, creating hidden waterfalls.&lt;/p&gt;

&lt;h2&gt;
  
  
  Recognising Hidden Waterfalls Before They Set In
&lt;/h2&gt;

&lt;p&gt;Hidden waterfalls rarely announce themselves. They emerge gradually, often masked by functioning agile rituals. But several indicators can signal that a programme has shifted from iterative delivery to sequential progression:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Sprint goals are increasingly defined by legacy parity rather than user outcomes. Backlog items begin to reference "the old system does X" as the primary acceptance criterion, rather than solving a validated user need.&lt;/li&gt;
&lt;li&gt; Discovery stops but requirements keep growing. The team completed a discovery phase early in the programme, but new requirements continue to surface from legacy behaviours that were never formally captured. Each one is treated as an exception rather than evidence of a structural gap.&lt;/li&gt;
&lt;li&gt;Release planning compresses into a single milestone. Despite iterative development, the programme converges on a single go-live date with limited rollback options, often driven by contract, funding or political commitments rather than technical readiness.&lt;/li&gt;
&lt;li&gt;Testing becomes regression-dominant. The majority of test effort shifts toward proving that the new system reproduces existing behaviour, rather than validating that it meets redefined needs.&lt;/li&gt;
&lt;li&gt;Stakeholder confidence depends on sign-off, not evidence. Progress is measured by stage-gate approvals and documentation completeness rather than working software, user feedback or operational metrics.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;None of these indicators are necessarily failures in their own right. However, when several appear together, they suggest the programme has structurally reverted to sequential delivery, regardless of the methodology it reports.&lt;/p&gt;

&lt;h2&gt;
  
  
  Modernisation Approaches
&lt;/h2&gt;

&lt;p&gt;Projects that aim to replace legacy systems in a single release, often called “big bang” delivery, assume considerable risk. These programmes concentrate delivery dependencies, limit rollback options, and make data migration a single-point failure.&lt;/p&gt;

&lt;p&gt;Incremental modernisation strategies can offer a more resilient alternative. Patterns such as parallel run, feature toggles, coexistence architectures and the strangler fig pattern allow systems to be evolved rather than replaced outright.&lt;/p&gt;

&lt;p&gt;In one survey, &lt;a href="https://www.bomberbot.com/software-development/what-is-the-strangler-fig-pattern-and-how-it-helps-manage-legacy-code/" rel="noopener noreferrer"&gt;79% of developers&lt;/a&gt; said the strangler pattern reduced project risk, primarily because it isolates change and supports rollback. Incremental delivery also aligns better with governance and assurance frameworks - supporting progressive certification, staged user validation and controlled data migration.&lt;/p&gt;

&lt;p&gt;In regulated environments, these approaches can reduce disruption and support operational continuity. They also provide decision-makers with clearer evidence of progress and outcomes at each stage.&lt;/p&gt;

&lt;h2&gt;
  
  
  Balancing Ambition with Legacy Constraints
&lt;/h2&gt;

&lt;p&gt;Preparing for hidden waterfalls is not an argument for replicating legacy systems. It is a call to interrogate them more rigorously, and to distinguish between what must be retained and what can be rethought.&lt;/p&gt;

&lt;p&gt;This involves:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Identifying which behaviours are regulatory, contractual or operationally essential&lt;/li&gt;
&lt;li&gt;Separating business-critical rules from historical conveniences&lt;/li&gt;
&lt;li&gt;Defining the minimum viable increment that preserves service capability while allowing change&lt;/li&gt;
&lt;li&gt;Designing systems that support evolution rather than frozen replication&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These assessments are less suited to be completed through workshops or documentation. Instead they require early and direct engagement with legacy systems, their data models, codebases, interface behaviours and operational roles.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Role of AI in Surfacing Constraints
&lt;/h2&gt;

&lt;p&gt;AI-assisted tooling offers practical support in navigating legacy complexity. When applied responsibly, these tools can help teams:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Analyse code to extract business rules and logic paths&lt;/li&gt;
&lt;li&gt;Identify unused or redundant code segments&lt;/li&gt;
&lt;li&gt;Map dependency chains and integration points&lt;/li&gt;
&lt;li&gt;Generate automated tests to capture existing system behaviours&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In environments where documentation is sparse and institutional memory has faded, these tools can reduce the time and effort needed to understand legacy systems. &lt;a href="https://www.oracle.com/a/ocom/docs/middleware/data-integration/data-migration-wp.pdf" rel="noopener noreferrer"&gt;Oracle’s whitepaper&lt;/a&gt; notes that poor understanding of legacy code is a major cause of data migration failure, an area where AI-driven code analysis can make a measurable difference.&lt;/p&gt;

&lt;p&gt;However, it is important to view AI as an enabler, not a decision-maker. Tools can help surface logic and dependency, but they can struggle to decide which behaviours remain relevant or valuable. That task requires domain knowledge, user insight and human judgement.&lt;/p&gt;

&lt;h2&gt;
  
  
  Preparing for Hidden Waterfalls
&lt;/h2&gt;

&lt;p&gt;Effective preparation involves a combination of technical, governance and delivery decisions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Map structural constraints early, particularly data, regulatory and legacy integrations&lt;/li&gt;
&lt;li&gt;Treat legacy systems as evidence, not default specifications&lt;/li&gt;
&lt;li&gt;Select modernisation approaches that allow co-existence and rollback&lt;/li&gt;
&lt;li&gt;Align governance and assurance models to tolerate incremental delivery&lt;/li&gt;
&lt;li&gt;Use AI tools to reduce manual analysis effort and highlight legacy dependencies&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The objective is not necessarily to eliminate all waterfall elements but to make them visible and manageable. Programmes that fail to do this often discover late in delivery that they are operating under assumptions that no longer hold, or that were never articulated to begin with.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Step Is Important
&lt;/h2&gt;

&lt;p&gt;Hidden waterfalls are a predictable outcome of unaddressed structural constraints that agile methods alone are not enough to resolve.&lt;/p&gt;

&lt;p&gt;Acknowledging this reality early allows teams to structure programmes that are responsive, transparent and recoverable. It enables more realistic delivery planning, supports operational continuity and improves trust between teams and stakeholders.&lt;/p&gt;

&lt;p&gt;This step becomes particularly important when delivery timelines are fixed, data quality is uneven or regulatory scrutiny is high.&lt;/p&gt;

&lt;h2&gt;
  
  
  Author
&lt;/h2&gt;

&lt;p&gt;Matt Cross is a Lead Business Analyst at Audacia. Matt has a background in leading requirements workshops, defining acceptance criteria for requirements and supporting stakeholders throughout the project lifecycle – on both consultancy and development projects across engineering, data, AI and cloud.&lt;/p&gt;

</description>
      <category>legacyit</category>
      <category>software</category>
      <category>agile</category>
    </item>
    <item>
      <title>Serverless Architectures: Designing for Scale, Simplicity and Resilience</title>
      <dc:creator>Audacia</dc:creator>
      <pubDate>Mon, 19 Jan 2026 08:30:00 +0000</pubDate>
      <link>https://dev.to/audaciatechnology/serverless-architectures-designing-for-scale-simplicity-and-resilience-2c62</link>
      <guid>https://dev.to/audaciatechnology/serverless-architectures-designing-for-scale-simplicity-and-resilience-2c62</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3wd76e5ao3a9nxkt80nn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3wd76e5ao3a9nxkt80nn.png" alt="Blog cover image of cloud technology"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Modern applications face a constant tension between competing architectural demands: systems must scale efficiently, remain highly available, perform well under load and be maintainable without excessive operational overhead.&lt;/p&gt;

&lt;p&gt;This blog, adapted from a Tech Talk by Principal Software Engineer, Luke Mitchell, explores cases where serverless architectures can address these requirements by shifting infrastructure management to cloud providers, allowing development teams to focus on building features rather than managing servers.&lt;/p&gt;

&lt;h2&gt;
  
  
  Scaling Strategies: Horizontal vs Vertical
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8n8ri75izhk570qhtu6h.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8n8ri75izhk570qhtu6h.jpg" alt="A diagram showing horizontal scaling and vertical scaling"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Understanding scaling approaches provides the foundation for appreciating serverless benefits. Vertical scaling involves adding resources to a single instance - more CPU, RAM or storage. A down-side of this approach is that it is limited by a single point of failure. Meaning when that machine goes down, the entire service becomes unavailable.&lt;/p&gt;

&lt;p&gt;Horizontal scaling takes a different approach by adding more instances of the same machine. This design provides built-in fault tolerance because multiple machines handle requests simultaneously. Therefore, if one machine fails, others continue serving traffic. This redundancy makes horizontal scaling more resilient than vertical scaling, though horizontal scaling can include more complexity in orchestration and load distribution.&lt;/p&gt;

&lt;h2&gt;
  
  
  Case Study 1: From Monolithic Functions to Distributed Processing in Azure
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh2oajswp0jpetxdixci2.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh2oajswp0jpetxdixci2.jpg" alt="Diagram showing a single Azure Function grabbing files from an SFTP server, and writing results to a Snowflake database"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The overhaul of this file processing system demonstrates how serverless patterns transform architecture. The initial implementation, as shown above, used a single Azure Function that continuously ran, grabbing files from an SFTP server, processing each entry sequentially and writing results to a Snowflake database. This design had several limitations: it scaled only vertically, created a single point of failure and left no clear recovery path when errors occurred mid-process.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyusap2s71z1tze0grzvf.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyusap2s71z1tze0grzvf.jpg" alt="Azure function with concurrent processing diagram "&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To mitigate these limitations the system architecture was refactored, as shown above – after which responsibilities were split across multiple components. The initial function now simply reads the file and splits each entry into individual messages on a storage queue. As messages arrive, a second function automatically scales up to process multiple instances in parallel. This distribution transforms sequential processing into concurrent execution, dramatically improving throughput speed.&lt;/p&gt;

&lt;p&gt;As well as speed, the switch to a serverless architecture also improves fault tolerance. If a function instance fails mid-processing, the message automatically returns to the queue for retry. Messages that consistently fail move to a poison queue for manual investigation, preventing problematic entries from blocking the entire pipeline. Unlike the original architecture, the need to track processing state within files or implement complex restart logic is eliminated because the queue handles these concerns automatically.&lt;/p&gt;

&lt;h2&gt;
  
  
  Case Study 2: Serving Static Content at Scale in AWS
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe9437acb4oxqvplmnpp7.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe9437acb4oxqvplmnpp7.jpg" alt="Web server architecture in AWS cloud"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Traditional web server architecture requires substantial infrastructure, as can be seen in the diagram above. Firstly, requests made by users are distributed by an application load balancer across EC2 instances deployed in multiple availability zones. Next, auto-scaling groups monitor traffic and adjust instance counts accordingly, adding capacity during peaks and removing it during lulls to control costs. Each virtual machine incurs a cost whether actively serving requests or sitting idle.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhz690i9ut3cnomu8n46k.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhz690i9ut3cnomu8n46k.jpg" alt="Serverless alternative to a web server with CloudFront and S3"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The serverless alternative simplifies this substantially. Static files reside in an S3 bucket, with CloudFront serving as the access point. CloudFront operates as a content distribution network with edge locations worldwide. When users request content, they receive it from the nearest edge location rather than travelling back to the origin region. This geographic distribution reduces latency significantly for global audiences.&lt;/p&gt;

&lt;p&gt;This serverless approach has benefits for performance, maintainability and scalability:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;S3 stores files across multiple availability zones by default and if one zone becomes unavailable, requests route to files in other zones without manual intervention.&lt;/li&gt;
&lt;li&gt;CloudFront caches content at edge locations, reducing origin server load and improving response times.&lt;/li&gt;
&lt;li&gt;The entire stack scales to handle traffic spikes without configuration changes or capacity planning.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Case Study 3: API Infrastructure Without Servers in AWS
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Figgspb46eqr97gg2421t.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Figgspb46eqr97gg2421t.jpg" alt="API server architecture in AWS cloud"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;API servers typically follow similar patterns to web servers: virtual machines behind load balancers, deployed across availability zones for resilience. This infrastructure requires ongoing maintenance - operating system patches, image updates, capacity planning and monitoring.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flvra9y2oac1c3w7xrkwd.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flvra9y2oac1c3w7xrkwd.jpg" alt="Serverless alternative to API server with API Gateway, with various integrations - Lambda functions, DynamoDB, SQS, SNS"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;API Gateway provides a serverless alternative, acting as a unified entry point for API traffic. It integrates directly with numerous AWS services: Lambda functions for compute, DynamoDB for database access, SQS for message queuing and SNS for publish-subscribe patterns. This integration flexibility enables varied architectural patterns without managing underlying infrastructure. This makes initial start-up easier than it would be using a virtual machine.&lt;/p&gt;

&lt;p&gt;The publish-subscribe model through SNS demonstrates particular power. A single message can fan out to multiple subscribers - perhaps a Lambda function sending notifications to Slack while simultaneously queuing work for asynchronous processing. This pattern enables event-driven architectures where services respond to events without tight coupling between components.&lt;/p&gt;

&lt;p&gt;This approach is also accompanied by benefits:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Multi-availability zone deployment happens by default.&lt;/li&gt;
&lt;li&gt;The platform automatically handles failover and scaling without explicit configuration.&lt;/li&gt;
&lt;li&gt;Updates don't require creating new machine images or coordinating rolling deployments across instances.&lt;/li&gt;
&lt;li&gt;The pay-per-use model means costs align directly with actual usage rather than provisioned capacity.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Trade-offs and Considerations
&lt;/h2&gt;

&lt;p&gt;Serverless architectures introduce their own considerations. For example, cold starts, the latency when a function first initialises, can affect user experience through lagging. This can be mitigated at an additional cost through provisioned concurrency (keeping a specified amount of Lambda expressions always running) to keep functions warm. This trade-off matters most for latency-sensitive applications where milliseconds count.&lt;/p&gt;

&lt;p&gt;Additionally, cost efficiency should be considered, which depends on scale. Serverless platforms charge per request, making them economical for variable workloads. At extremely high sustained volumes, dedicated infrastructure may become more cost-effective. However, this typically occurs only at the scale of major internet services.&lt;/p&gt;

&lt;p&gt;Some use cases still favour traditional servers. Such as long-running processes, which don't map cleanly to function execution models. For example, server-side rendering requires a server to generate HTML dynamically, which S3 and CloudFront cannot provide. Static site generation or pre-rendering can address some of these scenarios, but pure static hosting has SEO limitations without additional tooling.&lt;/p&gt;

&lt;p&gt;A learning curve exists for both serverless and non-serverless approaches. To use servers, understanding load balancers, auto-scaling groups and virtual machine maintenance requires expertise. On the other hand, serverless architectures require different knowledge - message queues, function composition and event-driven design. Teams should evaluate their existing skills and strategic direction when choosing approaches.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Serverless architectures deliver meaningful advantages in scalability, performance, fault tolerance and maintainability. By abstracting infrastructure management, they enable teams to focus on application logic rather than operational concerns. While not universal solutions, they provide compelling benefits for most modern applications, particularly those with variable traffic patterns or limited operations resources. The examples demonstrate that serverless patterns often simplify rather than complicate architecture, delivering better results with less overhead.&lt;/p&gt;

&lt;h2&gt;
  
  
  Watch the Tech Talk
&lt;/h2&gt;

&lt;p&gt;

  &lt;iframe src="https://www.youtube.com/embed/wNm8h3MkKUg"&gt;
  &lt;/iframe&gt;


&lt;/p&gt;

</description>
      <category>serverless</category>
      <category>cloud</category>
      <category>aws</category>
      <category>azure</category>
    </item>
    <item>
      <title>Putting the CD Back into CI/CD: A Guide to Continuous Deployment</title>
      <dc:creator>Audacia</dc:creator>
      <pubDate>Mon, 12 Jan 2026 08:30:00 +0000</pubDate>
      <link>https://dev.to/audaciatechnology/putting-the-cd-back-into-cicd-a-guide-to-continuous-deployment-174o</link>
      <guid>https://dev.to/audaciatechnology/putting-the-cd-back-into-cicd-a-guide-to-continuous-deployment-174o</guid>
      <description>&lt;p&gt;``&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiq8q4b0gkwprpc7t5dyp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiq8q4b0gkwprpc7t5dyp.png" alt="Cover image of a developer on stairs- representing steps towards continuous development"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Putting the CD Back into CI/CD: A Guide to Continuous Deployment &lt;br&gt;
Many organisations talk about CI/CD, but the reality is that most have achieved continuous integration (CI) without continuous deployment (CD). Embracing both CI and CD represents a fundamental shift in how software reaches production and how teams approach risk, quality and delivery.&lt;/p&gt;

&lt;p&gt;This blog, adapted from a Tech Talk by Principal Software Engineers Luke Mitchell and Akeel Ahmed, explores two distinct pathways to achieving true continuous deployment: trunk-based development with ephemeral environments, and Git Flow with structured release management. Both approaches can deliver frequent, reliable releases, but they require different technical infrastructure, and cultural and organisational readiness.&lt;/p&gt;
&lt;h2&gt;
  
  
  The Current State of Continuous Deployment
&lt;/h2&gt;

&lt;p&gt;Continuous integration has become standard practice through automated testing, code reviews and build pipelines. However, the journey from merged code to production often remains batched and infrequent.&lt;/p&gt;

&lt;p&gt;The reasons are varied: legacy approval processes inherited from waterfall methodologies, lack of confidence in automated testing, concerns about deployment risk or simply the complexity of managing multiple environments. Yet the benefits of genuine continuous deployment – faster feedback loops, reduced integration risk and the ability to respond rapidly to business needs – make it worth pursuing.&lt;/p&gt;
&lt;h2&gt;
  
  
  Trunk-Based Development
&lt;/h2&gt;

&lt;p&gt;At the core of trunk-based development is a single principle: the main branch should always be in a deployable state.  &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5b5saaxvtusqpj10h26a.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5b5saaxvtusqpj10h26a.jpg" alt="Diagram showing the branches in trunk-based development"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  The Fundamentals
&lt;/h3&gt;

&lt;p&gt;Unlike Git Flow's multiple long-lived branches, trunk-based development maintains a single main branch. Developers work on short-lived feature branches, typically lasting hours or days rather than weeks, before merging back to main. Each merge triggers an automated pipeline that can deploy directly to production.&lt;/p&gt;

&lt;p&gt;This approach demands discipline. Small, focused commits become essential. Code reviews must happen synchronously – within 10 to 15 minutes of raising a pull request. The entire team must prioritise getting code through the pipeline over starting new work.  &lt;/p&gt;
&lt;h3&gt;
  
  
  Ephemeral Environments
&lt;/h3&gt;

&lt;p&gt;One of the most powerful enablers of trunk-based development is the use of ephemeral environments. Rather than maintaining static QA and staging environments where multiple developers' changes intermingle, each feature branch spawns its own temporary environment.&lt;/p&gt;

&lt;p&gt;When a developer pushes their branch, the pipeline automatically provisions cloud infrastructure and deploys their changes to an isolated environment. This provides several advantages:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Isolation of changes:&lt;/strong&gt; Bugs discovered during testing are definitively linked to the development stage.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Parallel development:&lt;/strong&gt; Developers and testers can work simultaneously without interference, removing bottlenecks from the development and QA processes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cost efficiency:&lt;/strong&gt; Environments are taken down automatically after the code merges to main, ensuring resources are only consumed when needed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Production parity:&lt;/strong&gt; Each ephemeral environment can mirror production configuration, reducing environment-specific issues.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The workflow is streamlined:&lt;/strong&gt; Development happens on the feature branch, code review occurs when the developer opens a pull request, QA testing happens in the ephemeral environment once the PR is approved, and upon successful testing, the code merges to main and the ephemeral environment is deleted.  &lt;/p&gt;
&lt;h3&gt;
  
  
  Release Strategy for Trunk-based Development
&lt;/h3&gt;

&lt;p&gt;Merging to main doesn't necessarily mean immediate production deployment, though it could. Many teams create release candidate branches automatically upon merge to main. These branches can then be deployed to UAT or production based on business requirements.&lt;/p&gt;

&lt;p&gt;Teams take different approaches to deployment timing. Some deploy every merge to production immediately – true continuous deployment. Others batch a few tickets together, deploying the most recent release candidate branch that contains all the desired changes. The key principle remains constant: all code in main is production-ready, and the organisation decides when to deploy based on business needs, not technical readiness.&lt;/p&gt;

&lt;p&gt;To track what's currently live, many teams maintain a production branch, merging their release candidate branches into it after deployment. This provides a valuable snapshot of the live environment, simplifying rollbacks and hotfixes by maintaining a known good state to return to. Teams requiring additional safeguards sometimes create rollback candidate branches automatically before each production deployment, though this adds complexity that not all teams need.&lt;/p&gt;

&lt;p&gt;Feature flags provide an additional layer of deployment control that works with both trunk-based development and Git Flow. They're particularly valuable in trunk-based development, where code deploys to production frequently, by controlling feature visibility independently of code deployment.&lt;/p&gt;
&lt;h3&gt;
  
  
  The Benefits
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Risk mitigation:&lt;/strong&gt; pinpointing bugs becomes easier in smaller, recent releases. &lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Early client/user feedback:&lt;/strong&gt; a clients’ vision can change or become clearer when presented with something concrete – it’s best to know as early as possible.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reactive to change:&lt;/strong&gt; small releases reduce the amount of time and difficulty it takes to get feedback and implement changes.
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  The Cultural Requirements
&lt;/h3&gt;

&lt;p&gt;Trunk-based development requires significant cultural change. It demands trust that developers will maintain quality, that automated tests are comprehensive and that the team will respond quickly to production issues.&lt;/p&gt;

&lt;p&gt;It also requires scaling back bureaucratic approval processes. Change Advisory Boards can be antithetical to continuous deployment if every change is scrutinised. The governance must shift from manual approval gates to automated quality gates and rapid response capabilities.&lt;/p&gt;

&lt;p&gt;Full team ownership becomes paramount. From junior developers to tech leads, everyone shares responsibility for production stability. This shared accountability, combined with the practice of deploying small changes frequently, reduces risk compared to large, infrequent releases.  &lt;/p&gt;
&lt;h2&gt;
  
  
  Git Flow
&lt;/h2&gt;

&lt;p&gt;Not every organisation can adopt ephemeral environments immediately. Infrastructure constraints, compliance requirements or existing tooling may necessitate static environments. Git Flow provides a structured approach to continuous deployment within these constraints.  &lt;/p&gt;
&lt;h3&gt;
  
  
  The Git Flow Model
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwvf07oqxl6votkptern3.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwvf07oqxl6votkptern3.jpg" alt="Diagram showing the branches in GitFlow"&gt;&lt;/a&gt; &lt;/p&gt;

&lt;p&gt;Git Flow employs multiple long-lived branches with specific purposes:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Main reflects production and is updated only with tested, stable releases.&lt;/li&gt;
&lt;li&gt;Release branches are cut from develop to deploy to production.&lt;/li&gt;
&lt;li&gt;Develop serves as the integration branch for ongoing development. &lt;/li&gt;
&lt;li&gt;Feature branches are created from develop for new functionality. &lt;/li&gt;
&lt;li&gt;Bugfix branches are short-lived branches created from develop or release to fix defects, and are merged back into their source branch once resolved.&lt;/li&gt;
&lt;li&gt;Hotfix branches are created from main for urgent production fixes, and merged back into main and develop.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This structure provides clear separation between development, testing and production code. Whilst feature branches can be short-lived with good continuous integration practices, the methodology naturally supports more structured release cycles.  &lt;/p&gt;
&lt;h3&gt;
  
  
  Release Planning and Management
&lt;/h3&gt;

&lt;p&gt;Success with Git Flow depends heavily on release planning and management. Rather than ad-hoc deployments, teams batch related user stories into planned releases. This upfront planning – tagging stories with release identifiers early in the sprint – provides predictability for stakeholders whilst still enabling frequent releases.&lt;/p&gt;

&lt;p&gt;The workflow operates in distinct phases:  &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Development phase:&lt;/strong&gt; Developers merge feature branches to develop, which automatically deploys to a shared QA environment for testing. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Release preparation:&lt;/strong&gt; When all features for a release are complete and QA-tested, a release branch is created from develop. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;UAT phase:&lt;/strong&gt; The release branch is deployed to UAT for stakeholder testing. Crucially, no new features are added during this phase – only bug fixes and refinements. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Production deployment:&lt;/strong&gt; After successful UAT, the release branch deploys to production and merges back to main, providing a live reflection of production in the codebase. &lt;/p&gt;
&lt;h3&gt;
  
  
  Managing Hotfixes
&lt;/h3&gt;

&lt;p&gt;Git Flow excels at handling production issues whilst development continues. Hotfix branches are created from main, tested independently, and deployed to production without disrupting the develop branch or ongoing releases.&lt;/p&gt;

&lt;p&gt;A practical versioning approach helps manage this: if release 1.0 is in production and a bug is discovered, create hotfix branch 1.1, deploy it to production, then merge it back to both main and develop to keep everything aligned.  &lt;/p&gt;
&lt;h3&gt;
  
  
  The Advantages of Structure
&lt;/h3&gt;

&lt;p&gt;Git Flow's structure provides several benefits for teams and stakeholders: &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Stability:&lt;/strong&gt; The main branch always reflects production, reducing confusion about what code is live. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Visibility:&lt;/strong&gt; Clear branching structure makes it easy to understand what features are in which release. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Control:&lt;/strong&gt; Product owners and project managers have explicit control over what gets released and when. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Transparency:&lt;/strong&gt; Every merge, tag and deployment is logged, providing an audit trail for accountability. &lt;/p&gt;

&lt;p&gt;This structure particularly benefits larger teams where multiple developers work on the same codebase simultaneously. The isolation between branches provides clearer separation of concerns and reduces the risk of unstable code reaching production.  &lt;/p&gt;
&lt;h2&gt;
  
  
  Choosing Your Approach
&lt;/h2&gt;

&lt;p&gt;In a simple analogy, trunk-based development can be imagined as multiple passengers in different taxis heading to the same destination, whereas Git Flow involves a group of passengers on a bus – going through each checkpoint together. The decision between the two strategies depends on infrastructure capabilities and business requirements.  &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiir7vjnurcnpoureb3d4.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiir7vjnurcnpoureb3d4.jpg" alt="Comparison of trunk-based development and gitflow depicted through cars and buses"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Trunk-Based Development Fits When: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You have cloud infrastructure that supports ephemeral environments &lt;/li&gt;
&lt;li&gt;Your team is comfortable with high deployment frequency &lt;/li&gt;
&lt;li&gt;Automated testing provides high confidence &lt;/li&gt;
&lt;li&gt;There's organisational trust in the development team &lt;/li&gt;
&lt;li&gt;Small, incremental releases align with business needs &lt;/li&gt;
&lt;li&gt;You want to minimise the feedback loop between development and production &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Git Flow Fits When: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You have static environments that can't easily be replicated &lt;/li&gt;
&lt;li&gt;Releases need stakeholder approval or coordination &lt;/li&gt;
&lt;li&gt;Compliance requires structured release documentation &lt;/li&gt;
&lt;li&gt;Larger teams benefit from clear branch isolation &lt;/li&gt;
&lt;li&gt;Business prefers predictable, planned release schedules &lt;/li&gt;
&lt;li&gt;You're transitioning from traditional release processes &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Neither approach is inherently superior. Both can achieve continuous deployment if implemented well. The key is matching the approach to your context and executing it with discipline.  &lt;/p&gt;
&lt;h2&gt;
  
  
  Making It Work: Practices
&lt;/h2&gt;

&lt;p&gt;Regardless of which strategy you choose, a few practices are essential for successful continuous deployment: &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Automated Testing as a Foundation&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Quality gates must be automated and comprehensive. Unit tests, integration tests and UI tests should run automatically on every commit. These tests become your confidence in deployment – they must be reliable and fast.  &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Synchronous Code Reviews&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Code reviews can't be allowed to become bottlenecks. Establishing the expectation that pull requests receive attention within 15 minutes keeps code flowing. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Communication and Collaboration&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Continuous deployment requires continuous communication. Development teams and testers must collaborate closely, using tools like Slack, Teams or Azure DevOps to stay coordinated. Early feedback loops with product owners and clients help ensure that frequent releases deliver what stakeholders want.  &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Monitoring and Observability&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;When deploying frequently, you must know immediately if something goes wrong. Comprehensive monitoring, alerting and logging become essential. The ability to quickly diagnose and resolve production issues provides the confidence to deploy often.  &lt;/p&gt;
&lt;h2&gt;
  
  
  Next Steps
&lt;/h2&gt;

&lt;p&gt;Whether through trunk-based development's simplicity or Git Flow's structured approach – start small. Each increase in deployment frequency teaches lessons about improvements that can be made to testing, automation, monitoring or process.&lt;/p&gt;

&lt;p&gt;Moving from continuous integration to genuine continuous deployment represents a significant evolution in development maturity. It requires technical investment in automation and infrastructure, cultural change in how teams approach quality and risk, and organisational trust in development practices.&lt;/p&gt;
&lt;h2&gt;
  
  
  Watch the Tech Talk
&lt;/h2&gt;

&lt;p&gt;

  &lt;iframe src="https://www.youtube.com/embed/_6tlPpoKYjs"&gt;
  &lt;/iframe&gt;


&lt;/p&gt;

</description>
      <category>devops</category>
      <category>cicd</category>
      <category>git</category>
      <category>software</category>
    </item>
    <item>
      <title>The Building Blocks of AI Governance: Policies, Principles &amp; People</title>
      <dc:creator>Audacia</dc:creator>
      <pubDate>Mon, 15 Dec 2025 09:27:13 +0000</pubDate>
      <link>https://dev.to/audaciatechnology/the-building-blocks-of-ai-governance-policies-principles-people-1ej8</link>
      <guid>https://dev.to/audaciatechnology/the-building-blocks-of-ai-governance-policies-principles-people-1ej8</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5ab21ufxycu1fe7euw6y.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5ab21ufxycu1fe7euw6y.png" alt=" "&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;This blog post has been adapted from &lt;a href="https://audacia.co.uk/events/implementing-ai-governance?utm_campaign=2025-Tech-Talks-May-AI-Governance&amp;amp;utm_source=DEV.to&amp;amp;utm_medium=Blog-post&amp;amp;utm_content=DEV.to-Governance" rel="noopener noreferrer"&gt;this Tech Talk&lt;/a&gt; by Chris Bentley, Lead Data Scientist at Audacia. Find the full video at the end of the article.&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Navigating AI’s Expanding Landscape
&lt;/h2&gt;

&lt;p&gt;Increasingly, AI is being woven into the fabric of modern engineering. Whether it's enterprise models like ChatGPT, off-the-shelf cloud tools or bespoke machine learning pipelines.&lt;/p&gt;

&lt;p&gt;However, with every new capability comes new risk. As AI capabilities grow, so does the chance of unintended consequences: discrimination, security vulnerabilities or even loss of control over powerful systems. The solution is to ensure governance is considered at every part of the pipeline; it should be a robust, evolving framework grounded in clear principles, backed by thoughtful policies and shaped by the right people - enabling them to take ownership and drive responsible outcomes.&lt;/p&gt;

&lt;p&gt;This article sets out a practical foundation for technology leaders looking to implement or update AI governance.&lt;/p&gt;
&lt;h2&gt;
  
  
  What We Mean by AI (And Why It Matters for Governance)
&lt;/h2&gt;

&lt;p&gt;Before discussing governance, it helps to define what we mean by "AI."&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fszi3h5zwfavl12zr67v0.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fszi3h5zwfavl12zr67v0.jpg" alt=" "&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Rule-based systems: The earliest AI was entirely programmatic - explicit rules codified by humans to mimic decision-making in well-understood domains.&lt;/li&gt;
&lt;li&gt;Machine learning: A huge leap forward. Algorithms learn patterns from data to make predictions or decisions without explicitly coded rules for every scenario.&lt;/li&gt;
&lt;li&gt;Deep learning: A subset of machine learning that uses multi-layered neural networks to capture complex patterns in vast datasets.&lt;/li&gt;
&lt;li&gt;Generative AI: At the innermost core sits generative AI: deep learning models trained on massive datasets to produce new content - text, code, images or audio.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Most of the governance debate today is triggered by generative AI. However, in practice, governance concerns apply across AI in all its forms - not just generative AI. Whether you're building a tailored fraud detection model or experimenting with ChatGPT prompts, the same foundational risks around ethics, security and control still apply.&lt;/p&gt;
&lt;h2&gt;
  
  
  A Working Definition of AI Governance
&lt;/h2&gt;

&lt;p&gt;At its heart, AI governance is a framework of guidelines, processes and practices to ensure AI systems are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Ethical: respecting human values and avoiding harm&lt;/li&gt;
&lt;li&gt;Safe: robust, reliable, aligned with your organisation's attitude to risk&lt;/li&gt;
&lt;li&gt;Transparent: open to inspection, traceable and explainable&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And crucially, governance should span the full lifecycle - from initial scoping and development to deployment, monitoring and daily use.&lt;/p&gt;

&lt;p&gt;In a simple analogy, AI governance is like the rules of the road. Your business context sets the landscape, your developers are the drivers, and AI is the vehicle. Governance provides the signposts, traffic lights and certifications to ensure you reach your destination (the use case) safely - without crashing the system or harming bystanders along the way.&lt;/p&gt;
&lt;h2&gt;
  
  
  Why Governance Matters More Than Ever
&lt;/h2&gt;

&lt;p&gt;Ignoring AI governance comes with very real consequences:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Zillow: Their machine learning system for home buying was trained on outdated market data. Without ongoing governance to detect drift or continuously fine tune the model with new data, the model consistently overbid, racking up losses of over $500 million and forcing layoffs and program shutdowns. (&lt;a href="https://insideainews.com/2021/12/13/the-500mm-debacle-at-zillow-offers-what-went-wrong-with-the-ai-models/" rel="noopener noreferrer"&gt;InsideAI News&lt;/a&gt;, 2021)&lt;/li&gt;
&lt;li&gt;Samsung: Engineers pasted proprietary code into ChatGPT to debug problems, unaware of the implications. The result was uncontrolled exposure of intellectual property, forcing an emergency ban on AI use. (&lt;a href="https://www.forbes.com/sites/siladityaray/2023/05/02/samsung-bans-chatgpt-and-other-chatbots-for-employees-after-sensitive-code-leak/" rel="noopener noreferrer"&gt;Forbes&lt;/a&gt;, 2023)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Additionally, a UK poll revealed one in five companies experienced data leaks due to ungoverned GenAI use. Meanwhile, 92% of Fortune 500 firms already use ChatGPT - sometimes through informal &lt;a href="https://audacia.co.uk/blog/preventing-shadow-ai?utm_campaign=2025-Thought-leadership-Shadow-AI-July&amp;amp;utm_source=DEV.to&amp;amp;utm_medium=Blog-post&amp;amp;utm_content=DEV.to-governance" rel="noopener noreferrer"&gt;"shadow AI",&lt;/a&gt; where employees independently adopt tools without IT or legal signoff. ( &lt;a href="https://www.reuters.com/technology/openais-altman-pitches-chatgpt-enterprise-large-firms-including-some-microsoft-2024-04-12/" rel="noopener noreferrer"&gt;Reuters&lt;/a&gt;, 2024)&lt;/p&gt;

&lt;p&gt;Governance isn't there to slow teams down. It's your best route to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Build trust and adoption, internally and with customers&lt;/li&gt;
&lt;li&gt;Mitigate operational, legal and financial risk&lt;/li&gt;
&lt;li&gt;Ensure your AI systems are auditable, reproducible, and scalable&lt;/li&gt;
&lt;li&gt;Shorten time to production through clear, standardised practices&lt;/li&gt;
&lt;li&gt;Attract top technical talent who care about ethical, forward-thinking engineering.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  Core Principles: The Ethical Backbone
&lt;/h2&gt;

&lt;p&gt;The starting point for any governance framework is a set of core principles. These are underlying high-level ethical and operational guidelines - your non-negotiables for how AI gets built and used.&lt;/p&gt;

&lt;p&gt;Here are six example principles split into two core areas:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Oversight &amp;amp; Integrity&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Accountability: Define clear roles and ownership for AI usage. Know who's responsible for model outcomes and empower leaders to take corrective action.&lt;/li&gt;
&lt;li&gt;Ethics: Align AI with moral and societal values. Actively guard against bias or discriminatory outcomes.&lt;/li&gt;
&lt;li&gt;Transparency: Make your AI systems understandable. This allows effective auditing and builds technical literacy across teams.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;User Rights &amp;amp; Protection&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Security: Guard against unauthorised access and misuse. Protect systems from compromise.&lt;/li&gt;
&lt;li&gt;Privacy: Safeguard personal and sensitive data. Stay compliant with GDPR and evolving global standards.&lt;/li&gt;
&lt;li&gt;Control: Give users and your organisation the means to override or restrict AI outputs to stay aligned with human judgement.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These principles are deliberately broad, they form the basis of many governance policies, which are then narrowed down with specific organisational context.&lt;/p&gt;
&lt;h2&gt;
  
  
  How Governance Changes Shape Up the Stack
&lt;/h2&gt;

&lt;p&gt;Governance doesn't look the same at every level. Consider a typical AI project lifecycle with your users/project managers at the centre, and your engineers/data scientists embedded in the process:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnlkzgwp6d1u1ewefhajc.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnlkzgwp6d1u1ewefhajc.jpg" alt=" "&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We can approach this lifecycle at three different levels:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Built systems: Built from scratch by data science teams. Here governance focuses on model development standards, data selection (to mitigate bias or toxicity) and hands-on monitoring.&lt;/li&gt;
&lt;li&gt;Cloud services: Plug-and-play frameworks where your team provides data and tweaks. You're responsible for due diligence on service choice, feeding clean data and verifying outputs comply with standards.&lt;/li&gt;
&lt;li&gt;AI products: Tools like Copilot or ChatGPT. Governance shifts to vetting vendors, understanding their transparency commitments and educating employees on approved use.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Governance doesn't diminish as we move up the stack, it simply changes shape.&lt;/p&gt;
&lt;h2&gt;
  
  
  Turning Principles into Policy
&lt;/h2&gt;

&lt;p&gt;So how do you move from abstract principles to something actionable?&lt;/p&gt;

&lt;p&gt;A practical first step is an overarching AI governance policy document, tailored to your existing organisation. This might include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A formal statement of your principles&lt;/li&gt;
&lt;li&gt;Tables of roles &amp;amp; responsibilities&lt;/li&gt;
&lt;li&gt;Clear implementation guides with examples&lt;/li&gt;
&lt;li&gt;Checklists for assessing risks and impacts before adoption&lt;/li&gt;
&lt;li&gt;Standards for monitoring, auditing and escalation paths&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Good policy documents are clear and accessible, as well as easy to update and dynamic. Additionally, avoid over-complication that restricts workflows and innovation - this massively reduces the likelihood of widespread adoption.&lt;/p&gt;
&lt;h2&gt;
  
  
  Mitigating the Biggest Risks
&lt;/h2&gt;

&lt;p&gt;Build policies that actively spot and reduce risks. Examples include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Shadow AI exposure: Keep a registry of approved (and banned) tools. Provide sanctioned alternatives - e.g. enterprise-grade ChatGPT - to steer teams away from unsafe workarounds.&lt;/li&gt;
&lt;li&gt;Model drift &amp;amp; stale data: As Zillow discovered, failing to monitor changing data can be ruinous. Bake regular model reviews into your policy.&lt;/li&gt;
&lt;li&gt;Sensitive inputs: Guardrails (both technical and policy-based) to stop developers pasting IP into consumer SaaS tools.&lt;/li&gt;
&lt;li&gt;Privacy leaks: Ensure privacy reviews are a standard step in your deployment process.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Consider introducing lightweight artifacts like audit checklists or readiness questionnaires. You may want to build these into your governance policy document or introduce them as separate tools to keep them more dynamic. These integrate governance without adding heavy process that stifles engineering momentum.&lt;/p&gt;
&lt;h2&gt;
  
  
  People: The True Drivers of Governance
&lt;/h2&gt;

&lt;p&gt;No policy lives in a vacuum. Successful AI governance comes down to people.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Dedicated roles: Many organisations appoint an AI governance lead - often a data scientist or architect with a passion for responsible AI. They bridge exec strategy and daily developer practice.&lt;/li&gt;
&lt;li&gt;Defined responsibilities: Make sure everyone knows how governance relates to their role, and who to go to with questions.&lt;/li&gt;
&lt;li&gt;Periodic training: Keep teams current on new models, new regulations, and what policies mean for their work.&lt;/li&gt;
&lt;li&gt;Open culture: Foster spaces to raise concerns, suggest improvements, and discuss AI ethics. This not only improves adoption - it makes your governance framework stronger and more relevant.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  Staying Dynamic in a Fast-Moving World
&lt;/h2&gt;

&lt;p&gt;AI isn't standing still. It took ChatGPT two months to hit 100 million users - the fastest ever for a consumer application. (&lt;a href="https://www.theguardian.com/technology/2023/feb/02/chatgpt-100-million-users-open-ai-fastest-growing-app" rel="noopener noreferrer"&gt;The Guardian&lt;/a&gt;, 2023) Meanwhile, model sizes are growing by orders of magnitude, costs to train are plummeting and multi-agent systems are pushing us closer to artificial general intelligence.&lt;/p&gt;

&lt;p&gt;In practice, this means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Review your policies often: use strict version control, iterate quickly.&lt;/li&gt;
&lt;li&gt;Monitor your deployed systems: stay alert for unanticipated changes, especially if using enterprise models updated outside your control.&lt;/li&gt;
&lt;li&gt;Keep learning: from regulatory frameworks (like the UK's pro-innovation principles or ISO 42001) to evolving global standards, staying informed is essential.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  Where to Start (Wherever You Are)
&lt;/h2&gt;

&lt;p&gt;Not every organisation is at the same stage. If you're only just exploring AI:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Audit your workflows. Where could AI realistically help? Is shadow AI already creeping in?&lt;/li&gt;
&lt;li&gt;Use this to shape your first minimal governance guardrails.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you're sporadically using AI via contractors or pilot projects:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Identify who might lead your governance efforts. Do you have data specialists who can step up? What lessons from past projects can be formalised into your first policies? &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you're mature in AI use but light on governance:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Start documenting your implicit standards. Turn them into explicit, auditable principles and policies. Avoid slowing innovation but build the right checks so your systems stay robust and ethical. &lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  In Closing
&lt;/h2&gt;

&lt;p&gt;AI governance boils down to three pillars: policies, principles and people. Get them right and you're unlocking AI's potential in a way that's safe and aligned with your values.&lt;/p&gt;
&lt;h2&gt;
  
  
  Watch the full Tech Talk:
&lt;/h2&gt;

&lt;p&gt;

  &lt;iframe src="https://www.youtube.com/embed/hKFGFuecy-g"&gt;
  &lt;/iframe&gt;


&lt;/p&gt;

</description>
      <category>ai</category>
      <category>software</category>
    </item>
    <item>
      <title>Building a Tech Radar: A Practical Guide for Technology Leaders</title>
      <dc:creator>Audacia</dc:creator>
      <pubDate>Mon, 01 Dec 2025 11:21:39 +0000</pubDate>
      <link>https://dev.to/audaciatechnology/building-a-tech-radar-a-practical-guide-for-technology-leaders-3o6p</link>
      <guid>https://dev.to/audaciatechnology/building-a-tech-radar-a-practical-guide-for-technology-leaders-3o6p</guid>
      <description>&lt;p&gt;&lt;em&gt;This blog post has been adapted from &lt;a href="https://audacia.co.uk/events/building-a-tech-radar-what-why-and-how" rel="noopener noreferrer"&gt;this&lt;/a&gt; Tech Talk by Richard Brown, Technical Director at Audacia.Find the full video at the end of the article.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Technology leaders regularly make decisions that shape the technical direction of organisations. Each choice regarding frameworks, languages or tools influences how systems are built and maintained. The challenge is keeping those decisions aligned across organisations.&lt;/p&gt;

&lt;p&gt;One of the most effective ways to achieve that alignment is by using a Tech Radar. &lt;/p&gt;

&lt;h2&gt;
  
  
  What is a Tech Radar?
&lt;/h2&gt;

&lt;p&gt;At its simplest, a Tech Radar is a visual representation of technology choices within your organisation. It shows where each tool, framework or platform sits in terms of maturity and adoption. &lt;/p&gt;

&lt;p&gt;A Tech Radar makes it easy to answer key questions: what should we be adopting? What's being trialled? What’s emerging? What’s being phased out? &lt;/p&gt;

&lt;p&gt;The concept originated at Thoughtworks, but every organisation can adapt it to fit their own priorities. &lt;/p&gt;

&lt;h2&gt;
  
  
  Why Tech Radars are Important
&lt;/h2&gt;

&lt;p&gt;Without an explicit method for tracking technology choices, teams can drift, and tools can be selected based on convenience or habit rather than strategy. Over time, this results in inconsistency and risk. &lt;/p&gt;

&lt;p&gt;A Tech Radar forces deliberate conversations about technology. It provides a shared view of what’s recommended, what’s under evaluation, and what’s no longer safe to use. &lt;/p&gt;

&lt;p&gt;For technology leaders, the benefits are clear: &lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Knowledge sharing&lt;/strong&gt; 
In large organisations or consultancies, there is a danger that teams can become siloed. A radar combats this by makes choices visible. It becomes a central reference point for engineers starting new projects or exploring unfamiliar tools. 
*&lt;em&gt;2. Risk management *&lt;/em&gt;
Technologies age, licensing models change, maintainers move on. When a framework is deprecated or a tool becomes risky, the radar is the single source of truth that makes this clear. 
*&lt;em&gt;3. Future direction *&lt;/em&gt;
By looking at what’s being trialled or assessed, you can see where the organisation is heading. It informs hiring, training, and investment decisions. &lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Anatomy of a Tech Radar
&lt;/h2&gt;

&lt;p&gt;A radar has two main elements: &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Segments (or quadrants):&lt;/strong&gt; Categories of technology. These can be as traditional as ‘Languages &amp;amp; Frameworks’ or as precise as ‘Customer Experience Platforms’. &lt;br&gt;
&lt;strong&gt;Rings:&lt;/strong&gt; Levels of maturity. Thoughtworks use: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Hold: Do not use (either immature or on the way out). &lt;/li&gt;
&lt;li&gt;Assess: Watch closely. &lt;/li&gt;
&lt;li&gt;Trial: Test in a controlled way. &lt;/li&gt;
&lt;li&gt;Adopt: Recommended for general use. &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;At Audacia, we adapted this structure to reflect the way our projects are structured and the services we deliver. &lt;/p&gt;

&lt;p&gt;For example, our Cloud &amp;amp; DevOps quadrant spans everything from CI/CD pipelines to infrastructure-as-code tools. Data &amp;amp; AI merges data engineering with machine learning, because these disciplines are often inseparable in practice. Additionally, we renamed ‘Hold’ to ‘Avoid’. If a tool has been deprecated or poses security risks, its position in the ‘Avoid’ ring makes this explicit. New teams no longer waste time rediscovering the same issues. &lt;/p&gt;

&lt;p&gt;This tailoring is important. Choose quadrants and rings that reflect how you work, rather than adopting someone else’s model. For some organisations, that might mean categories around business domains rather than tools. Copying another company’s structure rarely gives good results. &lt;/p&gt;
&lt;h2&gt;
  
  
  How To Manage a Tech Radar
&lt;/h2&gt;

&lt;p&gt;A Tech Radar is not a one-off exercise. It must be living and breathing. &lt;/p&gt;

&lt;p&gt;At Audacia, we’ve built a lightweight internal web application to host our Tech Radar. It’s updated continuously, with a clear owner for each quadrant. Every blip includes context: where it’s used, why it’s recommended or avoided and who to talk to. Because the radar is visible, it keeps everyone informed and turns technical direction from implicit knowledge into a documented, shared resource. &lt;/p&gt;

&lt;p&gt;We also make it easy for engineers to suggest updates through a feedback mechanism. This lowers the barrier to contribution, ensuring the radar reflects the collective expertise of the organisation. People on the ground often see trends earlier than leadership. &lt;/p&gt;

&lt;p&gt;Ensure that your Tech Radar is reviewed regularly – a stale radar is worse than no radar. Additionally, make changes visible to everyone. When a significant shift occurs, such as a major library deprecation, communicate it internally beyond the radar itself. &lt;/p&gt;
&lt;h2&gt;
  
  
  Common Questions
&lt;/h2&gt;
&lt;h4&gt;
  
  
  How long should you keep a deprecated technology on the radar?
&lt;/h4&gt;

&lt;p&gt;It depends on how widely it was used. If it was central to past projects, leave it visible in the ‘Avoid’ ring for longer so that future teams know to avoid it. &lt;/p&gt;
&lt;h4&gt;
  
  
  Should cost factor into a decision?
&lt;/h4&gt;

&lt;p&gt;A Tech Radar provides visibility on what is and is not recommended, which could be for several reasons: technical, cost, licensing or others. Therefore, if licensing costs make a tool unviable, that should be reflected in its position. &lt;/p&gt;
&lt;h4&gt;
  
  
  How can we build one?
&lt;/h4&gt;

&lt;p&gt;You can start with a simple shared document. Over time, invest in a web-based version that supports filtering, links and feedback. Several open-source templates exist, but building your own allows the flexibility to match your structure. &lt;/p&gt;

&lt;p&gt;A Tech Radar is a conversation starter, a knowledge-sharing tool and a governance mechanism. The radar is also a way of making risk visible, such as licensing changes, security vulnerabilities or shifts in community support. It keeps technology choices deliberate and visible, ensuring that decisions made today continue to serve the organisation. &lt;/p&gt;

&lt;p&gt;For technology leaders, it is one of the simplest, most effective ways to set direction, manage risk and explain choices. &lt;/p&gt;

&lt;p&gt;If you don’t already have one, now is the time to start.&lt;/p&gt;
&lt;h2&gt;
  
  
  Watch the full Tech Talk:
&lt;/h2&gt;

&lt;p&gt;

  &lt;iframe src="https://www.youtube.com/embed/TBL4vxUIdqo"&gt;
  &lt;/iframe&gt;


&lt;/p&gt;

</description>
      <category>leadership</category>
    </item>
    <item>
      <title>Brick by Brick: How to Define the Right System Requirements</title>
      <dc:creator>Audacia</dc:creator>
      <pubDate>Mon, 17 Nov 2025 08:46:09 +0000</pubDate>
      <link>https://dev.to/audaciatechnology/brick-by-brick-how-to-define-the-right-system-requirements-514a</link>
      <guid>https://dev.to/audaciatechnology/brick-by-brick-how-to-define-the-right-system-requirements-514a</guid>
      <description>&lt;p&gt;&lt;em&gt;This blog post has been adapted from &lt;a href="https://audacia.co.uk/events/defining-requirements-for-software-projects" rel="noopener noreferrer"&gt;this&lt;/a&gt; Tech Talk by Matt Cross, Lead Business Analyst at Audacia. Find the full video at the end of the article. &lt;/em&gt; &lt;/p&gt;

&lt;p&gt;Successful software projects are rarely the result of chance. They emerge from a disciplined approach to understanding the problem, structuring requirements and aligning teams. These same principles – clarity of vision, scope control and collaboration – are as essential to replacing a medieval Lego castle as they are to delivering a complex software system. &lt;/p&gt;

&lt;p&gt;This article explores three core challenges in defining effective system requirements: &lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Stacking Wisely – Managing Scope and Priorities &lt;/li&gt;
&lt;li&gt;The Picture on the Box – Visualising Requirements &lt;/li&gt;
&lt;li&gt;All Hands on Bricks – Engaging Stakeholders&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  1. Stacking Wisely: Managing Scope and Priorities
&lt;/h2&gt;

&lt;p&gt;Every project begins with a list of ideas, but the reality of time, cost and risk forces a tough question: what belongs in the first release? Delivering everything at once may sound appealing, but it often leads to instability, wasted effort and missed opportunities for learning from real users. &lt;/p&gt;

&lt;h3&gt;
  
  
  Start with the problem
&lt;/h3&gt;

&lt;p&gt;Before anything else, ensure decision-makers agree on what problem is being solved. A simple framework of three questions is effective: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Who is experiencing the problem? &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;What is the problem? &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Why does it matter? &lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In a castle-building analogy, merchants face a market constrained by poor foundations – limiting growth and trade. In software terms, a system too unstable for additional features prevents teams from adapting to market demand. A clearly articulated problem provides a lens through which features can be assessed. &lt;/p&gt;

&lt;h3&gt;
  
  
  Avoid the trap of over-simplification with clear principles
&lt;/h3&gt;

&lt;p&gt;A narrow focus on the immediate problem can produce a quick win but risks long-term frustration, and features that do not consider scalability, flexibility or future needs will soon require rebuilding. For example, a market expansion may succeed, but if entry gates remain too small or foundational choices too rigid, growth will be limited. &lt;/p&gt;

&lt;p&gt;Defining a short set of principles alongside your problem statement helps safeguard against short-sighted decisions, by providing context to what’s important when evaluating features. These principles should influence how requirements are prioritised and where trade-offs are acceptable. They may include long-term architectural goals, user experience priorities or operational resilience. For instance, if future phases will introduce significant load or new user groups, early design choices – like implementing role-based access or building with modularity in mind – can ensure the system is prepared to accommodate tomorrow’s needs, even during MVP scoping. You can read our eight over-arching Engineering Principles here.  &lt;/p&gt;

&lt;h3&gt;
  
  
  Simplify prioritisation
&lt;/h3&gt;

&lt;p&gt;Initial prioritisation benefits from a binary ‘in or our’ filter to quickly define what makes the MVP. Once that shortlist is clear, apply a MoSCoW analysis (Must, Should, Could, Won’t) to balance value and effort. This two-step approach reduces ambiguity, prevents scope creep and clarifies which items belong to future phases. &lt;/p&gt;

&lt;h2&gt;
  
  
  2. The Picture on the Box: Visualising Requirements
&lt;/h2&gt;

&lt;p&gt;Software is intangible, and unlike a Lego set, there is no picture on the box to show you what it should look like. Without clear visualisation, requirements can be misinterpreted, dependencies overlooked and effort wasted. The ‘picture on the box’ challenge is to make these intangible elements visible, so we can determine which bricks are needed and where they should be stacked. Techniques such as user story mapping and wireframes can help to achieve this.   &lt;/p&gt;

&lt;h3&gt;
  
  
  User Story Mapping
&lt;/h3&gt;

&lt;p&gt;User story mapping is a practical and collaborative way to present a backlog as a visual, flowing representation of a product. Their purpose is to present the epics, features and user stories in the backlog as a chart – almost like a process diagram – showing both the big picture and the details of what is being built. &lt;/p&gt;

&lt;p&gt;Think of it as a layered map: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;The top row outlines high-level epics that represent the largest goals &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Under each epic, features group related functionality into logical processes&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Finally, user stories are the individual steps required to complete a task &lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This tree-like structure should be laid out to trace a start to finish path through the application from the user’s perspective. This exercise makes it easier to separate essential functionality from outdated processes that can be discarded. &lt;/p&gt;

&lt;p&gt;The process is inherently collaborative. Working through the journey step-by-step exposes gaps, validates assumptions and challenges unnecessary complexity. Story maps can also support later tasks, such as prioritisation, dependency mapping and planning iterative releases.&lt;/p&gt;

&lt;h3&gt;
  
  
  Wireframes
&lt;/h3&gt;

&lt;p&gt;When interface and user experience are central to a feature, simple wireframes add clarity. Even rough sketches highlight layout, workflows and user paths, making discussions concrete. The introduction of Generative AI tools and basic creative software such as Figma mean wireframes have become more time-effective and accessible – rather than resorting to underwhelming paint drawings, or expensive UX design.  &lt;/p&gt;

&lt;p&gt;These lightweight diagrams can help stakeholders quickly critique, refine and align on functionality before development begins. For example, a wireframe of a market interface might expose usability issues, such as layouts unsuited to mobile devices, or prompt the addition of features like search and filters. Teams can incorporate these visual guides into user stories, enhancing shared understanding among developers, testers and stakeholders. &lt;/p&gt;

&lt;h2&gt;
  
  
  3. All Hands on Bricks: Engaging Stakeholders
&lt;/h2&gt;

&lt;p&gt;Building a system requires the efforts of many hands. Different groups bring different experiences, priorities and assumptions. Without careful management, these differences can derail a project. &lt;/p&gt;

&lt;h3&gt;
  
  
  Recognising Stakeholder Motivations
&lt;/h3&gt;

&lt;p&gt;Not all stakeholders see change in the same light: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Some value the legacy system and fear losing familiar tools. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Others are eager for innovation and push for ambitious features. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Some may be sceptical, based on past experiences. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Others may simply lack understanding of what is possible. &lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Recognising these underlying motivations helps tailor communication and build trust. &lt;/p&gt;

&lt;h3&gt;
  
  
  Fostering Collaboration
&lt;/h3&gt;

&lt;p&gt;Early workshops set the tone. Use structured icebreakers and open-ended questions to encourage contributions. For example, ask participants to share what excites them about the project and what they see as possible risks. Ensure quieter voices are heard by directly inviting input, so no critical knowledge is overlooked. &lt;/p&gt;

&lt;p&gt;Stakeholder engagement is more than extracting requirements – it is about building ownership. People who contribute to the design process are more likely to support and champion the final product. &lt;/p&gt;

&lt;h3&gt;
  
  
  Clear and Consistent Communication
&lt;/h3&gt;

&lt;p&gt;Projects benefit from a central knowledge base that includes: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Problem statements and project principles &lt;/li&gt;
&lt;li&gt;Glossaries of terms and roles &lt;/li&gt;
&lt;li&gt;Guidance on processes like testing and agile practices &lt;/li&gt;
&lt;li&gt;Tools and dashboards for visibility on progress &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Provide introductory sessions, especially when stakeholders are unfamiliar with agile delivery. Record these sessions and keep materials accessible for new team members joining mid-project. &lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Defining the right system requirements is a structured, collaborative process. This process involves stacking wisely, creating the ‘visual on the box’ through techniques such as wireframes and story maps and engaging stakeholders through thoughtful communication, consistent knowledge sharing and structured workshops.  &lt;/p&gt;

&lt;p&gt;When approached this way, requirements gathering can transform from standard documentation to a system foundation, built brick by brick, that is resilient, user-focused and scalable.&lt;/p&gt;

&lt;h2&gt;
  
  
  Watch the full Tech Talk:
&lt;/h2&gt;

&lt;p&gt;

  &lt;iframe src="https://www.youtube.com/embed/tylHUgUeCSA"&gt;
  &lt;/iframe&gt;


.&lt;/p&gt;

</description>
      <category>software</category>
    </item>
    <item>
      <title>Delivering Greenfield Projects: Getting the Foundations Right</title>
      <dc:creator>Audacia</dc:creator>
      <pubDate>Mon, 03 Nov 2025 09:02:10 +0000</pubDate>
      <link>https://dev.to/audaciatechnology/delivering-greenfield-projects-getting-the-foundations-right-432e</link>
      <guid>https://dev.to/audaciatechnology/delivering-greenfield-projects-getting-the-foundations-right-432e</guid>
      <description>&lt;p&gt;How to get the first line of code - and everything that follows - right.&lt;/p&gt;

&lt;p&gt;Without legacy constraints, greenfield projects allow teams to bake in modern practices, cloud-native architectures and a developer-first culture from day one. Done well, those early choices compound into faster releases, easier scaling and happier teams for years. Done poorly, they can create tomorrow’s technical-debt problems. This article discusses how development teams can lay solid foundations when starting from scratch.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary:
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Automate early.&lt;/strong&gt; A CI/CD pipeline, test automation and Infrastructure as Code (IaC) from sprint 0 lock in speed and quality for the life of the product.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Design for cloud and APIs.&lt;/strong&gt; Cloud-first, API-first and security-by-design principles align with the UK Government’s Technology Code of Practice and give future teams flexibility at scale.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Balance YAGNI with extensibility.&lt;/strong&gt; Build “just enough” architecture - simple, loosely coupled services that are easy to extend later, not an over-engineered fortress.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Invest in observability, documentation and culture.&lt;/strong&gt; Practices like structured logging, Architecture Decision Records (ADRs), documentation and peer code review are far cheaper to embed before the codebase grows.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Blank Slates - Opportunities &amp;amp; Challenges:
&lt;/h2&gt;

&lt;p&gt;Greenfield projects are often described by engineers as a “nirvana” – no legacy code, no entrenched constraints, the freedom to choose the best technologies.&lt;/p&gt;

&lt;p&gt;But they also carry risk: without legacy constraints, teams might under-invest in necessary structure or, conversely, over-engineer because everything is possible.&lt;/p&gt;

&lt;p&gt;For IT leaders, a greenfield initiative (be it a new digital product, a spin-off system, or a major rewrite separated from legacy) is an opportunity to set an example for how software should be done. It’s the chance to incorporate lessons learned from past projects and evolving industry practices.&lt;/p&gt;

&lt;p&gt;One key guiding principle is “first build the right thing, then build it right” – meaning even with a perfect technical foundation, you must ensure the product meets user needs. Greenfield teams should still follow agile, user-centric development to validate they’re building a valuable product.&lt;/p&gt;

&lt;p&gt;But our focus here is on building it right - the engineering practices and infrastructure that form the foundation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Core Foundations:
&lt;/h2&gt;

&lt;h4&gt;
  
  
  CI/CD Pipelines:
&lt;/h4&gt;

&lt;p&gt;Automated pipelines build the code, run tests, and (if tests pass) deploy to a test environment, as well as deploying further to production for continuous delivery. By doing this from the start, every code commit goes through a consistent, repeatable process, and developers become accustomed to fast feedback. It also enforces good habits: if a certain practice (like running unit tests) is mandated by the pipeline, it will become part of the team’s routine.&lt;/p&gt;

&lt;p&gt;The pipeline can essentially be the embodiment of your process – add static analysis, security scans, etc., early on, so the team gets immediate feedback and quality is baked in.&lt;/p&gt;

&lt;p&gt;Many UK startups attribute their rapid scale-up to having CI/CD from the get-go. For example, when &lt;a href="https://monzo.com/blog/2022/05/16/how-we-deploy-to-production-over-100-times-a-day" rel="noopener noreferrer"&gt;Monzo&lt;/a&gt; began, they invested heavily in automation and tooling, which allowed them to deploy small changes frequently, catching issues early and scaling their operations without a hitch as user numbers grew.&lt;/p&gt;

&lt;h4&gt;
  
  
  Architecture and Design Principles:
&lt;/h4&gt;

&lt;p&gt;Choose an architecture that will support future needs without over-complicating, for instance, favouring a microservice or modular monolith architecture.&lt;/p&gt;

&lt;p&gt;Starting with microservices can allow for independent teams to work in parallel and deploy independently, fueling rapid feature development. The key is loose coupling – design components or services with clear responsibilities and interfaces, so that the system can evolve.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.computerweekly.com/feature/Gartner-Modernising-legacy-applications-for-cloud-native-success#:~:text=An%20assessment%20of%20your%20application,different%20portions%20of%20your%20code" rel="noopener noreferrer"&gt;Gartner suggests&lt;/a&gt; that reducing coupling and complexity at design time increases future changeability. Apply known design principles (SOLID, high cohesion, etc.) to avoid complexity as features grow.&lt;/p&gt;

&lt;h4&gt;
  
  
  Cloud-Native &amp;amp; Infrastructure Automation:
&lt;/h4&gt;

&lt;p&gt;The UK government &lt;a href="https://www.gov.uk/guidance/the-technology-code-of-practice" rel="noopener noreferrer"&gt;Technology Code of Practice&lt;/a&gt; states technology projects should “use cloud first”. Essentially all greenfield projects should be cloud-first. Cloud provides on-demand resources, managed services and scalability that a new project can leverage instead of reinventing.&lt;/p&gt;

&lt;p&gt;Use Infrastructure as Code (IaC) – tools like Terraform or AWS CloudFormation – to script your environments. This ensures you can replicate environments, do disaster recovery easily, and treat infrastructure setup as part of your codebase.&lt;/p&gt;

&lt;p&gt;Additionally, consider using platform services to accelerate development (databases, messaging, authentication services). A greenfield project can save time by not building commodity components themselves. For example, why build your own identity service if you can use Azure Entra ID or Auth0? This frees your team to focus on unique business logic.&lt;/p&gt;

&lt;h4&gt;
  
  
  DevSecOps:
&lt;/h4&gt;

&lt;p&gt;Security and compliance must be foundational, not an afterthought. Incorporate security design (threat modeling, secure defaults) and compliance requirements early. For instance, if building a healthcare app in the UK, ensure from day one that the data model and hosting comply with NHS Digital standards for patient data.&lt;/p&gt;

&lt;p&gt;Implement security controls (encryption, secure secret storage, logging) as part of the initial build. Use automated static code analysis and dependency vulnerability scanning in your CI pipeline. It’s easier to build a secure product from scratch than to retrofit one later.&lt;/p&gt;

&lt;p&gt;The Technology Code of Practice emphasises “Make things secure” and “Make privacy integral” as key points. Teams can focus on setting up monitoring and alerting with security in mind. For instance, define what constitutes suspicious behaviour in your system and plan how you’d detect it.&lt;/p&gt;

&lt;h4&gt;
  
  
  Open and Accessible Development:
&lt;/h4&gt;

&lt;p&gt;Adopting an open-first approach can have many benefits. Using open-source components (with due diligence) accelerates development. Hosting your code in repositories where collaboration is easy (GitHub, GitLab) and possibly open-sourcing parts of it can attract community contributions, as well as potentially make new team additions easier due to visibility.&lt;/p&gt;

&lt;p&gt;It’s an important opportunity to also consider accessibility from the start (especially important for public-facing services) – follow WCAG guidelines from the outset so you aren’t scrambling to fix accessibility later (for example, use semantic HTML, proper ARIA tags, etc., in web apps).&lt;/p&gt;

&lt;p&gt;Building accessible and inclusive technology is not only a legal requirement for some (e.g. public sector must meet certain accessibility standards), but also expands your potential user base.&lt;/p&gt;

&lt;h4&gt;
  
  
  Team and Process Foundations:
&lt;/h4&gt;

&lt;p&gt;Greenfield doesn’t mean “no process”; rather, it means the chance to implement lightweight agile processes that fit the team and purpose.&lt;/p&gt;

&lt;p&gt;Define how the team will collaborate, for example a Scrum or Kanban approach. Set up a backlog with user stories, define a Definition of Done (including testing, documentation, etc.), and use an agile project tool (Jira, Trello, Azure Boards) from the start to track work, ensuring transparency.&lt;/p&gt;

&lt;p&gt;Encourage practices like pair programming or peer code reviews from early on – these habits catch defects early and spread knowledge. Also, instill an engineering culture that aligns with your values. For example, if innovation is key, ensure people have time for spikes/proof-of-concepts; if reliability is crucial, emphasise TDD (Test-Driven Development) from the outset.&lt;/p&gt;

&lt;p&gt;A strong positive culture set in a small initial team can scale with the product. For example, &lt;a href="https://monzo.com/blog/we-have-updated-our-engineering-principles" rel="noopener noreferrer"&gt;Monzo’s engineering principles&lt;/a&gt;, such as “make changes small and often” and “leave things better than you found them”, were set early and helped maintain quality and speed even as the engineering team scaled 60% in a year.&lt;/p&gt;

&lt;h4&gt;
  
  
  Building for Scale (but not over-building):
&lt;/h4&gt;

&lt;p&gt;One risk in greenfield projects is over-engineering by trying to anticipate every future need. It’s important to strike a balance. Design an architecture that can scale out if needed, but don’t implement features or complexity you don’t need yet.&lt;/p&gt;

&lt;p&gt;A useful concept is YAGNI (You Ain’t Gonna Need It) from agile: defer work on future hypothetical requirements until they are more certain. For example, you might foresee that the system could need a more complex sharding mechanism when it has millions of users, but if you’re at prototype stage with 100 users, don’t implement sharding now; just design the data access in a way that adding sharding later is possible (e.g. via an abstraction layer).&lt;/p&gt;

&lt;p&gt;Another useful concept is building for extensibility, not general scalability, because trying to accommodate every potential future use makes systems complex and harder to maintain. Instead, build something that works for the known requirements, but with clean separation of concerns and with the ability to extend.&lt;/p&gt;

&lt;p&gt;For instance, if building a payment processing service, you might design it to handle credit cards initially but make sure the way you implement doesn’t hardcode specifics that would prevent adding PayPal later – perhaps use a strategy pattern for payment methods. You wouldn’t, however, implement PayPal support from day one if it’s not needed – you’d just ensure you won’t have to rewrite everything when adding it.&lt;/p&gt;

&lt;h4&gt;
  
  
  Setting Up Environment &amp;amp; Tools:
&lt;/h4&gt;

&lt;p&gt;On a practical level, ensure developers on a greenfield project have a frictionless environment.&lt;/p&gt;

&lt;p&gt;This might include using containerisation (Docker) so that setting up the development environment is quick and matches production as much as possible. Many teams create a one-click onboarding script so a new dev can get the whole system running locally or in a personal dev environment in the cloud within minutes. Quick onboarding is a sign of good foundations.&lt;/p&gt;

&lt;p&gt;As well as this, teams can implement source control best practices (feature branching or trunk-based dev, code reviews on pull requests, etc.) from day one. These practices are easier to establish when products are new, rather than fix later.&lt;/p&gt;

&lt;h4&gt;
  
  
  Monitoring &amp;amp; Observability from the Start:
&lt;/h4&gt;

&lt;p&gt;An often overlooked foundation is observability.&lt;/p&gt;

&lt;p&gt;Instrument the new application with logging, metrics and tracing early on. If you only add monitoring after going live, you might find you lack crucial insight into the system’s behaviour.&lt;/p&gt;

&lt;p&gt;Instead, incorporate libraries for structured logging and alerting, choose a metrics collection system (such as cloud-native ones like AWS CloudWatch or Azure Monitor if on those clouds), and possibly integrate a tracing system (like OpenTelemetry) if you have a microservices architecture.&lt;/p&gt;

&lt;p&gt;This “observability by design” means when you flip the switch to production, you can see what’s happening and catch performance issues or errors proactively.&lt;/p&gt;

&lt;h2&gt;
  
  
  Take Aways:
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Codify Best Practices:&lt;/strong&gt; Document and implement engineering best practices for the project early. For example, establish a rule that all new code must have unit tests and must be peer-reviewed. Put this in a Wiki for the project, helping to ensure consistency and making information easily accessible.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;-** Observe Performance:** Even if you don’t need to handle millions of users on day one, set up the ability to do performance testing. Create baseline performance tests (throughput, response time) for key operations and include them in your pipeline (even if they only run nightly). This way, you catch any egregious performance issues early. It also provides a baseline to compare as features are added.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Keep Documentation Up-to-date:&lt;/strong&gt; Start maintaining minimal but useful documentation. For example, a README for how to run the system, an Architecture Decision Record (ADR) log where you record key decisions (e.g. “Decision 1: We chose PostgreSQL over MySQL because...”). These ADRs help future team members understand why things were done. Also, document interfaces and APIs as you create them – possibly by adopting an API-first approach (write API spec first, then code). This avoids the scenario where 2 years in, nobody remembers why X was done or how Y works. Tools like Swagger and OpenAPI can be useful here in providing up to date API documentation automatically.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Prototype to Validate Architecture:&lt;/strong&gt; If you’re trying a novel architecture or new technology in the greenfield, do a quick prototype of the riskiest part. For instance, if you plan event-driven microservices, prototype a couple of services and the messaging between them to ensure it behaves as expected. This can surface integration challenges or learning curves early, allowing you to adjust the plan while risks are low.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In short, a greenfield build is a once-only chance to future-proof your product. By baking in cloud-native infrastructure, automated CI/CD, observability, solid security and a culture of small, high-quality changes from sprint 0, teams can help to avoid tomorrow’s technical-debt problems and drive improved developer velocity.&lt;/p&gt;

&lt;p&gt;Rhys Smith, Principal Software Engineer &lt;/p&gt;

</description>
      <category>greenfield</category>
      <category>software</category>
      <category>cloud</category>
      <category>devops</category>
    </item>
    <item>
      <title>Data Products: Build vs Buy</title>
      <dc:creator>Audacia</dc:creator>
      <pubDate>Mon, 20 Oct 2025 08:22:49 +0000</pubDate>
      <link>https://dev.to/audaciatechnology/data-products-build-vs-buy-1c39</link>
      <guid>https://dev.to/audaciatechnology/data-products-build-vs-buy-1c39</guid>
      <description>&lt;h2&gt;
  
  
  What is a “Data Product”?
&lt;/h2&gt;

&lt;p&gt;In recent years, the term “data product” has emerged in data strategy circles, especially with the rise of data mesh architecture. A data product is essentially a curated dataset or data service that is treated as a product – meaning it’s designed to be easily consumed, has a clear purpose, and is managed through a lifecycle (with owners, versioning, improvements, etc.). &lt;/p&gt;

&lt;p&gt;Data products can be operational, e.g. feeding real-time processes, or analytical, e.g. feeding human analysis or models. For example, an operational data product might be an API providing customer credit scores to be used in loan applications in real-time, whereas an analytical data product could be a cleaned and enriched customer 360 dataset that analysts use to generate marketing insights. &lt;/p&gt;

&lt;p&gt;Crucially, data products are owned by cross-functional teams (not just IT) and serve a defined customer, such as an internal user or an application. This approach marks a shift from seeing data as a by-product of applications to seeing data as a first-class product in its own right. In practice, treating data as a product means things like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;documenting what the data contains (metadata)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;ensuring its quality and freshness&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;providing it via convenient interfaces (SQL, API, etc.),&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;and iterating on it based on user feedback.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Build vs Buy – Developing data products:
&lt;/h2&gt;

&lt;p&gt;Organisations often face the question of whether to build their own data products in-house or to leverage third-party data products (or vendor solutions). Building a data product internally means your team defines the data set, gathers and transforms the data, and provides it to consumers. This gives maximum control and provides a custom fit to your needs.&lt;/p&gt;

&lt;p&gt;For example, a UK retailer might build an internal data product of “store footfall and sales forecast” combining CCTV counters, point-of-sale data, and weather data – something unique to their context. On the other hand, buying a data product could mean subscribing to an external data service or purchasing a packaged dataset. For instance, many enterprises subscribe to data products like credit bureaus (for credit scores), market data feeds (for finance), or analytics platforms that come with pre-built data models.&lt;/p&gt;

&lt;p&gt;For large enterprises, “buy” might also refer to using packaged analytics solutions that include data – e.g., a Customer 360 platform that provides a model of customer data out-of-the-box. The trade-off often comes down to core competencies and differentiation: if the data product represents proprietary insight or competitive advantage, building in-house makes sense. If it’s a commodity or a common need (like address validation data, or benchmark industry data), buying can save time. Many organisations do a mix: build the internal unique combinations, but enrich with bought data.&lt;/p&gt;

&lt;h2&gt;
  
  
  Governance and lifecycle of data products:
&lt;/h2&gt;

&lt;p&gt;Whether built or bought, data products require governance akin to software products.&lt;/p&gt;

&lt;p&gt;This means assigning ownership, typically a data product owner role, similar to a product manager, often someone in the business who understands both the data and user needs. It also implies lifecycle management, from:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;initial design (where requirements of the “users” of the data are gathered), to&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;development (data engineering to create the pipelines), to&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;deployment (publishing the data product for consumption), and&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;continuous improvement (adding new attributes, improving quality, etc.).&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For example, an analytical data product “Customer Segmentation Data” might start with basic demographic attributes and later incorporate social media data as the product evolves. &lt;/p&gt;

&lt;p&gt;Governance also covers access control (who can use the data product), compliance checks (does it contain personal data and if so, is that handled properly?), and ensuring consistency if multiple data products overlap.&lt;/p&gt;

&lt;p&gt;In a data mesh approach, each domain, such as Marketing, Finance, or Supply Chain, might produce its own data products, but there needs to be federated governance to ensure, for instance, that the definition of “customer” is consistent or that data products interoperate.&lt;/p&gt;

&lt;p&gt;One approach is the use of data product catalogues – essentially an organised inventory of all data products with meaningful descriptions, so users can discover them and trust them​.&lt;/p&gt;

&lt;p&gt;Instead of a technical data catalogue that might overwhelm users, a data product catalogue lists products like “Sales Dashboard Dataset – updated daily, owner: Analytics Team, quality SLA 99% complete” and so on, making it clear what is available. This approach has been observed in organisations adopting data mesh, where they present data in a marketplace-style portal internally.&lt;/p&gt;

&lt;h2&gt;
  
  
  Operational vs Analytical data products:
&lt;/h2&gt;

&lt;p&gt;To clarify the difference, consider a large UK bank. An operational data product could be something like “Fraud scores API” – a real-time service that gives a fraud risk score for a transaction. It’s a data product because it’s based on data and models, packaged behind an API, and has an owner - the fraud analytics team - who ensures it’s working efficiently.&lt;/p&gt;

&lt;p&gt;An analytical data product example is “Monthly Customer Profitability Dataset” – a compiled dataset that finance and marketing analysts download or query to do their analysis. It might not be real-time but it’s produced with each month’s data, with known definitions and quality checks, and it’s serving the analytical community. &lt;/p&gt;

&lt;p&gt;Both types need reliability, but operational data products often need higher uptime and responsiveness (SLAs on latency), whereas analytical data products emphasise correctness and richness of context, with good documentation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Examples in practice:
&lt;/h2&gt;

&lt;p&gt;For example a global consumer goods company could implement a data product approach for its sales and marketing data. Instead of each region doing its own data extraction and report building, they could create a standardised data product such as “Global Sales Snapshot” which is made up of a data table updated daily containing key metrics by region, channel, product.&lt;/p&gt;

&lt;p&gt;They “productised” it by assigning a product owner from the central analytics team, automating the pipeline, and setting up a help channel for users. Users then no longer had to wrangle data themselves – they had a ready “product” to consume. This is reflective of a wider trend: a Well-governed data product can greatly increase data re-use and efficiency, reducing duplicative work. &lt;/p&gt;

&lt;p&gt;On the “buy” side, consider regulated sectors: many UK insurance companies buy data products such as vehicle telematics data or flood risk data to integrate into underwriting. They treat these external datasets as part of their data product ecosystem – for instance, an underwriting data product that merges internal claims history with an external flood risk score per postcode. The interplay of build vs buy is evident here: they build the integration and custom dataset, but buy the specialist data.&lt;/p&gt;

&lt;h2&gt;
  
  
  Data products and vendor solutions:
&lt;/h2&gt;

&lt;p&gt;Some vendors market “data products” or pre-built analytics solutions. For example, a vendor might offer a Customer Analytics data model that an enterprise can adopt rather than designing their own from scratch. &lt;/p&gt;

&lt;p&gt;Large enterprises often evaluate these to accelerate their analytics projects. The key is to ensure alignment with internal definitions and to avoid vendor lock-in on a critical asset. In some cases, buying a data product like a curated dataset (e.g. a market share database for your industry from a research firm) is a better choice, since building it yourself is impossible. In other cases, if it’s your proprietary operational data, you likely need to build or at heavily customise the data product internally.&lt;/p&gt;

&lt;h2&gt;
  
  
  Governance best practices for data products:
&lt;/h2&gt;

&lt;p&gt;Each data product should have clearly defined SLAs/SLOs (service level agreements/objectives) that cover factors such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;data latency (data will be no more than 24 hours old),&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;quality metrics (e.g. 98% of records have complete values on critical fields), and&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;support procedures (who to contact if something looks wrong). &lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Many organisations incorporate data products into their Data Governance Councils, meaning that any new data product proposed is reviewed for compliance and value, and its performance is periodically reviewed.&lt;/p&gt;

&lt;p&gt;Data products also tie closely to data ownership culture: rather than IT owning all data, the business domain that knows the data best owns the product. For example, HR owns the “Employee Master Data Product”, Finance owns “Financial Actuals Data Product”, etc., with IT providing the tooling and platform support.&lt;/p&gt;

&lt;h2&gt;
  
  
  Build vs Buy decision factors:
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Time to value: Buying an external data product or pre-built solution can be faster, but may not fit all needs; building takes longer but can be more precisely tailored.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Uniqueness: If the data or logic is a source of competitive advantage (e.g. a unique algorithm using your data), build it. If it’s generic (everyone uses it, like compliance data), consider buy.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Cost and maintenance: Building in-house means ongoing maintenance costs, whereas bought products externalise some of that, e.g. subscription fees.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Integration: An internal build can integrate better with your existing architecture. An external product might come with integration adapters but could introduce silos if not careful.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Expertise: Do you have the skills? If not, buying or partnering might be better to ensure quality. Conversely, building can grow internal expertise in important data domains.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Overall, effective data product strategies often involve starting small – identifying a high-value dataset, productising it, demonstrating success, and then scaling the approach to more domains.&lt;/p&gt;

&lt;p&gt;The cultural change (people thinking in terms of products and “customers” of data) is as important as the technical. By having data products, organisations prevent the scenario of each analyst or project doing redundant data wrangling. It fosters a “one source of truth” mentality for each important data domain.&lt;/p&gt;

&lt;p&gt;Data products can make data easier to use and trust, by packaging it with the user in mind. And with greater ownership in place, quality and reliability tend to improve​ (because domain teams ensure their data product is up to scratch).&lt;/p&gt;

&lt;p&gt;Adam Brookes, Head of Consulting &lt;/p&gt;

</description>
      <category>data</category>
    </item>
    <item>
      <title>When You Don’t Need AI - Just Maths &amp; Statistics</title>
      <dc:creator>Audacia</dc:creator>
      <pubDate>Tue, 14 Oct 2025 14:31:32 +0000</pubDate>
      <link>https://dev.to/audaciatechnology/when-you-dont-need-ai-just-maths-statistics-43dn</link>
      <guid>https://dev.to/audaciatechnology/when-you-dont-need-ai-just-maths-statistics-43dn</guid>
      <description>&lt;p&gt;In the rush towards &lt;a href="https://audacia.co.uk/guide-to-ai-and-machine-learning" rel="noopener noreferrer"&gt;AI and machine learning&lt;/a&gt;, it’s easy to forget that many business problems can be solved – often more transparently and robustly – with traditional mathematical and statistical techniques. &lt;/p&gt;

&lt;p&gt;Organisations, particularly those with mature analytics teams, often find that “simpler is better” for a range of use cases. This article highlights examples where statistical models or mathematical techniques can provide appropriate solutions in place of complex AI.&lt;/p&gt;

&lt;h2&gt;
  
  
  Examples:
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Time-series forecasting: &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Many organisations need to forecast things like sales, demand, or budgets. Classical statistical models (ARIMA, exponential smoothing, Holt-Winters) often perform as well as or better than machine learning models on these tasks when data is limited or seasonal patterns are strong. &lt;/p&gt;

&lt;p&gt;For example, in retail, a simple seasonal ARIMA model can predict weekly store sales, which can be a simple and fast alternative to implementing an AI model, as well as being easier to convey to stakeholders and to update regularly. &lt;/p&gt;

&lt;p&gt;In this instance, complex ML (like an LSTM neural network) might need far more data and could potentially still struggle with holiday effects that a human can manually adjust in a simpler model.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Fraud detection: &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;While AI (like deep learning) is used in fraud, a lot of fraud rules in banking and insurance are essentially mathematical thresholds and if-else logic derived from statistical analysis. &lt;/p&gt;

&lt;p&gt;For example, a UK bank might use a logistic regression (a statistical model) to weigh factors for credit card fraud – this might catch 90% of fraud cases with a straightforward formula. &lt;/p&gt;

&lt;p&gt;More complex ML might only marginally improve that, and could introduce false positives that are harder to debug. &lt;a href="https://www.latitudemedia.com/news/the-state-of-utility-ai-adoption-aggressive-incrementalism/#:~:text=For%20instance%2C%20Dominion%E2%80%99s%20synchrophasers%20collect,%E2%80%9D" rel="noopener noreferrer"&gt;One utility company executive noted&lt;/a&gt; regarding anomaly detection on the grid: “You don’t need AI to get the information you need… It’s basic signal processing, control theory, statistics, nothing really crazy.”​ – meaning that well-established statistical methods can detect anomalies in sensor data effectively. The quote emphasises that in some engineering contexts, known statistical techniques (like control charts or spectral analysis) can do the job without ML.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Inventory and supply chain optimisation: &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These often rely on operations research (linear programming, optimisation techniques) and statistical demand distributions. &lt;/p&gt;

&lt;p&gt;For example, a manufacturing organisation might improve its supply chain by using a linear programming model to optimise production schedules and inventory – essentially just math equations. Attempts to use ML to dynamically “learn” the best schedule can be less effective than an OR model that is grounded in known constraints and costs. Similarly, inventory decisions often use formulas derived from statistical safety stock theory (like demand variability times service factor, etc.). These are not AI, but they work and are interpretable to planners.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Customer segmentation and marketing: &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Often a simple RFM (Recency, Frequency, Monetary) analysis – a statistical scoring of customers – can segment customers for targeting just as well as a complex clustering algorithm. &lt;/p&gt;

&lt;p&gt;For example, in retail, organisations might look to use advanced clustering (k-means, etc.) on their customer base, but find that a few well-chosen features and thresholds can give segments that marketing managers understand and can act on (“high spend, lapsed 6 months” segment, etc.). Sometimes too much algorithmic complexity yields segments that are hard to label or understand, which hurts adoption by the business.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Quality control: &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Basic statistical process control (SPC) charts, which date back decades, are still fundamental in factories to detect when a process is out of control. &lt;/p&gt;

&lt;p&gt;They rely on simple statistical rules (e.g., 3-sigma limits). While AI-based computer vision might inspect products for defects (advanced use case), the overall monitoring of process variation still heavily uses statistics.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why simpler models often suffice or excel:
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Data volume &amp;amp; quality&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Many enterprise problems lack big data. A machine learning model often needs lots of data to outperform simpler models. If you only have, say, 3 years of monthly data (36 points) to forecast something, it will be difficult for a deep learning model to beat a tuned exponential smoothing model on small datasets. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Transparency and trust&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Linear regression or statistical models provide coefficients and clear relationships that stakeholders trust. In contrast, an black-box AI model might be met with skepticism by regulators or executives. &lt;/p&gt;

&lt;p&gt;For example, financial services firms often prefer “explainable” logistic regression models for credit risk due to regulatory expectations, even if a black-box AI could result in a slightly better prediction. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Cost and speed &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Developing, testing, and deploying a complex AI solution can be resource-intensive. If a simpler analytic can achieve the business objective, it can be cost-effective and faster to implement. &lt;/p&gt;

&lt;p&gt;One might not need a full data science team to maintain a multiple regression model, whereas a neural network might.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example 1 - Retail:&lt;/strong&gt; &lt;/p&gt;

&lt;p&gt;A supermarket chain was considering machine learning to forecast product demand in each store. After trials, the data science team found that a relatively basic method (seasonal decomposition and linear regression with events like holidays) predicted demand as accurately as a gradient boosted trees model, with the added advantage that store managers understood the factors (they could see “last year’s sales + trend + holiday uplift” etc.). They chose to implement the simpler model company-wide, and reserved AI efforts for other areas like optimising personalised offers. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example 2 - Energy:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;An energy utility company implemented an AI system for predictive maintenance on turbines, but found that it was flagging too many false positives. They went back to a physics-based statistical model that utilised vibration sensor thresholds determined by engineers; while maybe slightly less “sensitive,” it produced alerts that field engineers trusted (because it correlated with known failure modes). The AI system was then repurposed to learn from the statistical model outputs, effectively working as a supplement rather than the primary driver.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example 3 - Finance&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Banks often layer approaches, using business rules and simple models as the first line (fast, interpretable), and then a secondary AI model for the cases that slip through or for additional scoring. For example, one bank’s fraud workflow first applies a number of rules (like “Transaction far from home and high amount” triggers red flag) – those rules alone catch a majority of fraud. &lt;/p&gt;

&lt;p&gt;The key takeaway is not to overlook the power of basic analytics. Approach incrementally, use basic methods first, prove value, then gradually layer more complexity​.&lt;/p&gt;

&lt;p&gt;Teams can look to start with getting the fundamentals right with math and statistics, getting people used to data-driven decision making with interpretable methods, and then consider adding AI complexity where the see it adding value.&lt;/p&gt;

&lt;h2&gt;
  
  
  Knowing when you don’t need AI:
&lt;/h2&gt;

&lt;p&gt;Not every problem requires AI. Some questions to consider: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Can a set of straightforward rules or formula solve this problem to an acceptable level? &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Do we fully understand the domain (if yes, a model based on that understanding may suffice; AI is more useful when patterns are too complex to articulate)? &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Is the additional accuracy from an AI model worth the loss of interpretability or increased maintenance? &lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Many times, the marginal gain can be a debate. For example, in marketing, a simple uplift model might identify target customers for a campaign with 80% accuracy. A complex ML might push that to 82%, but if it’s costly and people don’t trust it, the simpler approach might yield better overall results (because it gets implemented properly and acted upon).&lt;/p&gt;

&lt;h2&gt;
  
  
  Statistics as the backbone of AI:
&lt;/h2&gt;

&lt;p&gt;It’s also worth noting that AI/ML fundamentally is built on statistical principles. A neural network is effectively doing sophisticated statistics (just non-linear). &lt;/p&gt;

&lt;p&gt;Many solutions branded as AI might be solvable with simpler statistical models or even basic algebra. &lt;a href="https://medium.com/@julian.burns50/80-of-ai-projects-fail-and-yours-probably-will-too-but-that-is-ok-a90752795089" rel="noopener noreferrer"&gt;In some cases&lt;/a&gt;, organisations have realised that they could achieve the same outcomes with if-else logic or linear regression that they initially attempted with AI​.&lt;/p&gt;

&lt;p&gt;This isn't to dismiss AI – there are certainly problems where AI is necessary (image recognition, natural language processing, very high-dimensional patterns etc.). But in some cases, enterprise data is structured and aggregated, which can make it suitable for simpler methods.&lt;/p&gt;

&lt;p&gt;For example, in supply chain optimisation, linear programming (LP) and mixed-integer optimisation are tried-and-true techniques. Many scheduling, routing, and allocation problems are solved with these (or heuristic algorithms) rather than ML. There is a trend of “reinforcement learning” being applied to some operations problems, but they can sometimes struggle to beat well-tuned OR algorithms especially when constraints are hard (e.g. production capacities, shift schedules etc. – which OR handles efficiently).&lt;/p&gt;

&lt;h2&gt;
  
  
  The role of domain knowledge:
&lt;/h2&gt;

&lt;p&gt;Domain knowledge also plays a role. Often, an experienced analyst or engineer can craft a simple model leveraging deep domain knowledge that outperforms a generic ML that doesn’t incorporate that knowledge. For example, an actuary might incorporate known mortality tables and trends to forecast insurance claims – a machine learning model starting from scratch would have to “rediscover” those well-known patterns with lots of data.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion:
&lt;/h2&gt;

&lt;p&gt;When looking to solve these problems - start small. By doing so, organisations can build fundamental analytical skills and understanding. Simpler solutions can also be easier to deploy within existing data infrastructure and often easier to integrate into decision processes (people trust what they understand). This doesn’t mean avoiding AI – it means applying AI where it truly adds value that simpler analytics cannot.&lt;/p&gt;

&lt;p&gt;Richard Brown, Technical Director&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>statistics</category>
    </item>
  </channel>
</rss>
