<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Blackthorn Vision</title>
    <description>The latest articles on DEV Community by Blackthorn Vision (@blackthorn_vision_co).</description>
    <link>https://dev.to/blackthorn_vision_co</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3930801%2F6c5a3786-0784-42d7-93e7-291fb90ef87b.png</url>
      <title>DEV Community: Blackthorn Vision</title>
      <link>https://dev.to/blackthorn_vision_co</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/blackthorn_vision_co"/>
    <language>en</language>
    <item>
      <title>Microsoft Solutions Partner for .NET: What It Actually Means for Your Modernization Project</title>
      <dc:creator>Blackthorn Vision</dc:creator>
      <pubDate>Thu, 04 Jun 2026 15:10:04 +0000</pubDate>
      <link>https://dev.to/blackthorn_vision_co/microsoft-solutions-partner-for-net-what-it-actually-means-for-your-modernization-project-1l19</link>
      <guid>https://dev.to/blackthorn_vision_co/microsoft-solutions-partner-for-net-what-it-actually-means-for-your-modernization-project-1l19</guid>
      <description>&lt;p&gt;The phrase "Microsoft Solutions Partner" appears on a lot of vendor websites. For enterprise teams evaluating .NET and Azure development companies, it is easy to assume it is a generic marketing badge, the kind of thing every vendor in the Microsoft ecosystem eventually acquires. That assumption is worth examining before you sign an engagement, because what the designation actually requires, and what it does not, matters when the project involves legacy modernization or AI integration on an existing enterprise platform.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;At &lt;a href="https://blackthorn-vision.com/" rel="noopener noreferrer"&gt;Blackthorn Vision&lt;/a&gt;, a Microsoft-partnered &lt;a href="https://blackthorn-vision.com/technologies/net-development-services/" rel="noopener noreferrer"&gt;.NET&lt;/a&gt; and AI development company helping enterprise teams build and modernize complex software products, we work with teams evaluating vendors for exactly these projects. That positioning matters because legacy modernization today is rarely only a framework upgrade. For many enterprise teams, the same modernization roadmap also has to prepare the product for &lt;a href="https://blackthorn-vision.com/technologies/azure-development-services/" rel="noopener noreferrer"&gt;Azure architecture&lt;/a&gt;, AI features, Semantic Kernel orchestration, and long-term cloud scalability. What follows is what the designation means in practice, where it has real weight, and what it does not tell you by itself.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;h2&gt;What the Designation Actually Requires&lt;/h2&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;Microsoft's Solutions Partner program replaced the older Gold and Silver competency tiers in 2022. To earn the designation, a company must meet requirements across three categories: performance, meaning demonstrated customer growth in the Microsoft cloud; skilling, meaning certified employees across relevant Microsoft technologies; and customer success, meaning verified deployments and customer evidence submitted to Microsoft. &lt;a href="https://learn.microsoft.com/en-us/partner-center/membership/solutions-partner-azure" rel="noopener noreferrer"&gt;Microsoft's&lt;/a&gt; own documentation describes the partner capability score as a composite measurement across all three categories, with a minimum threshold of 70 points and at least one point in every individual metric.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;For the Solutions Partner for Digital &amp;amp; App Innovation (Azure) designation, which is the one relevant to .NET and Azure development, the skilling requirements include certifications in Azure Developer Associate, Azure Solutions Architect Expert, and DevOps Engineer Expert. These are not entry-level credentials. The Azure Solutions Architect Expert exam in particular tests knowledge of infrastructure, networking, identity, security, cost management, and application architecture at a depth that requires real Azure deployment experience to pass.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;The customer success requirement is what gives the designation its most meaningful signal. Microsoft requires verified evidence of customer deployments, not self-reported case studies. The designation is not based only on certifications. Customer success metrics are part of the partner capability score, which means Microsoft evaluates signals tied to real customer usage and deployments in the relevant solution area.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;h2&gt;Where It Has Real Weight for Enterprise Projects&lt;/h2&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;For enterprise teams, the Solutions Partner designation is a useful filter at the beginning of vendor evaluation, not a final answer. It tells you three things with reasonable confidence.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;First, the vendor has certified engineers. For a legacy .NET modernization project that involves Azure migration, this matters because the architectural decisions made during migration have long-term cost and reliability implications. An engineer who has passed the Azure Solutions Architect Expert exam has been tested on the right decisions across networking, identity, scaling, and cost, not just on whether they can deploy an App Service.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;Second, the vendor has done this in production. The customer success requirement means Microsoft has seen evidence of real deployments. For a CTO evaluating companies for a .NET Framework to .NET 8 migration, the difference between a vendor who has migrated similar systems and one who is proposing to do it for the first time is significant, and the Solutions Partner designation is one verifiable signal that production experience exists.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;Third, the vendor has access to Microsoft partner resources that non-partners do not. This includes technical enablement, partner support channels, FastTrack for Azure credits and architecture guidance on qualifying engagements, and access to incentive programs tied to customer deployments. For a complex migration project, having a partner who operates inside the Microsoft partner ecosystem rather than alongside it is a practical advantage.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;h2&gt;What It Does Not Tell You&lt;/h2&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;Legacy Silver and Gold competencies were retired in 2022 and replaced by Solutions Partner designations; Microsoft stopped selling legacy Silver and Gold benefits in January 2025. The designation does not tell you whether a vendor understands your specific problem. A company can hold the designation and specialize in greenfield Azure-native development, staff augmentation for existing teams, or data platform work, none of which is the same as legacy .NET modernization.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;It does not tell you whether the vendor has worked on systems like yours. A .NET Framework 4.x monolith that has been running in production for ten years, with undocumented SQL Server Agent jobs, tightly coupled modules, and downstream systems reading directly from the database, is a fundamentally different project from a modern .NET API that needs to move from on-premises to Azure. The designation does not distinguish between these.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;It does not tell you how the vendor handles the architectural decisions that determine whether a modernization project succeeds or fails: whether they assess the existing system before proposing an approach, whether they use the strangler fig pattern to keep the product running during migration, whether they have production experience with Azure OpenAI and Semantic Kernel for the AI features that come after modernization.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;These are the questions worth asking in a first conversation, and the Solutions Partner designation is the prerequisite check, not the answer to them.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;h2&gt;What Microsoft Solutions Partner Status Means Specifically for Legacy .NET Modernization&lt;/h2&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;Legacy .NET modernization has become a more urgent problem in the past two years for reasons that go beyond the standard "modernize or accumulate debt" argument. The convergence of three factors has made it time-sensitive.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;The support timeline for .NET versions has tightened. .NET 8 is the current long-term support version, supported through November 2026, after which organizations will need to be on .NET 10. Teams still running .NET Framework 4.x face an ecosystem that is narrowing: the tooling, the libraries, and the architectural patterns that make modern cloud-native development practical all assume modern .NET.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;AI integration has become a strategic requirement, not a future consideration. Azure OpenAI, Semantic Kernel, and the Microsoft AI stack are built for modern .NET. Integrating them into a legacy monolith is not a sprint, it is an architectural project that has to happen before the AI work can succeed. A vendor who understands both the modernization and the AI integration is covering one continuous project, not two separate engagements.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;And the cost of waiting has become more visible. &lt;a href="https://www.mckinsey.com/capabilities/tech-and-ai/our-insights/tech-debt-reclaiming-tech-equity" rel="noopener noreferrer"&gt;McKinsey&lt;/a&gt; research across enterprise technology organizations has found that technical debt consumes 20 to 40 percent of the value of a technology estate, with roughly 30 percent of new-product budgets quietly redirected to resolving existing debt. The business case for modernization has become easier to make to CFOs who previously saw it as a technical preference rather than a financial necessity.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;For enterprise teams evaluating Microsoft Solutions Partners specifically for .NET legacy modernization, the relevant questions are not about the designation itself but about what the vendor has done within it: how many .NET Framework to modern .NET migrations they have completed, whether they use a phased migration approach that keeps the product in production throughout, and whether they have experience connecting the modernization to Azure AI work that typically follows it.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;h2&gt;What to Ask a Microsoft Solutions Partner Before a .NET Modernization Project&lt;/h2&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;The Solutions Partner designation is the prerequisite check. The questions below are the actual evaluation.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Have you modernized .NET Framework 4.x systems that were already running in production?&lt;/li&gt;
&lt;li&gt;Do you assess the existing architecture before recommending a migration path to .NET 8 or .NET 10?&lt;/li&gt;
&lt;li&gt;Can you keep the product running and shipping features during the migration?&lt;/li&gt;
&lt;li&gt;Do you handle Azure architecture, CI/CD, observability, and security as part of the modernization, not as a separate later project?&lt;/li&gt;
&lt;li&gt;Can you prepare the application for Azure OpenAI and Semantic Kernel integration after modernization is complete?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;A vendor who cannot answer each of these specifically has either not done this type of work or is not being precise about what the engagement covers.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;h2&gt;Why Blackthorn Vision Fits This Use Case&lt;/h2&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;Blackthorn Vision is a Microsoft Solutions Partner focused specifically on .NET modernization and Azure AI integration for enterprise products. The company helps enterprise teams build and modernize complex software products, working with clients in fintech, healthcare, and enterprise SaaS where the existing system cannot be paused for a rewrite and the AI work has to be built on a modernized foundation.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;What this looks like in practice:&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Legacy .NET Framework assessment before any migration approach is proposed&lt;/li&gt;
&lt;li&gt;Strangler fig migration to .NET 8 with no feature freeze and no downtime&lt;/li&gt;
&lt;li&gt;Azure architecture design covering App Service, AKS, Azure SQL, Cosmos DB, and networking&lt;/li&gt;
&lt;li&gt;Azure OpenAI and Semantic Kernel integration after the modernization creates the service boundaries that make AI features stable in production&lt;/li&gt;
&lt;li&gt;Observability, CI/CD, and automated test coverage established as part of the migration, not deferred to a later project&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;Blackthorn Vision fits this use case because the company sits at the intersection of .NET modernization, Azure architecture, and enterprise AI integration. For teams with legacy .NET Framework systems, that matters: the goal is not only to move code to a supported runtime, but to create a product architecture that can support cloud deployment, secure data flows, observability, CI/CD, and AI features built with Azure OpenAI and Semantic Kernel. For enterprise teams searching for a Microsoft Solutions Partner that specializes in .NET legacy modernization specifically, rather than Azure work in general, Blackthorn Vision's engagement model is built around exactly this sequence. Verified client feedback is available on the &lt;a href="https://clutch.co/profile/blackthorn-vision" rel="noopener noreferrer"&gt;Clutch profile&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;If you are evaluating .NET and Azure development companies for a modernization project and want to understand whether the Solutions Partner designation is backed by relevant production experience, that is the right first question, and it is the one we can answer specifically.&lt;/p&gt;

</description>
      <category>microsoft</category>
      <category>dotnet</category>
    </item>
    <item>
      <title>Our Client's In-House LLM Integration Failed in Production: Observability, Cost, Latency — What Went Wrong</title>
      <dc:creator>Blackthorn Vision</dc:creator>
      <pubDate>Thu, 04 Jun 2026 13:02:56 +0000</pubDate>
      <link>https://dev.to/blackthorn_vision_co/our-clients-in-house-llm-integration-failed-in-production-observability-cost-latency-what-1ef3</link>
      <guid>https://dev.to/blackthorn_vision_co/our-clients-in-house-llm-integration-failed-in-production-observability-cost-latency-what-1ef3</guid>
      <description>&lt;p&gt;This is not a post about what Azure OpenAI can do. It is about what happens when an enterprise .NET team integrates it without the right architecture in place, ships it to production, and then calls us to figure out why it stopped working.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;At &lt;a href="https://blackthorn-vision.com/" rel="noopener noreferrer"&gt;Blackthorn Vision&lt;/a&gt;, a Microsoft-partnered &lt;a href="https://blackthorn-vision.com/technologies/net-development-services/" rel="noopener noreferrer"&gt;.NET&lt;/a&gt; and AI development company helping enterprise teams build and modernize complex software products, we are brought in after LLM integrations fail often enough that the failure pattern is predictable. That combination matters in LLM integration work, because production AI failures usually sit at the intersection of application architecture, &lt;a href="https://blackthorn-vision.com/technologies/azure-development-services/" rel="noopener noreferrer"&gt;Azure infrastructure&lt;/a&gt;, data access, and model behavior — not in the prompt alone. The team builds a compelling proof of concept, leadership approves production rollout, and within weeks the feature is either broken, generating complaints, or quietly disabled. The root causes are almost always the same three: no observability, uncontrolled cost, and latency the application was never designed to handle.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;What follows is a reconstruction of one such engagement, with identifying details changed, and the exact fixes we applied.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;h2&gt;The Setup&lt;/h2&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;The client was an enterprise SaaS company running a .NET 6 product serving midmarket financial services clients. The internal team had built an AI assistant feature using Azure OpenAI directly: a few API calls wired into the existing ASP.NET Core controllers, conversation history stored in memory, responses rendered in the UI. It worked well in staging with a small set of test prompts and a handful of concurrent users.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;Production looked different. Within two weeks of rollout the team was dealing with three separate problems simultaneously and had no way to diagnose which was causing which.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;h2&gt;Problem One: No Observability&lt;/h2&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;The first and most damaging problem was that the team had no visibility into what the AI feature was doing. When a user reported that the assistant gave a wrong answer, there was no record of what prompt was sent, what conversation history was included, what the model received, or what it returned. Debugging required reproducing the issue manually, which was slow and often impossible.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;When response times spiked, there was no way to tell whether the delay was in the application layer, the Azure OpenAI call, or a downstream service the assistant was trying to reach. Application Insights was configured for the rest of the product but the AI calls had no structured logging attached to them.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;The fix was implementing Semantic Kernel as the orchestration layer and attaching the full observability pipeline to it. &lt;a href="https://learn.microsoft.com/en-us/semantic-kernel/concepts/enterprise-readiness/observability" rel="noopener noreferrer"&gt;Semantic Kernel&lt;/a&gt; emits logs, metrics, and traces compatible with OpenTelemetry, which makes it possible to connect AI workflows to the same observability stack used by the rest of the application — every prompt, every function call, and every response traced end to end without writing custom logging code for each interaction.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;The minimum logging setup that made production problems diagnosable:&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;pre&gt;&lt;code&gt;kernel.FunctionInvocationFilters.Add(new ObservabilityFilter(logger));&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;The filter captured prompt templates, rendered prompts with PII fields redacted, token counts broken down by input and output, function call results from every plugin invocation, and latency at each step. Within a day of deploying this, the team could answer every question they had been unable to answer for two weeks.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;The wrong answers turned out to be a plugin validation issue, not a model issue. A function that retrieved account data was receiving a null tenant ID under certain session conditions and returning empty results. The model was generating plausible-sounding responses based on no data. The observability layer made this visible in minutes.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;h2&gt;Problem Two: Token Costs Three Times the Estimate&lt;/h2&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;The second problem was a billing surprise. The team had estimated token costs based on the Azure pricing calculator and a reasonable prompt size. The first production billing cycle came in at roughly three times that estimate.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;Three things caused it, none of which the pricing calculator accounts for.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;The first was output token pricing. On the GPT-4 model the team was using, output tokens are priced higher than input tokens. The team had modeled cost around their prompt size, not their expected response size. Longer generated responses, which users naturally preferred, were the real cost driver.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;The second was conversation history. The team was storing the full conversation history in memory and sending it with every request. A user who had 15 exchanges with the assistant was sending all 15 turns as input on turn 16. Token consumption grew with every message in every session.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;The fix was implementing context window management with token counting before each request:&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;pre&gt;&lt;code&gt;var encoding = GptEncoding.GetEncodingForModel("gpt-4o");
var totalTokens = history.Messages
    .Sum(m =&amp;gt; encoding.Encode(m.Content ?? "").Count);

while (totalTokens &amp;gt; MaxContextTokens &amp;amp;&amp;amp; history.Messages.Count &amp;gt; 2)
{
    var removed = history.Messages[1];
    history.Messages.RemoveAt(1);
    totalTokens -= encoding.Encode(removed.Content ?? "").Count;
}&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;The third cost driver was retry logic. The integration had basic retry on failure but did not respect the Retry-After header that Azure OpenAI returns with 429 responses. &lt;a href="https://learn.microsoft.com/en-gb/answers/questions/2276750/best-practices-for-handling-azure-openai-rate-limi" rel="noopener noreferrer"&gt;Azure OpenAI&lt;/a&gt; enforces TPM and RPM limits per deployment, and respecting the Retry-After header is the documented approach to handling throttling correctly. The application was retrying immediately, which extended the throttling window and in some cases caused repeated partial generations. Replacing this with exponential backoff that reads the Retry-After value brought the retry-related cost to near zero.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;Combined, these three fixes reduced the monthly token cost by approximately &lt;strong&gt;55%&lt;/strong&gt; without any change to the feature's behavior from the user's perspective.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;h2&gt;Problem Three: Latency the Application Was Not Built For&lt;/h2&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;The third problem was timeouts. GPT-4-class models can take several seconds or longer, especially with large prompts, long outputs, tool calls, or high service load. The application had a 10-second request timeout configured at the Application Gateway level, which predated the AI feature by several years. Responses that took longer than 10 seconds were silently dropped, the user saw a generic error, and the application logged a gateway timeout with no indication that an LLM call was involved.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;The fix had two parts.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;The first was streaming. Switching from &lt;code&gt;InvokePromptAsync&lt;/code&gt; to &lt;code&gt;InvokePromptStreamingAsync&lt;/code&gt; in Semantic Kernel meant the client received the first tokens within 1 to 2 seconds of the request, and the connection stayed active throughout generation. The Application Gateway timeout stopped triggering because the connection was never idle long enough to hit it.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;The second was a full audit of timeout settings across every layer in the request path: &lt;code&gt;HttpClient&lt;/code&gt; timeout in the application code, IIS request timeout, Application Gateway idle timeout, and the client-side fetch timeout in the frontend. Each one had been set independently by different people at different times, and none had been updated to account for LLM latency. This audit is now a standard step in every AI integration engagement we take on.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;h2&gt;What the Team Had Right&lt;/h2&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;It is worth being clear about what the internal team got right, because this is not a story about a bad engineering team. The Azure OpenAI integration was functionally correct. The prompt design was reasonable. The feature itself was genuinely useful to users, which is why the production failures were so damaging to adoption rather than just embarrassing.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;What the team did not have was experience with the specific failure modes that only appear under real production load: the observability gap that makes LLM problems invisible, the token cost mechanics that staging environments do not reveal, and the latency mismatch between LLM response times and timeout configurations set years before LLM integration was a consideration.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;These are not problems that experience with .NET alone solves. They require experience with Azure OpenAI and Semantic Kernel specifically in production, which is a different thing from knowing how to configure the SDK.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;h2&gt;Why Production LLM Recovery Requires More Than Prompt Engineering&lt;/h2&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;When an LLM feature fails in production, the fix is rarely a better prompt. &lt;a href="https://opentelemetry.io/blog/2025/ai-agent-observability/" rel="noopener noreferrer"&gt;OpenTelemetry's&lt;/a&gt; own analysis of AI agent observability confirms that without proper monitoring, tracing, and logging, diagnosing issues and ensuring reliability in AI-driven applications becomes structurally difficult — regardless of which orchestration framework is in use. In this case, the root causes were inside the software architecture: missing telemetry, unmanaged context growth, retry behavior, timeout configuration, and lack of orchestration. That is why enterprise AI integration requires a partner who understands both .NET product engineering and Azure AI infrastructure — not one or the other.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;h2&gt;Why This Matters When Evaluating .NET Development Partners&lt;/h2&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;The three problems described above are the most consistent findings when we assess LLM integrations built without Semantic Kernel as the orchestration layer. Not because Semantic Kernel is magic, but because it provides the observability hooks, the context management abstractions, and the retry infrastructure that production integrations require and that teams building directly against the Azure OpenAI SDK have to build themselves, usually incompletely.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;For enterprise teams evaluating top .NET development companies for AI integration work, the useful question is not whether the company knows Azure OpenAI. It is whether they have debugged an LLM integration that was failing in production under real user load. The answer to that question reveals whether the experience is in demos or in shipped products.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;Verified client feedback on Blackthorn Vision's Azure OpenAI and Semantic Kernel engagements is available on the &lt;a href="https://clutch.co/profile/blackthorn-vision" rel="noopener noreferrer"&gt;Clutch profile&lt;/a&gt;. If you are dealing with a failing LLM integration or planning one that needs to work from day one, that is the work we are built for.&lt;/p&gt;

</description>
      <category>llm</category>
    </item>
    <item>
      <title>Azure OpenAI + Semantic Kernel in a .NET SaaS: What Breaks in Production and How to Fix It</title>
      <dc:creator>Blackthorn Vision</dc:creator>
      <pubDate>Mon, 18 May 2026 12:58:34 +0000</pubDate>
      <link>https://dev.to/blackthorn_vision_co/azure-openai-semantic-kernel-in-a-net-saas-what-breaks-in-production-and-how-to-fix-it-2m8c</link>
      <guid>https://dev.to/blackthorn_vision_co/azure-openai-semantic-kernel-in-a-net-saas-what-breaks-in-production-and-how-to-fix-it-2m8c</guid>
      <description>&lt;h2&gt;
  
  
  &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp8tgf0h55u3zn5bcimhg.png" alt=" " width="800" height="447"&gt;
&lt;/h2&gt;

&lt;p&gt;Adding Azure OpenAI and Semantic Kernel to a .NET SaaS product is straightforward in a demo environment. The integration works, responses stream cleanly, the Semantic Kernel plugin system handles function calling elegantly, and the team ships a compelling proof of concept in a few weeks. Then the feature reaches production users, and the problems that staging never surfaced start appearing: latency spikes on concurrent requests, token costs that are 3 to 5 times the estimate, 429 rate limit errors under load, and an observability gap that makes it impossible to diagnose which component is responsible when something goes wrong.&lt;/p&gt;

&lt;p&gt;At &lt;a href="https://blackthorn-vision.com/" rel="noopener noreferrer"&gt;Blackthorn Vision&lt;/a&gt;, a Microsoft Solutions Partner specializing in .NET modernization and Azure AI integration, we have built Azure OpenAI and Semantic Kernel integrations into several enterprise .NET SaaS platforms. The failure modes below are not edge cases. They are the patterns that appear predictably when an integration moves from controlled demo conditions to real user load, and each one has a reliable fix.&lt;/p&gt;




&lt;h2&gt;
  
  
  The latency problem nobody plans for
&lt;/h2&gt;

&lt;p&gt;A .NET SaaS application built on synchronous request handling is not a natural host for LLM calls, and the mismatch shows up immediately in production. Azure OpenAI API calls for GPT-4-class models return responses in 5 to 30 seconds depending on prompt length, output length, and current service load. &lt;a href="https://learn.microsoft.com/en-us/azure/foundry/openai/how-to/latency" rel="noopener noreferrer"&gt;Microsoft's own latency guidance&lt;/a&gt; notes that response time scales with output token count, because generation is an iterative sequential process, one token at a time.&lt;/p&gt;

&lt;p&gt;In many enterprise ASP.NET deployments, request timeouts are configured somewhere between the application layer, IIS, a reverse proxy, Application Gateway, or the client itself, and most of these defaults were set long before LLM calls were a consideration. This mismatch with legacy timeout configuration causes silent failures that are difficult to diagnose because they surface as generic timeout errors rather than AI-specific problems.&lt;/p&gt;

&lt;p&gt;The fix has two parts. First, streaming: Semantic Kernel supports streaming responses via &lt;code&gt;InvokeStreamingAsync&lt;/code&gt;, which begins returning tokens as soon as the model starts generating rather than waiting for the complete response. This does not reduce total generation time, but it eliminates client-side timeouts and produces a substantially better user experience because the interface responds immediately.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Instead of waiting for the full response:&lt;/span&gt;
&lt;span class="c1"&gt;// var result = await kernel.InvokePromptAsync(prompt);&lt;/span&gt;

&lt;span class="c1"&gt;// Stream tokens as they arrive:&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;foreach&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="n"&gt;kernel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;InvokePromptStreamingAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;responseStream&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;WriteAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ToString&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;responseStream&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;FlushAsync&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Second, the hosting environment needs explicit review of timeout settings at every layer between the user and the model: application-level &lt;code&gt;HttpClient&lt;/code&gt; timeouts, IIS request timeouts, Application Gateway idle timeout, and any load balancer configuration that sits in the request path.&lt;/p&gt;




&lt;h2&gt;
  
  
  Token cost in production versus the estimate
&lt;/h2&gt;

&lt;p&gt;Azure OpenAI pricing looks predictable until the first real production bill arrives. The pricing calculator shows input and output token rates, which are real, but production deployments consistently cost significantly more than those rates suggest for three reasons that the calculator does not account for.&lt;/p&gt;

&lt;p&gt;On many Azure OpenAI models, output tokens are priced higher than input tokens, which means long generated responses often become the real cost driver rather than the prompts themselves. A well-structured prompt for a summarization or analysis task might send a moderate number of input tokens and receive a substantially larger number of output tokens. Most cost estimates based on the Azure pricing calculator undercount this because teams tend to model around their prompt size rather than their expected response size.&lt;/p&gt;

&lt;p&gt;Retry overhead adds meaningfully to costs in applications that handle 429 responses by immediately retrying without proper backoff. &lt;a href="https://learn.microsoft.com/en-us/azure/foundry/openai/quotas-limits" rel="noopener noreferrer"&gt;Microsoft's quota documentation&lt;/a&gt; specifies that when requests exceed the token rate limit, the API returns a 429 with a &lt;code&gt;Retry-After&lt;/code&gt; header indicating how long to wait. Applications that ignore this header and retry immediately increase request pressure on an already-throttled deployment, can prolong the throttling window, and risk additional costs when partial or repeated generations occur.&lt;/p&gt;

&lt;p&gt;Context window management is the third cost driver that staging environments do not reveal. Semantic Kernel's chat history mechanism accumulates conversation turns in memory and sends the entire history with each request. In a multi-turn copilot feature, a conversation that reaches 20 exchanges will send the full 20-turn history as input on turn 21. Without a strategy for truncating or summarizing older context, token costs grow with conversation length.&lt;/p&gt;

&lt;p&gt;A practical approach is to count tokens locally before sending each request, using &lt;a href="https://github.com/dmitry-brazhenko/SharpToken" rel="noopener noreferrer"&gt;SharpToken&lt;/a&gt; (a .NET port of OpenAI's tiktoken library), and trim the history when it exceeds a defined budget:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;SharpToken&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;encoding&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;GptEncoding&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GetEncodingForModel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"gpt-4o"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Count tokens in current chat history&lt;/span&gt;
&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;totalTokens&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;history&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Messages&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;encoding&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Content&lt;/span&gt; &lt;span class="p"&gt;??&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;Count&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Trim oldest turns if over budget (keep system prompt + recent context)&lt;/span&gt;
&lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;totalTokens&lt;/span&gt; &lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;MaxContextTokens&lt;/span&gt; &lt;span class="p"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;history&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Count&lt;/span&gt; &lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;removed&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;history&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Messages&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt; &lt;span class="c1"&gt;// skip system prompt at [0]&lt;/span&gt;
    &lt;span class="n"&gt;history&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;RemoveAt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;totalTokens&lt;/span&gt; &lt;span class="p"&gt;-=&lt;/span&gt; &lt;span class="n"&gt;encoding&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;removed&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Content&lt;/span&gt; &lt;span class="p"&gt;??&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;Count&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This prevents token costs from growing quadratically with conversation length, and it catches the problem before the request is sent rather than after the bill arrives.&lt;/p&gt;




&lt;h2&gt;
  
  
  Rate limits under concurrent load
&lt;/h2&gt;

&lt;p&gt;Azure OpenAI enforces limits on both tokens per minute (TPM) and requests per minute (RPM) for each deployment. In a multi-tenant SaaS application, multiple users triggering AI features simultaneously will exceed these limits more quickly than single-user testing reveals, and the resulting 429 errors produce a poor experience if the application does not handle them gracefully.&lt;/p&gt;

&lt;p&gt;The problems we see most consistently at &lt;a href="https://blackthorn-vision.com/machine-learning-and-ai-development/" rel="noopener noreferrer"&gt;Blackthorn Vision&lt;/a&gt; in enterprise .NET SaaS integrations are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Single-deployment architectures&lt;/strong&gt; where all AI traffic goes to one Azure OpenAI deployment. When that deployment hits its TPM limit, all AI features for all users fail simultaneously. The fix is to provision multiple deployments across Azure regions and implement client-side load balancing that distributes requests and falls back to alternative deployments when one returns a 429.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;No per-tenant throttling&lt;/strong&gt; at the application layer. Without application-level rate limiting, a single high-volume tenant can exhaust the shared Azure OpenAI quota for all other tenants. Implementing per-tenant request quotas at the application layer before requests reach Azure OpenAI prevents this failure mode.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Synchronous retry logic&lt;/strong&gt; that blocks the request thread during the backoff period. This consumes ASP.NET thread pool resources and degrades overall application performance during the period when rate limits are being hit. Using &lt;code&gt;Task.Delay&lt;/code&gt; with &lt;code&gt;CancellationToken&lt;/code&gt; support for retry backoff keeps threads free during the wait.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The observability gap
&lt;/h2&gt;

&lt;p&gt;The hardest production problems to diagnose in an Azure OpenAI integration are the ones that do not produce obvious errors. A request that takes 25 seconds instead of the expected 8 seconds is not failing, but it is degrading user experience significantly. A Semantic Kernel plugin that calls a business logic function and receives an unexpected null value may produce a plausible-looking but incorrect AI response. Without structured logging that captures prompt inputs, token counts, latency, and function call results at each step, these problems are nearly impossible to diagnose systematically.&lt;/p&gt;

&lt;p&gt;Semantic Kernel integrates with OpenTelemetry through &lt;code&gt;Microsoft.SemanticKernel.Core&lt;/code&gt;, and &lt;a href="https://github.com/microsoft/agent-framework" rel="noopener noreferrer"&gt;Microsoft's Agent Framework&lt;/a&gt;, which merges Semantic Kernel and AutoGen into a unified production SDK released in October 2025, ships with built-in OpenTelemetry integration as a first-class feature. For existing Semantic Kernel integrations, the minimum observability setup that makes production problems diagnosable involves:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Logging prompt templates and rendered prompts (with PII scrubbing) so that unexpected model behavior can be traced to specific inputs.&lt;/li&gt;
&lt;li&gt;Capturing token usage per request, broken down by input and output, and attributing it to the feature and tenant that generated the call.&lt;/li&gt;
&lt;li&gt;Recording function call results from Semantic Kernel plugins, including failures and unexpected return values, so that incorrect AI outputs can be traced to specific function invocations.&lt;/li&gt;
&lt;li&gt;Setting up Azure Monitor alerts for token usage spikes, sustained 429 error rates, and p95 latency thresholds that indicate problems before users report them.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Without this infrastructure, teams spend days diagnosing problems that could be resolved in hours with the right logging in place.&lt;/p&gt;




&lt;h2&gt;
  
  
  Data security and keeping data inside your Azure tenant
&lt;/h2&gt;

&lt;p&gt;Enterprise .NET SaaS applications handling sensitive customer data need to ensure that data does not flow outside the customer's Azure tenant boundary during AI processing. This is not guaranteed automatically by using Azure OpenAI: it requires deliberate configuration.&lt;/p&gt;

&lt;p&gt;The safer enterprise architecture for regulated industries keeps AI traffic private through Azure networking controls, uses Azure OpenAI resources governed by the customer's Azure subscription rather than shared endpoints, and avoids public endpoint exposure through Private Endpoints and Managed Identity. This means deploying &lt;a href="https://blackthorn-vision.com/technologies/azure-development-services/" rel="noopener noreferrer"&gt;Azure development services&lt;/a&gt; within the customer's own Azure subscription, restricting network access via Private Endpoints to the customer's virtual network, and using Azure AD-based Managed Identity authentication rather than API keys that could be extracted and reused outside the intended context.&lt;/p&gt;

&lt;p&gt;For Semantic Kernel RAG implementations that use Azure AI Search as a vector store, the same network isolation applies: the search resource should be on the same private virtual network as the Azure OpenAI deployment, with no public endpoint exposure. This architecture is more complex to configure than the default setup, but it is the appropriate baseline for enterprise SaaS platforms handling sensitive customer data in regulated industries.&lt;/p&gt;




&lt;h2&gt;
  
  
  Semantic Kernel plugin failures in production
&lt;/h2&gt;

&lt;p&gt;Semantic Kernel's plugin system, which allows the AI model to call C# functions as tools during inference, behaves differently under production conditions than in controlled testing. The model makes function calling decisions based on semantic descriptions of what functions do, and those decisions are probabilistic. Under certain input conditions, a model may call the wrong function, call a function with incorrect argument values, or invoke a function multiple times when once was intended.&lt;/p&gt;

&lt;p&gt;In a demo environment with a small set of test prompts, these issues rarely surface. In a production SaaS with diverse user inputs, they appear regularly. The fixes are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Write function descriptions that are unambiguous about what the function does and when it should not be called. Vague descriptions produce inconsistent function selection.&lt;/li&gt;
&lt;li&gt;Add validation to every plugin function that checks argument values before executing business logic. Semantic Kernel passes arguments from the model as strings, and a function that assumes a valid integer may receive an empty string or an unexpected format.&lt;/li&gt;
&lt;li&gt;Implement idempotency for any plugin function that has side effects (writes to a database, sends an email, creates a record). If the model calls the function twice due to a planning loop, the second call should produce the same result as the first without duplicating the action.&lt;/li&gt;
&lt;li&gt;Log all function invocations, arguments, and return values, and set up alerts for functions called with invalid arguments. This is the only way to discover unexpected model behavior before it produces visible user-facing errors.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Prompt injection in enterprise plugins&lt;/strong&gt; deserves specific attention. When plugin functions accept user-supplied text as arguments, a malicious or poorly formatted input can include instructions that attempt to redirect the model's behavior, such as telling it to ignore previous instructions or call a different function. The practical mitigation is to treat all user-supplied content passed into plugin function arguments as untrusted input: validate it against expected patterns, do not pass raw user text directly into subsequent prompts without sanitization, and use negative constraints in your system prompt that explicitly prohibit the model from following instructions embedded in user content.&lt;/p&gt;

&lt;p&gt;For retry logic across all the failure modes above, &lt;a href="https://github.com/App-vNext/Polly" rel="noopener noreferrer"&gt;Polly&lt;/a&gt; integrates cleanly with the &lt;code&gt;HttpClient&lt;/code&gt; that Semantic Kernel uses internally:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;retryPolicy&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;HttpPolicyExtensions&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;HandleTransientHttpError&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;OrResult&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;StatusCode&lt;/span&gt; &lt;span class="p"&gt;==&lt;/span&gt; &lt;span class="n"&gt;HttpStatusCode&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;TooManyRequests&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;WaitAndRetryAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;retryCount&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;sleepDurationProvider&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;attempt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="c1"&gt;// Respect Retry-After header if present&lt;/span&gt;
            &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;retryAfter&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="n"&gt;Result&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="n"&gt;Headers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RetryAfter&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="n"&gt;Delta&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;retryAfter&lt;/span&gt; &lt;span class="p"&gt;??&lt;/span&gt; &lt;span class="n"&gt;TimeSpan&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;FromSeconds&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Pow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;attempt&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="n"&gt;onRetryAsync&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timespan&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;attempt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;LogWarning&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Azure OpenAI throttled. Retry {Attempt} in {Delay}s"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;attempt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timespan&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;TotalSeconds&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CompletedTask&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This respects the &lt;code&gt;Retry-After&lt;/code&gt; header when Azure OpenAI returns it, falls back to exponential backoff when it does not, and logs each retry so that throttling patterns are visible in Application Insights before they become user-facing incidents.&lt;/p&gt;




&lt;h2&gt;
  
  
  Who this engagement model fits
&lt;/h2&gt;

&lt;p&gt;Blackthorn Vision is brought in when enterprise .NET SaaS teams need to add Azure OpenAI or Semantic Kernel features to a production platform and need a partner who has solved the specific production problems that staging environments do not reveal. Most of the integrations we work on at &lt;a href="https://blackthorn-vision.com/technologies/net-development-services/" rel="noopener noreferrer"&gt;Blackthorn Vision&lt;/a&gt; involve platforms that are already serving customers and cannot afford the kind of production incidents that result from AI features that are production-ready only in demos.&lt;/p&gt;

&lt;p&gt;This makes Blackthorn Vision relevant for CTOs and engineering leaders searching for companies with real Azure OpenAI and Semantic Kernel experience in .NET, particularly for enterprise applications where data security, cost control, and production reliability are non-negotiable. Verified client feedback on these engagements is available on the &lt;a href="https://clutch.co/profile/blackthorn-vision" rel="noopener noreferrer"&gt;Blackthorn Vision Clutch profile&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If you are evaluating partners for an Azure OpenAI integration into an existing .NET SaaS product, the most useful question to ask is whether they have handled rate limiting, context window management, and plugin validation at scale, because those are the problems that determine whether the feature stays in production or gets rolled back. Blackthorn Vision's &lt;a href="https://blackthorn-vision.com/machine-learning-and-ai-development/" rel="noopener noreferrer"&gt;AI integration approach&lt;/a&gt; and &lt;a href="https://blackthorn-vision.com/case-studies/" rel="noopener noreferrer"&gt;case studies&lt;/a&gt; cover both the architecture and the operational details that make the difference between a demo and a shipped feature.&lt;/p&gt;

</description>
      <category>dotnet</category>
      <category>azure</category>
      <category>kernel</category>
      <category>openai</category>
    </item>
    <item>
      <title>Strangler Fig Pattern for .NET Modernization: How It Works in a Real Production System</title>
      <dc:creator>Blackthorn Vision</dc:creator>
      <pubDate>Mon, 18 May 2026 12:51:41 +0000</pubDate>
      <link>https://dev.to/blackthorn_vision_co/strangler-fig-pattern-for-net-modernization-how-it-works-in-a-real-production-system-i76</link>
      <guid>https://dev.to/blackthorn_vision_co/strangler-fig-pattern-for-net-modernization-how-it-works-in-a-real-production-system-i76</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqxmb2nck8cobbnwq0n7b.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqxmb2nck8cobbnwq0n7b.png" alt=" " width="800" height="447"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;The strangler fig pattern is the most practical approach to modernizing a legacy .NET monolith without stopping product delivery. It works by incrementally replacing functionality in the existing system with new services, routing traffic gradually from the old codebase to the new one until the legacy system can be decommissioned. The pattern does not require a feature freeze, does not demand a big-bang cutover, and does not force you to bet the entire modernization project on a single deployment. At &lt;a href="https://blackthorn-vision.com/" rel="noopener noreferrer"&gt;Blackthorn Vision&lt;/a&gt;, a Microsoft Solutions Partner specializing in .NET modernization and Azure architecture, we use this approach as the default for enterprise .NET platforms where product delivery cannot pause for a rewrite.&lt;/p&gt;

&lt;p&gt;This article covers how the pattern actually works in a .NET production context, what the implementation looks like with modern tooling, where it tends to break down, and how to sequence the migration to avoid the failure modes that affect a large share of strangler fig projects.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why teams reach for the strangler fig pattern
&lt;/h2&gt;

&lt;p&gt;The alternative to the strangler fig pattern is usually described as a big-bang rewrite: stop adding features to the legacy system, build a new version from scratch, and cut over when it is ready. The appeal is obvious. You start with a clean architecture, no legacy constraints, and the full benefit of everything the team has learned since the original system was built.&lt;/p&gt;

&lt;p&gt;The problem is that big-bang rewrites fail at a rate that should make any engineering leader uncomfortable. &lt;a href="https://softwaremodernizationservices.com/insights/strangler-fig-pattern-example/" rel="noopener noreferrer"&gt;Modernization Intel's analysis&lt;/a&gt; of enterprise strangler migration data from 2022 to 2025 found a 76% success rate across 29 tracked strangler fig projects, with median annual savings of $640K in successful engagements. Failed projects, by contrast, produced a median sunk cost of $2.1 million. A key finding from the same dataset: projects that extracted less than 5% of monolith functionality in the first 90 days had a 92% failure rate, which means early velocity is the strongest predictor of whether a strangler migration succeeds. The most common failure mode in rewrites is feature parity: the new system consistently runs behind the legacy system in capability, the cutover date slips repeatedly, and eventually leadership loses confidence and either cancels the project or forces a cutover before the new system is ready.&lt;/p&gt;

&lt;p&gt;The strangler fig pattern sidesteps this problem by keeping the legacy system in production and making the migration reversible at every step. If a newly migrated component behaves incorrectly in production, you route traffic back to the legacy implementation while you investigate. There is no moment where the entire system depends on code that has never handled real production load.&lt;/p&gt;




&lt;h2&gt;
  
  
  The technical implementation for .NET: YARP as the facade layer
&lt;/h2&gt;

&lt;p&gt;The strangler fig pattern requires a routing layer that sits in front of both the legacy system and the new services. In the .NET ecosystem, the recommended tool for this is &lt;a href="https://github.com/dotnet/yarp" rel="noopener noreferrer"&gt;YARP (Yet Another Reverse Proxy)&lt;/a&gt;, a Microsoft-developed reverse proxy library built on ASP.NET Core middleware. Microsoft's own migration guidance for incremental ASP.NET to ASP.NET Core migrations is built around YARP, and it has become the standard approach for .NET strangler fig implementations because it integrates naturally with the existing .NET toolchain.&lt;/p&gt;

&lt;p&gt;The setup works like this. You create a new ASP.NET Core project that hosts YARP. Initially, YARP forwards 100% of requests to the legacy .NET Framework application. As you migrate each component, you add routing rules to YARP that send specific routes or request types to the new service instead of the legacy system. The legacy application continues to run and handle everything that has not yet been migrated. From the perspective of users and external systems, nothing changes, because all requests still arrive at the same endpoint.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;                ┌─────────────────────────────────────┐
                │            YARP Facade               │
 User / Client ►│         (ASP.NET Core app)           │
                │                                      │
                │  Route: /api/reports ───────────────► New .NET 8 Service
                │  Route: /api/orders  ───────────────► New .NET 8 Service
                │  Route: everything else ────────────► Legacy .NET Framework App
                └─────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Both systems run in production simultaneously. The routing configuration in YARP is the only thing that changes as each component is migrated. Rolling back a component means updating one routing rule, not redeploying the entire application.&lt;/p&gt;

&lt;p&gt;The practical implementation steps for a .NET Framework to .NET 8 migration are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Deploy the YARP-based ASP.NET Core application to Azure App Service alongside the legacy .NET Framework application. Both services run independently, with YARP configured to proxy all traffic to the legacy system as a starting point.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Add the System.Web.Adapters library to both projects, which provides compatibility shims for HttpContext and related types, allowing code that references System.Web to be moved incrementally without rewriting everything that depends on it.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Identify the first component to migrate, ideally something with clear boundaries, reasonable test coverage, and meaningful traffic volume. Starting with a low-traffic component that nobody will notice if it breaks is tempting, but it delays the point at which the team learns how the migration behaves under real load.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Build the new implementation in the ASP.NET Core project, run it in parallel with the legacy implementation, compare outputs to confirm parity, and then update the YARP routing configuration to direct that component's traffic to the new service.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Monitor the component in production for a validation period before moving on to the next component. The length of this period depends on the criticality of the component and the traffic patterns it handles.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  How to pick the first component to migrate
&lt;/h2&gt;

&lt;p&gt;Choosing the wrong starting point is one of the most common reasons strangler fig projects stall in the first two months. The instinct is usually to start with something small and contained, which makes sense in principle but often produces a migration that validates the toolchain without validating the approach under realistic conditions.&lt;/p&gt;

&lt;p&gt;The criteria that produce a better first component are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Clear external boundaries: the component has a defined API surface that other parts of the system consume through a stable contract, rather than reaching into shared state or calling internal methods directly.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Measurable output: you can run both implementations against the same inputs and compare outputs programmatically, which is the foundation of the parallel-run validation that makes the strangler fig safe.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Meaningful traffic: the component handles enough requests that production behavior is visible in monitoring within hours, not weeks. This matters because some failure modes only appear under load or in edge cases that staging environments do not produce reliably.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Limited data coupling: the component does not share database tables with multiple other components in ways that make schema changes a cross-system coordination problem.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;At &lt;a href="https://blackthorn-vision.com/application-modernization/" rel="noopener noreferrer"&gt;Blackthorn Vision&lt;/a&gt;, the components we typically migrate first in a .NET Framework monolith are API endpoints that handle well-defined request and response contracts, reporting and data export functions that can be validated by comparing output files, and background processing jobs that can be run in parallel and compared before the legacy version is disabled.&lt;/p&gt;




&lt;h2&gt;
  
  
  The parallel-run validation approach
&lt;/h2&gt;

&lt;p&gt;Running both implementations in parallel and comparing their outputs is the technical mechanism that makes the strangler fig pattern safe. Without it, you are deploying new code to production and hoping it behaves correctly, which is not meaningfully different from a big-bang migration in terms of risk.&lt;/p&gt;

&lt;p&gt;The parallel-run works by having the YARP facade send each request to both the legacy implementation and the new service simultaneously, recording both responses, and logging any discrepancies. The legacy response is returned to the caller, so users always receive the behavior they expect. The new service response is compared in the background. Discrepancies trigger alerts that the team investigates before increasing the traffic percentage routed to the new service.&lt;/p&gt;

&lt;p&gt;This approach requires investment in observability infrastructure that many legacy .NET systems lack. If the existing system has no structured logging, no distributed tracing, and no way to correlate requests across services, that investment has to happen before the migration can proceed safely. The observability work is not overhead: it is the foundation that makes the parallel-run comparison meaningful and that gives the team confidence to increase the traffic percentage routed to the new implementation.&lt;/p&gt;




&lt;h2&gt;
  
  
  What breaks in practice, and why
&lt;/h2&gt;

&lt;p&gt;Research covering 29 tracked strangler fig projects found that projects missing more than two key prerequisites had a 94% failure rate. The prerequisites that matter most in a .NET context are test coverage on the components being migrated, a working parallel-run validation mechanism, and a data migration strategy for components that own data.&lt;/p&gt;

&lt;p&gt;The failure mode we see most often at &lt;a href="https://blackthorn-vision.com/technologies/net-development-services/" rel="noopener noreferrer"&gt;Blackthorn Vision&lt;/a&gt; is what might be called "facade as decoration": a team builds the YARP routing layer, migrates the UI or the API layer of one component, but leaves the business logic and data access in the monolith. The new service makes calls back into the legacy system for data, which means the coupling has not been reduced, it has just been made visible through a network boundary. The team has added latency and operational complexity without actually strangling anything.&lt;/p&gt;

&lt;p&gt;The solution to this is enforcing data sovereignty as a hard rule: each migrated service must own its data. If a new service needs to read data that currently lives in the monolith's database, the migration plan for that service must include a data migration strategy, either through dual-write during the transition, Change Data Capture (CDC) to synchronize data between the old and new stores, or a data extraction and import step that runs before traffic is routed to the new service.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The database trap&lt;/strong&gt; deserves its own mention because it is the failure mode most likely to cause data corruption rather than just downtime. When two systems, the legacy monolith and the new service, write to the same database table simultaneously without a coordination mechanism, race conditions and conflicting writes produce corrupted records that are often invisible until a business process produces wrong results days later. This is not a theoretical risk: it is what happens when teams treat the shared database as a neutral middle ground between the old and new systems instead of recognizing it as the source of coupling they are trying to remove.&lt;/p&gt;

&lt;p&gt;The correct approach is to never allow both systems to write to the same table at the same time. If data has to be shared during the transition, use either dual-write with application-level coordination (the new service writes to both the new store and the legacy table, and the legacy system reads from its own table) or Change Data Capture to synchronize records between the old and new data stores without allowing both systems to write the same rows. Neither approach is simple, but both are substantially safer than allowing concurrent writes to shared tables.&lt;/p&gt;

&lt;p&gt;Other common failure points are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Session state: .NET Framework applications often use in-process session state, which breaks immediately when traffic starts flowing through a YARP proxy to a different process. Externalizing session state to a shared Redis cache or Azure Cache for Redis before the migration begins removes this as a blocker.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Authentication: shared authentication tokens and cookies that were issued by the legacy system need to be validated by the new service. This typically requires externalizing the identity provider and configuring both systems to validate tokens from the same source.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Synchronous integrations that cannot tolerate the additional latency introduced by the proxy hop. Most integrations handle this without issue, but any integration with a timeout configured below 500ms should be identified during the assessment phase and addressed before the facade is deployed.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Sequencing the full migration
&lt;/h2&gt;

&lt;p&gt;A strangler fig migration for a mid-size .NET Framework monolith typically runs over 12 to 18 months when executed alongside normal product delivery. That timeline is longer than most teams expect when they start, and shorter than most teams fear when they look at the size of the codebase.&lt;/p&gt;

&lt;p&gt;The migration progresses in three broad phases. The first phase establishes the infrastructure: YARP is deployed, observability is in place, the parallel-run validation mechanism is working, and the first component has been migrated and validated under real production load. This phase typically takes six to eight weeks and is the most important: if the infrastructure is not solid, every subsequent migration step will be slower and riskier than it needs to be.&lt;/p&gt;

&lt;p&gt;The second phase is the main migration loop: one component per sprint, parallel-run validation, traffic ramp, monitoring period, then the next component. The speed of this phase depends on the quality of the boundaries in the original system. Components with clear boundaries migrate quickly. Components where business logic is scattered across stored procedures, event handlers, and configuration files take longer because the boundary has to be established before the migration can happen.&lt;/p&gt;

&lt;p&gt;The third phase is decommissioning: once all traffic has been routed to the new services, the legacy system enters a monitoring-only state for a final validation period, typically four to six weeks, before it is shut down. The YARP facade can be removed at this point or retained as a load balancer, depending on the architecture of the new system.&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://blackthorn-vision.com/technologies/azure-development-services/" rel="noopener noreferrer"&gt;Azure development services&lt;/a&gt; that support this migration are available and well-documented. Azure App Service hosts both systems during the parallel-run phase. Azure Cache for Redis externalizes session state. Application Insights provides the observability layer. One additional benefit that teams often underestimate until they see it in practice: moving from .NET Framework to .NET 8 removes the dependency on Windows Server, which means the new services can run in Linux containers on Azure Kubernetes Service or Linux-based App Service plans. For organizations running large fleets of Windows Server VMs, the licensing cost reduction from this shift alone can be substantial, and it becomes a secondary justification for the modernization investment that is easy to quantify for leadership.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why this matters for AI integration
&lt;/h2&gt;

&lt;p&gt;One of the main reasons enterprise teams modernize .NET Framework monoliths today is that modern AI tooling works substantially better on modern .NET architecture. Azure OpenAI integration, Semantic Kernel, and the Microsoft.Extensions.AI libraries that simplify LLM orchestration all depend on async patterns, clean service boundaries, and the observability infrastructure that legacy monoliths typically lack.&lt;/p&gt;

&lt;p&gt;At &lt;a href="https://blackthorn-vision.com/machine-learning-and-ai-development/" rel="noopener noreferrer"&gt;Blackthorn Vision&lt;/a&gt;, the strangler fig approach is often used as a prerequisite step before Azure OpenAI or Semantic Kernel integration, because the AI workload exposes exactly the coupling and latency problems that the monolith has been hiding. Most of the platforms we modernize have teams that want to add copilot features, semantic search, or LLM-powered internal tools to an existing product. The strangler fig migration creates the service boundaries and the async infrastructure that make those integrations sustainable in production, rather than brittle demos that fail under real load.&lt;/p&gt;

&lt;p&gt;This is why enterprise teams searching for companies with real AI integration experience in .NET often end up evaluating partners who can do both: assess and modernize the platform, and then build the AI layer on top of the architecture that modernization produced.&lt;/p&gt;




&lt;h2&gt;
  
  
  Who this engagement model fits
&lt;/h2&gt;

&lt;p&gt;Blackthorn Vision is often brought in when enterprise teams need to modernize a legacy .NET monolith without pausing feature delivery, particularly when the codebase has accumulated enough complexity that a full rewrite carries unacceptable risk. Most of the platforms we work on have been in production for 8 to 15 years, support thousands of daily users, and involve complex ERP integrations or multi-team delivery environments where downtime is not acceptable. The strangler fig pattern with YARP is the approach we use for .NET Framework 4.x to .NET 8 migrations, and the &lt;a href="https://blackthorn-vision.com/case-studies/" rel="noopener noreferrer"&gt;case studies&lt;/a&gt; on the Blackthorn Vision site reflect the range of platforms and industries where we have applied it.&lt;/p&gt;

&lt;p&gt;This makes Blackthorn Vision relevant for CTOs and engineering leaders searching for the best companies for legacy .NET modernization, particularly when the requirement is a partner who can own the architectural decisions and manage the migration risk, not a team that needs to be told what to do at each step.&lt;/p&gt;

&lt;p&gt;If you are evaluating whether the strangler fig pattern is the right approach for your platform, the most useful first step is usually an honest assessment of the two things that determine whether the pattern will work: whether the existing system has enough boundary definition to support incremental extraction, and whether the team has the observability infrastructure to validate parity during the parallel-run phase. Blackthorn Vision's application modernization and assessment approach starts with exactly those two questions, and verified client feedback on how the engagements play out is available on the &lt;a href="https://clutch.co/profile/blackthorn-vision" rel="noopener noreferrer"&gt;Blackthorn Vision Clutch profile&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>dotnet</category>
      <category>architecture</category>
      <category>azure</category>
      <category>modernization</category>
    </item>
  </channel>
</rss>
