<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Dee.Bee</title>
    <description>The latest articles on DEV Community by Dee.Bee (@dee_bee).</description>
    <link>https://dev.to/dee_bee</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F467756%2Fd78a2faf-76e9-4be3-bbba-fe2285b27b0a.png</url>
      <title>DEV Community: Dee.Bee</title>
      <link>https://dev.to/dee_bee</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/dee_bee"/>
    <language>en</language>
    <item>
      <title>Right Model, Right Time: Why Model Routing Is Becoming Core to GenAI Platforms</title>
      <dc:creator>Dee.Bee</dc:creator>
      <pubDate>Thu, 14 May 2026 09:52:18 +0000</pubDate>
      <link>https://dev.to/dee_bee/why-a-single-ai-model-is-no-longer-enough-2bce</link>
      <guid>https://dev.to/dee_bee/why-a-single-ai-model-is-no-longer-enough-2bce</guid>
      <description>&lt;div class="ltag__link--embedded"&gt;
  &lt;div class="crayons-story "&gt;
  &lt;a href="https://dev.to/dee_bee/right-model-right-time-why-model-routing-is-becoming-core-to-genai-platforms-9oo" class="crayons-story__hidden-navigation-link"&gt;Right Model, Right Time: Why Model Routing Is Becoming Core to GenAI Platforms&lt;/a&gt;


  &lt;div class="crayons-story__body crayons-story__body-full_post"&gt;
    &lt;div class="crayons-story__top"&gt;
      &lt;div class="crayons-story__meta"&gt;
        &lt;div class="crayons-story__author-pic"&gt;

          &lt;a href="/dee_bee" class="crayons-avatar  crayons-avatar--l  "&gt;
            &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F467756%2Fd78a2faf-76e9-4be3-bbba-fe2285b27b0a.png" alt="dee_bee profile" class="crayons-avatar__image"&gt;
          &lt;/a&gt;
        &lt;/div&gt;
        &lt;div&gt;
          &lt;div&gt;
            &lt;a href="/dee_bee" class="crayons-story__secondary fw-medium m:hidden"&gt;
              Dee.Bee
            &lt;/a&gt;
            &lt;div class="profile-preview-card relative mb-4 s:mb-0 fw-medium hidden m:inline-block"&gt;
              
                Dee.Bee
                
              
              &lt;div id="story-author-preview-content-3641308" class="profile-preview-card__content crayons-dropdown branded-7 p-4 pt-0"&gt;
                &lt;div class="gap-4 grid"&gt;
                  &lt;div class="-mt-4"&gt;
                    &lt;a href="/dee_bee" class="flex"&gt;
                      &lt;span class="crayons-avatar crayons-avatar--xl mr-2 shrink-0"&gt;
                        &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F467756%2Fd78a2faf-76e9-4be3-bbba-fe2285b27b0a.png" class="crayons-avatar__image" alt=""&gt;
                      &lt;/span&gt;
                      &lt;span class="crayons-link crayons-subtitle-2 mt-5"&gt;Dee.Bee&lt;/span&gt;
                    &lt;/a&gt;
                  &lt;/div&gt;
                  &lt;div class="print-hidden"&gt;
                    
                      Follow
                    
                  &lt;/div&gt;
                  &lt;div class="author-preview-metadata-container"&gt;&lt;/div&gt;
                &lt;/div&gt;
              &lt;/div&gt;
            &lt;/div&gt;

          &lt;/div&gt;
          &lt;a href="https://dev.to/dee_bee/right-model-right-time-why-model-routing-is-becoming-core-to-genai-platforms-9oo" class="crayons-story__tertiary fs-xs"&gt;&lt;time&gt;May 14&lt;/time&gt;&lt;span class="time-ago-indicator-initial-placeholder"&gt;&lt;/span&gt;&lt;/a&gt;
        &lt;/div&gt;
      &lt;/div&gt;

    &lt;/div&gt;

    &lt;div class="crayons-story__indention"&gt;
      &lt;h2 class="crayons-story__title crayons-story__title-full_post"&gt;
        &lt;a href="https://dev.to/dee_bee/right-model-right-time-why-model-routing-is-becoming-core-to-genai-platforms-9oo" id="article-link-3641308"&gt;
          Right Model, Right Time: Why Model Routing Is Becoming Core to GenAI Platforms
        &lt;/a&gt;
      &lt;/h2&gt;
        &lt;div class="crayons-story__tags"&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/ai"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;ai&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/llm"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;llm&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/architecture"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;architecture&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/microsoft"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;microsoft&lt;/a&gt;
        &lt;/div&gt;
      &lt;div class="crayons-story__bottom"&gt;
        &lt;div class="crayons-story__details"&gt;
            &lt;a href="https://dev.to/dee_bee/right-model-right-time-why-model-routing-is-becoming-core-to-genai-platforms-9oo#comments" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left flex items-center"&gt;
              Comments


              2&lt;span class="hidden s:inline"&gt; comments&lt;/span&gt;
            &lt;/a&gt;
        &lt;/div&gt;
        &lt;div class="crayons-story__save"&gt;
          &lt;small class="crayons-story__tertiary fs-xs mr-2"&gt;
            3 min read
          &lt;/small&gt;
            
              &lt;span class="bm-initial"&gt;
                

              &lt;/span&gt;
              &lt;span class="bm-success"&gt;
                

              &lt;/span&gt;
            
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;


</description>
      <category>ai</category>
      <category>azure</category>
      <category>openrouter</category>
      <category>modelrouter</category>
    </item>
    <item>
      <title>Right Model, Right Time: Why Model Routing Is Becoming Core to GenAI Platforms</title>
      <dc:creator>Dee.Bee</dc:creator>
      <pubDate>Thu, 14 May 2026 09:48:30 +0000</pubDate>
      <link>https://dev.to/dee_bee/right-model-right-time-why-model-routing-is-becoming-core-to-genai-platforms-9oo</link>
      <guid>https://dev.to/dee_bee/right-model-right-time-why-model-routing-is-becoming-core-to-genai-platforms-9oo</guid>
      <description>&lt;h2&gt;
  
  
  Why a single AI model is no longer enough
&lt;/h2&gt;

&lt;p&gt;If you’re building AI-powered applications today, you’ve probably faced this problem already:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Some prompts are trivial, but they still hit an expensive model
&lt;/li&gt;
&lt;li&gt;Others are complex and fail badly when routed to a cheaper one
&lt;/li&gt;
&lt;li&gt;Latency, cost, and quality constantly pull in different directions
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Using one model for every prompt is increasingly inefficient especially as new models with very different strengths are released every few weeks.&lt;/p&gt;

&lt;p&gt;This is where &lt;strong&gt;model routing&lt;/strong&gt; comes in.&lt;/p&gt;




&lt;h2&gt;
  
  
  The hospital triage analogy
&lt;/h2&gt;

&lt;p&gt;Consider a large hospital where patients arrive all day with very different problems:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A sore throat
&lt;/li&gt;
&lt;li&gt;Chest pain
&lt;/li&gt;
&lt;li&gt;Sudden vision issues like floaters
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Patients do not reliably know where to go. Some choose to see a consultant directly, some underplay their symptoms, while others bounce between departments losing valuable time in the process.&lt;/p&gt;

&lt;p&gt;To handle this situation, hospitals rely on a &lt;strong&gt;triage lead&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The triage lead does not treat patients; they simply ensure that each patient is redirected to the right department in the hospital based on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Complexity of the case&lt;/strong&gt; – &lt;em&gt;simple symptom vs. unclear combination of issues&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Urgency / latency of the case&lt;/strong&gt; – &lt;em&gt;how quickly the case needs attention&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost and resource use&lt;/strong&gt; – &lt;em&gt;does this really require top‑tier expertise now?&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Expected performance / accuracy&lt;/strong&gt; – &lt;em&gt;does this require a specialist or advanced diagnosis?&lt;/em&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  What is model router
&lt;/h2&gt;

&lt;p&gt;A model router is like that triage lead in a hospital. It intelligently routes prompts to the most suitable AI model from a collection of available models in real time rather than relying on a single model for all queries.&lt;/p&gt;

&lt;p&gt;Simple prompts can be handled by smaller, faster, and cheaper models, while more complex reasoning can be routed to more capable models automatically.&lt;/p&gt;




&lt;h2&gt;
  
  
  How does a model router work?
&lt;/h2&gt;

&lt;p&gt;The obvious next question that comes to mind is: &lt;strong&gt;how does a model router actually work?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Nowadays, models are released rapidly, and we already have benchmark datasets to compare them. It is hard to imagine that a single model will perform best across every dataset, especially when benchmarks measure very different capabilities.&lt;/p&gt;

&lt;p&gt;As an AI developer or architect, the real interest is knowing &lt;strong&gt;which model performs best for a specific task or use case&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;A model router is trained on various benchmark datasets to learn the relationship between prompt types and model strengths.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://doi.org/10.48550/arXiv.2309.15789" rel="noopener noreferrer"&gt;Large Language Model Routing with Benchmark Datasets&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Model routing in cloud platforms
&lt;/h2&gt;

&lt;p&gt;Cloud providers already offer out-of-the-box model routing capabilities:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Azure AI Foundry – &lt;a href="https://learn.microsoft.com/en-us/azure/foundry/openai/concepts/model-router" rel="noopener noreferrer"&gt;Model Router&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;AWS Bedrock – &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-routing.html" rel="noopener noreferrer"&gt;Intelligent Prompt Router&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;According to Microsoft documentation:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;We train the router on a large, diverse dataset spanning hundreds of thousands of examples across many domains. These include question answering, code generation, mathematical reasoning. Summarization, conversations, and agentic workflows are also covered. We continuously expand the training data to keep pace with new models and capabilities.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://github.com/MicrosoftDocs/azure-ai-docs/blob/main/articles/foundry/openai/concepts/model-router-how-it-works.md#model-overview" rel="noopener noreferrer"&gt;Model Router – How it works&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Open ecosystem support
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://openrouter.ai" rel="noopener noreferrer"&gt;OpenRouter&lt;/a&gt; also provides similar capabilities. It allows AI engineers to optimise model usage based on specific needs such as cost, quality, or speed, while automatically maintaining fallback strategies.&lt;/p&gt;




&lt;h2&gt;
  
  
  Benefits of using a model router
&lt;/h2&gt;

&lt;p&gt;Based on the incoming prompt, a model router intelligently identifies and routes to the most suitable model. Smaller, less expensive models are used when they are sufficient for the task.&lt;/p&gt;

&lt;p&gt;This leads to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Lower inference costs
&lt;/li&gt;
&lt;li&gt;Reduced latency
&lt;/li&gt;
&lt;li&gt;More efficient and sustainable compute usage
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The following Microsoft community blog shares the cost benefits achieved using model routing in Azure AI Foundry:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://techcommunity.microsoft.com/blog/azuredevcommunityblog/optimising-ai-costs-with-microsoft-foundry-model-router/4494776" rel="noopener noreferrer"&gt;Optimising AI costs with Microsoft Foundry Model Router&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Measurable cost savings across all modes:&lt;br&gt;&lt;br&gt;
4.5% in Balanced, 4.7% in Cost, and 14.2% in Quality mode.  &lt;/p&gt;

&lt;p&gt;Quality mode saved the most by routing simple prompts to faster, cheaper models while still directing complex requests to more capable models.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  What’s next?
&lt;/h2&gt;

&lt;p&gt;In the next article, I’ll provide a step-by-step guide on how a model router can be created in &lt;strong&gt;Microsoft AI Foundry&lt;/strong&gt;, covering model selection strategies, routing behaviour, and practical considerations for production systems.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>architecture</category>
      <category>microsoft</category>
    </item>
    <item>
      <title>Will ESB gradually die.......in era of microservices</title>
      <dc:creator>Dee.Bee</dc:creator>
      <pubDate>Mon, 08 Apr 2024 10:36:13 +0000</pubDate>
      <link>https://dev.to/dee_bee/will-esb-gradually-diein-era-of-microservices-567i</link>
      <guid>https://dev.to/dee_bee/will-esb-gradually-diein-era-of-microservices-567i</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm1hp6z4xpq278bzgbi0r.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm1hp6z4xpq278bzgbi0r.png" alt=" " width="800" height="711"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Traditional integration aimed to solve the problem of data exchange between isolated applications. An Enterprise Service Bus (ESB) was a common tool used in this approach. It served as a centralized hub, allowing different applications to access and exchange data. However, this approach had its limitations, such as complexity in managing the ESB, lack of real-time data exchange, and difficulties in integrating with modern, cloud-based applications. This led to the evolution of new integration approaches like API-led connectivity and microservices architecture. These newer methods offer more flexibility, scalability, and efficiency in integrating diverse applications.&lt;/p&gt;

&lt;p&gt;In the modern era, applications are moving away from monolithic architectures towards more flexible and scalable microservices architectures. These architectures break down application functionality into small, independent services, eliminating the need for a centralized data transfer point like an ESB. Instead, services communicate with each other in a decentralized manner, allowing for elastic scalability - the ability to easily add or remove service functionality as needed.&lt;/p&gt;

&lt;p&gt;This architectural shift aligns well with agile development practices. Agile development breaks down the application development process into short, iterative sprints, each focused on delivering a complete set of functionality for a specific set of tasks. This approach naturally complements a microservices architecture, as microservices are small, discrete, and have clearly defined functions and service boundaries. This combination allows for rapid, iterative development and deployment, and facilitates continuous integration and continuous delivery (CI/CD) practices.&lt;/p&gt;

&lt;p&gt;In conclusion, the shift towards microservices and agile development is a response to the need for more flexible, scalable, and efficient application development and deployment in today’s fast-paced digital world.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Traditional Integration:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Imagine a world where different software applications are like isolated islands. Each island has its own data, rules, and way of doing things.&lt;/li&gt;
&lt;li&gt;Traditional integration aimed to connect these islands. It was like building bridges or tunnels between them so they could share information.&lt;/li&gt;
&lt;li&gt;One common approach was using an Enterprise Service Bus (ESB). Think of the ESB as a central hub where data from different applications could meet and chat.&lt;/li&gt;
&lt;li&gt;However, this often led to big, complex applications with tightly woven connections. It was like gluing puzzle pieces together to made a giant picture.&lt;/li&gt;
&lt;li&gt;Integration was seen as part of the infrastructure—the behind-the-scenes plumbing that made everything work.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Modern Applications and Microservices:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Fast forward to today. We’re moving away from those giant puzzle pieces.&lt;/li&gt;
&lt;li&gt;Instead of building monolithic skyscrapers, we’re creating smaller, independent buildings called microservices.&lt;/li&gt;
&lt;li&gt;Each microservice has a specific job, like a tiny superhero with a unique power. They don’t need a central hub; they can talk directly to each other.&lt;/li&gt;
&lt;li&gt;Imagine a city where these microservices are scattered around. They communicate flexibly, like neighbours borrowing sugar from each other.&lt;/li&gt;
&lt;li&gt;Agile development fits right in. It’s like building one room at a time, adding features step by step. Each sprint is a mini construction project.&lt;/li&gt;
&lt;li&gt;Microservices play well with this approach because they’re small, focused, and have clear boundaries. It’s like having separate rooms for cooking, sleeping, and playing.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>integration</category>
      <category>microservices</category>
      <category>api</category>
      <category>esb</category>
    </item>
  </channel>
</rss>
