<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Andrew Dugan</title>
    <description>The latest articles on DEV Community by Andrew Dugan (@andrew_d).</description>
    <link>https://dev.to/andrew_d</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3802433%2F9f67dbc3-15c2-4e32-b5d2-ced4340bacf3.png</url>
      <title>DEV Community: Andrew Dugan</title>
      <link>https://dev.to/andrew_d</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/andrew_d"/>
    <language>en</language>
    <item>
      <title>Tutorial: This AI Now Tells You if a Meeting Could Be an Email</title>
      <dc:creator>Andrew Dugan</dc:creator>
      <pubDate>Thu, 21 May 2026 16:00:00 +0000</pubDate>
      <link>https://dev.to/digitalocean/tutorial-this-ai-now-tells-you-if-a-meeting-could-be-an-email-2m3f</link>
      <guid>https://dev.to/digitalocean/tutorial-this-ai-now-tells-you-if-a-meeting-could-be-an-email-2m3f</guid>
      <description>&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;DigitalOcean's Inference Router semantically routes prompts to the most appropriate model based on custom instructions. The setup process is 'point-and-click', with no hardcoded "if/else" logic required.&lt;/li&gt;
&lt;li&gt;The router is built directly into the inference pipeline. Users can make inference requests normally, and the router automatically handles the workflow.&lt;/li&gt;
&lt;li&gt;In our workflow, it determines the nature of the task and routes the request to a cheaper, faster model to write an email or a larger, more advanced model to write a meeting agenda. This architecture can scale beyond meetings and can be used for support tickets, code reviews, legal documents, and more.&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;Think back to the last time you received a calendar invite with no agenda, 12 attendees, and a title that says "Quick Sync". We've all either held or attended meetings that "could have been an email" at some point, but what if there was a way to have a gentle nudge built straight into your workflow that only leads us into a meeting when the task requires it. Instead of defaulting to a meeting, one could describe the details of the task that needs to be addressed, and immediately either an email is written for you to send out or a meeting agenda is written ready to attach to your calendar invites. To take it a step further, emails and meeting agendas require different levels of depth and consideration, and ultimately different &lt;a href="https://www.digitalocean.com/resources/articles/large-language-models" rel="noopener noreferrer"&gt;LLMs&lt;/a&gt; to write them.&lt;/p&gt;

&lt;p&gt;We've built exactly this using DigitalOcean's new &lt;a href="https://docs.digitalocean.com/products/inference/how-to/use-inference-router/" rel="noopener noreferrer"&gt;Inference Router&lt;/a&gt;, a policy-driven routing layer that matches each incoming prompt to the right model based on task complexity without hardcoded "if/else" logic required. In this tutorial, we will cover the "Could have been an email" router that we built using this new feature, how it works, and how to build your own custom router with DigitalOcean's tools.&lt;/p&gt;

&lt;h2&gt;
  
  
  How the router works with DigitalOcean
&lt;/h2&gt;

&lt;p&gt;Traditional LLM (large language model) inference involves sending a request to a single model and getting a response. The better or worse the model, the better or worse the response. LLM routers are a layer in between you and a group of models that takes your request, identifies the best model for the request, and has that specific model handle it. Routers can be customized to choose models based on speed, price, specific task, or any other optimization you are looking for. It allows teams to set up a single endpoint for a wide range of needs while getting the best possible price and speed for each request.&lt;/p&gt;

&lt;p&gt;In our case, we built a router with two tasks. The first task we made is &lt;code&gt;write_email&lt;/code&gt;. It is backed by a cheap, fast model (&lt;a href="https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct" rel="noopener noreferrer"&gt;Llama 3.3 Instruct 70B&lt;/a&gt;) for writing a simple email. The second task is &lt;code&gt;write_meeting_agenda&lt;/code&gt;. It is backed by a frontier model (Anthropic &lt;a href="https://www.anthropic.com/news/claude-opus-4-7" rel="noopener noreferrer"&gt;Claude Opus 4.7&lt;/a&gt;) to create a detailed meeting plan to discuss decisions that genuinely require talking to each other. In the request, you describe what you need done, the topic, the stakeholders, and any agenda items, and the router reads that description, matches it against the task definitions, and routes it to whichever model fits. If the request lands on the &lt;code&gt;write_email&lt;/code&gt; task, the router delivers a verdict of "this could be an email" and generates a ready-to-send email draft. If it lands on &lt;code&gt;write_meeting_agenda&lt;/code&gt;, the app confirms the meeting is warranted and produces a structured agenda with talking points and action items. The routing decision itself is the verdict. No additional classification logic is needed.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1. Build the router
&lt;/h2&gt;

&lt;p&gt;The first step to building a router is to log in to your DigitalOcean cloud account, or create an account if you don't have one already. Navigate to the &lt;a href="https://cloud.digitalocean.com/model-studio/router/" rel="noopener noreferrer"&gt;router page&lt;/a&gt; and select "Create Router".&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdoimages.nyc3.cdn.digitaloceanspaces.com%2F010AI-ML%2F2025%2FAndrew%2F19_Meeting_or_Email%2F1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdoimages.nyc3.cdn.digitaloceanspaces.com%2F010AI-ML%2F2025%2FAndrew%2F19_Meeting_or_Email%2F1.png" title="Create a Router in the DigitalOcean Control Panel" alt="The DigitalOcean Create a Router page showing the name and description fields" width="800" height="456"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;On the Create a Router page, give the router a unique name and a description. That description is not just metadata. It serves as a routing prompt, giving the router overall context so it can identify the most appropriate task for each incoming request. From there you define the tasks that make up the router's logic. Each task combines a name, a description, and a model pool with a selection policy. You can either add pre-configured tasks that DigitalOcean has already benchmarked and optimized, or define fully custom tasks that specify exactly which models to use and how to rank them, whether by cost efficiency, speed (&lt;a href="https://www.digitalocean.com/blog/llm-inference-benchmarking" rel="noopener noreferrer"&gt;Time To First Token&lt;/a&gt;), or a manual ranking you control.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdocs.digitalocean.com%2Fscreenshots%2Finference%2Fadd-custom-task.fc4a50918dd6700b40e7f2bdca0e20b358a2f6b7322dc31921cbe8d3f448a21b.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdocs.digitalocean.com%2Fscreenshots%2Finference%2Fadd-custom-task.fc4a50918dd6700b40e7f2bdca0e20b358a2f6b7322dc31921cbe8d3f448a21b.png" title="Add a custom task to the router" alt="The Add Custom Task dialog showing task name, description, and model pool fields" width="800" height="459"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Once your tasks are in place, the last piece is specifying fallback models. Fallback models catch any request that does not cleanly match one of your configured tasks, and they are tried in the priority order you set. This gives the router a safety net so that even if the incoming prompt is ambiguous or outside the scope of your named tasks, a response is still generated rather than failing silently. For our email/meeting router, that means a borderline "is this a meeting or an email?" input never goes unanswered.&lt;/p&gt;

&lt;p&gt;If you prefer automation over the control panel, you can also create the router with a single POST request to &lt;code&gt;https://api.digitalocean.com/v2/gen-ai/models/routers&lt;/code&gt;, passing in the same names, task definitions, selection policies, and fallback models as a JSON body, which is also useful for version-controlling your router alongside your application code.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2. Build the app
&lt;/h2&gt;

&lt;p&gt;With the router created, integrating it into an application is straightforward because the router is a drop-in replacement for any direct model call. You use the same Chat Completions endpoint (&lt;code&gt;https://inference.do-ai.run/v1/chat/completions&lt;/code&gt;) and the same request shape, but instead of naming a specific model you prefix your router's name with &lt;code&gt;router:&lt;/code&gt; in the &lt;code&gt;model&lt;/code&gt; field. For this app, the field would look like &lt;code&gt;"model": "router:meeting-or-email"&lt;/code&gt;. Authentication works the same way. You generate a Model Access Key from the DigitalOcean Control Panel, export it as &lt;code&gt;MODEL_ACCESS_KEY&lt;/code&gt;, and pass it as a Bearer token in your request header. The user's meeting description, agenda, and attendee list become the message content, and the router takes it from there.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;label&lt;/span&gt; &lt;span class="n"&gt;meeting_or_email&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;py&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;meeting_or_email&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_input&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://inference.do-ai.run/v1/chat/completions&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;headers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Content-Type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;application/json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Authorization&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Bearer &amp;lt;^&amp;gt;YOUR_MODEL_ACCESS_KEY&amp;lt;^&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;router:meeting-or-email&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;messages&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;system&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;You are a workplace productivity assistant that evaluates whether a task &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;requires a live meeting or can be handled asynchronously via email. &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;If the request involves a straightforward update, announcement, or single-topic &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;communication with no real-time decision-making needed, write a concise, &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;professional email draft and state that this could have been an email. &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;If the request requires discussion, real-time collaboration, debate, or &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;coordination among multiple stakeholders with competing priorities, produce &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;a structured meeting agenda with talking points and action items, and confirm &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;that a meeting is warranted. Always begin your response by clearly stating &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;your verdict: &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;This could be an email.&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt; or &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;This warrants a meeting.&lt;/span&gt;&lt;span class="sh"&gt;'"&lt;/span&gt;
                &lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;user_input&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;response_body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Model: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;response_body&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;model&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Message: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;response_body&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;choices&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;message&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;model&lt;/code&gt; field in the response body tells you exactly which model the router selected for that request. Requests the router judged as routine land on the cheaper, faster model, while requests it judged as genuinely complex land on the frontier model. The &lt;code&gt;x-model-router-selected-route&lt;/code&gt; response header tells you which task was matched, for example &lt;code&gt;write_email&lt;/code&gt; vs &lt;code&gt;write_meeting_agenda&lt;/code&gt;, or &lt;code&gt;fallback&lt;/code&gt; if none of the tasks matched. The app does not need any if/else logic to decide what kind of meeting it is. It reads the header the router already populated and maps it to a verdict message for the user.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nf"&gt;meeting_or_email&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;I need to plan a large event with multiple stakeholders that will all be involved.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[secondary_label Output]
Model: anthropic-claude-opus-4.7
Message: This warrants a meeting.

Coordinating a large event with multiple stakeholders involves competing priorities, real-time negotiation of responsibilities, and collaborative decision-making that simply cannot be handled efficiently via email threads. Below is a structured agenda to make the meeting productive.

---

## Event Planning Kickoff Meeting
...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can see above that with a large project the task is routed to Opus 4.7. With a smaller task that just warrants an email, below, the task is routed to Llama3.3.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nf"&gt;meeting_or_email&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;I have some metrics I want to share with my team.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[secondary_label Output]
Model: llama3.3-70b-instruct
Message: This could be an email. 

Here's a draft email you could send to your team:

Subject: Update on Key Metrics

Dear Team,
...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 3. Deploy to DigitalOcean App Platform
&lt;/h2&gt;

&lt;p&gt;Before deploying your own router, it is worth spending a few minutes in the Inference Router playground to validate that the router is routing the way you expect. From the &lt;code&gt;My Routers&lt;/code&gt; tab, click the menu next to your router and select a model to compare it against. The Playground opens in a split view where you can type a meeting description and see both the router's response and the comparison model's response side by side. Each result shows the cost difference, end-to-end latency, the specific model the router selected, and the task that was matched for that query. This is a useful check to confirm that your task descriptions are correctly discriminating between routine syncs and complex-coordination requests before any real traffic hits the router.&lt;/p&gt;

&lt;p&gt;Once deployed, the Analyze tab gives you a live view of how the router is performing in production. You can see aggregate metrics across all your routers or drill into a specific one, including total requests, total token usage, model match rate, and fallback rate. Model match rate is the percentage of requests matched to a configured task, and fallback rate is the percentage that fell through to the fallback models instead. For accuracy evaluation, the Router Evaluation tool in the Playground tab lets you upload a labeled dataset and run an LLM-as-a-Judge evaluation that scores responses on completeness, correctness, token usage, and latency. Together these two views give you what you need to iterate on your task descriptions and model pools after launch as you accumulate real meeting data.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;The meeting app we built is a thin wrapper around a genuinely powerful idea. You do not have to choose which model handles a request, you just have to describe the conditions under which each model makes sense and let the router enforce those conditions at runtime. The router does not just save money on tokens. It changes how you think about designing for complexity. Instead of building one prompt that works adequately for everything, you build narrow, well-described task buckets and let semantic matching handle the dispatch.&lt;/p&gt;

&lt;p&gt;The broader lesson here extends well beyond meetings and emails. The same pattern applies anywhere you have a mix of requests hitting a single endpoint. This could include a customer support queue where most tickets are simple FAQs but a few require nuanced reasoning, a code review pipeline where style fixes and architecture feedback warrant very different models, or a legal document classifier where boilerplate and novel clauses should not cost the same to process. Once you have written a router description and a pair of task definitions, you have infrastructure that scales horizontally without adding branching logic to your application code. DigitalOcean's &lt;a href="https://www.digitalocean.com/" rel="noopener noreferrer"&gt;platform&lt;/a&gt; keeps that infrastructure on one bill and one security model, which removes the operational overhead that typically discourages teams from adopting multi-model strategies in the first place.&lt;/p&gt;

&lt;h2&gt;
  
  
  Related links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://docs.digitalocean.com/products/inference/how-to/use-inference-router/" rel="noopener noreferrer"&gt;How to Use Inference Router&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.digitalocean.com/community/tutorials/how-to-build-parallel-agentic-workflows-with-python" rel="noopener noreferrer"&gt;How to Build Parallel Agentic Workflows with Python&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.digitalocean.com/community/tutorials/mistral-7b-fine-tuning" rel="noopener noreferrer"&gt;Fine-Tune Mistral-7B with LoRA: A Quickstart Guide&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>tutorial</category>
      <category>agentskills</category>
      <category>inference</category>
    </item>
    <item>
      <title>How I Used Nemotron 3 to Help Me Find the Perfect Dishrack</title>
      <dc:creator>Andrew Dugan</dc:creator>
      <pubDate>Thu, 09 Apr 2026 18:00:00 +0000</pubDate>
      <link>https://dev.to/digitalocean/how-did-nemotron-3-help-me-find-the-perfect-dish-rack-479c</link>
      <guid>https://dev.to/digitalocean/how-did-nemotron-3-help-me-find-the-perfect-dish-rack-479c</guid>
      <description>&lt;p&gt;After recently moving into a new apartment, I realized how much time I was spending searching online for household items ranging from storage solutions, to pots and pans, to the furniture thing that sits at the end of the bed. It occurred to me that this seems like the perfect task for an LLM. So I built an app that does just that. &lt;/p&gt;

&lt;p&gt;The Nemofinder sorts through dozens of product descriptions to find one that matches your exact needs. This tutorial describes how the application works. &lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Nemotron 3 Nano's efficient Mixture-of-Experts architecture enables cost-effective product filtering at scale, comparing product descriptions against specific requirements while maintaining high accuracy.&lt;/li&gt;
&lt;li&gt;The Nemofinder integrates third-party search APIs to gather product listings and leverages Nemotron 3 Nano to intelligently match products based on detailed user requirements, reviews, and pricing.&lt;/li&gt;
&lt;li&gt;The application is fully customizable and open source, allowing you to adapt it for any product search use case and integrate it with different search APIs based on your needs.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Why Nemotron 3 Nano?
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://huggingface.co/nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16" rel="noopener noreferrer"&gt;Nemotron 3 Nano&lt;/a&gt; is specifically optimized for cost efficiency in targeted agentic tasks without sacrificing accuracy. This makes it an ideal choice for filtering through dozens of product descriptions and checking whether each one matches specific product requirements. Unlike larger models that may be overkill for focused tasks, Nano delivers strong performance while remaining significantly more efficient. It is also open source, giving you complete control over your personal product queries and output data. &lt;/p&gt;

&lt;p&gt;Under the hood, Nemotron 3 Nano uses a hybrid &lt;a href="https://arxiv.org/html/2503.07137v1" rel="noopener noreferrer"&gt;Mixture-of-Experts&lt;/a&gt; (MoE) architecture combined with &lt;a href="https://arxiv.org/abs/2405.21060" rel="noopener noreferrer"&gt;Mamba-2 state-space models&lt;/a&gt;, which dramatically reduces computational overhead compared to traditional transformer architectures. Even though the model has 30 billion parameters, only 3.5 billion are active per token during inference. This architectural efficiency translates to faster response times and lower computational costs, making it practical to deploy on smaller GPU instances. Additionally, you can optionally disable Nemotron's reasoning capabilities through a simple configuration flag if you need even faster inference for straightforward product matching tasks, though this may slightly reduce accuracy. Refer to the &lt;a href="https://www.digitalocean.com/community/tutorials/nemotron-3-models-run-gpu-droplet" rel="noopener noreferrer"&gt;deployment guide&lt;/a&gt; to deploy an instance on a DigitalOcean Droplet. &lt;/p&gt;

&lt;h2&gt;
  
  
  How the Nemofinder Works
&lt;/h2&gt;

&lt;p&gt;First, the application takes the keyword you would like to search along with a detailed text description of your specific requirements for that item. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdoimages.nyc3.cdn.digitaloceanspaces.com%2F010AI-ML%2F2025%2FAndrew%2F13_Nemofinder%2FProduct%2520Requirements.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdoimages.nyc3.cdn.digitaloceanspaces.com%2F010AI-ML%2F2025%2FAndrew%2F13_Nemofinder%2FProduct%2520Requirements.png" title="Product requirements form for Nemofinder" alt="Product requirements for Nemotron Nemofinder" width="800" height="110"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It then uses a search API (application programming interface) to look for items using the keyword. The search API can be store-specific, a generic shopping API, or a custom combination that calls multiple APIs. It needs to be able to take a keyword and return a list of products with their descriptions, and ideally reviews, as a response. &lt;/p&gt;

&lt;p&gt;The application then goes through each of the product descriptions, prices, reviews, comments, etc., and has Nemotron 3 Nano compare each description to your product requirements. After sorting through and finding matches, it returns the matches to the user. In this case, it found the perfect dish rack to match the requirements in my description. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdoimages.nyc3.cdn.digitaloceanspaces.com%2F010AI-ML%2F2025%2FAndrew%2F13_Nemofinder%2FDish%2520rack.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdoimages.nyc3.cdn.digitaloceanspaces.com%2F010AI-ML%2F2025%2FAndrew%2F13_Nemofinder%2FDish%2520rack.png" title="Nemofinder results showing matching dish rack" alt="The perfect dish rack from the Nemotron Nemofinder" width="800" height="772"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Improving and Implementing the Nemofinder
&lt;/h2&gt;

&lt;p&gt;The Nemofinder is open source and available on &lt;a href="https://github.com/adugan-do/nemofinder" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;. You need to add a &lt;a href="https://serpapi.com/" rel="noopener noreferrer"&gt;SerpAPI&lt;/a&gt; key or change the API to one that you have access to. You need to &lt;a href="https://www.digitalocean.com/community/tutorials/nemotron-3-models-run-gpu-droplet" rel="noopener noreferrer"&gt;set up a DigitalOcean GPU droplet&lt;/a&gt; with Nemotron 3. Next, you need to update the Nemotron 3 calls to use your deployment's IP address. Feel free to clone, change, and use the application as you'd like. &lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Can this application buy the product?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;No, purchasing functionality could be added, but I wouldn't trust it. The problem being solved in this use case is the time spent looking for the ideal product. Automating purchases without human verification introduces unnecessary risk.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Can it search on all platforms, like Amazon?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Only if you have an API for that particular platform. With the right API, you can search through anything. Amazon does offer a Product Advertising API, though access can be limited. For most e-commerce platforms, you'll need to check their developer documentation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Can I use a different LLM instead of Nemotron 3 Nano?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Yes, you can adapt the application to use other models. However, Nemotron 3 Nano is recommended for its efficiency and cost-effectiveness on product filtering tasks. Larger models like Claude or GPT may work but could result in higher token costs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How do I handle price variations across different products?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;As long as the API allows, the application passes the price data from the search API alongside the product description to Nemotron 3 Nano. You can modify the prompts to set price thresholds or have the model factor pricing into the matching criteria based on your budget requirements.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Is my product search history private?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;It depends on how you deploy it. Running the application locally keeps everything on your machine. If you deploy it on a remote server, be mindful of which APIs you're using and review their privacy policies. Consider using a dedicated API account and limiting what data is logged. &lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;The Nemofinder demonstrates how Nemotron 3 Nano can efficiently handle targeted product discovery tasks without the overhead of larger language models. By combining intelligent search APIs with Nemotron's reasoning capabilities, you can quickly find products that match your exact specifications across multiple product listings and review data. Whether you're searching for household items, specialized equipment, or niche products, the application adapts to your needs through customizable prompts and API integrations.&lt;/p&gt;

&lt;p&gt;The beauty of the Nemofinder is its flexibility. You can extend it to search across multiple e-commerce platforms, add additional filtering criteria, or integrate it into a larger workflow. As shown in the related Daily Digest tutorial, these kinds of specialized tools can be combined to create comprehensive AI-driven solutions. If you want to explore further or build your own product search application, the source code is available on GitHub, and the setup process is straightforward with the right API keys and a Nemotron 3 Nano deployment.&lt;/p&gt;

&lt;h2&gt;
  
  
  Related Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://www.digitalocean.com/community/tutorials/nemotron-3-models-run-gpu-droplet" rel="noopener noreferrer"&gt;Nemotron 3 on DigitalOcean&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://www.digitalocean.com/community/tutorials/how-to-build-parallel-agentic-workflows-with-python" rel="noopener noreferrer"&gt;How to Build Parallel Agentic Workflows with Python&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://www.digitalocean.com/community/tutorials/run-gpt-oss-vllm-amd-gpu-droplet-rocm" rel="noopener noreferrer"&gt;Run gpt-oss 120B on vLLM with an AMD Instinct MI300X GPU Droplet&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>nemotron</category>
      <category>python</category>
      <category>llm</category>
    </item>
  </channel>
</rss>
