<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Mounir Mouawad</title>
    <description>The latest articles on DEV Community by Mounir Mouawad (@momo_portia).</description>
    <link>https://dev.to/momo_portia</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2961282%2Fe8b717d5-dbf0-4f14-a0f4-628cc4c34687.jpeg</url>
      <title>DEV Community: Mounir Mouawad</title>
      <link>https://dev.to/momo_portia</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/momo_portia"/>
    <language>en</language>
    <item>
      <title>5 tools we wish were on the Awesome AI Tools list</title>
      <dc:creator>Mounir Mouawad</dc:creator>
      <pubDate>Fri, 15 Aug 2025 12:15:07 +0000</pubDate>
      <link>https://dev.to/portia-ai/5-tools-we-wish-were-on-the-awesome-ai-tools-list-576m</link>
      <guid>https://dev.to/portia-ai/5-tools-we-wish-were-on-the-awesome-ai-tools-list-576m</guid>
      <description>&lt;p&gt;We’re big fans of the Awesome AI tools &lt;a href="https://github.com/mahseema/awesome-ai-tools?tab=readme-ov-file" rel="noopener noreferrer"&gt;list&lt;/a&gt; and we all use it to discover new AI tools over at &lt;a href="https://github.com/portiaAI/portia-sdk-python" rel="noopener noreferrer"&gt;Portia AI&lt;/a&gt;. My latest and favourite find is &lt;a href="http://getmerlin.in" rel="noopener noreferrer"&gt;Merlin&lt;/a&gt;: A Chrome extension that allows me to ask “how to” questions on any app rather than flipping over to ChatGPT or Claude to ask.&lt;/p&gt;

&lt;p&gt;Here are five tools we use a lot and wish were on the Awesome AI Tools list:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://textual.textualize.io/" rel="noopener noreferrer"&gt;Textual&lt;/a&gt; – We love spicing up our terminal interface for using the Portia SDK and even non-technical customers love it when I run demos from the terminal now. We all have our favourite terminal flavour of it – I made mine with Atari retro vibes 🕹️holler if you’re using Portia and want the code for it!&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://mistral.ai/solutions/document-ai" rel="noopener noreferrer"&gt;Mistral OCR&lt;/a&gt; – We think it’s the best balance of cost, speed and performance for OCR on the market right now. We also admittedly have soft spot for our neighbours across the English Channel over in La France 🥐.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://visily.ai/" rel="noopener noreferrer"&gt;Visily&lt;/a&gt; – Figma for non-designers, it’s my go-to when brainstorming early UX mocks with front-end engineers and UX designers. I especially love the ability to turn any screenshot into a wireframe because I can bring inspirations to life with some tweaks super quickly.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://github.com/souzatharsis/podcastfy" rel="noopener noreferrer"&gt;Podcastfy&lt;/a&gt; – I can’t say for sure why they skipped the “i” in their name but we love that they are an open source and equally powerful alternative to NotebookLM. One of our engineers built a bite-sized AI news podcast that I listen to during my commute daily. You can recreate it &lt;a href="https://github.com/portiaAI/portia-agent-examples/tree/main/ai-research-agent" rel="noopener noreferrer"&gt;here&lt;/a&gt; using Portia SDK or get the daily podcast on our &lt;a href="https://discord.gg/DvAJz9ffaR" rel="noopener noreferrer"&gt;Discord server&lt;/a&gt;’s &lt;strong&gt;#ai-news&lt;/strong&gt; channel.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://openrouter.ai/" rel="noopener noreferrer"&gt;OpenRouter&lt;/a&gt; - We love OpenRouter because it allows you to easily try out new models and load balance between models. We actually got an open source &lt;a href="https://github.com/portiaAI/portia-sdk-python/pull/640" rel="noopener noreferrer"&gt;contribution&lt;/a&gt; for this one recently, so we should be supporting it ❤️&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>programming</category>
      <category>ai</category>
    </item>
    <item>
      <title>Code vs LLM in a simple planning poker agent example</title>
      <dc:creator>Mounir Mouawad</dc:creator>
      <pubDate>Wed, 09 Jul 2025 17:00:33 +0000</pubDate>
      <link>https://dev.to/portia-ai/code-vs-llm-in-a-simple-planning-poker-agent-example-5dg0</link>
      <guid>https://dev.to/portia-ai/code-vs-llm-in-a-simple-planning-poker-agent-example-5dg0</guid>
      <description>&lt;p&gt;If you're building AI agents, chances are you often had to consider how much logic you want to handle through the LLM versus through traditional code. I wanted to share my experience with it this morning as a conversation starter and get your thoughts! &lt;/p&gt;

&lt;h2&gt;
  
  
  What I wanted the agent to do
&lt;/h2&gt;

&lt;p&gt;I normally spend a ton of time gathering feedback from our users. In a previous life I would put those insights into tickets in Linear and spend a ton of mental cycles trying to size the return on effort to inform our prioritisation. In this bold new world of AI, I figured I would instead write up a &lt;a href="https://en.wikipedia.org/wiki/Planning_poker" rel="noopener noreferrer"&gt;planning poker&lt;/a&gt; agent to help me do t-shirt sizing of some of those tickets in Linear. Built on the Portia SDK, the agent would:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Fetch relevant linear tickets using the remote MCP server for Linear, which is one of &lt;a href="https://www.portialabs.ai/tools" rel="noopener noreferrer"&gt;1000s of tools&lt;/a&gt; we have with built-in auth.&lt;/li&gt;
&lt;li&gt;Simulate sizing estimates from multiple developer personas and get to a consensus for each ticket's effort sizing. Here I wanted to create a ticket estimator tool using a subclass of our LLM tool that would return estimates as &lt;a href="https://docs.portialabs.ai/inputs-outputs#llm-tool-outputs" rel="noopener noreferrer"&gt;structured outputs&lt;/a&gt;. The tool would take a &lt;code&gt;context.md&lt;/code&gt; file where I keep a summary of the architecture and core abstractions that make up the Portia SDK so it can help the LLM with effort sizing.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;As it turns out, I had asked one of our devs (we'll call him Ethan) to do this and forgotten! So we both wrote this thing up at the same time except...I relied quite heavily on the LLM to handle the task while he relied way more heavily on code. Let's unpack how our approches compared.&lt;/p&gt;

&lt;h2&gt;
  
  
  How each of us built it
&lt;/h2&gt;

&lt;p&gt;Full code in our agent examples repo &lt;a href="https://github.com/portiaAI/portia-agent-examples/tree/main/planning-poker" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🧠 &lt;strong&gt;LLM-heavy&lt;/strong&gt;: I relied on a robust prompt and the &lt;a href="https://docs.portialabs.ai/generate-plan" rel="noopener noreferrer"&gt;Portia planning agent&lt;/a&gt; to figure out the entire set of steps that need to be taken, that is fetch and filter tickets from Linear, then get estimates for ticket sizes from each developer persona and average them out. Essentially I relied on the LLM itself to 1) index and aggregate the sizing estimates by Linear ticket id and persona, 2) figure out how many tool call iterations (a.k.a. "unrolling") to make to handle all ticket id and persona combinations. Here's the code snippet where the magic happens:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Get tickets from Linear and estimate the size of the tickets
&lt;/span&gt;&lt;span class="n"&gt;project&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Async Portia&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Get the tickets i&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;m working on from Linear with a limit of 3 on the tool call. Then filter specifically for those regarding the &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;project&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; project.
    For each combination of the tickets above and the following personas, estimate the size of the ticket.
    &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;personas&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;

    Return the estimates in a list of PlanningPokerEstimate objects, with estimate sizes averaged across the personas for each ticket.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;span class="n"&gt;estimates&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;portia&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;structured_output_schema&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;PlanningPokerEstimateList&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;outputs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;final_output&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;estimates&lt;/span&gt; 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;🧑🏻‍💻 &lt;strong&gt;Code-heavy&lt;/strong&gt;: Ethan on the other hand figured that we don't really need to rely on the LLM, neither for planning nor for indexing / aggregating / iterating on estimates. Instead he used Portia's declarative &lt;code&gt;PlanBuilder&lt;/code&gt; &lt;a href="https://docs.portialabs.ai/generate-plan#build-a-plan-manually" rel="noopener noreferrer"&gt;interface&lt;/a&gt; to enumerate the steps and tool calls needed. He fetched the tickets using a first Portia plan run into &lt;code&gt;LinearTicket&lt;/code&gt; objects using &lt;a href="https://docs.portialabs.ai/inputs-outputs#plan-structured-outputs" rel="noopener noreferrer"&gt;structured outputs&lt;/a&gt;. To generate sizing estimates, he then iterated with conventional code over each developer persona and over each ticket element in the list returned from the previous plan run. Each iteration called the ticket estimator tool in a single step Portia plan run. Here's a code snippet containing both the ticket fetching plan run and the ticket sizing iterations:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Fetch Linear tickets
&lt;/span&gt;&lt;span class="n"&gt;project&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Async SDK&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Get the tickets i&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;m working on from linear regarding the &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;project&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; project&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;plan&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;PlanBuilder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;structured_output_schema&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;LinearTicketList&lt;/span&gt;
&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;step&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;query&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; and only call the tool with a limit of 3&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tool_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;portia:mcp:mcp.linear.app:list_my_issues&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;step&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Filter the tickets to only include specifically the ones related to &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;project&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tool_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llm_tool&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;build&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;plan_run&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;portia&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run_plan&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;plan&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;tickets&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;plan_run&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;outputs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;final_output&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tickets&lt;/span&gt;

&lt;span class="c1"&gt;# Iterate over tickets and persona to generate estimates
&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;ticket&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;tickets&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;estimates&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="n"&gt;estimate_plan&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;PlanBuilder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;estimate the size of the ticket&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;structured_output_schema&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;PlanningPokerEstimate&lt;/span&gt;
    &lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;step&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Estimate the size of the ticket: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;ticket&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;ticket&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tool_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ticket_estimator_tool&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;build&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;persona&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;personas&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
        &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;persona&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;
        &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;tool_context&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;
        &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="n"&gt;estimate_tool&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tool_context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;
        &lt;span class="n"&gt;portia&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tool_registry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;with_tool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;estimate_tool&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;overwrite&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;estimate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;portia&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run_plan&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;estimate_plan&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;estimate&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;PlanRunState&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;COMPLETE&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;estimates&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;estimate&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;outputs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;final_output&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  What we learned
&lt;/h2&gt;

&lt;p&gt;Let's compare both approaches side by side and draw some conclusions. I hooked up Langsmith to Portia for &lt;a href="https://docs.portialabs.ai/agent-observability" rel="noopener noreferrer"&gt;observability&lt;/a&gt; so I could obtain the metrics shown below.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;LLM-heavy&lt;/th&gt;
&lt;th&gt;Code-heavy&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Effort&lt;/td&gt;
&lt;td&gt;Lowest&lt;/td&gt;
&lt;td&gt;Highest&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Total tokens&lt;/td&gt;
&lt;td&gt;70k&lt;/td&gt;
&lt;td&gt;30k&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cost&lt;/td&gt;
&lt;td&gt;$0.12&lt;/td&gt;
&lt;td&gt;$0.06&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Latency [P99]&lt;/td&gt;
&lt;td&gt;28.95s&lt;/td&gt;
&lt;td&gt;9.70s&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;So what conclusions can we draw from this exercise?&lt;br&gt;
💡 &lt;strong&gt;Reliability:&lt;/strong&gt; You can trust your Portia agents to figure out the right sequence of steps and to unroll (iterate on) the tool calls correctly so that definitely simplifies development, kinda like a form of vibe coding...but much like vibe coding it does take a bit of 'LLM-whispering' (a.k.a. prompt engineering) and using the right underlying model. For plan runs with heavy iteration expectations in particular, you will need robust eval sets in place to keep tabs on reliability lest you aim for a Mona Lisa and end up with a Picasso.&lt;br&gt;
👣 &lt;strong&gt;Traceability:&lt;/strong&gt; Relying on the LLM to handle planning and execution to the extent I did does make tracing particularly easy. One single &lt;code&gt;PlanRunState&lt;/code&gt; &lt;a href="https://docs.portialabs.ai/store-retrieve-plan-runs" rel="noopener noreferrer"&gt;instance&lt;/a&gt; in the Portia dashboard showed me the entirety of the work done by the underlying subagents. This also makes revisiting the output of the plan run easier of course. Ethan on the other hand ended up with numerous plan runs, which makes auditing and / or debugging harder.&lt;br&gt;
💸 &lt;strong&gt;Cost:&lt;/strong&gt; As you'd expect the LLM-heavy method is slower and costlier. Ultimately we're still processing the same amount of context presumably (same number of tickets and estimations) but the overhead of passing along a growing context window across all execution agents during the plan run means that the LLM-heavy method is inevitably slower and costlier. You're also opening yourself up to the stochasticity of LLMs when code could do the trick.&lt;/p&gt;

&lt;h2&gt;
  
  
  A parting thought
&lt;/h2&gt;

&lt;p&gt;One aspect I don't consider in the comparison above is autonomy. Because the task is neatly scoped in this example (planning poker agent = fetch and filter tickets + estimate per persona + summarise consensus) you can make the argument that at production scale one should restrict LLM usage only to the tasks that traditional code can't handle as easily (e.g. natural language processing). BUT where inputs from the environment change or the scope of the task is fluid, the LLM-heavy approach truly thrives. I'll try and tease that more obviously in a subsequent post. &lt;br&gt;
👉🏼 &lt;strong&gt;If you're interested please shout in the comments down below!&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  About Portia
&lt;/h2&gt;

&lt;p&gt;Portia AI is an open-source framework for building predictable, stateful, authenticated agentic workflows.&lt;/p&gt;

&lt;p&gt;We allow developers to have as much or as little oversight as they’d like over their multi-agent deployments and we are obsessively focused on production readiness.&lt;/p&gt;

&lt;p&gt;We invite you to play around with our &lt;a href="https://github.com/portiaAI/portia-sdk-python" rel="noopener noreferrer"&gt;SDK&lt;/a&gt;, break things, and tell us how you're getting on in &lt;a href="https://discord.gg/DvAJz9ffaR" rel="noopener noreferrer"&gt;Discord&lt;/a&gt;&lt;/p&gt;

</description>
      <category>programming</category>
      <category>ai</category>
      <category>llm</category>
      <category>mcp</category>
    </item>
  </channel>
</rss>
