<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Sanskriti</title>
    <description>The latest articles on DEV Community by Sanskriti (@sansbuilds).</description>
    <link>https://dev.to/sansbuilds</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3424470%2F6491848a-8bb9-4657-bc69-1f09e5107a88.png</url>
      <title>DEV Community: Sanskriti</title>
      <link>https://dev.to/sansbuilds</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/sansbuilds"/>
    <language>en</language>
    <item>
      <title>I Baked a Football Cake and It Taught Me About Building AI Agents</title>
      <dc:creator>Sanskriti</dc:creator>
      <pubDate>Sun, 30 Nov 2025 02:33:06 +0000</pubDate>
      <link>https://dev.to/sansbuilds/i-baked-a-football-cake-and-it-taught-me-about-building-ai-agents-3ih8</link>
      <guid>https://dev.to/sansbuilds/i-baked-a-football-cake-and-it-taught-me-about-building-ai-agents-3ih8</guid>
      <description>&lt;p&gt;I recently baked a football cake and it helped me realize AI agents work just like layered desserts. Here’s how flavors, molds and icing maps to agentic design.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftffxb65ip20bdw6cvieb.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftffxb65ip20bdw6cvieb.jpeg" alt="Football mold red velvet and vanilla birthday cake" width="502" height="360"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The code example below is designed to break down user goals into actionable steps and execute them using either custom tools or LLM reasoning. It uses regex to extract numbered steps from the LLM’s plan output. Executes each step by matching keywords like "search" or "compute" to the appropriate tool, or falls back to LLM reasoning.&lt;/p&gt;

&lt;p&gt;Just like a custom cake has layers of flavor, structure, and decoration, an AI agent has its own stack.&lt;/p&gt;

&lt;p&gt;It uses llama 3 LLM model and created two custom simple tools:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;search_tool() -  simulates a search engine returning mock results.&lt;/li&gt;
&lt;li&gt;compute_tool() - simulates a computation task returning a placeholder result.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Implementing the agent architecture:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Base model&lt;/strong&gt;: The sponge layer. I used llama 3, Ollama LLM model. It handles basic reasoning via the LLM.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# -------------------------------------------------------
# Ollama LLM Wrapper
# -------------------------------------------------------
class OllamaLLM:
    def __init__(self, model="llama3"):
        self.model = model

    def __call__(self, prompt: str) -&amp;gt; str:
        """Send a prompt to a local Ollama instance."""
        resp = requests.post(
            "http://localhost:11434/api/generate",
            json={"model": self.model, "prompt": prompt, "stream": False}
        )
        text = json.loads(resp.text).get("response", "")
        return text

# -------------------------------------------------------
# Base Agent 
# -------------------------------------------------------
class AgentCore:
    def __init__(self, llm):
        self.llm = llm

    def reason(self, prompt):
        return self.llm(prompt)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Tool execution&lt;/strong&gt;: The icing and decor are the search and compute tools.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# -------------------------------------------------------
# Local Tools
# -------------------------------------------------------
def search_tool(query: str) -&amp;gt; dict:
    return {
        "tool": "search",
        "query": query,
        "results": [
            {"title": "Top NFL QBs 2024", "eff": 98.1},
            {"title": "Quarterback Rankings", "eff": 95.6},
        ],
    }


def compute_tool(task: str) -&amp;gt; dict:
    return {
        "tool": "compute",
        "task": task,
        "result": 42,  # we pretend the tool computed something important
    }

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Prompt logic&lt;/strong&gt;: Like flavor choices, it defines behavior.
It adds step parsing and tool execution.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# -------------------------------------------------------
# Agent prompt and structured tool execution
# -------------------------------------------------------
class StructuredAgent(AgentCore):

    def parse_steps(self, plan: str):
        """Extract step lines starting with numbers."""
        lines = plan.split("\n")
        steps = []
        for line in lines:
            match = STEP_REGEX.search(line.strip())
            if match:
                cleaned = match.group(2).strip()
                steps.append(cleaned)
        return steps

    def execute_step(self, step: str):
        step_lower = step.lower()

        if "search" in step_lower:
            return search_tool(step)

        if "calculate" in step_lower or "compute" in step_lower:
            return compute_tool(step)

        # fallback: let the model reason
        return self.reason(step)

    def run(self, goal: str):
        PLAN_PROMPT =f"""You are a task decomposition engine.  
Your ONLY job is to break the user's goal into a small set of concrete, functional steps.
Your outputs MUST stay within the domain of the user’s goal.  
If the goal references football, metrics, or sports, remain in that domain only.

RULES:
- Only return steps directly needed to complete the user’s goal.
- Do NOT invent topics, examples, reviews, or unrelated domains.
- Do NOT expand into full explanations.
- No marketing language.
- No creative writing.
- No assumptions beyond the user's exact goal.
- No extra commentary.

FORMAT:
1. &amp;lt;short step&amp;gt;
2. &amp;lt;short step&amp;gt;
3. &amp;lt;short step&amp;gt;

User goal: "{goal}"
"""
        plan = self.llm(PLAN_PROMPT)
        steps = self.parse_steps(plan)

        outputs = []
        for step in steps:
            outputs.append({
                "step": step,
                "output": self.execute_step(step)
            })
        return outputs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;User-facing output&lt;/strong&gt;: The final taste = formatted responses.
It formats the output for user facing responses.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# -------------------------------------------------------
# User Facing Agent (Formatted Output Layer)
# -------------------------------------------------------
class FinalAgent(StructuredAgent):
    def respond(self, goal: str):
        results = self.run(goal)

        formatted = "\n".join(
            f"- **{r['step']}** → {r['output']}"
            for r in results
        )

        return (
            f"## Result for goal: *{goal}*\n\n"
            f"{formatted}\n"
        )

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Takeaway
&lt;/h3&gt;

&lt;p&gt;Whether you're baking or writing code, structure matters. Think in layers. And if you ever need a sweet analogy to explain AI agents try cake. Got a dev inspired dessert metaphor? Drop it in the comments and let’s make tech tasty. &lt;/p&gt;

&lt;h3&gt;
  
  
  Tutorial
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;System Requirements:&lt;/em&gt;&lt;/strong&gt;&lt;br&gt;
Python version: 3.11.6 &lt;br&gt;
Ollama: Install and run Ollama locally to serve the Llama 3 model.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import requests
import json
import re

# parser that detects all common LLM step styles, including:
# 1. Do X
# 1) Do X
# Step 1: Do X
# **Step 1:** Do X
# - Step 1: Do X
# ### Step 1
# Step One:

STEP_REGEX = re.compile(
    r"(?:^|\s)(?:\*\*)?(?:Step\s*)?(\d+)[\.\):\- ]+(.*)", re.IGNORECASE
)



# -------------------------------------------------------
# Ollama LLM Wrapper
# -------------------------------------------------------
class OllamaLLM:
    def __init__(self, model="llama3"):
        self.model = model

    def __call__(self, prompt: str) -&amp;gt; str:
        """Send a prompt to a local Ollama instance."""
        resp = requests.post(
            "http://localhost:11434/api/generate",
            json={"model": self.model, "prompt": prompt, "stream": False}
        )
        text = json.loads(resp.text).get("response", "")
        return text

# -------------------------------------------------------
# Base Agent 
# -------------------------------------------------------
class AgentCore:
    def __init__(self, llm):
        self.llm = llm

    def reason(self, prompt):
        return self.llm(prompt)

# -------------------------------------------------------
# Local Tools
# -------------------------------------------------------
def search_tool(query: str) -&amp;gt; dict:
    return {
        "tool": "search",
        "query": query,
        "results": [
            {"title": "Top NFL QBs 2024", "eff": 98.1},
            {"title": "Quarterback Rankings", "eff": 95.6},
        ],
    }


def compute_tool(task: str) -&amp;gt; dict:
    return {
        "tool": "compute",
        "task": task,
        "result": 42,  # we pretend the tool computed something important
    }



# -------------------------------------------------------
# Agent prompt and structured tool execution
# -------------------------------------------------------
class StructuredAgent(AgentCore):

    def parse_steps(self, plan: str):
        """Extract step lines starting with numbers."""
        lines = plan.split("\n")
        steps = []
        for line in lines:
            match = STEP_REGEX.search(line.strip())
            if match:
                cleaned = match.group(2).strip()
                steps.append(cleaned)
        return steps

    def execute_step(self, step: str):
        step_lower = step.lower()

        if "search" in step_lower:
            return search_tool(step)

        if "calculate" in step_lower or "compute" in step_lower:
            return compute_tool(step)

        # fallback: let the model reason
        return self.reason(step)

    def run(self, goal: str):
        PLAN_PROMPT =f"""You are a task decomposition engine.  
Your ONLY job is to break the user's goal into a small set of concrete, functional steps.
Your outputs MUST stay within the domain of the user’s goal.  
If the goal references football, metrics, or sports, remain in that domain only.

RULES:
- Only return steps directly needed to complete the user’s goal.
- Do NOT invent topics, examples, reviews, or unrelated domains.
- Do NOT expand into full explanations.
- No marketing language.
- No creative writing.
- No assumptions beyond the user's exact goal.
- No extra commentary.

FORMAT:
1. &amp;lt;short step&amp;gt;
2. &amp;lt;short step&amp;gt;
3. &amp;lt;short step&amp;gt;

User goal: "{goal}"
"""
        plan = self.llm(PLAN_PROMPT)
        steps = self.parse_steps(plan)

        outputs = []
        for step in steps:
            outputs.append({
                "step": step,
                "output": self.execute_step(step)
            })
        return outputs


# -------------------------------------------------------
# User Facing Agent (Formatted Output Layer)
# -------------------------------------------------------
class FinalAgent(StructuredAgent):
    def respond(self, goal: str):
        results = self.run(goal)

        formatted = "\n".join(
            f"- **{r['step']}** → {r['output']}"
            for r in results
        )

        return (
            f"## Result for goal: *{goal}*\n\n"
            f"{formatted}\n"
        )


# -------------------------------------------------------
# Test Cases
# -------------------------------------------------------
if __name__ == "__main__":
    agent = FinalAgent(llm=OllamaLLM("llama3"))

    tests = [
        "Compare NFL quarterback efficiency metrics and summarize insights.",
        "Search for top training drills for youth football players.",
        "Compute a simple metric and explain how you'd structure the process.",
    ]

    for i, t in enumerate(tests, 1):
        print("=" * 70)
        print(f"TEST {i}: {t}")
        print(agent.respond(t))
        print()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Sample Output
&lt;/h3&gt;

&lt;p&gt;======================================================================&lt;br&gt;
TEST 1: Compare NFL quarterback efficiency metrics and summarize insights.&lt;/p&gt;

&lt;h2&gt;
  
  
  Result for goal: &lt;em&gt;Compare NFL quarterback efficiency metrics and summarize insights.&lt;/em&gt;
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Gather data on NFL quarterback statistics&lt;/strong&gt; → Here are some key statistics for NFL quarterbacks, gathered from various sources including Pro-Football-Reference.com, ESPN, and NFL.com:&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Passing Statistics&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Career Passing Yards:&lt;/strong&gt;
    + Tom Brady: 73,517 yards (most in NFL history)
    + Drew Brees: 72,503 yards
    + Peyton Manning: 71,940 yards&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Career Touchdowns:&lt;/strong&gt; 
    + Tom Brady: 624 touchdowns
    + Drew Brees: 571 touchdowns
    + Aaron Rodgers: 462 touchdowns&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Interceptions:&lt;/strong&gt; 
    + Brett Favre: 336 interceptions (most in NFL history)
    + Eli Manning: 244 interceptions
    + Philip Rivers: 234 interceptions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Completion Percentage:&lt;/strong&gt; 
    + Aaron Rodgers: 65.5% completion percentage (highest in NFL history)
    + Drew Brees: 64.7%
    + Tom Brady: 63.4%&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Rushing Statistics&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Career Rushing Yards:&lt;/strong&gt; 
    + Cam Newton: 5,442 yards
    + Russell Wilson: 3,911 yards
    + Michael Vick: 3,844 yards&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Career Rushing Touchdowns:&lt;/strong&gt; 
    + Cam Newton: 85 rushing touchdowns (most among QBs)
    + Russell Wilson: 44 rushing touchdowns
    + Michael Vick: 36 rushing touchdowns&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Other Statistics&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Win-Loss Record:&lt;/strong&gt; 
    + Tom Brady: 230-72 regular season record (best among QBs)
    + Drew Brees: 208-115 regular season record
    + Peyton Manning: 208-141 regular season record&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Playoff Wins:&lt;/strong&gt; 
    + Tom Brady: 32 playoff wins (most in NFL history)
    + Joe Montana: 23 playoff wins
    + Terry Bradshaw: 20 playoff wins&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Note: These statistics are accurate as of the end of the 2020 NFL season and may change over time.&lt;/p&gt;

&lt;p&gt;I hope this helps! Let me know if you have any specific questions or if there's anything else I can help with.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Identify relevant efficiency metrics (e.g., passer rating, yards per attempt)&lt;/strong&gt; → Here are some common efficiency metrics used to evaluate quarterbacks:&lt;/li&gt;
&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Passer Rating&lt;/strong&gt;: A quarterback's cumulative performance based on completion percentage, yards per attempt, touchdowns, and interceptions.
    * Formula: (Completions - Attempts + Touchdowns * 5 + Interceptions * -2) / Attempted Passes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Yards Per Attempt (YPA)&lt;/strong&gt;: Measures a quarterback's average yardage gained per pass attempt.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Completion Percentage&lt;/strong&gt;: The percentage of passes completed out of total attempts.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Touchdown-to-Interception Ratio&lt;/strong&gt;: Evaluates a quarterback's ability to score touchdowns compared to throwing interceptions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Red Zone Efficiency&lt;/strong&gt;: Measures a quarterback's success in scoring touchdowns within the opponent's 20-yard line (e.g., red zone).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Third Down Conversion Percentage&lt;/strong&gt;: Assesses a quarterback's ability to convert third-down plays into first downs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fourth Quarter Points Per Game&lt;/strong&gt;: Evaluates a quarterback's performance in critical, late-game situations.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Adjusted Net Yards Per Attempt (ANY/A)&lt;/strong&gt;: A more advanced metric that adjusts for opponent strength and incorporates additional factors like sacks and fumbles.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Other efficiency metrics used to evaluate quarterbacks include:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Quarterback Wins&lt;/strong&gt;: A simple measure of a quarterback's contribution to their team's win-loss record.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Passer Rating Average per Game&lt;/strong&gt;: A variation of the traditional passer rating formula, adjusted for games played.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sack Rate&lt;/strong&gt;: Measures a quarterback's frequency of being sacked relative to their total pass attempts.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fumble Rate&lt;/strong&gt;: Evaluates a quarterback's tendency to fumble the ball relative to their total pass attempts.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Keep in mind that each metric has its strengths and weaknesses, and no single efficiency metric can fully capture a quarterback's performance.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Calculate averages and rankings for each quarterback&lt;/strong&gt; → {'tool': 'compute', 'task': 'Calculate averages and rankings for each quarterback', 'result': 42}&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>agents</category>
      <category>beginners</category>
      <category>python</category>
    </item>
    <item>
      <title>How AI Thinks It Thinks: ChatGPT, Copilot and Gemini Explain Themselves</title>
      <dc:creator>Sanskriti</dc:creator>
      <pubDate>Tue, 02 Sep 2025 04:26:06 +0000</pubDate>
      <link>https://dev.to/sansbuilds/how-ai-thinks-it-thinks-chatgpt-copilot-and-gemini-explain-themselves-1f3k</link>
      <guid>https://dev.to/sansbuilds/how-ai-thinks-it-thinks-chatgpt-copilot-and-gemini-explain-themselves-1f3k</guid>
      <description>&lt;p&gt;When building LangChain agents, understanding how your LLM "reasons" isn’t just a technical curiosity, it’s critical for debugging. They operate as probabilistic engines trained on vast datasets, generating outputs that mimic reasoning, planning, and even self awareness. Whether they’re glorified autocomplete systems or emergent cognitive frameworks, they’re shaping how we build, debug, and deploy intelligent workflows. To truly understand what these models can and can’t do, I went straight to the source. I prompted ChatGPT, Microsoft Copilot, and Google Gemini to explain their own architecture, reasoning process, and limitations. The best way to learn how they think is to ask them to think about themselves. Here’s what they said.&lt;/p&gt;

&lt;h3&gt;
  
  
  Questions Asked
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;How do you work?&lt;/li&gt;
&lt;li&gt;Can you explain more technically?&lt;/li&gt;
&lt;li&gt;What’s the actual math behind it?&lt;/li&gt;
&lt;li&gt;How do you create your own output with reason if all you are doing is token prediction?&lt;/li&gt;
&lt;li&gt;Can you explain it step by step like I’m 5?&lt;/li&gt;
&lt;li&gt;What are your limitations today?&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Response Summary
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Question&lt;/th&gt;
&lt;th&gt;ChatGPT&lt;/th&gt;
&lt;th&gt;Copilot&lt;/th&gt;
&lt;th&gt;Gemini&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1. How do you work?&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Describes token prediction, training on large datasets, and emergent reasoning.&lt;/td&gt;
&lt;td&gt;Explains transformer architecture and modular orchestration.&lt;/td&gt;
&lt;td&gt;Covers transformer basics and training data.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2. Can you explain more technically?&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Introduces embeddings, attention, and sampling with equations.&lt;/td&gt;
&lt;td&gt;Details inference pipeline, memory, and tool integration.&lt;/td&gt;
&lt;td&gt;Describes self-attention and model layers.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3. What’s the actual math behind it?&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Shares softmax and sampling logic.&lt;/td&gt;
&lt;td&gt;Explains conditional probability and token scoring.&lt;/td&gt;
&lt;td&gt;Mentions attention weights and vector math.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;4. How do you create your own output with reason if all you are doing is token prediction?&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Frames reasoning as emergent behavior from token selection.&lt;/td&gt;
&lt;td&gt;Simulates reasoning and planning via intermediate steps and tool calls.&lt;/td&gt;
&lt;td&gt;Says reasoning emerges from scale and training.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;5. Can you explain it step by step like I’m 5?&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Uses analogies like smart autocomplete and storytelling.&lt;/td&gt;
&lt;td&gt;Describes step-by-step planning like a checklist.&lt;/td&gt;
&lt;td&gt;Uses library and storytelling metaphors.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;6. What are your limitations today?&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Hallucinations, context limits, bias, no memory.&lt;/td&gt;
&lt;td&gt;No consciousness, no intent, sandboxed tools.&lt;/td&gt;
&lt;td&gt;Hallucinations, bias, static knowledge, no real reasoning.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Verdict: Who Explained It Best?
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Clarity&lt;/th&gt;
&lt;th&gt;Depth&lt;/th&gt;
&lt;th&gt;Conciseness&lt;/th&gt;
&lt;th&gt;Dev Utility&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;ChatGPT&lt;/td&gt;
&lt;td&gt;Y&lt;/td&gt;
&lt;td&gt;Y&lt;/td&gt;
&lt;td&gt;Y&lt;/td&gt;
&lt;td&gt;Y&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Copilot&lt;/td&gt;
&lt;td&gt;Y&lt;/td&gt;
&lt;td&gt;Y&lt;/td&gt;
&lt;td&gt;Y&lt;/td&gt;
&lt;td&gt;Y&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gemini&lt;/td&gt;
&lt;td&gt;Y&lt;/td&gt;
&lt;td&gt;Maybe&lt;/td&gt;
&lt;td&gt;Maybe&lt;/td&gt;
&lt;td&gt;Maybe&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Legend - Y (It's able to explain in detail); Maybe (Somewhat explained it)&lt;/p&gt;

&lt;h3&gt;
  
  
  Takeaway for Agent Builders
&lt;/h3&gt;

&lt;p&gt;If you're designing agents that need to simulate reasoning, learn and make decisions. Copilot’s modular thinking and planning metaphors are especially useful. ChatGPT is great for understanding the math and mechanics. Gemini is fine but less actionable.&lt;/p&gt;

&lt;p&gt;Don't forget to try it on your own and let's discuss in the comments.&lt;/p&gt;

&lt;p&gt;Versions used - &lt;br&gt;
Gemini 2.5 Flash&lt;br&gt;
Copilot Quick Response Mode&lt;br&gt;
ChatGPT GPT-5&lt;/p&gt;

</description>
      <category>agentai</category>
      <category>aiengineer</category>
      <category>development</category>
      <category>ai</category>
    </item>
    <item>
      <title>LangChain + Ollama in the Wild: Hard Learned Lessons on Building a Custom LLM Agent</title>
      <dc:creator>Sanskriti</dc:creator>
      <pubDate>Wed, 20 Aug 2025 05:32:18 +0000</pubDate>
      <link>https://dev.to/sansbuilds/langchain-ollama-in-the-wild-hard-learned-lessons-on-building-a-custom-llm-agent-3i58</link>
      <guid>https://dev.to/sansbuilds/langchain-ollama-in-the-wild-hard-learned-lessons-on-building-a-custom-llm-agent-3i58</guid>
      <description>&lt;p&gt;I stumbled upon Ollama, an open-source application that makes it easy to run large language models locally with minimal setup. Integrating LangChain with Ollama is straightforward: you can wire up a model with just a few lines of code.&lt;/p&gt;

&lt;p&gt;Turning that integration into shippable code, something reliable enough to move beyond a demo requires more care. Through trial and error, I ran into several pitfalls that made the difference between a quick prototype and a stable system.&lt;/p&gt;

&lt;p&gt;For this article, I’ll use a simple scheduling agent as the example to walk through four key lessons.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdaux8drgprcfguivvnyk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdaux8drgprcfguivvnyk.png" alt="meta-ollama-llama3" width="800" height="366"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;src-&lt;a href="https://ollama.com/public/blog/meta-ollama-llama3.png" rel="noopener noreferrer"&gt;https://ollama.com/public/blog/meta-ollama-llama3.png&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  1. Skipping Explicit Schemas
&lt;/h3&gt;

&lt;p&gt;If you let the model “be concise,” it will drift into natural language instead of structured outputs. The fix is to define JSON schemas directly in the system prompt.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SYSTEM_PROMPT = """
You are a calendaring assistant. Actions:

1. create_meeting
   { "person": string, "datetime": string, "reason": string }

2. reschedule_meeting
   { "person": string, "new_datetime": string }

3. cancel_meeting
   { "person": string }

4. escalate_issue
   { "reason": string }

Output only valid JSON:
{ "action": "&amp;lt;name&amp;gt;", "input": {...} }
"""
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Post Processing Is Not Optional
&lt;/h3&gt;

&lt;p&gt;Even with a schema, the model sometimes returns almost JSON: stray commas, comments, or text. That’s why you should always sanitize before parsing.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import re, json

def clean_json(payload: str) -&amp;gt; str:
    no_comments = re.sub(r'//.*', '', payload)
    return re.sub(r',(\s*[}\]])', r'\1', no_comments).strip()

def parse_payload(raw: str):
    payload = clean_json(raw)
    return json.loads(payload)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. No Timeouts/Retries
&lt;/h3&gt;

&lt;p&gt;Network hiccups or model stalls will block your system if you don’t enforce limits. Ollama doesn’t provide retries or timeouts out of the box, so you need to add them at the call site.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from langchain_ollama import ChatOllama
from langchain.schema import SystemMessage, HumanMessage

llm = ChatOllama(model="llama3", temperature=0)

def call_with_retry(messages, retries=2):
    for attempt in range(retries):
        try:
            return llm.invoke(input=messages, timeout=10)  # enforce timeout
        except Exception:
            if attempt == retries - 1:
                raise
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4. Ignoring Drifts
&lt;/h3&gt;

&lt;p&gt;As prompts or models change, outputs can silently drift. A schema that worked last week may suddenly fail. Adding lightweight regression checks helps you catch this early.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;def test_golden_case():
    messages = [SystemMessage(content=SYSTEM_PROMPT),
                HumanMessage(content="Book a lunch with Alex at 1pm")]
    ai_msg = call_with_retry(messages)
    cmd = parse_payload(ai_msg.content)
    assert "action" in cmd and "input" in cmd
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Putting It Together:
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import json, re
from langchain_ollama import ChatOllama
from langchain.schema import SystemMessage, HumanMessage, AIMessage

# 1) Define your agent meeting functions as JSON in the system prompt
SYSTEM_PROMPT = """
You are a calendaring assistant. Actions:

1. create_meeting
   { "person": string, "datetime": string, "reason": string }

2. reschedule_meeting
   { "person": string, "new_datetime": string }

3. cancel_meeting
   { "person": string }

4. escalate_issue
   { "reason": string }

Output only valid JSON:
{ "action": "&amp;lt;name&amp;gt;", "input": {...} }
"""

# 2) Init the Ollama model
llm = ChatOllama(model="llama3", temperature=0)

# 3) AI message as cleaned up JSON
def parse_payload(raw: str):
    payload = clean_json(raw)
    return json.loads(payload)

# 4) Remove any JS-style comments and trailing commas in objects/arrays
def clean_json(payload: str) -&amp;gt; str:
    no_comments = re.sub(r'//.*', '', payload)
    return re.sub(r',(\s*[}\]])', r'\1', no_comments).strip()

# 5) Invoke llm model with retry limit
def call_with_retry(messages, retries=2):
    for attempt in range(retries):
        try:
            return llm.invoke(input=messages)
        except Exception:
            if attempt == retries - 1:
                raise

# 6) Execute custom agent and to return the result
def run_agent(user_request: str):
    messages = [SystemMessage(content=SYSTEM_PROMPT), HumanMessage(content=user_request)]
    ai_msg: AIMessage = call_with_retry(messages)
    cmd = parse_payload(ai_msg.content)

    action, params = cmd["action"], cmd["input"]

    if action == "create_meeting":
        return {"result": f"Created meeting with {params['person']} on {params['datetime']}"}

    elif action == "reschedule_meeting":
        return {"result": f"Rescheduled {params['person']} to {params['new_datetime']}"}

    elif action == "cancel_meeting":
        return {"result": f"Cancelled meeting with {params['person']}"}

    elif action == "escalate_issue":
        return {"result": f"Escalated due to: {params['reason']}"}

    else:
        return {"error": f"Unknown action: {action}"}

# 7) Optional - regression check
def test_golden_case():
    messages = [SystemMessage(content=SYSTEM_PROMPT),
                HumanMessage(content="Reschedule a lunch with Alex at 1pm")]
    ai_msg = call_with_retry(messages)
    cmd = parse_payload(ai_msg.content)
    print(cmd["action"])
    print(cmd["input"])
    assert "action" in cmd and "input" in cmd

# Application start
if __name__ == "__main__":
    query = "Book a call with Mr. Russell for next Thursday at 3 PST for a quick lunch"
    print(run_agent(query))
    # Optional validation step
    #test_golden_case()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;&lt;u&gt;O/P:&lt;/u&gt;&lt;/strong&gt; {'result': 'Created meeting with Mr. Russell on 2023-03-16T15:00:00-08:00'}&lt;/p&gt;

&lt;h2&gt;
  
  
  Takeaway
&lt;/h2&gt;

&lt;p&gt;LangChain + Ollama is fast to set up, but brittle if you skip guardrails. These small investments turn a fragile code into a service you can trust.&lt;/p&gt;

&lt;h2&gt;
  
  
  QQ???
&lt;/h2&gt;

&lt;p&gt;Ollama updated lib now supports &lt;a href="https://ollama.com/blog/functions-as-tools" rel="noopener noreferrer"&gt;tools&lt;/a&gt;, so would you wrap dispatch logic into a proper toolset (using LangChain Tool abstraction) or keep it closer to plain Python for control?&lt;/p&gt;

</description>
      <category>langchain</category>
      <category>ollama</category>
      <category>ai</category>
      <category>development</category>
    </item>
  </channel>
</rss>
