<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Stack Builders</title>
    <description>The latest articles on DEV Community by Stack Builders (@stack_builders).</description>
    <link>https://dev.to/stack_builders</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1346951%2Fa63cb2b8-9eef-4757-95b8-33f946cf96b9.png</url>
      <title>DEV Community: Stack Builders</title>
      <link>https://dev.to/stack_builders</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/stack_builders"/>
    <language>en</language>
    <item>
      <title>Beyond AGENTS.md: Turning AI Pair Programming into Workflows</title>
      <dc:creator>Stack Builders</dc:creator>
      <pubDate>Thu, 09 Apr 2026 13:00:30 +0000</pubDate>
      <link>https://dev.to/stack_builders/beyond-agentsmd-turning-ai-pair-programming-into-workflows-m0o</link>
      <guid>https://dev.to/stack_builders/beyond-agentsmd-turning-ai-pair-programming-into-workflows-m0o</guid>
      <description>&lt;p&gt;In practice, the starting pattern for using AI to write code is usually the same: open the IDE, highlight some code, and ask an AI agent (like Copilot or a chat‑based assistant) to "write this feature" or "fix this bug." It can prove to be very powerful and time-efficient, but on the flip side, it can quickly run into predictable failure modes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Context window overflow and degraded responses over time&lt;/li&gt;
&lt;li&gt;Inconsistent architectural decisions across features&lt;/li&gt;
&lt;li&gt;Superficial or self‑congratulatory test coverage&lt;/li&gt;
&lt;li&gt;Features drifting away from original requirements&lt;/li&gt;
&lt;li&gt;Hidden technical debt that's hard to detect in review&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The issue isn't that AI is incapable or that the agent is the wrong tool. Instead, the problem lies in teams applying it without structure.&lt;/p&gt;

&lt;p&gt;This is where the practice of Context Engineering becomes essential. It is the foundational layer that makes AI workflows actually function in a complex repository. Jumping straight into generating code workflows often fails because, in a real-world implementation, those workflows only work if the underlying context is structured, versioned, and explicitly dependency-loaded. Context Engineering solves the "blank slate" problem by systematically managing what the LLM knows at any given moment, ensuring it only acts after loading the exact architectural guardrails required for that specific task.&lt;/p&gt;

&lt;p&gt;We're not introducing a new standard here. This post explores an approach that builds on the theoretical foundations of Context Engineering, alongside emerging patterns around agents.md, spec‑driven development, and agent skills. We will show how you can wire these structured contexts together into a simple, deterministic workflow for everyday coding.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is an agents.md file
&lt;/h2&gt;

&lt;p&gt;An AGENTS.md file is just a README written for your coding agent. In its simplest form, it looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# AGENTS.md&lt;/span&gt;
You are a Python expert. Follow PEP 8.
Write tests for all code.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You then prompt your tool with something like:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;- "Read AGENTS.md, then refactor userservice.py."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;This approach gives you a few immediate benefits, especially on smaller projects:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The agent gets project‑specific rules before you ask for anything.&lt;/li&gt;
&lt;li&gt;You don't have to repeat basic constraints ("follow PEP 8", "write tests") in every prompt.&lt;/li&gt;
&lt;li&gt;New developers can rely on the same base behavior.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So, what are the limitations?&lt;/p&gt;

&lt;p&gt;Once you get into more complex projects or lean on that pattern a bit harder, there are some places where it falls short:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Generic Instructions: Lines like "write clean code" or "follow best practices" don't give the agent a concrete process.&lt;/li&gt;
&lt;li&gt;No enforcement: Nothing in AGENTS.md prevents you from skipping important steps, such as design or review.&lt;/li&gt;
&lt;li&gt;No shared workflow: Each developer works with the agent differently. Some use it to sketch designs, others ask for direct implementations, and others barely touch it.&lt;/li&gt;
&lt;li&gt;No quality gates: There's no built‑in way to say, "Before we merge, check these architectural rules and stop if something is wrong."&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The agent in itself can be brilliant, but it can also introduce technical debt since the team doesn't have a shared way of working with it.&lt;/p&gt;

&lt;p&gt;Now, we'll introduce the workflow model we use in practice at Stack Builders.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: From generic agents to explicit personas
&lt;/h2&gt;

&lt;p&gt;Before introducing workflows, it helps to refine agent definitions into explicit personas.&lt;/p&gt;

&lt;p&gt;However, it's important to note that agents are not the system itself. They are activated and constrained by workflows, which define when and how they operate.&lt;/p&gt;

&lt;p&gt;Instead of a single, generic agent, you define concrete personas. For example, let's take an &lt;code&gt;@architect-reviewer&lt;/code&gt; persona:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# Architect Reviewer Agent&lt;/span&gt;

&lt;span class="gu"&gt;## Role&lt;/span&gt;
You are the primary Architectural Reviewer for the project 'Apollo Microservices'. Your job is to ensure every code change adheres to the system's core design principles before it is committed.

&lt;span class="gu"&gt;## Dependencies &amp;amp; Context&lt;/span&gt;
ALWAYS load the following files for context before beginning any audit:
&lt;span class="p"&gt;1.&lt;/span&gt; docs/architecture/microservices-principles.md
&lt;span class="p"&gt;2.&lt;/span&gt; docs/development/golden-rules.md (for anti-patterns)
&lt;span class="p"&gt;3.&lt;/span&gt; src/config/layer-definitions.json (for module layer boundaries)

&lt;span class="gu"&gt;## Mandatory Audit Checklist&lt;/span&gt;
Review every change (file diff) against these non-negotiable points:
&lt;span class="p"&gt;1.&lt;/span&gt; &lt;span class="gs"&gt;**Layer Violation:**&lt;/span&gt; Does new code in /engine import anything from /ui? (Violation based on layer-definitions.json)
&lt;span class="p"&gt;2.&lt;/span&gt; &lt;span class="gs"&gt;**Configuration vs. Hard-Code:**&lt;/span&gt; Is business logic implemented directly in code when it should be driven by configuration files (e.g., in /config)?
&lt;span class="p"&gt;3.&lt;/span&gt; &lt;span class="gs"&gt;**Immutability:**&lt;/span&gt; Are any core entity objects modified outside of their designated factory/repository methods?
&lt;span class="p"&gt;4.&lt;/span&gt; &lt;span class="gs"&gt;**Security:**&lt;/span&gt; Are input sanitization checks present for all external API endpoints? (Reference golden-rules.md, section 4.1)

&lt;span class="gu"&gt;## Response Format&lt;/span&gt;
If violations are found, respond &lt;span class="ge"&gt;*only*&lt;/span&gt; with a numbered list of issues, referencing the specific line numbers and the rule violated. Do not offer solutions unless explicitly asked.

&lt;span class="gu"&gt;## Constraints&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="gs"&gt;**NEVER**&lt;/span&gt; permit changes that introduce global state.
&lt;span class="p"&gt;-&lt;/span&gt; Your response must be concise, professional, and entirely based on the provided documentation.
&lt;span class="p"&gt;-&lt;/span&gt; Your authority is final in matters of architectural integrity.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Compared to the more basic approach, this refined definition:&lt;/p&gt;

&lt;p&gt;Names a clear role&lt;/p&gt;

&lt;p&gt;Loads specific dependencies every time (architecture docs, golden rules, layer definitions)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Follows a concrete checklist&lt;/li&gt;
&lt;li&gt;Uses a strict response format&lt;/li&gt;
&lt;li&gt;Enforces hard constraints&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You can do this across multiple personas and plug them into an explicit workflow.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: Introduce simple commands (workflows)
&lt;/h2&gt;

&lt;p&gt;The next step is to stop free-styling prompts and start using a small set of named commands. These are not just convenient shortcuts for common prompts. They are deterministic workflow scripts: predefined execution paths that load the right context, activate the right persona, and enforce the right sequence of steps each time they run. A simple table like this can be enough:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Command&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;th&gt;Persona used&lt;/th&gt;
&lt;th&gt;What it avoids&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;$prepare&lt;/td&gt;
&lt;td&gt;Set up the session&lt;/td&gt;
&lt;td&gt;(none/system)&lt;/td&gt;
&lt;td&gt;Context amnesia&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$start-design&lt;/td&gt;
&lt;td&gt;Create or update a design spec&lt;/td&gt;
&lt;td&gt;architect&lt;/td&gt;
&lt;td&gt;Premature coding&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$start-feature&lt;/td&gt;
&lt;td&gt;Implement from a spec&lt;/td&gt;
&lt;td&gt;engineer&lt;/td&gt;
&lt;td&gt;Spec drift&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$commit&lt;/td&gt;
&lt;td&gt;Run final checks before merging any changes&lt;/td&gt;
&lt;td&gt;reviewer&lt;/td&gt;
&lt;td&gt;Hidden technical debt&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Each command has a short script behind it. Each workflow also declares explicit context dependencies—a list of files that must be loaded before execution. For example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Product requirements&lt;/li&gt;
&lt;li&gt;Technical constraints&lt;/li&gt;
&lt;li&gt;Golden rules&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This ensures the AI operates with the correct and complete context, rather than relying on the developer to manually restate everything in each prompt. More importantly, these workflows are deterministic. They are not suggestions or flexible guidelines. They are executed step by step with predefined dependencies, constraints, and checks.&lt;/p&gt;

&lt;p&gt;This structure helps ensure:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The same inputs produce consistent outputs&lt;/li&gt;
&lt;li&gt;Critical steps (like design or review) cannot be skipped&lt;/li&gt;
&lt;li&gt;AI behavior becomes predictable across sessions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You can still call these commands via natural language, e.g.:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;em&gt;"Run &lt;code&gt;$prepare&lt;/code&gt;, then &lt;code&gt;$start-design&lt;/code&gt; for 'new invoice export feature'."&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The point is that you and your teammates are now using the same entry points instead of inventing new prompts every time.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: Make &lt;code&gt;"prepare"&lt;/code&gt; non-optional
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;$prepare&lt;/code&gt; command is the mandatory entry point of the system. Every session begins here.&lt;/p&gt;

&lt;p&gt;Its purpose establishes a controlled environment by:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Loading core context (requirements, rules, constraints)&lt;/li&gt;
&lt;li&gt;Verifying that context is present&lt;/li&gt;
&lt;li&gt;Setting behavioral constraints on the AI&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Without this step, the system degrades back into traditional, unreliable prompting.&lt;/p&gt;

&lt;p&gt;Once that's in place, your interaction pattern changes from:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;em&gt;"Here's a random chunk of context, please do X."&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;to: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;em&gt;"First, prepare. Then, run $start-design / $start-feature / $commit."&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Step 4: Design before code with &lt;code&gt;$start-design&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;A lot of issues with AI‑assisted coding come from skipping design. The model writes code fast, but it doesn't force you to think.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;$start-design&lt;/code&gt; is intentionally about thinking.&lt;/p&gt;

&lt;p&gt;A reasonable flow:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create a new design document based on a simple template. &lt;/li&gt;
&lt;li&gt;Have the architect persona ask you clarifying questions about the feature:&lt;/li&gt;
&lt;li&gt;What problem are we solving?&lt;/li&gt;
&lt;li&gt;Which parts of the system are in scope?&lt;/li&gt;
&lt;li&gt;What can't change?&lt;/li&gt;
&lt;li&gt;Fill out the design doc: scope, impacted modules, data changes, APIs, test plan, risks, edge cases, open questions.&lt;/li&gt;
&lt;li&gt;Switch to the reviewer persona and have it scan the design for obvious gaps or rule violations.&lt;/li&gt;
&lt;li&gt;Stop and hand the design back to you.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;You then review the design like you would any other spec: edit, push back, refine. Only when you're comfortable with it do you move on.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Rule of thumb:&lt;/strong&gt; If you wouldn't merge the design doc as a human‑written spec, don't ask the agent to implement it.&lt;/p&gt;

&lt;p&gt;This keeps you in the role of architect instead of solely a "prompter."&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 5: Implement the spec with &lt;code&gt;$start-feature&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;With a solid design doc in place, &lt;code&gt;$start-feature&lt;/code&gt; does something very simple, yet very important: it treats design as the single source of truth.&lt;/p&gt;

&lt;p&gt;A typical &lt;code&gt;$start-feature&lt;/code&gt; command might:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Load the design document.&lt;/li&gt;
&lt;li&gt;Activate the engineer persona.&lt;/li&gt;
&lt;li&gt;Optionally follow a test‑first loop: outline or generate tests, then implement code until they pass.&lt;/li&gt;
&lt;li&gt;Ask the reviewer persona for a first pass on the changes.&lt;/li&gt;
&lt;li&gt;Stop and present the result.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Because the design doc is always in view, the implementation is less likely to wander off as the conversation evolves. If something does drift, you have a concrete spec to compare against. In this workflow, the design document is treated as the immutable source of truth.&lt;/p&gt;

&lt;p&gt;The agent is not allowed to reinterpret or extend requirements beyond what is defined in the design. This constraint is critical—it prevents scope drift and ensures implementation remains aligned with agreed specifications.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 6: Use &lt;code&gt;$commit&lt;/code&gt; as a gate
&lt;/h2&gt;

&lt;p&gt;The last command, &lt;code&gt;$commit&lt;/code&gt;, is your pre‑merge checklist.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Load the files changed by this feature.&lt;/li&gt;
&lt;li&gt;Load the relevant design document and "golden rules".&lt;/li&gt;
&lt;li&gt;Use the reviewer persona to apply its checklist:&lt;/li&gt;
&lt;li&gt;Are there forbidden dependencies between layers?&lt;/li&gt;
&lt;li&gt;Are we leaking implementation details across boundaries?&lt;/li&gt;
&lt;li&gt;Are there obvious security or validation gaps?&lt;/li&gt;
&lt;li&gt;Return a list of issues, with file and line information.&lt;/li&gt;
&lt;li&gt;Let the engineer address every issue and rerun $commit until all violations are resolved. No commit, merge, or final code integration may happen without explicit human approval.&lt;/li&gt;
&lt;li&gt;Only then, ask for explicit human approval to merge.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This doesn't replace human review or tests, but it gives you a consistent, automated gate that catches many of the "we'll fix it later" problems before they hit main.&lt;/p&gt;

&lt;p&gt;One key detail is that the &lt;code&gt;$commit&lt;/code&gt; workflow is not a single-pass check. It introduces a correction loop:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If issues are found, they must be fixed&lt;/li&gt;
&lt;li&gt;The system re-runs the review&lt;/li&gt;
&lt;li&gt;This repeats until all violations are resolved&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Only then is the user asked for explicit approval. This ensures that quality gates are not just advisory—they are enforced.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why does this work better than AGENTS.md alone?
&lt;/h2&gt;

&lt;p&gt;Once you've layered these pieces on top of your basic AGENTS.md, a few things change:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Everyone follows roughly the same path&lt;/strong&gt;&lt;br&gt;
"Prepare → design → feature → commit" becomes the default, instead of each person inventing their own approach.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Design becomes a normal artifact, not an afterthought&lt;/strong&gt;&lt;br&gt;
The agent helps you write and review specs, but you still own them. That alone reduces rework.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Context is more stable&lt;/strong&gt;&lt;br&gt;
Each command is responsible for loading what it needs. You're not constantly juggling which files to paste in or which rules to remind the model about.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;You stay in charge&lt;/strong&gt;&lt;br&gt;
The model drafts, checks, and suggests. You decide what gets built, what's acceptable, and when something is done.&lt;/p&gt;

&lt;p&gt;There is some overhead: you write a bit more up front and agree to use the commands. But for any non‑trivial feature, that cost tends to be smaller than the time you'd lose to "fast but wrong" implementations.&lt;/p&gt;

&lt;p&gt;A key distinction in this approach is separating context from prompts:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Prompts are interactions&lt;/li&gt;
&lt;li&gt;Context is the environment in which the AI operates&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;By engineering context as a system, prompts become lighter, more consistent, and less error-prone.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to get started &amp;amp; where this is heading
&lt;/h2&gt;

&lt;p&gt;If you're already using AGENTS.md, you don't need to adopt all of this at once. Start small: tighten your agent instructions, introduce a couple of clear personas, and add one or two commands that you actually use day‑to‑day.&lt;/p&gt;

&lt;p&gt;What matters most is not workflows in isolation, but the combination of three elements working together:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Context engineering to ensure the AI operates with a structured, versioned, dependency-loaded context&lt;/li&gt;
&lt;li&gt;Deterministic workflows to enforce repeatable execution and prevent critical steps from being skipped&lt;/li&gt;
&lt;li&gt;Role-based agents to keep behavior controlled, specialized, and easier to trust&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The broader ecosystem is already moving in this direction. Instruction formats like AGENTS.md help standardize how we guide agents. Skill systems package reusable capabilities. And clearer distinctions between agents, skills, and commands make it easier to design workflows that are both practical and reliable.&lt;/p&gt;

&lt;p&gt;The approach outlined here fits into that larger evolution rather than competing with it. In our experience, the real value comes from combining these pieces into a system that gives AI enough structure to be useful without giving up engineering control. You can begin by layering that structure around the patterns you already use: a lightweight prepare step, design-first execution, explicit personas, and a commit gate with human approval.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Rule of thumb:&lt;/strong&gt; Treat workflows and specs as the "operating system" around your agents. The more deliberate that layer becomes, the more you can trust AI to handle real work without giving up control of your engineering process.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>architecture</category>
      <category>machinelearning</category>
      <category>productivity</category>
    </item>
    <item>
      <title>AI-Assisted Visual QA for Figma AEM Workflows</title>
      <dc:creator>Stack Builders</dc:creator>
      <pubDate>Thu, 02 Apr 2026 14:12:14 +0000</pubDate>
      <link>https://dev.to/stack_builders/ai-assisted-visual-qa-for-figma-aem-workflows-3bba</link>
      <guid>https://dev.to/stack_builders/ai-assisted-visual-qa-for-figma-aem-workflows-3bba</guid>
      <description>&lt;p&gt;Visual QA is one of those activities everyone agrees is important…right up until it becomes the bottleneck.&lt;/p&gt;

&lt;p&gt;A page looks “basically right,” you’re under deadline, and that last review pass turns into a game of spot the difference: margin tweaks, heading sizes, tiny spacing inconsistencies that are easy to miss and painful to repeat across dozens (or hundreds) of pages.&lt;/p&gt;

&lt;p&gt;In a recent &lt;a href="https://www.meetup.com/quito-lambda-meetup/events/" rel="noopener noreferrer"&gt;Quito Lambda talk&lt;/a&gt; at Stack Builders, our team explored a &lt;a href="https://www.youtube.com/watch?v=ueiwIhkMKtQ" rel="noopener noreferrer"&gt;practical approach to reducing manual visual QA time using AI-assisted development&lt;/a&gt; and pixel-based visual comparison: pulling a baseline from Figma, capturing the “about to go live” view from Adobe Experience Manager (AEM), and generating a visual diff report that shows exactly where the UI diverges.&lt;/p&gt;

&lt;p&gt;Stack Builders works extensively with AEM and is an official Adobe Experience Manager partner, so this kind of workflow is directly aligned with the kind of enterprise-grade content operations we help teams modernize.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;The Pain: Manual Visual QA Doesn’t Scale&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;If you’ve ever reviewed two screenshots that look identical, you know how this goes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A paragraph is shifted by ~40px.&lt;/li&gt;
&lt;li&gt;A heading is an H2 instead of an H3—visually “almost the same,” but not quite.&lt;/li&gt;
&lt;li&gt;Spacing changes by a couple of pixels, and nobody notices until a stakeholder does.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Manual checks are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Repetitive and tiring&lt;/li&gt;
&lt;li&gt;Time-consuming&lt;/li&gt;
&lt;li&gt;Inconsistent (different reviewers notice different things)&lt;/li&gt;
&lt;li&gt;Risky (small UI regressions slip through and show up in production)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And importantly, you repeat the same effort for every page, every time.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;The Real-World Workflow: From Content to “Live”&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;In many organizations (especially those running AEM), the pipeline often looks like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Content writing (messaging, paragraphs, structure)&lt;/li&gt;
&lt;li&gt;Design in Figma (layouts, tokens, components, specs)&lt;/li&gt;
&lt;li&gt;Authoring in AEM (drag-and-drop components, build pages from templates)&lt;/li&gt;
&lt;li&gt;Visual QA (verify AEM matches Figma)&lt;/li&gt;
&lt;li&gt;Publish (page goes live)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;AEM is particularly powerful here because it enables non-developers to assemble pages using controlled templates and components, great for scale, but it also means small configuration differences can produce subtle visual drift.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;The Goal: Faster QA, More Consistency, Better Evidence&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;The objective isn’t to “remove QA,” it’s to make QA more reliable and dramatically less manual.&lt;/p&gt;

&lt;p&gt;A good automated approach should:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Reduce the time spent visually inspecting pages&lt;/li&gt;
&lt;li&gt;Increase consistency across reviews&lt;/li&gt;
&lt;li&gt;Produce evidence (diff images + percentage change) that teams can act on quickly&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is where pixel-based visual comparison shines.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Pixel-Based Comparison: Simple Idea, Huge Leverage&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;At the core is a straightforward method:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Capture Screenshot A (baseline, e.g., from Figma export)&lt;/li&gt;
&lt;li&gt;Capture Screenshot B (actual UI, e.g., AEM preview)&lt;/li&gt;
&lt;li&gt;Compare pixels (RGB values by position)&lt;/li&gt;
&lt;li&gt;Output:&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;Diff image/heatmap&lt;/li&gt;
&lt;li&gt;Percent difference&lt;/li&gt;
&lt;li&gt;Optional: segmented diffs per section (header, hero, etc.)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is a classic form of visual regression testing, where you compare screenshots to catch unintended UI changes.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Where AI Fits: Building the Tool Faster (and Better) with “Vibe Engineering”&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;A key theme from the talk was the difference between:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Vibe coding: “Prompt it and ship it.”&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://medium.com/@mahernaija/from-vibe-coding-to-vibe-engineering-6edcd81caddd" rel="noopener noreferrer"&gt;Vibe engineering&lt;/a&gt;: Use AI for speed, but keep engineering discipline—security, reliability, maintainability, and real-world scalability.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The AI helped accelerate:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Rapid prototyping of integrations (Figma + AEM preview capture)&lt;/li&gt;
&lt;li&gt;Refactoring guidance&lt;/li&gt;
&lt;li&gt;Documentation generation&lt;/li&gt;
&lt;li&gt;Security improvements (e.g., safer credential/token handling)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But the takeaway was clear: AI is strongest when paired with experienced engineering judgment, setting constraints, reviewing outputs, and enforcing standards.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;A Practical Architecture: Figma + AEM + Screenshot Diffing&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;A lightweight architecture for AI-assisted visual QA looks like this:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Inputs&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Figma: design source of truth&lt;/li&gt;
&lt;li&gt;AEM Preview: “view as published” preview before release&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Pipeline&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Pull/export the relevant frame from Figma (via API)&lt;/li&gt;
&lt;li&gt;Use browser automation to load AEM preview and capture a screenshot&lt;/li&gt;
&lt;li&gt;Normalize: crop / resize, reduce whitespace, align viewport&lt;/li&gt;
&lt;li&gt;Compare images (pixel-by-pixel)&lt;/li&gt;
&lt;li&gt;Produce a report: baseline, actual, diff/heatmap, percent change&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Example Tech Stack&lt;/strong&gt;
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Node.js + TypeScript&lt;/li&gt;
&lt;li&gt;Express for APIs + Helmet for security headers&lt;/li&gt;
&lt;li&gt;Playwright (Chromium) for headless browser automation + screenshot capture&lt;/li&gt;
&lt;li&gt;Sharp for image preprocessing (crop/resize/cleanup)&lt;/li&gt;
&lt;li&gt;pixelmatch for pixel-based diffs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This combination is popular because it’s scriptable, fast, and easy to run locally or in CI.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;What the Report Gives You (and Why it Matters)&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Instead of “it looks off somewhere,” you get:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A diff heatmap that pinpoints the UI drift&lt;/li&gt;
&lt;li&gt;A different percentage that helps establish thresholds&lt;/li&gt;
&lt;li&gt;A repeatable process that’s consistent across reviewers/pages&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A “good” page might show ~3% difference (often driven by tiny nav or content mismatches), while subtle layout issues (like heading sizing + a 40px indentation) pushed the diff higher (~5%), and the heatmap immediately highlighted the problem areas.&lt;/p&gt;

&lt;p&gt;This is the big win: you can move from subjective review to actionable evidence.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Why “AI Image Analysis” Didn’t Fully Replace Pixel Diffs (Yet)&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;We’ve also experimented with using an AI model to interpret differences more semantically (“this heading should be smaller,” “this padding is off”). That part didn’t work as reliably as hoped.&lt;/p&gt;

&lt;p&gt;The likely reason: pure screenshot-based AI analysis can struggle to infer intent and structure unless it’s grounded in the design system and underlying specs.&lt;/p&gt;

&lt;p&gt;Which leads to the most important next step…&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Roadmap: From Pixel Diffs to Design-System Validation&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Pixel diffs are powerful, but the long-term path is even better:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1) Tighten Your Design System Bridge (Figma ↔ Implementation)&lt;/strong&gt;&lt;br&gt;
If Figma tokens and component structure map cleanly to your code (or CMS components), you can validate:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;typography scales&lt;/li&gt;
&lt;li&gt;spacing rules&lt;/li&gt;
&lt;li&gt;component variants&lt;/li&gt;
&lt;li&gt;layout constraints&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This reduces false positives and moves QA closer to “verify intent,” not just pixels.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2) Use Design Tokens Consistently&lt;/strong&gt;&lt;br&gt;
Define tokens once (e.g., “Small = 14px”) and ensure they’re respected across:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Figma&lt;/li&gt;
&lt;li&gt;CSS / component library&lt;/li&gt;
&lt;li&gt;AEM component styles&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;3) Expand Breakpoints&lt;/strong&gt;&lt;br&gt;
Desktop-only diffs are a start. Add:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;tablet&lt;/li&gt;
&lt;li&gt;mobile&lt;/li&gt;
&lt;li&gt;responsive states&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;4) Batch Runs&lt;/strong&gt;&lt;br&gt;
Instead of page-by-page:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;run an entire path, site section, or folder of pages&lt;/li&gt;
&lt;li&gt;produce a consolidated report for review&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;5) Broaden CMS Compatibility&lt;/strong&gt;&lt;br&gt;
AEM is a great first target, but the concept generalizes to other CMS platforms.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Conclusion: Make Visual QA Faster in Your AEM Pipeline?&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;If your team is authoring high volumes of pages in AEM and spending too much time on repetitive reviews, this kind of workflow can pay off quickly, especially once it’s wired into CI or editorial release processes.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>testing</category>
      <category>frontend</category>
    </item>
    <item>
      <title>How to Build Multi-Agent Architectures with Google ADK</title>
      <dc:creator>Stack Builders</dc:creator>
      <pubDate>Thu, 02 Apr 2026 14:01:01 +0000</pubDate>
      <link>https://dev.to/stack_builders/how-to-build-multi-agent-architectures-with-google-adk-2c6h</link>
      <guid>https://dev.to/stack_builders/how-to-build-multi-agent-architectures-with-google-adk-2c6h</guid>
      <description>&lt;h2&gt;
  
  
  &lt;strong&gt;Why Single Agents Don't Scale Well&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Whenever we’re prototyping an AI application, we begin by creating a single agent that’s in charge of everything, from understanding the request, planning, calling tools, and generating the final response. This is fine for a simple use case, as this is easy to implement.&lt;/p&gt;

&lt;p&gt;However, as complexity grows, the agent must manage a lot of different tasks, like validation and multiple tool calls in the same iteration loop. This can make context difficult to debug and understand because everything is happening inside a single model call.&lt;/p&gt;

&lt;p&gt;This would mean adding more instructions and tools to our agent that don’t actually fix the structural issue. This increases the complexity and unpredictability of our agent. While single-agent systems are excellent for prototypes, we require multi-step workflows that actually separate responsibilities and are more reliable. This is where multi-agent systems shine and become a more scalable solution in the long term.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;What are Multi-Agent Systems (MAS)&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;A Multi-Agent System "MAS" is an architecture in which multiple agents collaborate to achieve a shared objective. Instead of using a single agent to handle every responsibility, tasks are divided into smaller parts, where each agent focuses on a specific task.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Multi-agent systems patterns&lt;/strong&gt;&lt;br&gt;
Designing a multi-agent system is about choosing what pattern to use for the problem. There are some common patterns that have emerged to structure collaboration among agents.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Coordinator/Dispatcher Pattern&lt;/strong&gt;&lt;br&gt;
For this pattern, you have a central agent that acts as an orchestrator, which receives the user request and decides how to delegate the parts needed to complete the task. This orchestrator tasks other agents and gathers the results to produce the final response. The most typical example of this is a customer support chatbot, which delegates requests to particular areas depending on the expertise needed (eg, billing agent, technical support agent) and returns a response to the user's query.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sequential Pipeline Pattern&lt;/strong&gt;&lt;br&gt;
Here, you set a linear flow where each agent executes a specific transformation and gives the output to the next agent. Each step depends on the previous one's execution, making the workflow predictable. An example of this can be an agent to make a financial report, where an agent first gets structured data from a file, another analyzes that data, and the final agent gathers all the insights in a report.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Parallel Fan-Out/Gather Pattern&lt;/strong&gt;&lt;br&gt;
This makes it possible to divide a task into independent subtasks that are executed simultaneously and gathered into a single result at the end. This pattern is perfect to reduce latency in the agent calls. An example of this can be having a research agent that gets multiple documents and summarizes them in parallel, and a final agent consolidates them into an overview.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hierarchical Task Decomposition&lt;/strong&gt;&lt;br&gt;
Structures agents into multiple levels of responsibility, where the top agent breaks down the complex objective into manageable components, then all the sub-agents handle the components with the capability to decompose and delegate them further. For example, we can use this pattern for an agent that produces code where the sub-task would look like: creating the actual requirements, designing the solution, implementing the solution, and testing it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Review/Critique Pattern (Generator-Critic)&lt;/strong&gt;&lt;br&gt;
We separate the generation from the evaluation; an agent produces an output, and another independently reviews it to see if there are areas of improvement, introducing an internal quality control system. This pattern can be used in code generation, where one agent produces the code and another reviews it for logical flaws, edge cases, security vulnerabilities, etc.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Iterative Refinement Pattern&lt;/strong&gt;&lt;br&gt;
This allows you to collaborate over multiple cycles to progressively improve an output, permitting other patterns into the mix to obtain a better result. For example, you might have an agent to generate initial article drafts and another agent to evaluate them and give them revisions. You can go through multiple iterations until your revisions don’t have any comments and you have an article that fits your quality criteria.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Human-in-the-Loop Pattern&lt;/strong&gt;&lt;br&gt;
For this pattern, we allow for human oversight into the workflow at defined checkpoints. The agents perform the task, but a human reviews, adjusts, and approves the output before continuing. This might be necessary in high-risk environments where accountability is crucial. For example, if you have an agent to generate and review a contract, you will still need to go through a lawyer who provides the final touches and approves it for delivery.&lt;/p&gt;
&lt;h2&gt;
  
  
  &lt;strong&gt;Google’s Agent Development Kit 101&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Google’s Agent Development Kit, or ADK, is a framework designed to build, orchestrate, and deploy multi-agent systems where structure is prioritized. ADK provides a formal way to define agents instead of having ad hoc solutions. ADK also allows us to define interaction between agents, tool calls, and hierarchies that allow us to benefit from the patterns explained above.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Basic concepts to know about ADK&lt;/strong&gt;&lt;br&gt;
To actually use ADK, we should be familiar with the basic concepts that enable us to build working agent applications with ADK. The following are the most important things to know about ADK:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Agents&lt;/strong&gt;&lt;br&gt;
In ADK, an agent is the most atomic unit. They have their own instructions, model configuration, and access to tools.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tools&lt;/strong&gt;&lt;br&gt;
Tools in ADK are structured functions that agents can invoke to accomplish a specific task. You can have tools that allow you to search the internet, format information in a particular format, send emails, etc.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Workflows and Orchestration&lt;/strong&gt;&lt;br&gt;
ADK allows for explicit orchestration. Instead of embedding multi-step reasoning inside a single prompt, ADK allows you to define how agents interact in code through the patterns explained in the previous section.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What about memory and state?&lt;/strong&gt;&lt;br&gt;
Managing state is a common challenge in AI systems, where you sometimes pass long context prompts that can get overlooked by agents. ADK provides a structured way to handle session state and memory. In this way, agents can maintain contextual information across the whole workflow without relying on prompt accumulation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How does ADK help with observability?&lt;/strong&gt;&lt;br&gt;
This visibility is crucial for debugging and auditing. In case of an error in production environments, you need to understand how a result was generated by looking at the steps the agents took. In ADK, this is integrated within the framework, allowing you to see which agent was invoked, what context and tools were used, and how the overall workflow worked.&lt;/p&gt;
&lt;h2&gt;
  
  
  &lt;strong&gt;ADK vs other frameworks&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;As multi-agent systems become more common, several frameworks have emerged to help developers to implement them. They share similar goals, but they differ in their focus. Comparing ADK with other popular frameworks like LangChain and LangGraph helps us clarify where each of them fits in different projects.&lt;/p&gt;

&lt;p&gt;LangChain is one of the most flexible frameworks available. It supports many model providers. This makes it attractive for teams that value vendor neutrality and ecosystem breadth. Despite this, we see that large systems may require additional architectural decisions to allow for maintainability.&lt;/p&gt;

&lt;p&gt;LangGraph builds on the LangChain ecosystem by introducing explicit graph-based orchestration. It allows for complex workflows that require different states and workflow management. It’s as flexible as LangChain while offering structural control.&lt;/p&gt;

&lt;p&gt;Google ADK takes a more structured and opinionated approach. It encourages architectural clarity and production readiness, but it is closely aligned with Google’s model ecosystem and cloud infrastructure. This makes it a strong option for teams already in that stack; this can limit flexibility for teams that value vendor portability.&lt;/p&gt;

&lt;p&gt;In conclusion, ADK focuses on structured design and ecosystem integration. LangChain and LangGraph prioritize flexibility. As with everything in software, the right choice depends on what your project priorities are and what are the constraints you have.&lt;/p&gt;
&lt;h2&gt;
  
  
  &lt;strong&gt;A practical look into a MAS project with ADK&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;We’ll build a small ADK multi-agent system for expense tracking that helps an individual log transactions, auto-categorize them, track budgets, and generate a monthly summary. The goal is to show how ADK’s agent composition feels in a real project by getting hands-on experience.&lt;/p&gt;

&lt;p&gt;This follows ADK’s standard project and root_agent entrypoint, and uses Gemini as an LLM Agent plus a Workflow Agent to orchestrate steps.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Project setup&lt;/strong&gt;&lt;br&gt;
Create a new ADK agent project and install dependencies.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; pip &lt;span class="nb"&gt;install &lt;/span&gt;google-adk
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; adk create finance_tracker
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;cd &lt;/span&gt;finance_tracker
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;ADK’s Python quickstart uses &lt;code&gt;google-adk&lt;/code&gt; and a &lt;code&gt;root_agent&lt;/code&gt; defined in &lt;code&gt;agent.py&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The multi agent design&lt;/strong&gt;&lt;br&gt;
We’ll use these agents:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Intake agent: turns user text into a clean transaction payload&lt;/li&gt;
&lt;li&gt;Gate agent: this agent is in charge of allowing the actual execution of the flow if all required data is present.&lt;/li&gt;
&lt;li&gt;Categorizer agent: assigns a category to the expense (groceries, rent, transport, etc.)&lt;/li&gt;
&lt;li&gt;Ledger agent: writes and reads transactions using tools, in our case SQLite&lt;/li&gt;
&lt;li&gt;Insights agent: creates summaries and the reports&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Then we connect them with a LoopAgent workflow, which is one of ADK’s deterministic Workflow Agents, this allows us to ask the user for clarification in case it is needed. This would look something like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frbdknapfy2u4rfc8fpv9.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frbdknapfy2u4rfc8fpv9.webp" alt="ADK’s deterministic Workflow Agents" width="800" height="256"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Minimal implementation&lt;/strong&gt;
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;__future__&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;annotations&lt;/span&gt;


&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;sqlite3&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pathlib&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Path&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Any&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;


&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;google.adk.agents.llm_agent&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Agent&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;google.adk.agents.loop_agent&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;LoopAgent&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;google.adk.tools&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;exit_loop&lt;/span&gt;




&lt;span class="n"&gt;DB_PATH&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;__file__&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;with_name&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;finance.db&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="c1"&gt;# Simple SQLite-based ledger for transactions and budgets.
# This is a minimal implementation for demonstration purposes.
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_get_conn&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;sqlite3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Connection&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
   &lt;span class="n"&gt;conn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sqlite3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;DB_PATH&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
   &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
       &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
       CREATE TABLE IF NOT EXISTS transactions (
           id INTEGER PRIMARY KEY AUTOINCREMENT,
           ts TEXT NOT NULL,
           amount REAL NOT NULL,
           currency TEXT NOT NULL,
           merchant TEXT,
           category TEXT,
           note TEXT
       )
       &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
   &lt;span class="p"&gt;)&lt;/span&gt;
   &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
       &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
       CREATE TABLE IF NOT EXISTS budgets (
           category TEXT PRIMARY KEY,
           monthly_limit REAL NOT NULL,
           currency TEXT NOT NULL
       )
       &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
   &lt;span class="p"&gt;)&lt;/span&gt;
   &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;commit&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
   &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;




&lt;span class="c1"&gt;# Tools (callable functions) that agents can use.
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;add_transaction&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
   &lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="n"&gt;currency&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;USD&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="n"&gt;merchant&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="n"&gt;note&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="n"&gt;ts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
   &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Insert a transaction into the local ledger.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
   &lt;span class="n"&gt;conn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;_get_conn&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
   &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;ts&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
       &lt;span class="n"&gt;ts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;utcnow&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;isoformat&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
   &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
       &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;INSERT INTO transactions (ts, amount, currency, merchant, category, note) VALUES (?, ?, ?, ?, ?, ?)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;currency&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;merchant&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;note&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
   &lt;span class="p"&gt;)&lt;/span&gt;
   &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;commit&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
   &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;success&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;inserted&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ts&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ts&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;




&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;list_transactions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
   &lt;span class="n"&gt;month&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="n"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
   &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
   List transactions.
   month format: YYYY-MM, e.g. 2026-02
   &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
   &lt;span class="n"&gt;conn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;_get_conn&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
   &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Any&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
   &lt;span class="n"&gt;q&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SELECT ts, amount, currency, merchant, category, note FROM transactions&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
   &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;month&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
       &lt;span class="n"&gt;q&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; WHERE substr(ts, 1, 7) = ?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
       &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;month&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
   &lt;span class="n"&gt;q&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; ORDER BY ts DESC LIMIT ?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
   &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


   &lt;span class="n"&gt;rows&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;fetchall&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
   &lt;span class="n"&gt;items&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
       &lt;span class="p"&gt;{&lt;/span&gt;
           &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ts&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
           &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;amount&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
           &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;currency&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
           &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;merchant&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
           &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;category&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
           &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;note&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
       &lt;span class="p"&gt;}&lt;/span&gt;
       &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;rows&lt;/span&gt;
   &lt;span class="p"&gt;]&lt;/span&gt;
   &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;success&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;items&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;




&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;set_budget&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;monthly_limit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;currency&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;USD&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
   &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Set or update a monthly budget for a category.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
   &lt;span class="n"&gt;conn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;_get_conn&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
   &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
       &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;INSERT INTO budgets(category, monthly_limit, currency) VALUES(?, ?, ?) &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
       &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ON CONFLICT(category) DO UPDATE SET monthly_limit=excluded.monthly_limit, currency=excluded.currency&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;monthly_limit&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;currency&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
   &lt;span class="p"&gt;)&lt;/span&gt;
   &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;commit&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
   &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;success&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;category&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;monthly_limit&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;monthly_limit&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;currency&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;currency&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;




&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_budgets&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
   &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Return all configured budgets.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
   &lt;span class="n"&gt;conn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;_get_conn&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
   &lt;span class="n"&gt;rows&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SELECT category, monthly_limit, currency FROM budgets&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;fetchall&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
   &lt;span class="n"&gt;items&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;category&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;monthly_limit&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;currency&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]}&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
   &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;success&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;items&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;


&lt;span class="c1"&gt;# Agents definitions. Each agent has a specific role and can call tools or other agents as needed.
&lt;/span&gt;

&lt;span class="n"&gt;intake_agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
   &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gemini-2.5-flash&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;intake_agent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Extracts transaction details from user messages.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="n"&gt;instruction&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
       &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Turn the user message into a transaction JSON.&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
       &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Required: amount.&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
       &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Optional: currency (default USD), merchant, note.&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
       &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;If amount is missing or ambiguous, ask ONE short question and output:&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
       &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;{ &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s"&gt;complete&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s"&gt;: false }&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
       &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;If ready, output:&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
       &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;{ &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s"&gt;complete&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s"&gt;: true, &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s"&gt;amount&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s"&gt;: &amp;lt;number&amp;gt;, &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s"&gt;currency&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s"&gt;USD&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s"&gt;merchant&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s"&gt;: null|&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s"&gt;...&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s"&gt;note&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s"&gt;...&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s"&gt; }&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
       &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Merchant may be null. Do not ask for merchant.&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
   &lt;span class="p"&gt;),&lt;/span&gt;
   &lt;span class="n"&gt;output_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tx&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="n"&gt;gate_agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
   &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gemini-2.5-flash&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gate_agent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Stops the workflow if intake is incomplete.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="n"&gt;instruction&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
       &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Look at tx in state.&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
       &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;If tx.complete is false, call the exit_loop tool immediately and output nothing.&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
       &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;If tx.complete is true, do nothing.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
   &lt;span class="p"&gt;),&lt;/span&gt;
   &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;exit_loop&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="n"&gt;categorizer_agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
   &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gemini-2.5-flash&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;categorizer_agent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Assigns a category to a transaction.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="n"&gt;instruction&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
       &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Given a transaction JSON, assign one category from this list:&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
       &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Groceries, Dining, Rent, Utilities, Transport, Health, Entertainment, Shopping, Income, Other.&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
       &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Output JSON with all original keys plus category.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
   &lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="n"&gt;ledger_agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
   &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gemini-2.5-flash&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ledger_agent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Writes and reads transactions and budgets using tools.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="n"&gt;instruction&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
       &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;You manage the personal finance ledger.&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
       &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Use tools to add transactions, list transactions, set budgets, and get budgets.&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
       &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Never invent ledger entries.&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
       &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;If asked for a summary, list transactions and budgets first, then compute.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
   &lt;span class="p"&gt;),&lt;/span&gt;
   &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;add_transaction&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;list_transactions&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;set_budget&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;get_budgets&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="n"&gt;insights_agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
   &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gemini-2.5-flash&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;insights_agent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Creates summaries and budget insights based on ledger data.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="n"&gt;instruction&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
       &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;You produce a short monthly summary with totals by category and budget status.&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
       &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Be concrete, use numbers, and keep recommendations practical.&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
       &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;If there is not enough data, say what is missing and suggest the next best action.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
   &lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="c1"&gt;# The root agent orchestrates the workflow.
# It runs the intake agent first, then the gate agent to check if we can proceed.
# If the intake is complete, it continues to categorizer, ledger, and insights agents in sequence.
# The max_iterations=1 means it will run through this sequence once per user message.
#
# For a real application, you might want a more complex loop with conditions to
# allow for follow-up questions, corrections, etc.
&lt;/span&gt;&lt;span class="n"&gt;root_agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;LoopAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
   &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;root_agent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Personal finance tracking assistant.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="n"&gt;sub_agents&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
       &lt;span class="n"&gt;intake_agent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="n"&gt;gate_agent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="n"&gt;categorizer_agent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="n"&gt;ledger_agent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="n"&gt;insights_agent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="p"&gt;],&lt;/span&gt;
   &lt;span class="n"&gt;max_iterations&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Results&lt;/strong&gt;&lt;br&gt;
To test the system, we can run the agent using ADK’s development server. From the project root, execute:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;adk web&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;This starts a local web interface, available at (&lt;a href="http://127.0.0.1:8000" rel="noopener noreferrer"&gt;http://127.0.0.1:8000&lt;/a&gt;), where you can interact with the agent in real time. The web environment allows you to: Send natural language prompts Observe how each agent in the workflow is executed Inspect tool calls and intermediate outputs Debug the flow of state between agents&lt;/p&gt;

&lt;p&gt;Using a simple prompt such as:&lt;/p&gt;

&lt;p&gt;“Spent 250 on a new monitor yesterday.”&lt;/p&gt;

&lt;p&gt;We can see the full multi-agent workflow in action, where ADK shows us each agent's output and the tools that are called. We can inspect each request to see, for example, the shared state present between the call of the intake agent and the rest of the agents. Then, this interactive environment is useful during development because it makes the orchestration visible, allowing you to see how each decision is made at each stage.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqgmbglxqtnpto9tlfb9x.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqgmbglxqtnpto9tlfb9x.webp" alt="ADK Multi Agent Workflow" width="800" height="646"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I’ve added more transactions to see how the insight agent behaves once we have enough data to generate a monthly report. This helps us test the flow as a whole, which results in the following:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frigijc38jjvjnt3mf1vr.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frigijc38jjvjnt3mf1vr.webp" alt="Insight Agent Results" width="800" height="702"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Conclusion&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Multi-agent systems mark a shift from prompt engineering to system design. Rather than asking one large model to act like an entire team, we delegate tasks to specialized agents with clear boundaries and controlled workflows.&lt;/p&gt;

&lt;p&gt;With Google’s Agent Development Kit, we can progress beyond experimental setups and create organized systems that are observable, modular, and ready for production. The finance tracker example may seem straightforward, but the underlying structure can scale to much more complex areas.&lt;/p&gt;

&lt;p&gt;As AI agents become more embedded in real products, the ability to design systems instead of just prompts will set apart prototypes from actual, used software.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>architecture</category>
    </item>
    <item>
      <title>Increasing confidence in your software with formal verification</title>
      <dc:creator>Stack Builders</dc:creator>
      <pubDate>Thu, 21 Mar 2024 20:06:04 +0000</pubDate>
      <link>https://dev.to/stack_builders/increasing-confidence-in-your-software-with-formal-verification-2194</link>
      <guid>https://dev.to/stack_builders/increasing-confidence-in-your-software-with-formal-verification-2194</guid>
      <description>&lt;p&gt;Testing can help you rule out some bugs in your software, but in general, it cannot ensure that your software behaves exactly like it is expected to. When you need this level of confidence in your software, formal verification comes into play!&lt;/p&gt;

&lt;p&gt;Software is an integral part of modern society. We use it every day, directly or indirectly, to make our lives easier. We find it in our phones, it is used to run companies and governments all over the world, and it is also found on vehicles like cars and airplanes.&lt;/p&gt;

&lt;p&gt;Naturally, given how ubiquitous software is, we would want it to be as free of bugs as possible. Errors in software affect us in different ways, like leading us to conclude inaccurate information or causing companies to lose money. In the more extreme cases, they could even cause fatal consequences resulting in deaths, for instance in a vehicle failure. These kinds of errors have happened in the past:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://en.wikipedia.org/wiki/Therac-25"&gt;Therac-25&lt;/a&gt;, a machine unit used to treat cancer patients with radiotherapy, had several failures due to &lt;a href="https://en.wikipedia.org/wiki/Race_condition"&gt;race conditions&lt;/a&gt; and ended up administering lethal doses of radiotherapy to six people, killing some of them and leaving the others with permanent injuries.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The &lt;a href="https://en.wikipedia.org/wiki/Ariane_flight_V88"&gt;Ariane 5&lt;/a&gt; rocket had a software failure due to an &lt;a href="https://en.wikipedia.org/wiki/Integer_overflow"&gt;integer overflow&lt;/a&gt; which resulted in the explosion of the rocket, causing a loss of $370 million. Fortunately, the Ariane 5 was unmanned.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We as developers understand the importance of correct software. At Stack Builders, we are committed to delivering high-quality code. To achieve this, we continuously explore cutting-edge techniques and tools to ensure the correctness of the software we develop. In this blog post, we will provide an introduction to formal verification, a technique used to rigorously prove that a program is correct. Later on, we will dig a bit deeper into the more precise meaning of what the term "correct" entails in the context of software programs.&lt;/p&gt;

&lt;p&gt;Wanting our programs to be free of these critical bugs brings us to the question: how can we increase our confidence in the software we are developing? One way to do that, and the most widespread approach, is testing. This includes all sorts of tests: unit tests, integration tests, end-to-end tests, etc. Testing helps us spot potential bugs in our software earlier before it gets shipped, and can also help prevent regressions.&lt;/p&gt;

&lt;h4&gt;
  
  
  Testing and the absence of bugs
&lt;/h4&gt;

&lt;p&gt;Tests can help us find bugs in our programs, especially when covering corner cases. However, tests &lt;strong&gt;cannot&lt;/strong&gt;, in general, &lt;strong&gt;ensure the absence of bugs&lt;/strong&gt;. Testing is an approach that aims to check the behavior of software against a given set of test cases. Ensuring the absence of bugs with testing requires us to define a set of test cases that covers the entirety of the &lt;a href="https://en.wikipedia.org/wiki/Domain_of_a_function"&gt;domain&lt;/a&gt; of any given function that we want to test.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Logical AND&lt;/span&gt;
&lt;span class="n"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;and&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nl"&gt;x:&lt;/span&gt; &lt;span class="nc"&gt;Bool&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nl"&gt;y:&lt;/span&gt; &lt;span class="nc"&gt;Bool&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nc"&gt;Bool&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="nc"&gt;False&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="n"&gt;then&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;False&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// Tests&lt;/span&gt;
&lt;span class="k"&gt;assert&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;and&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;False&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;False&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="nc"&gt;False&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;assert&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;and&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;False&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;True&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="nc"&gt;False&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;assert&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;and&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;True&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;False&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="nc"&gt;False&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;assert&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;and&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;True&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;True&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="nc"&gt;True&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In this example (in a made-up programming language, to keep things simple) we have defined an and function which takes two boolean arguments and returns a boolean value. We have defined &lt;strong&gt;four&lt;/strong&gt; tests to check the behavior of and. These four tests &lt;strong&gt;ensure&lt;/strong&gt; the &lt;strong&gt;absence of bugs&lt;/strong&gt; in the function since we have defined tests that cover the entire domain of the function; that is, we have one test for every possible combination of the arguments' values.&lt;/p&gt;

&lt;p&gt;The easiest way to know how many test cases we need for a particular function is to calculate the size of its domain. In this case, we have two possible values for the first argument and two possible values for the second argument, so the size of and's domain is 2 * 2 = 4.&lt;/p&gt;

&lt;p&gt;Easy, right? Well, let's try to use the same testing approach in another example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="n"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;square&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nl"&gt;x:&lt;/span&gt; &lt;span class="nc"&gt;Int32&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nl"&gt;y:&lt;/span&gt; &lt;span class="nc"&gt;Int32&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nc"&gt;Int32&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Uh-oh... what happens if we try to cover the whole domain of the square function? Let's see: the first argument has 2^32 possible values, and the second argument has 2^32 possible values, so the size of the square's domain is... 2^32 * 2^32 = 18446744073709551616.&lt;/p&gt;

&lt;p&gt;It is clearly unfeasible to have that many test cases written! But it can get worse:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="n"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;toUppercase&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nl"&gt;name:&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;letter&lt;/span&gt; &lt;span class="n"&gt;in&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
        &lt;span class="n"&gt;uppercase&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;letter&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The toUppercase function takes a string and returns that string with every letter turned into uppercase. In this case, the domain is theoretically infinite, so it's straight out impossible to write test cases for every input. Realistically, if we assume that a String is arbitrarily large and allocated in dynamic memory, the number of possible values depends on the free memory of the system, which is still an astronomical number.&lt;/p&gt;

&lt;p&gt;As we were saying earlier, to ensure the absence of bugs with testing, we need to cover the domain of the function we are testing. We have seen that in general, this is not doable or practical.&lt;/p&gt;

&lt;p&gt;However, there are other techniques more effective than testing when we aim to ensure that our program is correct. Many of these techniques come from &lt;a href="https://en.wikipedia.org/wiki/Formal_methods"&gt;formal methods&lt;/a&gt; research efforts.&lt;/p&gt;

&lt;p&gt;One of the most well-known and widely applied techniques, and the one we will be focusing on, is &lt;strong&gt;formal verification&lt;/strong&gt;.&lt;/p&gt;

&lt;h4&gt;
  
  
  Formal verification in short
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://en.wikipedia.org/wiki/Formal_verification"&gt;Formal verification&lt;/a&gt; is a technique used to &lt;strong&gt;prove&lt;/strong&gt; the &lt;strong&gt;correctness&lt;/strong&gt; of a program against a given &lt;strong&gt;specification&lt;/strong&gt;. That might be a lot to take in, so let's break it down.&lt;/p&gt;

&lt;p&gt;The first important term to introduce is specification. A specification, simply put, is a description of what a program should do. The concept of specification appears in other areas of software engineering, and we as developers often encounter specifications in its informal form. One example of a specification, taking the &lt;strong&gt;toUppercase&lt;/strong&gt; function we defined earlier, would be:&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;toUppercase&lt;/strong&gt; function receives a string argument and returns a matching string in which every letter appearing in the argument is turned into uppercase. Every other character is left untouched.&lt;/p&gt;

&lt;p&gt;In formal verification, as the name suggests, we deal with &lt;strong&gt;formal&lt;/strong&gt; specifications of our programs. A formal specification is a more mathematical and rigorous description of what our program does. Formal specifications are written in what's called a specification language; the specifications we write might look like code written in a conventional programming language, but specification languages are usually much limited in their expressive power. The goal of a specification language is not to be a programming language on its own, but rather to describe particular properties of our programs. For example, going back to our &lt;strong&gt;toUppercase&lt;/strong&gt; function:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;function toUppercase(name: String) -&amp;gt; result: String {
    satisfies:
        - result.length == name.length
        - forall (i &amp;lt;- 0..result.length-1). result[i] == uppercase(name[i])
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This could be an example of a specification for our &lt;code&gt;toUppercase&lt;/code&gt; function in a specification language I just made up purely for illustrative purposes. In our specification, we unambiguously state:&lt;/p&gt;

&lt;p&gt;The toUppercase function receives a string argument name, and returns a string result, satisfying the two following conditions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;"the length of result (the returned string) must be equal to the length of name (the argument string)."&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;"for all indices i from 0 to the length of result minus 1, the character at index i in the result string must be equal to the character at index i in the name string applied to the uppercase function."&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is relatively similar to our informal specification we gave earlier! Just a bit more math-y.&lt;/p&gt;

&lt;p&gt;The properties we can describe of our software are determined by the focus and intent of the specification language. A specification language can be aimed at checking the kinds of data our programs deal with, or whether our programs return values satisfying some properties (like in our example), or even more general semantic properties, like how a program behaves when seeing it from a concurrency point of view.&lt;/p&gt;

&lt;p&gt;Now that we have introduced the concept of a specification for our functions or programs, we can now answer the question of what we mean when we say that a program is correct. As you might have guessed, we say that a program is correct when it adheres to the specification we have given for that program. It's as simple as that! It also relates to our intuitive notion of correctness: we would say that a program is correct when it behaves exactly like it is supposed to. A specification describes this behavior.&lt;/p&gt;

&lt;p&gt;This naturally implies that the program is free of bugs! ...well, while being true, this is kind of inaccurate. The kinds of bugs that get ruled out depend on the kind of behavior that can be expressed by our specification language. One specification language could be great at allowing you to express concurrency behavior, but not have support for modeling memory access behavior. The fact that our program is correct against a specification written in such a specification language means that it is free of concurrency bugs, which is still great if that's what we're looking for! But it means nothing in terms of memory safety.&lt;/p&gt;

&lt;p&gt;Finally, we can go back to our main topic. We stated at the beginning of this section that formal verification is the process of verifying (i.e. checking) that a program adheres to the &lt;strong&gt;formal specification&lt;/strong&gt; we have given for that program. With everything we have introduced so far, we can now make sense of this definition.&lt;/p&gt;

&lt;p&gt;Here is what a general process of formal verification looks like:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqu45mgeuxkglgb5jy2aw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqu45mgeuxkglgb5jy2aw.png" alt="Formal Verification Process" width="800" height="327"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Using formal verification
&lt;/h4&gt;

&lt;p&gt;Now, how can I as a developer make use of this? You might have already been using a form of verification all this time without realizing it! If you have developed programs in a statically typed language such as Java, C/C++, C#, Go, Rust or Haskell, you have used one of the most common and useful verification tools out there: static type checking. With static types, you are describing the expected behavior of your program when it comes to data: if an argument is tagged as a string, then it must be treated as a string or it will result in a compiler error (meaning that your program is not correct). The specification for your program is comprised of the type annotations.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;float findArea(float x, float y) { ... }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In this example of a function definition in C, we have just provided a specification for how you expect your function to behave:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The **findArea&lt;/em&gt;* function receives two floating-point numbers as arguments and returns another floating-point number.*&lt;/p&gt;

&lt;p&gt;Admittedly, this specification is fairly limited in expressing the behavior of our function: it says nothing about what the function does, and there are many other functions that we could come up with that would fit in that description. But it still does a very good job of ruling out potential bugs when calling the &lt;strong&gt;findArea&lt;/strong&gt; function with incorrect data (e.g. a string argument). The type systems found in Java or C are not particularly expressive, but you can find type systems like the ones found in Haskell or Rust that allow you to express much more nuanced properties of the kinds of data that your program can accept. Type systems keep being researched and taken even further to check for the correctness of other useful properties such as concurrency (e.g. &lt;a href="https://en.wikipedia.org/wiki/Session_type"&gt;session types&lt;/a&gt;) or memory access (e.g. &lt;a href="https://en.wikipedia.org/wiki/Substructural_type_system#Linear_type_systems"&gt;linear types&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;However, as we showed in the examples earlier, when relying on verification we usually want to have ways to describe in more detail how our program behaves. It's possible to find specification languages and tools that integrate tightly with a given programming language, allowing us to express formal properties as annotations in the source code. For example, &lt;a href="https://es.wikipedia.org/wiki/Java_Modeling_Language"&gt;Java Modelling Language (JML)&lt;/a&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="no"&gt;MAX_BALANCE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; 
&lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="cm"&gt;/*@ spec_public @*/&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;balance&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;//@ requires 0 &amp;lt; amount &amp;amp;&amp;amp; amount + balance &amp;lt; MAX_BALANCE;&lt;/span&gt;
&lt;span class="c1"&gt;//@ assignable balance;&lt;/span&gt;
&lt;span class="c1"&gt;//@ ensures balance == \old(balance) + amount;&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;addToAccount&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="n"&gt;balance&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The formal specification in this case is given by the annotations just above the function. The required annotation is a precondition, a property that must hold right before the addToAccount function is called. The ensures annotation is a postcondition, a property that must hold right after the addToAccount function is called and has finished executing. This fashion of specifying a function's behavior is called &lt;a href="https://en.wikipedia.org/wiki/Design_by_contract"&gt;design-by-contract&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The useful thing about functions that use design-by-contract is that if we are able to guarantee that when calling the function the stated precondition holds, then we can safely assume that the stated postcondition will hold when the function completes its execution. This is possible because a design-by-contract verification tool will try to ensure that the postcondition holds by formally reasoning about the implementation of the function. Otherwise, the verification tool will raise an error stating that the function is not correct! We will need to go back to the function and see how the implementation is violating the postcondition.&lt;/p&gt;

&lt;p&gt;Other formal verification tools are independent of particular programming languages and are instead focused on verifying an abstract model of our program's behavior. A couple of examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://en.wikipedia.org/wiki/TLA%2B"&gt;TLA+&lt;/a&gt; is a formal specification language with a complete toolchain and IDE, with a great focus on concurrent and distributed systems.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://en.wikipedia.org/wiki/Coq"&gt;Coq&lt;/a&gt; is an interactive theorem prover that is also used to verify algorithms and programs. It provides a formal specification language called Gallina.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These kinds of tools can be extremely powerful if used properly, although admittedly they can also be pretty hard to use for people less familiar with formal verification. They can be used to verify very complex programs. For example, &lt;a href="https://compcert.org/compcert-C.html"&gt;CompCert C&lt;/a&gt; is a verified compiler for C that can provide extra guarantees that there won't be bugs generated during compilation (compilers are also programs created by humans, and can have bugs!), intended for use in life-critical software. CompCert C is primarily written and verified in Coq.&lt;/p&gt;

&lt;h4&gt;
  
  
  Conclusions
&lt;/h4&gt;

&lt;p&gt;Formal verification is not a substitute for testing. Given that it's noticeably harder and more time-consuming to ensure correctness for a program when compared to writing some relevant (even if they are not 100% exhaustive) test cases, it still makes more sense to simply rely on testing when deadlines are tight and what you care about is catching the most prominent bugs that could appear in your software.&lt;/p&gt;

&lt;p&gt;However, we have seen that there are times when tests are not enough to ensure that our programs behave exactly as we expect them to. When we need such a high degree of confidence in our software, formal verification can help us accomplish that.&lt;/p&gt;

&lt;p&gt;To get started with formal verification, you can try out one of the tools we mentioned earlier, or you can look up on the Internet what's currently available for a programming language of your choice. If your chosen programming language is mainstream enough, chances are there will be at least one tool to perform some degree of formal verification in your programs.&lt;/p&gt;

</description>
      <category>testing</category>
      <category>formalverification</category>
      <category>formalmethods</category>
    </item>
    <item>
      <title>A QuickCheck Tutorial: Generators</title>
      <dc:creator>Stack Builders</dc:creator>
      <pubDate>Mon, 11 Mar 2024 21:37:43 +0000</pubDate>
      <link>https://dev.to/stack_builders/a-quickcheck-tutorial-generators-53pg</link>
      <guid>https://dev.to/stack_builders/a-quickcheck-tutorial-generators-53pg</guid>
      <description>&lt;p&gt;Learn how to use QuickCheck’s combinators to create simple generators of random values. From reversing lists to rolling dice and crafting generators for your data types, this tutorial will enhance your programming skills and help you get started with property-based testing in Haskell. This popular post was originally written in 2015 and updated in January 2024 to reflect QuickCheck library changes up to the most recent version (2.14.3) as well as other minor fixes.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://hackage.haskell.org/package/QuickCheck"&gt;QuickCheck&lt;/a&gt; is a Haskell library for testing properties using randomly generated values. It's one of the most popular Haskell libraries and part of the reason&lt;a href="https://doi.org/10.1093/nsr/nwv042"&gt;why functional programming has mattered&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;In short, we can use functions to express properties about our&lt;br&gt;
programs and QuickCheck to test that such properties hold for large numbers of random cases.&lt;/p&gt;

&lt;p&gt;For example, given a function to reverse the elements of a list:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight haskell"&gt;&lt;code&gt;&lt;span class="kr"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;Prelude&lt;/span&gt; &lt;span class="k"&gt;hiding&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;reverse&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;reverse&lt;/span&gt; &lt;span class="o"&gt;::&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;reverse&lt;/span&gt; &lt;span class="kt"&gt;[]&lt;/span&gt;     &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kt"&gt;[]&lt;/span&gt;
&lt;span class="n"&gt;reverse&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;xs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;reverse&lt;/span&gt; &lt;span class="n"&gt;xs&lt;/span&gt; &lt;span class="o"&gt;++&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We can define a property to check whether reversing a list (of&lt;br&gt;
integers) yields the same list or not:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight haskell"&gt;&lt;code&gt;&lt;span class="n"&gt;prop_ReverseReverseId&lt;/span&gt; &lt;span class="o"&gt;::&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;Integer&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;Bool&lt;/span&gt;
&lt;span class="n"&gt;prop_ReverseReverseId&lt;/span&gt; &lt;span class="n"&gt;xs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;
  &lt;span class="n"&gt;reverse&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reverse&lt;/span&gt; &lt;span class="n"&gt;xs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;xs&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And QuickCheck will generate 100 lists and test that the property&lt;br&gt;
holds for all of them:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ghci&amp;gt; import Test.QuickCheck
ghci&amp;gt; quickCheck prop_ReverseReverseId
+++ OK, passed 100 tests.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If we define a property to check whether reversing a list once yields the same list or not (which holds only for some lists):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight haskell"&gt;&lt;code&gt;&lt;span class="n"&gt;prop_ReverseId&lt;/span&gt; &lt;span class="o"&gt;::&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;Integer&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;Bool&lt;/span&gt;
&lt;span class="n"&gt;prop_ReverseId&lt;/span&gt; &lt;span class="n"&gt;xs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;
  &lt;span class="n"&gt;reverse&lt;/span&gt; &lt;span class="n"&gt;xs&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;xs&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;QuickCheck will generate lists until it finds one that makes the property fail:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ghci&amp;gt; quickCheck prop_ReverseId
*** Failed! Falsified (after 5 tests and 4 shrinks):
[0,1]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is a simple example, but it's good enough for illustrating the basic idea behind testing real world Haskell libraries and programs using QuickCheck.&lt;/p&gt;

&lt;p&gt;Now, a fundamental part of QuickCheck is random generation of values.&lt;br&gt;
Let's take a look at some of the pieces involved in this process and some examples of how to generate our own random values.&lt;/p&gt;

&lt;p&gt;To generate a random value of type &lt;code&gt;a&lt;/code&gt;, we need a generator for values of that type: &lt;code&gt;Gen a&lt;/code&gt;. The default generator for values of any type is &lt;code&gt;arbitrary&lt;/code&gt;, which is a method of QuickCheck's &lt;code&gt;Arbitrary&lt;/code&gt; type class:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight haskell"&gt;&lt;code&gt;&lt;span class="kr"&gt;class&lt;/span&gt; &lt;span class="kt"&gt;Arbitrary&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="kr"&gt;where&lt;/span&gt;
  &lt;span class="n"&gt;arbitrary&lt;/span&gt; &lt;span class="o"&gt;::&lt;/span&gt; &lt;span class="kt"&gt;Gen&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;
  &lt;span class="o"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If we have a generator, we can run it with &lt;code&gt;generate&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight haskell"&gt;&lt;code&gt;&lt;span class="n"&gt;generate&lt;/span&gt; &lt;span class="o"&gt;::&lt;/span&gt; &lt;span class="kt"&gt;Gen&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;IO&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let's run &lt;code&gt;arbitrary&lt;/code&gt; to generate values of some basic types that have&lt;br&gt;
an instance of &lt;code&gt;Arbitrary&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ghci&amp;gt; generate arbitrary :: IO Int
27
ghci&amp;gt; generate arbitrary :: IO (Char, Bool)
('m',True)
ghci&amp;gt; generate arbitrary :: IO [Maybe Bool]
[Just False,Nothing,Just True]
ghci&amp;gt; generate arbitrary :: IO (Either Int Double)
Left 7
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Additionally, QuickCheck provides several combinators that we can use to generate random values and define our own instances of &lt;code&gt;Arbitrary&lt;/code&gt;.&lt;br&gt;
For instance, we can use &lt;code&gt;choose&lt;/code&gt; to generate a random element in a given range:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight haskell"&gt;&lt;code&gt;&lt;span class="n"&gt;choose&lt;/span&gt; &lt;span class="o"&gt;::&lt;/span&gt; &lt;span class="kt"&gt;Random&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;Gen&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let's define a dice with &lt;code&gt;choose&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight haskell"&gt;&lt;code&gt;&lt;span class="kr"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;Test.QuickCheck&lt;/span&gt;

&lt;span class="o"&gt;...&lt;/span&gt;

&lt;span class="n"&gt;dice&lt;/span&gt; &lt;span class="o"&gt;::&lt;/span&gt; &lt;span class="kt"&gt;Gen&lt;/span&gt; &lt;span class="kt"&gt;Int&lt;/span&gt;
&lt;span class="n"&gt;dice&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;
  &lt;span class="n"&gt;choose&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And roll it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ghci&amp;gt; generate dice
5
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We can also generate a &lt;code&gt;Bool&lt;/code&gt; with &lt;code&gt;choose&lt;/code&gt; (in fact, this was&lt;br&gt;
QuickCheck's default generator for &lt;code&gt;Bool&lt;/code&gt; before switching to a faster implementation using &lt;code&gt;chooseEnum&lt;/code&gt;):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight haskell"&gt;&lt;code&gt;&lt;span class="n"&gt;arbitraryBool&lt;/span&gt; &lt;span class="o"&gt;::&lt;/span&gt; &lt;span class="kt"&gt;Gen&lt;/span&gt; &lt;span class="kt"&gt;Bool&lt;/span&gt;
&lt;span class="n"&gt;arbitraryBool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;
  &lt;span class="n"&gt;choose&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As another example, we can use &lt;code&gt;sized&lt;/code&gt; to construct generators that depend on a size parameter:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight haskell"&gt;&lt;code&gt;&lt;span class="n"&gt;sized&lt;/span&gt; &lt;span class="o"&gt;::&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Int&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;Gen&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;Gen&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let's take a look at how QuickCheck generates lists:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight haskell"&gt;&lt;code&gt;&lt;span class="n"&gt;arbitraryList&lt;/span&gt; &lt;span class="o"&gt;::&lt;/span&gt; &lt;span class="kt"&gt;Arbitrary&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;Gen&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;arbitraryList&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;
  &lt;span class="n"&gt;sized&lt;/span&gt; &lt;span class="o"&gt;$&lt;/span&gt;
    &lt;span class="nf"&gt;\&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kr"&gt;do&lt;/span&gt;
      &lt;span class="n"&gt;k&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt; &lt;span class="n"&gt;choose&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="n"&gt;sequence&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt; &lt;span class="n"&gt;arbitrary&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="kr"&gt;_&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;..&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Given a size parameter &lt;code&gt;n&lt;/code&gt;, QuickCheck chooses a &lt;code&gt;k&lt;/code&gt; from &lt;code&gt;0&lt;/code&gt; to &lt;code&gt;n&lt;/code&gt;, the number of elements of the list, and generates a list with &lt;code&gt;k&lt;/code&gt; arbitrary elements.&lt;/p&gt;

&lt;p&gt;We can follow this pattern to construct generators for our own data types. Let's use (rose) trees as an example of how to do this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight haskell"&gt;&lt;code&gt;&lt;span class="kr"&gt;data&lt;/span&gt; &lt;span class="kt"&gt;Tree&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;
  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kt"&gt;Tree&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;Tree&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="kr"&gt;deriving&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Show&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A rose tree is just a node and a list of trees. Here's a sample tree:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight haskell"&gt;&lt;code&gt;&lt;span class="n"&gt;aTree&lt;/span&gt; &lt;span class="o"&gt;::&lt;/span&gt; &lt;span class="kt"&gt;Tree&lt;/span&gt; &lt;span class="kt"&gt;Int&lt;/span&gt;
&lt;span class="n"&gt;aTree&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;
  &lt;span class="kt"&gt;Tree&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;Tree&lt;/span&gt; &lt;span class="mi"&gt;12&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;Tree&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;[]&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="kt"&gt;Tree&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="kt"&gt;[]&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="kt"&gt;Tree&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;Tree&lt;/span&gt; &lt;span class="mi"&gt;12&lt;/span&gt; &lt;span class="kt"&gt;[]&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Given such a tree, we can ask for things such as the number of nodes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight haskell"&gt;&lt;code&gt;&lt;span class="n"&gt;nodes&lt;/span&gt; &lt;span class="o"&gt;::&lt;/span&gt; &lt;span class="kt"&gt;Tree&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;Int&lt;/span&gt;
&lt;span class="o"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or the number of edges:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight haskell"&gt;&lt;code&gt;&lt;span class="n"&gt;edges&lt;/span&gt; &lt;span class="o"&gt;::&lt;/span&gt; &lt;span class="kt"&gt;Tree&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;Int&lt;/span&gt;
&lt;span class="o"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The sample tree has 6 nodes and 5 edges, for instance:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ghci&amp;gt; nodes aTree
6
ghci&amp;gt; edges aTree
5
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Given definitions for &lt;code&gt;nodes&lt;/code&gt; and &lt;code&gt;edges&lt;/code&gt;, we can test that they satisfy the theorem that every tree has one more node than it has edges:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight haskell"&gt;&lt;code&gt;&lt;span class="n"&gt;prop_OneMoreNodeThanEdges&lt;/span&gt; &lt;span class="o"&gt;::&lt;/span&gt; &lt;span class="kt"&gt;Tree&lt;/span&gt; &lt;span class="kt"&gt;Int&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;Bool&lt;/span&gt;
&lt;span class="n"&gt;prop_OneMoreNodeThanEdges&lt;/span&gt; &lt;span class="n"&gt;tree&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;
  &lt;span class="n"&gt;nodes&lt;/span&gt; &lt;span class="n"&gt;tree&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;edges&lt;/span&gt; &lt;span class="n"&gt;tree&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;But &lt;code&gt;Tree a&lt;/code&gt; is not an instance of &lt;code&gt;Arbitrary&lt;/code&gt; yet, so QuickCheck doesn't know how to generate values to check the property. We could simply use the &lt;code&gt;arbitrary&lt;/code&gt; generator for lists:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight haskell"&gt;&lt;code&gt;&lt;span class="kr"&gt;instance&lt;/span&gt; &lt;span class="kt"&gt;Arbitrary&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;Arbitrary&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Tree&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kr"&gt;where&lt;/span&gt;
  &lt;span class="n"&gt;arbitrary&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kr"&gt;do&lt;/span&gt;
    &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt; &lt;span class="n"&gt;arbitrary&lt;/span&gt;
    &lt;span class="n"&gt;ts&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt; &lt;span class="n"&gt;arbitrary&lt;/span&gt;
    &lt;span class="n"&gt;return&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Tree&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="n"&gt;ts&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;But we wouldn't be able to guarantee that such a generator would ever stop. Thus, we need to use the &lt;code&gt;sized&lt;/code&gt; combinator:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight haskell"&gt;&lt;code&gt;&lt;span class="kr"&gt;instance&lt;/span&gt; &lt;span class="kt"&gt;Arbitrary&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;Arbitrary&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Tree&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kr"&gt;where&lt;/span&gt;
  &lt;span class="n"&gt;arbitrary&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;
    &lt;span class="n"&gt;sized&lt;/span&gt; &lt;span class="n"&gt;arbitrarySizedTree&lt;/span&gt;

&lt;span class="n"&gt;arbitrarySizedTree&lt;/span&gt; &lt;span class="o"&gt;::&lt;/span&gt; &lt;span class="kt"&gt;Arbitrary&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;Int&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;Gen&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Tree&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;arbitrarySizedTree&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kr"&gt;do&lt;/span&gt;
  &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt; &lt;span class="n"&gt;arbitrary&lt;/span&gt;
  &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt; &lt;span class="n"&gt;choose&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="p"&gt;`&lt;/span&gt;&lt;span class="n"&gt;div&lt;/span&gt;&lt;span class="p"&gt;`&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;ts&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt; &lt;span class="n"&gt;vectorOf&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;arbitrarySizedTree&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="p"&gt;`&lt;/span&gt;&lt;span class="n"&gt;div&lt;/span&gt;&lt;span class="p"&gt;`&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
  &lt;span class="n"&gt;return&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Tree&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="n"&gt;ts&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Given a size parameter &lt;code&gt;m&lt;/code&gt;, we generate a value of type &lt;code&gt;a&lt;/code&gt;, choose a number &lt;code&gt;n&lt;/code&gt; to be the number of trees in the list, and then generate &lt;code&gt;n&lt;/code&gt; trees using the &lt;code&gt;vectorOf&lt;/code&gt; combinator. We use the &lt;code&gt;div&lt;/code&gt; function to make sure that generation stops at some point.&lt;/p&gt;

&lt;p&gt;Let's test the generator:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ghci&amp;gt; generate arbitrary :: IO (Tree Int)
Tree (-19) [Tree (-2) [Tree 15 [],Tree 28 []]]
ghci&amp;gt; generate arbitrary :: IO (Tree Int)
Tree 30 [Tree 15 [],Tree 19 [Tree 3 [],Tree (-28) []]]
ghci&amp;gt; generate arbitrary :: IO (Tree Int)
Tree (-11) [Tree (-6) [Tree (-6) [],Tree 1 []]]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Can you define the &lt;code&gt;nodes&lt;/code&gt; and &lt;code&gt;edges&lt;/code&gt; functions so that the tests pass?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ghci&amp;gt; quickCheck prop_OneMoreNodeThanEdges
+++ OK, passed 100 tests.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;p&gt;All of the examples were tested with GHC 9.6.4 and&lt;br&gt;
&lt;a href="https://hackage.haskell.org/package/QuickCheck-2.14.3"&gt;QuickCheck 2.14.3&lt;/a&gt;. For more information, see the&lt;br&gt;
&lt;a href="https://www.cse.chalmers.se/~rjmh/QuickCheck/manual.html"&gt;QuickCheck manual&lt;/a&gt;&lt;/p&gt;

</description>
      <category>haskell</category>
      <category>programming</category>
      <category>functional</category>
    </item>
  </channel>
</rss>
