Building a Task Flow Builder

Leslie McFarlin — Thu, 04 Dec 2025 12:46:49 +0000

This is a submission for the Google AI Agents Writing Challenge: Learning Reflections

Before the AI agents Intensive, I thought of LLMs as finely tuned probabilistic models: systems that surface the next most likely tokens given an input and context window. In other words, they were powerful conditional probability machines that could align very well with a prompt--but fundamentally in a single pass and within a single message.

This course reframed that for me. Agents are orchestrators that use LLMs as just one component in a larger system: combining model reasoning with tools, memory, and state to actually do work over time.

I came into the course as a quantitative UX researcher with some design experience, watching UX teams adjust (and sometimes struggle) as AI-infused tools are pushed into their stacks--Lovable, Builder.io, Figma, and beyond. In that context, one of the most debated and critical artifacts is still the same: the task flow. Teams argue about what really happens, how many paths exist, and what to document. I wanted to build something that actually helps at that point of friction: taking messy, verbal descriptions and turning them into a clear, validated flow that everyone can see and react to.

Concepts That Resonated the Most

The biggest concept that stuck with me was understanding the idea of tools + stateful sessions as the core building blocks of agentic systems.

Tools
The course's emphasis on moving logic into tools and keeping the LLM relatively "thin" resonated with my background. A tool has a clear contract, can be tested independently, and is auditable. That shaped my tool functions. Instead of asking Gemini to generate a diagram in one shot, I let it decide when to call each tool and with what structured input.
Session and memory
Session and memory made it clear that good agents carry context across multiple turns. For my UX Task Flow Builder, that's what allows the agent to ask a clarifying question and then meaningfully update the flow when I answer.
Observability
Logging and tracing turned this project from "a cool thing I tried" into something I could seriously reason about and debug. When sessions or app names were misaligned, those logs were the only reason I could track down what went wrong.

The idea of multi-step, tool-using flows where the agent can decide, "I need to ask more, then call a tool, then show results" also felt very close to how UX work actually happens.

How My Understanding of AI Agents Evolved

My mental model shifted from "an LLM that responds to prompts" to "a system that coordinates the LLM, tools, and state to achieve a goal."

Some of the biggest shifts were:

From single-pass to dialog
I used to think mostly in terms of optimizing a single, dense prompt. Now I think in turns: the agent can say, "I can build that flow, but I need to know X first." In UX, that's how collaboration works, I just hadn't applied that lens to an AI system before.
From model-centric to architecture-centric
Instead of starting with "Which model should I use?", I started with assessing what was needed to accomplish the end goal:
- What tools do I need?
- How do I represent a task flow so it can be reviewed?
- When should the agent call a tool vs. ask a question?
Thinking about an extensible system

The multi-agent and evaluation sections of the course made me think of my project as a starting point and not the final point. There's room to add a more robust reviewer agent or a Streamlit UI in a way that feels natural instead of bolted on.

The key shift for me is: LLMs are still probabilistic sequence models, but agents let us embed them in workflows where they can take action and make refinements instead of answer once and stop.

What I Built and What I Learned from It

My capstone project is a UX Task Flow Builder Agent, an LLM-powered agent that turns natural language descriptions of UX flows into valid Mermaid diagrams. It also generates .mmd and PNG outputs. Building it has taught me several things:

Defensive design is your friend
I ran into cases where the agent sent unexpected output types into one of the tools. Instead of crashing, I updated the code to accommodate that output but return a failed validation message and an error diagram. This actually mirrors a UX principle: it's better to show an informative error than a blank screen.
Agent plumbing matters
Misaligned APP_NAME, sessions, and runners caused a bit of a headache at times. Fixing that forced me to really understand how ADK wires agents, apps, and sessions together. This is knowledge I can reuse on more complex projects.

Closing Thoughts

As someone who has been in task-flow debates, seeing the agent ask questions like, "Should the user be allowed to retry after failed login?" and then generate a clean diagram felt practical. It directly addresses a coordination + documentation problem that some UX teams run into, and that may become more common as AI tools get bolted into existing design stacks.

Overall, this intensive shifted me from thinking about prompting a probabilistic model to designing agentic UX tools. It gave me both the conceptual toolkit and concrete project I can evolve. Most importantly, it gave me something I can plausibly bring into real-world UX workflows where AI tools are no longer optional.

DEV Community: Leslie McFarlin

Building a Task Flow Builder

Concepts That Resonated the Most

How My Understanding of AI Agents Evolved

What I Built and What I Learned from It

Closing Thoughts