DEV Community

Pixelle.AI
Pixelle.AI

Posted on

MultiAgent Architecture Practice: Building ComfyUI-Copilot V2.0 with 3k GitHub Stars

  • ComfyUI Background ComfyUI is the “LEGO blocks” of the Stable Diffusion ecosystem. It transforms AI image generation into a visual programming game. Unlike WebUI’s “point-and-shoot camera” approach, every function here is a modular node that can be assembled: drag a “prompt” node to connect to a “sampler,” then attach “ControlNet” to adjust poses… It’s like building a circuit board to construct your personalized image generation pipeline.

Developers use it to debug new animation architectures, digital artists rely on it to precisely control hair texture and lighting, and AIGC researchers can observe the magical changes in latent space in real-time. All computations run on your local GPU (requires at least 8GB VRAM), supporting free loading of community-developed plugins and models. When you watch parameters flow like electricity between nodes and finally converge into stunning images, you’ll suddenly understand why people say: “Once you get used to ComfyUI, you can never go back to WebUI.”

But, fellow ComfyUI alchemists, do you often feel that:

  • Error messages look like Morse code? If you can understand them, you win; if not… just restart!

  • Node connections are more complex than a Spring Festival railway map? One wrong line and three hours of work are wasted!

  • Parameter tuning relies entirely on mysticism? Add one more zero after the decimal point, and your GPU goes on strike!
    The workflow.json you worked so hard to obtain? After opening it, you instantly understand what “ideal is full, reality is skinny” means…


Don’t panic! ComfyUI-Copilot V2.0 is here! It allows you to complete AI image generation development through natural language conversation, helping ComfyUI users complete every step of building image generation workflows. Beginners can get started in 5 minutes, and experts can double their efficiency!

What ComfyUI-Copilot Can Do
From “blank canvas” to “generating a high-quality image that meets user requirements,” we break down the user’s real journey in ComfyUI into four main steps and provide actionable intelligent support at each step.

  • Conversation-First Development: Describe your intent in natural language, and the system translates it into executable workflows and operational suggestions.

  • Workflow Rewriting: Directly describe what you’re unsatisfied with and new requirements. Copilot automatically restructures the workflow, replaces/adds key nodes, and recommends optimal parameter ranges based on your environment.

  • Measurable Improvements: Use GenLab to transform “mystical parameter tuning” into visual comparisons and reproducible experiments.


ComfyUI-Copilot V2.0 is built on a Multi-Agent collaboration framework and integrates with local services and ComfyUI’s official ecosystem through MCP.

Star History


Fine-grained Controllable Generation
As a ComfyUI UI plugin, Copilot emphasizes “visible results, controllable processes, and reproducible engineering” in the “high quality” and “controllable generation” dimensions. Relying on multi-agent and structured tool chains, Copilot implements “controllability” at three layers: intent→structure→parameters, and provides comparison and reproduction capabilities through GenLab.

High Quality and Controllability: Practical Use Cases

  • Character and Style Consistency
    Through “workflow rewriting + model/LoRA recommendation + parameter mapping” to ensure the same character maintains facial features and makeup consistency across multiple shots and consistent styles; supports “lock composition/character, only replace style” one-click strategy.

  • Controllable Composition/Pose/Lighting (Structure Control)
    Integrates common control nodes (such as pose/edge/depth, etc.) and provides “intelligent node completion + downstream subgraph recommendation,” allowing controlled composition while maintaining detail quality; supports “lock layout, open texture details” soft and hard constraint combinations.

  • Production-level Detail Quality (Detail Fidelity)
    For e-commerce/material scenarios, provides “high-resolution reconstruction pipeline + artifact removal parameter templates” to reduce edge aliasing and texture stretching; supports multi-solution A/B comparison (subjective scoring + objective metrics like SSIM/LPIPS) and one-click rollback to optimal snapshots.

  • Reproducibility and Process Traceability
    Full-process snapshots (workflow structure, model versions, key parameters, seeds) and experiment records ensure reproducible and auditable results in team collaboration; automatic failure rollback and parameter range protection reduce crash probability caused by “mystical parameter tuning.”

  • Ability to reuse all end-to-end models, here’s an e-commerce scenario showcase
    I want to change the model in this Taobao women’s clothing image to a Black person and add marketing copy
    Nano banana can ensure the clothing remains unchanged, but the text addition effect is poor
    Qwen image can add marketing copy, but can’t maintain the main subject well
    ComfyUI can combine the advantages of all end-to-end models and all image control components, such as Controlnet, to generate more controllable production images

Differentiated Advantages Compared to Foundation Model Solutions (vs Nano Banana / Qwen Image Edit)


In summary: Copilot’s differentiation lies in “engineering-structured controllability supporting high-quality generation and team-level reproduction.” Foundation model solutions lean more toward “single editing/quick image generation,” while Copilot is more suitable for “controllable, comparable, and reusable” production scenarios. This also provides a verifiable closed-loop foundation for the subsequent DebugAgent.

Architecture and Design Philosophy

Canvas Plugin Form Selection:

  • Ecosystem Traffic Acquisition Considerations: We decided to adopt the ComfyUI plugin form rather than independent application development. This choice not only directly reaches ComfyUI’s core user base but also fully utilizes ComfyUI’s ecosystem traffic, reducing user acquisition costs.

  • Plugin Implementation Methods in ComfyUI Ecosystem: There are mainly two common plugin implementation methods in the ComfyUI ecosystem:

Custom Nodes: Extend functionality through custom nodes, suitable for technical users. This method offers high flexibility but requires users to have a certain technical background.
UI Plugins: Provide visual interfaces, more suitable for novice users. This method can significantly lower the usage threshold and improve interaction experience.

  • Target User Group Analysis: After in-depth analysis of the target user group, we found that most users prefer simple and easy-to-use tools. Therefore, we ultimately chose to develop UI Plugins to maximize the reduction of usage barriers while improving interaction experience.

Core Architecture Design:

  • V1.0 Limitations: ComfyUI-Copilot V1.0 only implemented frontend interaction capabilities on the plugin side, with all logic placed in the backend. As product functionality expanded, this architecture gradually exposed the following problems:
    Unable to perceive the local ComfyUI environment.
    Difficult to directly call local models (like Llama3).
    Unable to efficiently manage local services (such as node installation, model downloading).

  • V2.0 Improvements: To solve these problems, V2.0 evolved into the following architecture:

Plugin Side: Placed MCP-Client in Canvas and Copilot-Local, supporting calls to remote tools (SSE) and local services (stdio).
Remote Calls: Including workflow generators (SD/Flux/API Node), node recommendations, model recommendations, intelligent node completion, image evaluation, etc.
Local Services: Including workflow execution, one-click node installation, one-click model installation, parameter modification/local parameter mapping, etc.
Server Side: Opens MCP-Server, provides SSE Tools, supporting workflow generation, node recommendations, model recommendations, and other functions.
Local Model Support: Some users can directly call locally installed open-source models (like Llama3) to meet personalized needs.
Copilot-Local | MultiAgent

  • Problem Background: If relying solely on single Agent-Tools capabilities to implement complex functions (such as workflow modification and debugging), it would cause the Agent to mount too many Tools, leading to performance degradation and maintenance difficulties.

  • Solution: Adopt a layered architecture, breaking complex scenarios into multiple subtasks handled by different Agents, with each Agent only binding necessary Tools. MasterAgent: Responsible for overall coordination and decision-making, interacts with users, and collaborates with RewriteAgent. Ensures seamless task handoff through Handoff mechanism by passing context information. DebugAgent: Responsible for workflow debugging, identifies error types and calls LinkAgent, ParameterAgent, and WorkflowBugfixAgent for repairs. RewriteAgent: Responsible for workflow rewriting, calls RAG system based on user requirements, recalls relevant experience and node information, and generates optimized workflows.

Copilot-Remote | RAG

  • Server Capabilities: MCP-Server provides SSE Tools, supporting MCP-Client calls from the plugin side. Core functions include: Workflow recall and generation. Node recommendations and model recommendations.

Technology Stack Selection:

  • OpenAI Agents (Python):
  1. Native Multi-Agent Collaboration Support: Perfectly matches the design requirement of MasterAgent coordinating multiple sub-Agents.
  2. Standardized Tool Registration/Discovery Mechanism: Facilitates layered management of different Agents’ dedicated tool sets while supporting convenient MCP configuration. 3. Built-in Context Management: Supports complex workflow debugging session persistence and seamless context information transfer through Handoff mechanism.

MasterAgent
MCP Client & MCP Server
In ComfyUI-Copilot V2.0, MasterAgent mounts 4 core tools through MCP (Multi-Agent Control Protocol) SSE (Server-Sent Events) method to achieve efficient interaction with backend services. Here’s the specific implementation code:

async with MCPServerSse(
    params= {
        "url": BACKEND_BASE_URL + "/mcp-server/mcp",
        "timeout": 300.0,
    },
    cache_tools_list=True,
    client_session_timeout_seconds=300.0
) as server:
  triage_agent = Agent(
      name="ComfyUI-Copilot",
      instructions="...",
      mcp_servers=[server],
      handoffs=[rewrite_agent],
  )

Enter fullscreen mode Exit fullscreen mode

Technical Details Supplement:

  • MCP-SSE Role: SSE is a lightweight real-time communication protocol suitable for unidirectional data push scenarios. Here, it’s used for efficient communication between MasterAgent and backend services, ensuring real-time updates of tool lists and task status synchronization.

  • triage_agent Design: As the core component of MasterAgent, triage_agent is responsible for coordinating task distribution and passing specific tasks (such as workflow rewriting) to rewrite_agent through the handoffs mechanism.

To be compatible with the existing FASTAPI system while supporting both MCP and traditional API calls, we adopted the following architectural design:

from fastmcp import FastMCP

mcp = FastMCP(name="ComfyUI Copilot MCP", instructions="Tools for ComfyUI workflow, node, and model management")
mcp_app = mcp.http_app('/mcp', transport="sse")
app = FastAPI(**app_args, lifespan=mcp_app.lifespan)
# Mount MCP application first, then add other middleware to avoid middleware affecting MCP's ASGI handling
app.mount("/mcp-server", mcp_app)

@mcp.tool()
async def recall_workflow() -> Dict[str, Any]:
  return {}

Enter fullscreen mode Exit fullscreen mode
  • FastMCP Role: It’s a lightweight framework for integrating MCP protocol into FASTAPI, supporting hybrid mode of SSE and traditional HTTP requests.

  • Tool Registration: Through @mcp.tool() decorator, functions can be registered as MCP tools for MasterAgent to call. For example, the recall_workflow tool is used to recall workflow data.

RAG & Benchmark System
To precisely match user requirements from massive information, we designed a complete RAG (Retrieval-Augmented Generation) system supplemented with Benchmark evaluation mechanisms.


Offline processing handles heavy computational tasks, with all complex time-consuming operations pre-completed and stored in the database. Online services maintain lightweight design to ensure quick response and improve user experience.

Offline processing is mainly divided into four steps:

  1. Information Collection: Crawl ComfyUI-related data through crawler systems 2. Data Cleaning: Classify and process different types of data, focusing on solving structured problems of documents, image-text, and multilingual content. For example, perform title segmentation processing on node documents, Lora information is mainly stored in images, integrate image-text information through multimodal models, and unify translation of multilingual content 3. Information Structuring: Convert node documents into structured data; extract Lora base models as filtering tags; build node association knowledge graphs 4. Vectorization Processing: Generate vector data through embedding technology and store it in the database along with structured data

To achieve workflow completion functionality similar to Cursor code completion, the system recommends multiple downstream subgraphs after users select nodes. By deconstructing complex workflows into several simplified subgraph connections, it helps large models better understand workflow structures. Graph algorithms are used to extract frequently occurring subgraph patterns and solidify them into reusable common components.

The online process fully utilizes structured data generated by offline processing:

  • User input often has incomplete or ambiguous expressions, requiring Agent semantic rewriting and completion, then filtering out interference items through metadata

  • Adopt dual-path recall strategy to balance recall breadth and result precision: vector recall covers semantically similar cases, keyword matching ensures result relevance, both are weighted and fused for ranking

  • Perform relevance evaluation on recall results, introduce business metrics in some scenarios (such as combining GitHub stars weight in node recall) for final ranking

DebugAgent

Design Philosophy
Referencing Cursor’s debug process, after running scripts and getting error information, it attempts one rewrite, then runs the test script again based on the rewritten code. Through multiple error feedback and modifications, it finally completely solves the error problems. The core of this process lies in error capture, intelligent analysis, iterative repair, and verification closed loop.

  1. Error Capture Phase: Automatically run current workflow through workflow validation tools, capturing structured error logs. The key to this step is ensuring completeness and parseability of error information, providing foundation for subsequent analysis.

2.
Intelligent Analysis Phase: Error classifier routes errors to corresponding domain Agents (connection/parameter/structure). For example:
Connection errors (connection_error) are handled by Link Agent.
Parameter exceptions (value_not_in_list) are handled by Parameter Agent.
Structure problems (node_compatibility) are handled by Workflow Bugfix Agent.

3.
Iterative Repair Phase: Each Agent uses dedicated tools for repair, automatically saving workflow snapshots after each modification. For example:

Link Agent is responsible for fixing missing connections.
Parameter Agent is responsible for adjusting parameter values or suggesting model downloads.
Workflow Bugfix Agent is responsible for handling node compatibility issues or removing invalid nodes.

4.
Verification Closed Loop: Automatically trigger secondary verification after repair, forming debugging closed loop (maximum 6 iterations). If verification passes, output success; otherwise, continue analysis and repair.

MultiAgent Structure

debug_agent = Agent(
    name="Debug Agent",
    instructions="You determine which agent to use based on the user's homework question",
    tools=[run_workflow, analyze_error_type, save_current_workflow],
)

parameter_agent = Agent(
    name="Parameter Agent",
    handoff_description="",
    tools=[find_matching_parameter_value, get_model_files, 
        suggest_model_download, update_workflow_parameter, get_current_workflow],
    handoffs=[debug_agent],
)

link_agent = Agent(
    name="Link Agent",
    handoff_description="",
    tools=[analyze_missing_connections, apply_connection_fixes, get_current_workflow, get_node_info],
    handoffs=[debug_agent],
)

workflow_bugfix_default_agent = Agent(
    name="Workflow Bugfix Agent",
    handoff_description="",
    tools=[get_current_workflow, get_node_info, update_workflow],
    handoffs=[debug_agent],
)

debug_agent.handoffs = [link_agent, workflow_bugfix_default_agent, parameter_agent]

Enter fullscreen mode Exit fullscreen mode

Each Agent focuses on specific tasks and achieves efficient collaboration through context routing protocols. Here are the key design points:

  1. Context Routing Protocol:
    Connection errors (connection_error) ➔ Link Agent
    Parameter exceptions (value_not_in_list) ➔ Parameter Agent
    Structure problems (node_compatibility) ➔ Workflow Bugfix Agent

  2. Tool Reuse Strategy:
    Basic tools (like get_current_workflow) are shared across Agents, avoiding duplicate implementation.
    Dedicated tools (like apply_connection_fixes) are mounted on-demand, ensuring function focus.
    Tool outputs are automatically cached, avoiding repeated calculations.

3.
Shared Context Control:
Each Agent automatically inherits previous context during processing.
Tool outputs are automatically merged into global context (such as Link Agent’s connection repair records).
Structured data is returned to frontend through tool, reducing LLM burden.

4.
Processing results are returned, structured data is returned to frontend through tools. All tool returns in the entire MultiAgent system can be uniformly processed through Events. This implementation is key to letting LLM focus on core functions, allowing the MultiAgent system to return various complex JSON data formats through Tools without putting pressure on the LLM.

result = Runner.run_streamed(
    debug_agent,
    input=messages,
)
async for event in result.stream_events():
  if event.type == "raw_response_event" and isinstance(event.data, ResponseTextDeltaEvent):
    # Stream text deltas for real-time response
    delta_text = event.data.delta
    if delta_text:
        current_text += delta_text
        # Only yield text updates during streaming
        if len(current_text) > last_yielded_length:
            last_yielded_length = len(current_text)
            yield (current_text, None)
  elif event.type == "run_item_stream_event":
    elif event.item.type == "tool_call_output_item":
      output = str(event.item.output)
      tool_output_json = json.loads(output)
      yield (current_text, tool_output_json)

Enter fullscreen mode Exit fullscreen mode

Implementing this entire MultiAgent structure was quick, but we soon encountered many problems. MultiAgent is very difficult to debug, and we used the OpenAI Agents framework, which has many technical details that need handling and appear as black boxes to us. The subsequent work was the most time-consuming: making MultiAgent more stable, controllable, and intelligent.


Smart Approach: Context Control, Only Give Necessary Information
Let individual LLMs in MultiAgent focus on specific tasks. Without any configuration, handoff defaults to passing complete context.

  • filter: When Agent handoff occurs, complete context is passed by default. But you can limit the amount of information passed through input_filter, such as removing all tool call history, ensuring new Agent only focuses on necessary information.
from agents import Agent, handoff
from agents.extensions import handoff_filters

agent = Agent(name="FAQ agent")

handoff_obj = handoff(
    agent=agent,
    input_filter=handoff_filters.remove_all_tools, 
)

Enter fullscreen mode Exit fullscreen mode
  • InputData: In some cases, when large models perform handoff, you want it to provide some data, and only provide this data, without passing lengthy complete context. For example, if handing off to an “escalation agent,” you might want to provide a reason for recording.
from pydantic import BaseModel

from agents import Agent, handoff, RunContextWrapper

class EscalationData(BaseModel):
    reason: str

async def on_handoff(ctx: RunContextWrapper[None], input_data: EscalationData):
    print(f"Escalation agent called with reason: {input_data.reason}")

agent = Agent(name="Escalation agent")

handoff_obj = handoff(
    agent=agent,
    on_handoff=on_handoff,
    input_type=EscalationData,
)

Enter fullscreen mode Exit fullscreen mode

Smart Approach: “Artificial Intelligence”, Enhancing Determinism

  • How much artificial, how much intelligence. Separate deterministic work from work that must be decided by LLM. Implement deterministic work through coding, minimize LLM output difficulty and length, only let LLM make decisions.

  • Param Agent changes parameters to normal values when parameters are abnormal. In this situation, if you also let LLM return a huge workflow with just a few parameter fields modified, similar scenarios would add burden to LLM unnecessarily and easily cause hallucinations.

  • Similarly, in workflow connection scenarios, if completely implemented by LLM, it becomes very passive - input a huge workflow, connect several lines, then output several huge workflows. Below is an example using Link Agent:

# Tool enumerates all nodes and corresponding parameters that need connections
@function_tool
async def analyze_missing_connections() -> str:
    """
    Analyze missing connections in workflow, enumerate all possible connection options and required new nodes.

    Return format description:
    - missing_connections: Detailed list of missing connections, including node ID, input name, required data type, etc. (only includes required inputs)
    - possible_connections: Connection options that existing nodes can provide
    - universal_inputs: Universal input ports that can accept any output type
    - optional_unconnected_inputs: List of unconnected optional inputs, including node ID, input name, configuration info, etc.
    - required_new_nodes: List of new node types that need to be created
    - connection_summary: Statistical summary of connection analysis (including statistics of optional inputs)
    """
    try:
        # Get session_id from context, get workflow from db through session_id
        session_id = get_session_id()
        workflow_data = get_workflow_data(session_id)
        if not workflow_data:
            return json.dumps({"error": "No workflow data found for this session"})
        # Get current node information
        object_info = await get_object_info()

------------------------------------------------------------------------

# Guide LLM to return nodes to add and lines to connect
# Tool modifies workflow and provides feedback to frontend through Event
@function_tool
def apply_connection_fixes(fixes_json: str) -> str:
    """Batch apply connection fixes, fixes_json should be JSON string containing repair instructions"""
    ...
    return  json.dumps([{
            "type": "workflow_update",
            "data": {
                "workflow_data": workflow_data,
                "changes": {
                    "applied_fixes": applied_fixes,
                    "failed_fixes": failed_fixes
                }
            }
        }])

Enter fullscreen mode Exit fullscreen mode

Smart Tool Management Tips: Making AI No Longer Overwhelmed
When the system is filled with 30+ functional tools, it’s like asking a novice chef to operate ten stoves simultaneously - very easy to get overwhelmed.

  • Task Role Boundary Ambiguity: Unclear Responsibilities Among Agents In V1, multiple sub-Agents mounted numerous general tools, leading to overlapping responsibilities and decision confusion How to allocate roles to fully utilize different agents’ expertise and decompose tasks to various agents is key to improving collaboration efficiency.

We summarized two management secrets when developing ComfyUI-Copilot V2.0:
First Trick: Tool Classification and Packaging

  1. Naming is Important
    Each tool is like a spice bottle in the kitchen - names should be immediately understandable (like “Parameter Validator”, “Connection Assistant”)
    Input and output should be standardized, like spice bottle openings being uniform (mandatory type annotation and fixed return formats)

  2. Fixed Combinations Save Time and Effort
    Discover golden partnerships: like “wash vegetables→cut vegetables” fixed processes, package them as ready-made meals
    Reference Link Agent’s connection repair functionality, solidify common operation processes

Second Trick: AI Team Division Method

  1. Three-layer Management Architecture
    Grassroots employees (L1): Specialize in 3-5 tools, like Param Agent only handles parameter issues
    Team leaders (L2): Coordinate multiple tools, like Link Agent responsible for entire workflow connections
    General manager (L3): Overall coordination, dispatching different teams based on situations

  2. Smart Scheduling Secrets
    Give each tool “feature tags” (like “parameter processing”, “graphic connection”)
    When problems come, first match tags, like automatic package recognition at express sorting stations
    Switch handlers after 3 consecutive failures, avoid stubbornness

Tracing Assistant - Langsmith
In MultiAgent scenarios, debugging becomes very painful, with errors reported at very deep levels, making troubleshooting extremely difficult. In this situation, Tracing needs to be integrated to assist with troubleshooting and debugging. OpenAI Agents recommends using Langfuse, but currently Langsmith is still the best to use. The method to integrate Langsmith is as follows:

import os

os.environ['LANGCHAIN_TRACING_V2'] = "true"
os.environ['LANGCHAIN_API_KEY'] = "xxx"

import asyncio
from agents import Agent, Runner, set_trace_processors, set_tracing_disabled, set_default_openai_api
from langsmith.wrappers import OpenAIAgentsTracingProcessor

set_tracing_disabled(False)
set_default_openai_api("chat_completions")

async def main():
    agent = Agent(
        name="Captain Obvious",
        instructions="You are Captain Obvious...",
        model="gpt-4.1-2025-04-14-GlobalStandard",
    )
    question = "hello"
    result = await Runner.run(agent, question)
    print(result.final_output)

if __name__ == "__main__":
    set_trace_processors([OpenAIAgentsTracingProcessor()])
    asyncio.run(main())

Enter fullscreen mode Exit fullscreen mode

RewriteAgent


Smart Approach: From Prompt Engineering to Context Engineering
Context Engineering is a systematic discipline focused on designing, building, and maintaining a dynamic system responsible for intelligently assembling optimal context combinations for Agents at each step of task execution, ensuring tasks can be completed reliably and efficiently.

Long contexts bring cost and coordination pressure, more easily exposing four types of context failures: pollution, interference, confusion, and conflict. They often couple with each other and directly damage reasoning stability and cross-agent transfer. Context Engineering can avoid these risks through intelligent management and compression of context, only injecting high-value conclusions and information into context (a considerable portion of tokens have no value for analysis).



Data Flow Example:

  1. A Agent collects raw data through tool sets 2. Cleaned structured data stored in Context 3. Context fingerprint carried during task handoff 4. B Agent directly reads Context data 5. LLM only handles core decision logic
# Context-driven architecture example
import asyncio
from dataclasses import dataclass

from agents import Agent, RunContextWrapper, Runner, function_tool

@dataclass
class UserInfo:  
    name: str
    uid: int

@function_tool
async def fetch_user_age(wrapper: RunContextWrapper[UserInfo]) -> str:  
    """Fetch the age of the user. Call this function to get user's age information."""
    return f"The user {wrapper.context.name} is 47 years old"

async def main():
    user_info = UserInfo(name="John", uid=123)

    agent = Agent[UserInfo](  
        name="Assistant",
        handoff=[agentB],
    )

    agentB = Agent(
        name="B Agent",
        tools=[fetch_user_age],
    )

    result = await Runner.run(  
        starting_agent=agent,
        input="What is the age of the user?",
        context=user_info,
    )

    print(result.final_output)  
    # The user John is 47 years old.

if __name__ == "__main__":
    asyncio.run(main())

Enter fullscreen mode Exit fullscreen mode

Smart Approach: Five Fingers Don’t Touch Spring Water
Any Agent design that mounts multiple tools will bring more requirements and context information to LLM. For particularly complex tasks where model capabilities are limited, we should return to basics, let LLM do nothing else, complete all information collection in advance, store it in context in structured data format, then let LLM focus on handling this complex task with all prepared context.

# RewriteAgent completes all information collection and stores it in context
@function_tool
async def get_node_info(node_class: str) -> str:
    """Node metadata collector"""
    try:
        object_info = await get_object_info()
        if node_class in object_info:
            node_data = json.dumps(object_info[node_class], ensure_ascii=False)
            # Context persistent storage
            get_rewrite_context().node_infos[node_class] = node_data
            return node_data
    except Exception as e:
        return json.dumps({"error": f"Metadata acquisition failed: {str(e)}"})

# Context-driven workflow generator
def build_llm_context(rewrite_context) -> str:
    """Structured context builder"""
    return f"""
## Core Elements
* Business Intent: {rewrite_context.rewrite_intent}
* Current State: {rewrite_context.current_workflow}
* Environment Data: {json.dumps(rewrite_context.node_infos, ensure_ascii=False)}
* Domain Knowledge: {rewrite_context.rewrite_expert or 'Basic Rules'}
"""

# Simplified LLM interaction interface
def generate_workflow(context_str: str) -> dict:
    """Context-driven workflow generation"""
    return client.chat.completions.create(
        model=WORKFLOW_MODEL_NAME,
        messages=[{
            "role": "system",
            "content": "You are a workflow generation expert, please generate solutions based on the following structured context:"
        }, {
            "role": "user",
            "content": context_str
        }],
        response_format=RewriteResponse  # Strong type response constraints
    )

Enter fullscreen mode Exit fullscreen mode

Come Try ComfyUI-Copilot!
How to download and install?
github👉https://aidc-ai.github.io/ComfyUI-Copilot?utm_source=devto&utm_medium=blog&utm_campaign=launch&utm_content=article
discord👉https://discord.gg/zNkR5xaT

Top comments (0)