Hidekazu Kubota

Posted on Jan 25 • Edited on Feb 21 • Originally published at Medium

Mindmap Generator: How to Draw SVG-Based Graphs with AI-Generated Text and Images Using the Agent Approach

#devchallenge #agentaichallenge #ai #machinelearning

This is a submission for the Agent.ai Challenge: Full-Stack Agent (See Details) and Productivity-Pro Agent (See Details)

What I Built

I have developed a Mindmap Generator using an agent-based approach.
This agent will collaborate with you to create a mindmap from your ideas.
https://agent.ai/agent/mindmap

A mindmap is a well-known visual tool that organizes ideas around a central concept, using branches to illustrate relationships and subtopics. It is useful for brainstorming, planning, and summarizing information, making complex ideas easier to understand and remember.

The mindmap is originally an interactive process where humans create it through trial and error using paper and colored pens. On the other hand, it is also favored as a visualization technique. Therefore, this agent can achieve both aspects.

✏️ Interactive Concept Building

The user can progressively develop the diagram while conversing with the AI.

🔄 Fully Automated Visualization

The user can have the AI automatically generate a mindmap to visualize their ideas.

Demo

https://agent.ai/agent/mindmap
Please open the URL above and try using the Mindmap Generator.

Here is a video that demonstrates how it works.

Enter your idea.
Choose "Full automation" or "Interactive."
The agent presents the first result, which you can revise through conversation.

In Interactive mode, the initial result is very minimal. In Full automation mode, a nearly complete result is displayed.

Once you are satisfied with your mindmap, the conversation can end at any time. If needed, press the "Open SVG" button to save it as an SVG image.

The images downloaded from the Mindmap Generator have a wide range of uses. Since the image format is SVG, you can easily edit them later using any online or offline SVG editor. You can modify the text or move items as well.

How it works

The mindmap needs to stimulate the viewer's imagination. In particular, the concept placed at the center must be evocative. For this reason, the agent performs the following generation tasks sequentially to create the mindmap:

Generation Tasks

🎨 Task 1: Generate Concept Image
- Generate an image and an <img> tag representing the central concept.
📜 Task 2: Generate Concept Tree
- Generate text in a tree structure representing ideas related to the concept.
🌀 Task 3: Generate Visual Map
- Using the concept image and the tree-structured text, generate an SVG image.
🛠️ Subtask: Transform Format
- Additionally, a subtask is executed to generate the Base64-formatted image data required for Task 3, based on the <img> tag generated in Task 1.

The agent also conducts the following dialogue tasks with the user:

Dialogue Tasks

💡 Task a: Get Concept
- Obtain the central concept from the user.
🔄 Task b: Get Mode
- Obtain the user's selection for the execution mode ("Interactive" or "Full Automation").
✏️ Task c: Revise Interaction
- Incorporate the user's instructions for the output SVG image and redo Generation Task 2 and 3.

The process flow is illustrated in the following diagram.

It may seem logical to place Task b immediately after Task a, but since Generation Tasks take a significant amount of time, Task b's question is asked to the user midway to avoid keeping them waiting too long.

Here is a video that explains the workflow of the Mindmap Generator within Agent Builder.

Technical challenges

🎨 Task 1: Generate Concept Image

The concept at the center of a mindmap should ideally be circular, as the branches can extend in any direction without bias.

However, general image generation AI creates rectangular images. By using SVG's clip-path, it is easy to crop the original image into a circular shape. This SVG approach significantly enhances the visual expressive power of AI agents.

The images generated are likely to be more helpful for free thinking if they are abstract rather than concrete. In this context, images suitable for mindmaps are being generated using prompts such as the following:

For "{{user_input}}", write one drawing that meets all of the following conditions: (1) The lines should look as if they were drawn with a thick brush. (2) There should be only one element in the picture. (3) The painting should be very simple. (4) The picture must be painted in one pastel color. (5) The painting must be painted in one pastel color. (6) Paint as a composition. (7) The painting must express Zen.

📜 Task 2: Generate Concept Tree

Generating tree-structured text with AI is not difficult, as AI has knowledge of mindmaps and an understanding of tree structures based on simple rules. Thus, the required prompt will be as follows.

Create a mindmap about "{{user_input}}" with the tree structure written as follows:  
- Write each item on one line.  
- Use the number of leading spaces to indicate the depth of the tree structure.  
- The highest level should have 1 leading space.  
- The top level should always be "{{user_input}}" and not "top."  
Respond with the mind map starting from the top level on the first line, without adding unnecessary words. 
Follow the [Example] below:
[Example]
 top
  idea1
   idea1_1
    idea1_1_1
    idea1_1_2
    idea1_2
   idea2
    idea2_1
    idea2_2

If an example of a tree structure is provided at the end of the prompt, failures will likely decrease.

🌀 Task 3: Generate Visual Map

SVG diagram generation is still less common compared to the generation of text, pictures, or videos. For this reason, I needed to create my own API to convert text data into a mindmap.

Libraries for generating mindmaps from tree-structured text, such as Mermaid and Pintora, are well-known. However, Mermaid does not work on Node.js, and Pintora, which uses native code, is difficult to run as a serverless function. Therefore, I had to write a Web API from scratch to generate SVG diagrams of mindmaps from text.

At first, I expected this to be challenging, but it turned out to be easier than I thought. What I want to share with you today is that a serverless function for generating SVG diagrams with a specific purpose does not require a lot of code or external libraries.

Here, I am sharing the text2mindmap Web API I created:
https://github.com/sosuisen/text2mindmap

It consists of a 360-line layout algorithm and a 130-line data class written in TypeScript, which is not particularly large. Additionally, it contains only a few SVG tags, and since Copilot is adept at handling SVG tags and attributes, this part should not trouble you much.

The key point of the layout algorithm is to divide the second-level nodes into two halves and place them on the left and right sides. In the example above, there are seven nodes at the second level, with the first four expanding to the right and the remaining three to the left. While paper-based mindmaps are more radial and allow for free text orientation, this approach offers a more computer-friendly representation based on horizontally written text.

You could create a serverless function that generates custom SVG diagrams like the one above in less than a week. This time, I used Vercel because the code was slightly too long to deploy using the Lambda deployment feature provided by Agent.ai. I developed it with VSCode and TypeScript and deployed it as a free serverless function on Vercel.

If you publish one of these diagram-generation functions, everyone on agent platforms like Agent.ai will be able to use it. This collection of small, user-contributed feature extensions will enhance the visual expression capabilities of agents in the future.

Agent.ai Experience

Flow control and output history are excellent

This agent allows users to gradually develop the output diagram by providing improvement instructions. The implementation of this iterative process was seamlessly achieved using the flow control provided by Agent.ai.

Additionally, since Agent.ai retains past outputs as logs, users can easily refer to the history of modifications made to the diagram. This feature of Agent.ai is highly practical.

Serverless functions have limitless capabilities

Agents need to combine the results of multiple tasks, and data format gaps inevitably arise in the process. While Agent.ai comes with several predefined data transformation methods, developers can extend its capabilities for tasks that cannot be handled by the built-in methods.

With Agent.ai, developers can perform any type of data transformation by writing a simple serverless function. These serverless functions are automatically deployed to AWS Lambda and become callable.

The function I needed this time was one that takes the <img src="https://xxx.yyy/zzz.jpg"> tag data returned by DALL-E 3 as input and converts the image referenced by the src attribute into a Base64-encoded string that can be embedded into SVG. This was achieved by writing a roughly 50-line program that reads the URL from the src attribute, fetches the image, Base64-encodes it, and then deploys it with Agent.ai.

Data format gaps occur frequently, and serverless functions that can handle any transformation process are a powerful solution. In agent programming, one of the greatest challenges—and sources of satisfaction—is coming up with ideas for these kinds of transformations.

The serverless function I created is publicly available here:
https://github.com/sosuisen/imgtag2base64-lambda

SVG support is challenging

SVG is not natively supported in Agent.ai, making the output and download of SVGs somewhat challenging.

While I succeeded in displaying it by embedding the SVG tag in HTML, an SVG embedded as a tag cannot be downloaded as an image. To enable users to download the SVG, I included <form> and <button> tags in Agent.ai's output, allowing the text2mindmap API to be called again in a new tab.

Conclusion

In this article, I shared extensive information about handling SVGs with Agent.ai. I look forward to seeing many visual agents emerge in the future.

🛠️ Bring your solution into Docusign. Reach over 1.6M customers.

Docusign is now extensible. Overcome challenges with disconnected products and inaccessible data by bringing your solutions into Docusign and publishing to 1.6M customers in the App Center.

Learn more