Most of the softwares under the hood are data structures and algorithms. These are the essential tools(skills) for software craftsmen. If you are a developer, you must have come across a task where you had to use data structures and apply an algorithm to solve the problem.
TL;DR;
This article explains a use case on how trees are useful for organizing hierarchical data and processing them using a recursive algorithm.
The source code is available at https://github.com/premchalmeti/clish_xml_parser
I recently started working with a virtualization products company that operates in enterprise data services. A virtual machine contains storage disks attached to it and these VMs collectively called storage clusters.
The clusters had an admin CLI(Command Line Interface) using CLISH(pronounce: see-lish) framework. My job was to create a safe CLI for users to execute mostly read-only commands.
What is CLISH??
- A CLISH(Command Line Interface SHell) is an open-source framework used to create CLI on *nix platforms.
- In CLISH, adding a new menu in the console is as simple as writing an XML(
clock.xml
) file. You just have to define all your commands(clock set, clock timezone) in an XML file at a single location. - Now, The CLISH reads these XMLs and generate an instance of CLI like this,
Problem 💭
Let's explore the problem by taking an example,
In admin CLI, a network menu has all the commands like network show, network device add, etc.
<?xml version="1.0" encoding="UTF-8"?> | |
<CLISH_MODULE xmlns="http://clish.sourceforge.net/XMLSchema" | |
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" | |
xsi:schemaLocation="http://clish.sourceforge.net/XMLSchema | |
http://clish.sourceforge.net/XMLSchema/clish.xsd"> | |
<!--=======================================================--> | |
<PTYPE name="STRING" | |
pattern="[^\-]+" | |
help="String"/> | |
<PTYPE name="IP_ADDR" | |
pattern="(((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?))" | |
help="IP address AAA.BBB.CCC.DDD where each part is in the range 0-255"/> | |
<!--=======================================================--> | |
<COMMAND name="network show" | |
help="Show currently added devices"> | |
<ACTION>network show</ACTION> | |
</COMMAND> | |
<!--=======================================================--> | |
<COMMAND name="network device" | |
help="Device related commands"></COMMAND> | |
<!--=======================================================--> | |
<COMMAND name="network device add"> | |
<PARAM name="device_name" ptype="STRING" help="Name of the device"></PARAM> | |
<PARAM name="device_ip" ptype="IP_ADDR" help="IP of the device"></PARAM> | |
<ACTION>network device add ${device_name} ${ip_addr}</ACTION> | |
</COMMAND> | |
<!--=======================================================--> | |
<COMMAND name="network device list"> | |
<ACTION>network device list</ACTION> | |
</COMMAND> | |
<!--=======================================================--> | |
<COMMAND name="network device configure"> | |
</COMMAND> | |
<!--=======================================================--> | |
<COMMAND name="network device configure rename"> | |
<PARAM name="old_name" ptype="STRING" help="Current name of the device"></PARAM> | |
<PARAM name="new_name" ptype="STRING" help="New name of the device"></PARAM> | |
<ACTION>network device rename ${old_name} ${new_name}</ACTION> | |
</COMMAND> | |
<!--=======================================================--> | |
<COMMAND name="network device configure ip"> | |
<PARAM name="device_name" ptype="STRING" help="Name of the device"></PARAM> | |
<PARAM name="device_ip" ptype="IP_ADDR" help="IP of the device"></PARAM> | |
<ACTION>network device configure ip ${device_name} ${ip_addr}</ACTION> | |
</COMMAND> | |
</CLISH_MODULE> |
In user CLI, a network module (user_network.xml
) has only safe commands in it.
A safe command doesn't alter the state (configuration to be precise) of the cluster.
- network device add: not safe (add a new device hence alters the state of the clusters)
- network show: safe
This safe command's definition is provided in a JSON schema file.
{ | |
"version": "2.0.0", | |
"created_on": "2021-06-28T17:29:37.714428", | |
"module": { | |
"name": "network", | |
"source_file": "network.xml", | |
"commands": [ | |
{ | |
"cmd": "network show", | |
"meta": { | |
"safe": true | |
} | |
}, | |
{ | |
"cmd": "network device add", | |
"meta": { | |
"safe": false | |
} | |
}, | |
{ | |
"cmd": "network device list", | |
"meta": { | |
"safe": true | |
} | |
}, | |
{ | |
"cmd": "network device configure rename", | |
"meta": { | |
"safe": false | |
} | |
}, | |
{ | |
"cmd": "network device configure ip", | |
"meta": { | |
"safe": false | |
} | |
} | |
] | |
} | |
} |
Now, the first step in the process of creating a CLI was to write a parser that does the following things.
- Read the JSON schema file
- Loads the corresponding source XML file (source_file key in the JSON)
- Parse the XML and remove the unsafe commands (The algorithm is explained later)
- Output the safe XML (
user_network.xml
) to be used for user CLI
However, we can not directly use the schema file for parsing. If you notice in the above XML file the CLI has a particular format to group related commands under a parent command.
- network device groups network device add, network device delete, etc
- network device configure groups network device configure rename, network device configure ip, etc
In this case, if all child commands are removed (are unsafe) then the parent command has to be removed and so on till the root node.
Iterating a list of commands without any relation among them is difficult to operate on.
Solution 💡
This stringent parsing requires the commands to be organized in a hierarchy then recursively process the nodes from bottom to up. (The image gives more idea).
Trees are best data structures for storing data in hierarchical order
A simple variant of general trees fits the bill. A general tree contains zero or more child nodes.
We can tokenize each command and prepare a common general tree definition from the current schema file. We will refer to this tree as CmdTree. A general tree for the admin_network.xml
constructed using the schema file looks below
Each node in the above tree has a cmd token (network, show, device) and has some more metadata used while parsing.
Finally, we can now derive this recursive algorithm for parsing the XML.
Algorithm 📝
0. START | |
1. Traverse the CmdTree node from the root CALL process_node(CmdTree.root) | |
2. START process_node: | |
If the CmdNode is not a leaf node then | |
2.1. Process all the child nodes of the current node first. CALL process_node(node) | |
2.2. After processing all child CmdNode. Set current node visibility | |
If all the childs are removed then | |
node.visible = False | |
Else | |
node.visible = True | |
End if | |
Else | |
CALL action_node(node) | |
End If | |
END process_node | |
3. START action_node: | |
if the node is not visible then | |
Remove the node from the XML | |
END action_node | |
4. END |
The final XML will look like this.
<?xml version="1.0" encoding="UTF-8"?> | |
<CLISH_MODULE xmlns="http://clish.sourceforge.net/XMLSchema" | |
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" | |
xsi:schemaLocation="http://clish.sourceforge.net/XMLSchema | |
http://clish.sourceforge.net/XMLSchema/clish.xsd"> | |
<COMMAND name="network show" | |
help="Show currently added devices"> | |
<ACTION>network show</ACTION> | |
</COMMAND> | |
<COMMAND name="network device" | |
help="Device related commands"> | |
</COMMAND> | |
<COMMAND name="network device list"> | |
<ACTION>network device list</ACTION> | |
</COMMAND> | |
</CLISH_MODULE> |
Since this XML only contains safe commands after parsing. These XMLs can now be used for creating a safe CLI for the users.
The source code is available at https://github.com/premchalmeti/clish_xml_parser
Top comments (1)
Nicely Explained!!