Most of the softwares under the hood are data structures and algorithms. These are the essential tools(skills) for software craftsmen. If you are a developer, you must have come across a task where you had to use data structures and apply an algorithm to solve the problem.
TL;DR;
This article explains a use case on how trees are useful for organizing hierarchical data and processing them using a recursive algorithm.
The source code is available at https://github.com/premchalmeti/clish_xml_parser
I recently started working with a virtualization products company that operates in enterprise data services. A virtual machine contains storage disks attached to it and these VMs collectively called storage clusters.
The clusters had an admin CLI(Command Line Interface) using CLISH(pronounce: see-lish) framework. My job was to create a safe CLI for users to execute mostly read-only commands.
What isΒ CLISH??
- A CLISH(Command Line Interface SHell) is an open-source framework used to create CLI on *nix platforms.
- In CLISH, adding a new menu in the console is as simple as writing an XML(
clock.xml
) file. You just have to define all your commands(clock set, clock timezone) in an XML file at a single location. - Now, The CLISH reads these XMLs and generate an instance of CLI like this,
Problem π
Let's explore the problem by taking an example,
In admin CLI, a network menu has all the commands like network show, network device add, etc.
In user CLI, a network module (user_network.xml
) has only safe commands in it.
A safe command doesn't alter the state (configuration to be precise) of the cluster.
- network device add: not safe (add a new device hence alters the state of the clusters)
- network show: safe
This safe command's definition is provided in a JSON schema file.
Now, the first step in the process of creating a CLI was to write a parser that does the following things.
- Read the JSON schema file
- Loads the corresponding source XML file (source_file key in the JSON)
- Parse the XML and remove the unsafe commands (The algorithm is explained later)
- Output the safe XML (
user_network.xml
) to be used for user CLI
However, we can not directly use the schema file for parsing. If you notice in the above XML file the CLI has a particular format to group related commands under a parent command.
- network device groups network device add, network device delete, etc
- network device configure groups network device configure rename, network device configure ip, etc
In this case, if all child commands are removed (are unsafe) then the parent command has to be removed and so on till the root node.
Iterating a list of commands without any relation among them is difficult to operate on.
Solution π‘
This stringent parsing requires the commands to be organized in a hierarchy then recursively process the nodes from bottom to up. (The image gives more idea).
Trees are best data structures for storing data in hierarchical order
A simple variant of general trees fits the bill. A general tree contains zero or more child nodes.Β
We can tokenize each command and prepare a common general tree definition from the current schema file. We will refer to this tree as CmdTree. A general tree for the admin_network.xml
constructed using the schema file looks below
Each node in the above tree has a cmd token (network, show, device) and has some more metadata used while parsing.
Finally, we can now derive this recursive algorithm for parsing the XML.
Algorithm π
The final XML will look like this.
Since this XML only contains safe commands after parsing. These XMLs can now be used for creating a safe CLI for the users.
The source code is available at https://github.com/premchalmeti/clish_xml_parser
Top comments (1)
Nicely Explained!!