david wyatt

Posted on Dec 2, 2024

Power Automate - Handling XML

#powerautomate #powerplatform #xml #rpa

Power Automate is built on JSON (JavaScript Object Notation), every input and output will be in a JSON format, and that's great because its the best 😎. Its a universal language we can use to talk to any other system though their API's, but its not the only language. In fact the original Dynamics (which Dataverse and the Power Platform is built on) wasn't JSON, but XML (its why you can still do XML quires on Dataverse tables). XML is still widely used, so this blog is all about how you can handle it in your flow.

What is XML
NameSpaces
Xpath
Updating XML

1. What is XML

XML stands for 'Extensible Markup Language', and is a fully flexible markup language. Where HTML is also a markup language, it is not fully flexible as it has to follow the Hypertext agreed tags (you can think of the same way as a jacuzzi/hot tub, as HTML is a type of XML, but XML is not a type of HTML - wow random example there).

So XML allows you to use any tag (<tag></tag>) which is great for building custom data structures to send data between systems.

Pros/Cons

Why is XML still used, well it does have it's benefits, in particular:

Easy to read
XML is super easy to read, with the closing tag compared to a generic }, you can read nested XML.

It's everywhere
Although maybe 'dated' it isn't going away, there are API's that will not change, and there is the 800 pound elephant in the room, HTML. As I said HTML is XML, so if you want to handle HTML, you need to handle XML.

But there are also a couple of negatives to XML, especially compared to JSON, which is why Power Automate, and most new API's use it.

It's Verbose
Compared to JSON, there are more words/characters in the payload, which makes it slower to type and require more data:

<name>David Wyatt</name>

json

"name":{David Wyatt}

Open Structure
JSON does not allow duplicate keys in the same object, but XML does, which can cause problems when combining XML documents from different XML applications:

<developer>
   <name>David Wyatt</name>
   <tech>power platform</tech>
</developer>

<developer>
   <dev name>George Tzani</dev name>
   <role>Snr Dev</role>
   <location>UK</location>
</developer>

To rectify above XML uses namespaces, which is great but again adds complexity.

2. Namespaces

Namespaces groups tags so that they can be identified/filtered. They are generally a url to documentation for the XML, just like a JSON will have a schema url.

You can see namespaces with the xmlns attributes in a tag, everything nested in the tag is in the namespace.

<?xml version="1.0" encoding="UTF-8"?>
<library
    xmlns="http://example.com/library">
    <book>
        <title>XML Basics</title>
        <author>John Doe</author>
    </book>
    <book>
        <title>Advanced XML</title>
        <author>Jane Smith</author>
    </book>
</library>

You can have multiple namespaces inside the same tag, but then any tags linked to the namespace require a unique prefix linked to the namespace.

<?xml version="1.0" encoding="UTF-8"?>
<catalog
    xmlns:bk="http://example.com/books"
    xmlns:auth="http://example.com/authors">
    <bk:book>
        <bk:title>XML Basics</bk:title>
        <auth:author>John Doe</auth:author>
    </bk:book>
    <bk:book>
        <bk:title>Advanced XML</bk:title>
        <auth:author>Jane Smith</auth:author>
    </bk:book>
</catalog>

3. Xpath

Xpath is the way to query/navigate the XML tree. If you think of each tag as a folder on your desktop, and you click through nested folders you see a path in the explorer window, xpath is this shortcut, but with some extra bells and whistles.

So if I wanted all the books in the below I would use /library/book

<?xml version="1.0" encoding="UTF-8"?>
<library>
    <book>
        <title>XML Basics</title>
        <author>John Doe</author>
    </book>
    <book>
        <title>Advanced XML</title>
        <author>Jane Smith</author>
    </book>
</library>

That would return an array, so if I wanted the first I could use /library/book[0]. Also I can grab all books in any tag with a kind of wildcard of //book.

The structure of our xpath looks like this:

Expression	Description
tag	Selects all tags/nodes with the name "tagName"
/	Selects from the root tag
//	Selects tags in the document from the current tag that match the selection no matter where they are
.	Selects the current tag
..	Selects the parent of the current tag
@	Selects attributes

Xpath can also sum and other math expressions from arrays, its incredibly powerful and I could go on all day, but there are already great resources available like https://www.w3schools.com/xml/xpath_intro.asp.

There is one call out I want to make here, and that the above doesn't work the same with namespaces.

If the XML had prefixes then they can be added and it works, but if its a just a namespace then the xpath now looks something like:

/library/*[local-name()='book']

If only one namespace in the entire XML, or if a specific namespace:

/library/*[local-name()='book' and namespace-uri()='http://example.com/library']

Power Automate has an xpath function, but its not the only way to handle XML, in fact there are 3 ways.

4. Updating XML

String Manipulation
JSON Manipulation
XML Manipulation

In the below we are going to use this XML, and we want to remove the location tag (although these techniques will also work with updating/adding).

<?xml version="1.0" encoding="UTF-8"?>
<library
    xmlns="http://example.com/library">
    <location>UK</location>
    <book>
        <title>XML Basics</title>
        <author>John Doe</author>
    </book>
    <book>
        <title>Advanced XML</title>
        <author>Jane Smith</author>
    </book>
</library>

String Manipulation

I will let you in on a secret, although the most 'bodgy' way to do it, this is kind of my favourite 😎

In this approach we are going to treat the XML as text and edit it. The best way to do that in Power Automate is the replace() function, but that doesn't allow regex's. So instead we are going to use the split() function approach. We are going to split the xml into 3, up to the location, the location, and after location, then put 1 and 3 back together.

The first expression splits the before and after location, creating our first array of 2 items.

split(triggerBody()['text'],'<location>')

The second splits the second item into 2, and then concats the first item from the first array with the second item from the second array.

concat(
    outputs('Split_xml')[0]
,
    split(outputs('Split_xml')[1],'<location/>')[1]
)

If we wanted to edit/add new tag, we would add our new tag between the above items.

JSON Manipulation

Power Automate uses JSON as its api language, its less verbose and the new standard. So in this approach we are going to convert to JSON, remove the tag, convert back to XML.

The first expression converts the XML to a JSON, but as the content is a string, not XML, we convert it to XML first.

json(xml(triggerBody()['text']))

The second expression uses the removeProperty() function to remove the location tag. Though as its a nested tag (inside the library tag) we have to update the library property with the location property removed.

xml(setProperty(
    outputs('Convert_to_JSON')
,
    'library'
,
    removeProperty(
        outputs('Convert_to_JSON')?['library']
    ,
        'location'
    )
))

Finally we convert it back to xml using the xml() function.

Here the edit/add tag is super easy, we just change the removeProperty() to setProperty() / addProperty().

XML Manipulation

This is the way we probably should do it, but for me its a little to clunky. We cant edit/add/remove properties like we can with a JSON, so we are going to have to do string manipulation too.

First we are going to use the xpath() function to get our location tag, as with JSON conversion we also need to convert it to xml first. We use the below expression to get the tag:

xpath(
    xml(triggerBody()['text']))
,
    '//*[local-name()="location"]'
)[0]

As it returns an array of all location tags we have to grab the first with the array position [0].

Call out, if you didn't select a specific item in the array, and went for the whole array, it would return a collection of base64 items (like files, showing $content-type and $content keys), so you would need to base64ToString() convert the $content.

One small problem, its also appends a attribute to the tag, xmlns="http://example.com/library", which normally wouldn't matter, but it will in this case (see later). Why does it do that, because of the namespace. As said before, if you didn't have the namespace then the below expression would work:

 xpath(xml(triggerBody()['text'])),'//location')

If you wanted to target with a namespaces you would add it to the xpath, so ours would look like this:

/*[local-name()="location" and namespace-uri()="http://example.com/library"]/*[local-name()="location"]

The only reason I didn't include the and namespace is that it isn't necessary, as there is only one namespace in the XML.

So why is this a problem, well when you use local-name() it automatically adds the namespace as a. attribute to our xpath() output (to help track it in the future). Again David why is this a problem, well because of the next step, we plan to use the xpath() output to do a find and replace (our way if creating a kind of regex), and that wont work because now it doesn't match the tag in the XML.

The namespace shouldn't be added without the namespace in the xpath(), but Microsoft decided to add it anyway. Finally getting to the point now, it means that we need to remove the namespace parameter from our xpath() output, and you guessed it, the best way is to convert it to a string and replace it with ''.

replace(
    string(
        xpath(
            xml(triggerBody()['text'])
        ,
            '//*[local-name()="location"]'
        )[0]
    )
,
    ' xmlns="http://example.com/library"'
,
    ''
)

Callout, see that leading whitespace of xmkns, that's there on purpose, do not remove it

Now we have out tag that is an exact match, our second expression uses a replace() expression to replace it with ''.

replace(
    triggerBody()['text'])
,
    outputs('Get_Protection')
,
    ''
)

If you want to add a tag you could concat it to the xpath() output, and if you want to edit it you could just edit/rebuild the xpath() output.

All 3 ways to handle XML work, but in most cases I would say JSON convert is the best, with XML for any particularly awkward schemas, but I still love a good string split() 😎.

The flow can be found in a solution named XML here.