DEV Community

Yawar Amin
Yawar Amin

Posted on

Constructing XML output with dream-html

FOR some time now, I have been maintaining an OCaml library called dream-html. This library is primarily intended to render correctly-constructed HTML, SVG, and MathML. Recently, I added the ability to render well-formed XML markup, which has slightly different rules than HTML. For example, in HTML if you want to write an empty div tag, you do: <div></div>. But according to the rules of XML, you could also write <div/> ie a self-closing tag, however HTML 5 does not have the concept of self-closing tags!

So by having the library take care of these subtle but crucial details, you can just concentrate on writing code that generates the markup. Of course, this has many other advantages too, but in this post I will just look at XML.

It turns out that often we need to serialize some data into XML format, for storage or communication purposes. There are a few packages in the OCaml ecosystem which handle XML, however I think dream-html actually does it surprisingly well now. Let's take a look.

But first, a small clarification about the dream-html package itself. Recently I split it up into two packages:

  1. pure-html has all the functionality needed to write valid HTML and XML
  2. dream-html has all of the above, plus some integration with the Dream web framework for ease of use.

As you might imagine, the reason for the split was to allow using the HTML/XML functionality of the package without having to pull in the entire Dream dependency cone, which is quite large, especially if you happen to be using a different dependency cone as well. So pure-html depends only on the uri package to help construct correct URI strings.

To start using it, just install: opam install pure-html

And add to your dune file: (libraries pure-html)

Now, let's look at an example of how you can use it to construct XML. Suppose you have the following type:

type person = {
  name : string;
  email : string;
}
Enter fullscreen mode Exit fullscreen mode

And you need to serialize it to XML like this:

<person name="Bob" email="bob@info.com"/>
Enter fullscreen mode Exit fullscreen mode

Let's write a serializer using the pure-html package:

open Pure_html

let person_xml =
  let person = std_tag "person"
  and name = string_attr "name"
  and email = string_attr "email" in
  fun { name = n; email = e } -> person [name "%s" n; email "%s" e] []
Enter fullscreen mode Exit fullscreen mode

Let's test it out:

$ utop -require pure-html
# open Pure_html;;
# let pp = pp_xml ~header:true;;
val pp : Format.formatter -> node -> unit = <fun>
# #install_printer pp;;
# type person = {
  name : string;
  email : string;
};;
type person = { name : string; email : string; }
# let person_xml =
  let person = std_tag "person"
  and name = string_attr "name"
  and email = string_attr "email" in
  fun { name = n; email = e } -> person [name "%s" n; email "%s" e] [];;
val person_xml : person -> node = <fun>
# person_xml { name = "Bob"; email = "bob@example.com" };;
- : node =
<?xml version="1.0" encoding="UTF-8"?>
<person
name="Bob"
email="bob@example.com" />
Enter fullscreen mode Exit fullscreen mode

OK cool, so our person record is serialized in this specific way. But, what if we need to serialize it like:

<person>
  <name>Bob</name>
  <email>bob@example.com</email>
</person>
Enter fullscreen mode Exit fullscreen mode

After all, this is a common way of formatting records in XML. Let's write the serializer in this style:

let person_xml =
  let person = std_tag "person"
  and name = std_tag "name"
  and email = std_tag "email" in
  fun { name = n; email = e } ->
    person [] [
      name [] [txt "%s" n];
      email [] [txt "%s" e];
    ]
Enter fullscreen mode Exit fullscreen mode

Let's try it out:

# let person_xml =
  let person = std_tag "person"
  and name = std_tag "name"
  and email = std_tag "email" in
  fun { name = n; email = e } ->
    person [] [
      name [] [txt "%s" n];
      email [] [txt "%s" e];
    ];;
val person_xml : person -> node = <fun>
# person_xml { name = "Bob"; email = "bob@example.com" };;
- : node =
<?xml version="1.0" encoding="UTF-8"?>
<person><name>Bob</name><email>bob@example.com</email></person>
Enter fullscreen mode Exit fullscreen mode

Looks good! Let's examine the functions from the pure-html package used here to achieve this.

std_tag

This function lets us define a custom tag: let person = std_tag "person". Note that it's trivial to add a namespace: let person = std_tag "my:person".

string_attr

This allows us to define a custom attribute which takes a string payload: let name = string_attr "name". Again, easy to add a namespace: let name = string_attr "my:name".

There are other attribute definition functions which allow int payloads and so on. See the package documentation for details.

pp_xml

This allows us to define a printer which renders XML correctly according to its syntactic rules:

let pp = pp_xml ~header:true
Enter fullscreen mode Exit fullscreen mode

The optional header argument lets us specify whether we want to always print the XML header or not. In many serialization cases, we do.

There's also a similar function which, instead of defining a printer, just converts the constructed node into a string directly: to_xml.

Conclusion

With these basic functions, it's possible to precisely control how the serialized XML looks. Note that dream-html and pure-html support only serialization of data into XML format, and not deserialization ie parsing XML. For that, there are other packages!

Top comments (1)

Collapse
 
bhoot profile image
Jayesh Bhoot

Just wrote an atom feed generator using your library. Great DX!