YAML is a great file format, used for writing configuration files; and more and more, as a foundation for Domain Specific Languages (DSLs), such as script files for Github workflows, Docker or Kubernetes.
How to make YAML dynamic?
We have all, however, suffered from a frustration: YAML is fundamentally a static file format. How can we make it dynamic?
It starts with the fact that some string, like http://dev.company.local appears several times across the files.
server:
url: http://dev.company.local
...
documentation:
url: http://dev.company.local/docs
...
And the day that address changes, what do we do?
- Do a search/replace in the config files? Nah... this is brittle and things will get worse before they get better.
- Define constant values with anchors (
&), and then use aliases (*) to refer to them in the rest of the YAML file. This is a really neat feature, which gives YAML an advantage over formats: now you can put your constants on the top of the file. Anchors don't need to be strings or numbers (foo, or5), but can also be maps or sequences ({foo: 5, bar: 6}, or[foo, bar, baz]). However it doesn't solve many cases where it's more complicated than a simple change of code snippet. A "small" change of parameter (fromdevtotestand fromtesttolive) leads to significant changes in the YAML tree. - All right, then solve it with a little bit of templating, using Jinja (Python) or the Go native engine. Solved! ...Really? Everything is fine for a while, and then the workflow crashes because some corner case broke the YAML syntax. And we start fiddling-fiddling with spaces and escaping and newlines... and after 5 or 10 abort-debug-rerun cycles, the delivery is running late.
- OK, what we can do now? Use a preprocessing language guaranteed to produce valid syntax, like Jsonnet for json? That wil work, for sure. However, you have now a friction problem: it's new language with its own syntax and file format, which requires implementing its own toolchain (Jsonnet is not json, so editor add-ons, parsers, interpreters, libraries, etc.) and you will need to drive the team through the learning curve of that language with its own syntax and semantic. Will you implement a serious project... or will you have a coffee break at that point, and keep doing things the same old way?
Enter YAMLpp
Faced with that problem, I decided that it needed a straightforward solution, which required basically only one utility (a pre-processor) to convert files into static YAML, and would change nothing else anywhere.
Enter YAMLpp (YAML Pre-Processor). It is a language that makes YAML dynamic, but it is itself YAML. So you can use your current editor which does your color highlighting, syntax checking, and so on.
Let's start with a Hello World example:
.context:
name: "World"
message: "Hello {{ name }}!"
I don't think there is a lot to explain here. .context is called a construct, because it 'constructs' (builds) a YAML sub-tree or does something useful; in this case it defines a variable name and leaves no trace.
All constructs start with a . (a dot). In YAMLpp, .context is known as a keyword.
Then message contains a string that will be interpolated using the name variable.
Here is the result:
message: "Hello World"
As you see, the node .context disappeared. That's a rule for all YAMLpp constructs: they disappear and never make it to the target file.
YAMLpp is a metaprogramming tool
It goes much further than that.
It is a macro language written in YAML, the same language as the target language YAML.
.context:
platform: dev
server:
.if:
.cond: "(platform, {{ prod }})"
.then:
url: prod.machine.local
...
.else:
...
url: dev.machine.local
The result is thus:
server:
url: dev.machine.local
...
The .if construct is expanded: .if, .cond, .then and .else keywords will disappear, and only the intended final YAML nodes will remain.
For anyone who had anything to do with LISP or one its offspring languages, you know that a language that generates itself through macros, is a huge deal, because it opens the doors to meta-programming. In essence, are able to create your own Domain-Specific Language (DSL), for configuration or scripting.
For the others for whom this might sound a little too abstract: what you write in YAMLpp is still YAML, but allows you to manipulate your target YAML tree in any way you please. You are now able to express high-level concepts in your config file or script in a way that is meaningful to you and your team (expressive, shorter, non-repetitive), which will be expanded into a form (YAML) that the machine will be able to test/interpret correctly and that everyone else will understand -- because its part of the standard specification.
So now, YAML can be either dynamic (when it contains YAMLpp keywords) or static (without them). But regardless, it is still valid YAML.
Where do I go from there?
A working prototype was developed in Python and it already has its test suite (with pytest).
To install it:
pip install yamlpp-lang
You just need one command:
yamlpp input.yaml -o output.yaml
For more information see the Github repo and the documentation on ReadTheDocs.
You can also look at a webpage containing two realistic use cases, one for Kubernetes and one for Docker.
Top comments (1)
A new version 0.2.4 is available on Pypi
With this new version, the possibilities of exporting pieces of the YAML file are extended, with JSON and TOML files, as well as arguments to customize the export:
The result would be: