DEV Community

Cover image for Introduction to YAML
Paula Santamaría
Paula Santamaría

Posted on • Edited on

Introduction to YAML

The first time I came across YAML was around a year ago when I use it to write OpenAPI definitions to document a RESTful API using Swagger API Documentation and, to be honest, I really hated it.

Being a JSON "fan", the YAML syntax felt weird and unnatural to me, so for a while, I didn't pay any attention to it.

This changed a few months ago, when I started to get into CI/CD, since both Azure and GitLab pipelines require a YAML file to setup. So I finally decided to properly learn about YAML, and after doing some reading I found the ideas behind it fascinating.

In this article I'll cover the basics of YAML, including its main goals, basic syntax and some of its more complex features.

Table of contents

Introduction

YAML is a data-serialization language often used for configuration files, such as Open API specifications or CI/CD pipelines.

Fun fact! 🤓

According to YAML 1.0 specification document (2001-05-26) the acronym "YAML" stands for "Yet Another Markup Language", but it was later changed to the recursive acronym "YAML Ain't Markup Language" in the 2002-04-07 specification.

As stated in the latest spec YAML is designed to be friendly to people working with data and achieves "unique cleanness" by minimizing the use of structural characters, allowing the data to appear in a natural and meaningful way.

The latest spec also states that YAML 1.2 is in compliance with JSON as an official subset, meaning that most JSON documents can be parsed to YAML.

YAML achieves easy inspection of data's structures by using indentation-based scoping (similar to Python).

Another fun fact! 🤓

DEV.to articles use YAML to define custom variables like title, description, tags, etc.

Basic Syntax

YAML documents are basically a collection of key-value pairs where the value can be as simple as a string or as complex as a tree.
Here are a few notes about YAML syntax:

  • Indentation is used to denote structure. Tabs are not allowed and the amount of whitespace doesn't matter as long as the child node is more indented than the parent.
  • UTF-8, UTF-16 and UTF-32 encodings are allowed.

Strings

# Strings don't require quotes:
title: Introduction to YAML

# But you can still use them:
title-w-quotes: 'Introduction to YAML'

# Multiline strings start with |
execute: |
    npm ci
    npm build
    npm test
Enter fullscreen mode Exit fullscreen mode

The above code will translate to JSON as:

{
    "title": "Introduction to YAML",
    "title-w-quotes": "Introduction to YAML",
    "execute": "npm ci\nnpm build\nnpm test\n"
}
Enter fullscreen mode Exit fullscreen mode

Numbers

# Integers:
age: 29

# Float:
price: 15.99

# Scientific notation:
population: 2.89e+6
Enter fullscreen mode Exit fullscreen mode

The above code will translate to JSON as:

{
    "age": 29,
    "price": 15.99,
    "population": 2890000
}
Enter fullscreen mode Exit fullscreen mode

Boolean

# Boolean values can be written in different ways:
published: false
published: False
published: FALSE
Enter fullscreen mode Exit fullscreen mode

All of the above will translate to JSON as:

{
    "published": false
}
Enter fullscreen mode Exit fullscreen mode

Null values

# Null can be represented by simply not setting a value:
null-value: 

# Or more explicitly:
null-value: null
null-value: NULL
null-value: Null
Enter fullscreen mode Exit fullscreen mode

All of the above will translate to JSON as:

{
    "null-value": null
}
Enter fullscreen mode Exit fullscreen mode

Dates & timestamps

ISO-Formatted dates can be used, like so:

date: 2002-12-14
canonical: 2001-12-15T02:59:43.1Z
iso8601: 2001-12-14t21:59:43.10-05:00
spaced: 2001-12-14 21:59:43.10 -5
Enter fullscreen mode Exit fullscreen mode

Sequences

Sequences allow us to define lists in YAML:

# A list of numbers using hyphens:
numbers:
    - one
    - two
    - three

# The inline version:
numbers: [ one, two, three ]
Enter fullscreen mode Exit fullscreen mode

Both of the above sequences will parse to JSON as:

{
    "numbers": [
        "one",
        "two",
        "three"
    ]
}
Enter fullscreen mode Exit fullscreen mode

Nested values

We can use all of the above types to create an object with nested values, like so:

# Nineteen eighty four novel data.
nineteen-eighty-four:
    author: George Orwell
    published-at: 1949-06-08
    page-count: 328
    description: |
        A Novel, often published as 1984, is a dystopian novel by English novelist George Orwell.
        It was published in June 1949 by Secker & Warburg as Orwell's ninth and final book.
Enter fullscreen mode Exit fullscreen mode

Which will translate to JSON as:

{
    "nineteen-eighty-four": {
        "author": "George Orwell",
        "published-at": "1949-06-08T00:00:00.000Z",
        "page-count": 328,
        "description": "A Novel, often published as 1984, is a dystopian novel by English novelist George Orwell.\nIt was published in June 1949 by Secker & Warburg as Orwell's ninth and final book.\n"
    }
}
Enter fullscreen mode Exit fullscreen mode

List of objects

Combining sequences and nested values together we can create a lists of objects.

# Let's list books:
- nineteen-eighty-four:
    author: George Orwell
    published-at: 1949-06-08
    page-count: 328
    description: |
        A Novel, often published as 1984, is a dystopian novel by English novelist George Orwell.

- the-hobbit:
    author: J. R. R. Tolkien
    published-at: 1937-09-21
    page-count: 310
    description: | 
        The Hobbit, or There and Back Again is a children's fantasy novel by English author J. R. R. Tolkien.
Enter fullscreen mode Exit fullscreen mode

Distinctive Features

The following are some more complex features that caught my attention and that also differentiate YAML from JSON.

Comments

As you've probably already noticed in my prior examples, YAML allows comments starting with #.

# This is a really useful comment.
Enter fullscreen mode Exit fullscreen mode

Reusability with Node Anchors

Node anchors mark a node for future reference, which allow us to reuse the node. To mark a node we use the & character, and to reference it we use *:

In the following example we'll define a list of books and reuse the author data, so we only have to define it once:

# The author data:
author: &gOrwell 
    name: George
    last-name: Orwell

# Some books:
books: 
    - 1984:
        author: *gOrwell 
    - animal-farm:
        author: *gOrwell
Enter fullscreen mode Exit fullscreen mode

The above code will look like this once parsed to JSON:

{
    "author": {
        "name": "George",
        "last-name": "Orwell"
    },
    "books": [
        {
            "1984": {
                "author": {
                    "name": "George",
                    "last-name": "Orwell"
                }
            }
        },
        {
            "animal-farm": {
                "author": {
                    "name": "George",
                    "last-name": "Orwell"
                }
            }
        }
    ]
}
Enter fullscreen mode Exit fullscreen mode

Explicit data types with tags

As we've seen in previous examples, YAML autodetects the type of our values, but it's possible to specify which type we want.
We specify the type by including it before the value preceded by !!.

Here are some examples:

# The following value should be an int, no matter what:
should-be-int: !!int 3.2

# Parse any value to string:
should-be-string: !!str 30.25

# I need the next value to be boolean:
should-be-boolean: !!bool yes
Enter fullscreen mode Exit fullscreen mode

This will translate to JSON as:

{
    "should-be-int": 3,
    "should-be-string": "30.25",
    "should-be-boolean": true
}
Enter fullscreen mode Exit fullscreen mode

Conclusion

Reading and writing about YAML, and experimenting with it was super interesting.

What I like: I specially loved to read about the goals of YAML in relation to code cleanness and readability, and how it achieves that. I also feel better about properly learning the syntax at last 😅.

What I don't like: I don't like that I need a parser (which means installing a new dependency) to use YAML with the main technologies I work with (node.js and .NET Core).

However, I will now consider YAML, specially if I need something that JSON can't cover like reusability, explicit types or comments. I'm sure that working with pipelines will be easier now too.

Also, I'd strongly recommend reading YAML 1.2 Specification document (3rd review) - Introduction to learn more about YAML goals, origins and relationship with other languages.

What are you using YAML for? 💬

Are you using YAML? For what? What are your thoughts about it?

Top comments (38)

Collapse
 
orenovadia profile image
orenovadia

Thanks for all the tips and tricks. I like using YAML for configurations.

PS: YAML has some default casting one should be aware of:

In [4]: yaml.load('yes') # 'Yes' and 'No' become boolean
Out[4]: True

In [5]: yaml.load('1_000_000')
Out[5]: 1000000

Enter fullscreen mode Exit fullscreen mode
Collapse
 
paulasantamaria profile image
Paula Santamaría • Edited

Thank you! According to the YAML 1.2 specification document 'yes' and 'no' are no longer interpreted as boolean.

We have removed unique implicit typing rules and have updated these rules to align them with JSON's productions. In this version of YAML, boolean values may be serialized as “true” or “false”;

You can use !!bool to parse them, though.

Collapse
 
_darrenburns profile image
Darren Burns

Thanks for this! I just got started with GitHub Actions a couple of days ago, and was making a lot of assumptions on what the YAML was representing -- the translations to JSON you've done here are really helpful :)

Collapse
 
paulasantamaria profile image
Paula Santamaría

Thanks Darren, I'm glad I could help! 🙂

Collapse
 
jacksoft profile image
Jacksoft CS

I used YAML file to configure Cluster group in the Pipeline during my internship this past Summer. It was a bit challenging since this was the first time using it but definitely easy to work with. Thanks for sharing this great article.

Collapse
 
paulasantamaria profile image
Paula Santamaría

Nice! Was it your first time working with it? And did its readability made it easier to pickup?

Collapse
 
karamfd profile image
Karam • Edited

I use YAML for my default config files with tools such as ESLint and Stylelint. I find the syntax a lot more intuitive than JSON and I am less likely to make mistakes with it.

Thank you for making this tutorial, Paula.

Collapse
 
paulasantamaria profile image
Paula Santamaría

Nice! Do you mind sharing a bit more about how you use it? Like which language and do you use a parser?
I'm really curious because I love the syntax, but I don't like the idea of including extra dependencies just for that.

Collapse
 
karamfd profile image
Karam

I currently use yaml only when a node js package offers built-in support for it. I stick with json, otherwise. I don't know much about parsers as I've only used babel before and it doesn't support yaml, as far as I know.

Collapse
 
rnrnshn profile image
Olimpio

I use YAML so often on Jekyll.. and I didn't know I could write multiline strings just by add this |. Amazing. Thanks.

Collapse
 
paulasantamaria profile image
Paula Santamaría

Something I forgot to include about strings is that you can also write multiline strings that you don't want to be interpreted as multiline. For example:

single-line-string: > 
    This
    should
    be
    one
    line

And this is how it'll look like in JSON:

{
    "single-line-string": "This should be one line\n"
}

When using the > character, instead of |, each new line will be interpreted as an empty space.

Collapse
 
paulasantamaria profile image
Paula Santamaría

I'm glad! Thanks for reading :D

Collapse
 
chr0m1ng profile image
Gabriel Santos

Hi, Paula. Great article!
I have one question, in JSON we can easily create a list of objects without “naming” them, like this:
[{“a”: “b”}, {“x”: “y”}]
How we do that in YAML? For example, a list of Authors (I don’t wanna have [{“authors” : {authoObj1}}, {“authors” : {authoObj2}}])

Collapse
 
paulasantamaria profile image
Paula Santamaría • Edited

Great question!
You can achieve that by entering the hyphen first and then the properties in a new line, like so:

- 
    name: George
    last-name: Orwell
- 
    name: Stephen
    last-name: King

Which will translate to JSON as:

[
    {
        "name": "George",
        "last-name": "Orwell"
    },
    {
        "name": "Stephen",
        "last-name": "King"
    }
]

Also here's a nice online tool I've been using to try the YAML syntax and see its JSON counterpart.

Collapse
 
abir1997 profile image
Abir

I used YAML for my uni assignment where we had to build a ci/cd pipeline using Travis CI for a spring boot application. Travis uses a YAML file for configuration and I found it very easy and intuitive to work with.

Collapse
 
paulasantamaria profile image
Paula Santamaría • Edited

Nice! Every CI pipeline config I've seen so far uses YAML. I believe we'll be seeing more of it in the near future.

Collapse
 
m1g profile image
Miguel

Thanks for this helpful article. I'm using it in K8s to setup the values files and sort of understood the basics of YAML - but this was legit lightbulb. I don't know why I didn't consider that it can be translated to JSON! Thanks again

Collapse
 
antorrg profile image
Antonio R. Rodriguez Gramajo

Thanks for all the tips and tricks!!
I discovered Yaml accidentally, I am learning star and I met him, and also with the link to your article and this beautiful community, this is for me a day of pleasant discoveries.

Collapse
 
tuwang profile image
TuWang

I ran into YAML when working with AWS CloudFormation. It beat me up like a drum 😒

Collapse
 
paulasantamaria profile image
Paula Santamaría

Believe me, I know the feeling!😂
One of the things that made me change my mind after reading about it for a bit was that the whole concept behind YAML made me remember how I felt after going from XML to JSON, which was basically "wow, so much less code, I can actually read this!".