DEV Community

loading...
Cover image for Advanced YAML Syntax Cheatsheet
Educative

Advanced YAML Syntax Cheatsheet

Ryan Thelin
Just a tech writer and coder striving to get better every day.
Originally published at educative.io ・7 min read

YAML (YAML Ain’t Markup Language) is a data serialization language used to create key-value pair configuration files and app APIs. It's a superset of JSON and is formatted using line breaks and whitespace to improve readability.

While not a daily use technology, it's an important foundation for many modern technologies like Ansible and is, therefore, a prerequisite for higher-paying developer roles.

Today, we'll help you perfect your YAML knowledge so you can impress your next interviewer.

Here’s what we’ll cover today:

Become a certified YAML professional

Stand out from other applicants with a YAML certification that you can earn in less than an hour.

Introduction to YAML

YAML Validator and Parser

While YAML is more readable than XML, it's easy to miss mistakes in formattings like the number of spaces across a multi-line section or forgetting newline markers. These can cause sweeping errors for the YAML parser, which interprets YAML into more machine-readable actions.

Parsing errors are hard to find by hand, so many developers choose to use a web validator or linter. These tools review your YAML document and highlight any potential errors to ensure you have valid YAML before putting it into use.

The most popular validator is YAMLLint, a free open-source tool available in your browser or on Github. YAMLLint is considered the most convenient linter because it provides error highlighting, autoformatting for copy/pasted YAML, and auto-correct options.

YAML Anchors and Alias

Anchors and Aliases are YAML constructions that allow you to reduce repeat syntax and extend existing data nodes. You can place Anchors (&) on an entity to mark a multi-line section. You can then use an Alias (*) call that anchor later in the document to reference that section. Anchors and Aliases are very helpful for larger projects as they cut visual clutter caused by extra lines.

The Alias essentially acts as a "see above'' command, which makes the program pause standard traversal, return to the anchor point, then resume standard traversal after the Anchored portion is finished. If you're familiar with Object-Oriented Programming designs, you'll feel right at home with Anchors.

Below, the build-test Anchor begins on line 3 and is called by Aliases on lines 13 and 15.

definitions: 
  steps:
    - step: &build-test
        name: Build and test
        script:
          - mvn package
        artifacts:
          - target/**

pipelines:
  branches:
    develop:
      - step: *build-test
    master:
      - step: *build-test

Enter fullscreen mode Exit fullscreen mode

Overrides and Extensions

You can also tweak the Anchor when called by entering <<: before the Alias. Below this, you can write any desired changes. Mappings are overridden if the new mapping has the same name or is added afterward if different.

definitions: 
  steps:
    - step: &build-test
        name: Build and test
        script:
          - mvn package
        artifacts:
          - target/**


pipelines:
  branches:
    develop:
      - step: *build-test
    master:
      - step: 
          <<: *build-test
          name: Testing on Master #override
          ongoing: false #extension
Enter fullscreen mode Exit fullscreen mode

YAML Schemas

Schemas are rulesets in YAML 1.2 that tell the loader/parser what each part of your YAML file converts to in actual commands. They're essentially a collection of if/then statements to process human-readable YAML tags into. For example, the YAML Core schema states that both !!str my string and my string are equivalent and should be parsed to the same action.

There is a default schema for general-purpose YAML, and many special-use schemas that are each suited for certain niche scenarios. You can even create your own custom schemas or download other user's schemas on the JSON Schema Store.

Custom schemas are helpful when configuration files include custom objects or if you want to create language-specific object serialization
s.

---
# Ruby Psych
dice: !ruby/Object:Dice [3, 6]

---
# perl YAML::XS, YAML.pm, YAML::Syck (Dump and Load)
dice: !!perl/array:Dice [3, 6]

--------
# perl YAML.pm, YAML::Syck (Load)
dice: !perl/array:Dice [3, 6]

--------
# Pyyaml
dice: !!python/object/new:__main__.Dice
  - !!python/tuple [3, 6]
Enter fullscreen mode Exit fullscreen mode

There are three default schemas for YAML are:

  • Failsafe Schema: A minimalist schema that only supports String (!!str), Map (!!map), and Sequence (!!seq) tags. Failsafe is guaranteed to work with any YAML document due to its simplicity, but it does not support any complex tags.
  • JSON schema: A foundational schema designed to reliably parse equivalent YAML and JSON files to the same end result. This is the most commonly used schema and is the starting point for most schemas. It supports all failsafe tags, as well as Boolean (!!bool), null (!!null), Integer (!!int), and Float (!!float).
  • Core Schema: The default YAML schema that widens the matching criteria on the YAML side. It is essentially a less opinionated version of the JSON schema. For example, JSON accepts Booleans that achieve the criteria "Match: true | false". By comparison, here is Core's Boolean criteria: "true | True | TRUE | false | False | FALSE".

Keep the learning going.

Brush up on your expert YAML skills and earn a certificate at the same time. Educative's text-based courses are easy to skim and feature live coding environments, meaning you can better your skills in half the time.

Introduction to YAML

YAML Escape Sequences

YAML's readable syntax is only possible because it uses indents and whitespace over inline syntactic markers like square brackets or curly braces. However, this can get difficult when using special characters with quotation mark scalars.

YAML offers Escape Sequences to allow you to indicate if the character should be interpreted as a special character or as part of the scalar. For example, you could escape a single space using &#x20; to create a string with a value of .

Escape Sequences come in 3 types:

  • Entity Escapes, which allow you to use otherwise syntactically important characters by space (&#x20;), colon (&#58;), ampersand (&amp;).
  • Unicode Escapes, which let you add a space (\u0020), single quote (\u0027), or double quote (\u0022) by calling their unique key directly from the Unicode key-value pairs list.
  • Quoted Escapes, which let you display quoted sentences using additional quotes. For example, double quotes can be shown within single quotes (‘ "quote" ’) or nested within another set of double quotes ("abc "quote" cba") and with nested single quotes (‘ ‘quote’ ’).

Note: A full list of YAML Escape Sequences can be found in the official YAML Specification.

YAML Separators and Directives

You may have noticed that all YAML documents begin with ---. This notifies the parser that this data is a separate chunk but was sent as part of the same request for efficiency.

Another more advanced separator is ..., which marks the end of a document. The end-of-document separator must be followed either by a beginning-of-document separator (---) or a set of directives. Directives are defined settings marked by % that comes before a document and are followed by ---.

This means ... and --- are needed to mark the boundaries of the directive space. The two current directives are %YAML, which lets you set the YAML version for the coming document, and %TAGS, which lets you create custom shorthands for tags you'll use in the document.

Directives are only useful in niche circumstances, like using an old YAML document that's too large to update. Usually, you will not need directives or a document separator.

doc 1 
... 

%TAG !foo! !foo-types/ 

-------- 

doc 2
Enter fullscreen mode Exit fullscreen mode

YAML Timestamp

Timestamp is a helpful data type that lets you store times as a unit rather than as a collection of different numbers. It's marked by the !!timestamp data tag and can hold various levels of specificity such as simple yyyy/mm/dd or down to a fraction of a second 2001-12-15T2:59:43.10.
You can also separate date and time using spaces to make the timestamp more readable, like 2001-12-15 2:59:43.10

All timestamps are recorded as Coordinated Universal Time (UTC) unless another timezone is specified at the end of the timestamps. You define the timezone by including how many hours it is ahead or behind UTC.

For example, I could set a timestamp as Pacific Standard Time (PST) with a -8 at the end.

2001-12-15 2:59:43.10 -8`. 
Enter fullscreen mode Exit fullscreen mode

This is similar to DateTime used in popular programming languages like JavaScript or Python, except with how it handles timezone. DateTime records the time listed in the timezone of the host server by default.

This can lead to problems if working with networks from outside their native timezone. The computer needs to convert to UTC then to the destination timezone every time the date is called, slowing down processes.

Wrapping up

Good job tackling this advanced YAML syntax. While YAML might seem like a niche skill, it's a prerequisite for many jobs across the market. However, most coding interviews will not include YAML. The best way to demonstrate your YAML knowledge is to use it in a portfolio project or be able to show past experience with the technology.

To help you wow recruiters with your YAML skills, Educative has created Introduction to YAML. This course covers beginner to expert level syntax and techniques in a condensed, hands-on format. By the end of the course, you'll be able to use YAML with confidence and will have your own YAML certification to put on your resume.

Happy learning!

Continue reading about YAML and foundational skills

Discussion (0)

Forem Open with the Forem app