More and more, we, web developers need to learn about different areas to become better professionals and less dependent on others for simple things.
If you start a career as FE, you might see a lot of .yml
files like .travis.yml
(for Travis Build), .gitlab-ci.yml
(for git lab CI), etc. but, let's be honest, what the hell?
Why would people use this kind of file? What's the benefit of it? How does this thing work?
So, the goal of this article is to introduce you to the YAML
structure and give you more confidence to understand, read, and change a file like this when you need it.
After all, we tend to feel very uncomfortable and frustrated and we need to do something and we can't even understand what's that.
But first, What is YAML?
According to the official website, Yaml is:
"YAML (a recursive acronym for "YAML Ain't Markup Language") is a human-friendly data serialization standard for all programming languages."
Heavily used to write configuration files, which explains A LOT, right?
People were tired to have a bunch of configs nobody could understand until someone just say:
What if we could somehow write our configuration like a "cake recipe"? I mean minimum bare text, very straight forward?
Boom, in May 2001 Yaml was created.
YAML vs JSON
Surprisingly (or not really), Yaml is a superset of our well-known buddy JSON.
"Superset is A programming language that contains all the features of a given language and has been expanded or enhanced to include other features as well." - Font
If I could give you a perspective of what it means I would say:
In a FE world, Yaml would matches for TypeScript while JSON for JavaScript
To better understand how this would be even possible let's see this example:
{
"compilerOptions": {
"module": "system",
"noImplicitAny": true,
"removeComments": true,
"preserveConstEnums": true,
"outFile": "../../built/local/tsc.js",
"sourceMap": false,
"types": ["node", "lodash", "express"]
},
"include": ["src/**/*"],
"exclude": ["node_modules", "**/*.spec.ts"]
}
This is a tsconfig.json
example. very easy to read, we can easily identify what's what but... it has some limitations, like:
- Can't create variables;
- Can't use external variables (e.g. environment variables)
- Override values;
In JS world, if we can create a .json
a configuration file, we almost always also can create a .js
(like .eslintrc
or .eslint.js
) which allows us to mitigate the CONS mentioned before.
But then, if you're using other programing language, JS files isn't an option. And it's at this point Yaml starts to shine.
If we'd have to re-write the tsconfig.json
in YAML
syntax and having the exactly result, it'd be like:
compilerOptions:
module: system
noImplicitAny: true
removeComments: true
preserveConstEnums: true
outFile: '../../built/local/tsc.js'
sourceMap: false
types:
- node
- lodash
- express
include:
- src/**/*
exclude:
- node_modules
- '**/*.spec.ts'
Note this is only an example. You cannot write your
tsconfig
in YAML! π
I hope you're starting to get the idea from these files.
Concepts, Types, and Syntax
Now, let's dive deep a bit in the concept of the language.
Indentation
In Yaml, indentation does matter. It uses whitespace indentation to nest information. By whitespace, keep in mind tab
is not allowed.
If you're like me and uses tab for everything, install some plugin in your IDE to replace your tabs for spaces (like editorconfig). Doing that, when you hit a tab, it'll automatically replace your tab by space and you don't even need to use your space bar! ;)
Root
Since indentation matters here, if there's no space before the first declaration YAML will understand that's the root (level 0) of your file:
person:
age: 20
Like we have in JSON with the first {
curly brackets:
{
"person": {
"age": 20
}
}
Key/Value
Like in JSON/JS, YAML
also uses the key/value
syntax and you can use in various ways:
key: value
key_one: value one
key one: value # This works but it's weird
'my key': somekey
Comments
To write a comment you just have to use #
followed by your message.
# I'm a comment
person: # I'm also a comment
age: 20
This is cool to document some decision or make a note. Unfortunately, we can't do this with JSON.
Lists
There're 2 ways to write lists:
JSON way: array of strings
Remember Yaml is a JSON's superset? we can use its syntax:
people: ['Anne', 'John', 'Max']
Hyphen syntax
The most common (and probably recommended)
people:
- Anne
- John
- Max
Strings
There're a few ways to declare a string in Yaml:
company: Google # Single words, no quotes
full_name: John Foo Bar Doe # Full sentence, no quotes
name: 'John' # Using single quotes
surname: "Christian Meyer" # Using double quotes
While in JSON we would have only a way to use double quotes:
{
"company": "Google",
"full_name": "John Foo Bar Doe",
"name": "John",
"surname": "Christian Meyer"
}
As a suggestion, prefer to use quotes when you want to use any special character like _
, @
, etc.
Numbers
Like in any programming language, we have 2 types of number: Integer and Float:
year: 2019 # Integer
nodeVersion: 10.8 # Float
Node Anchors (variables-ish)
An anchor is a mechanism to create a group of data (an object) that can be injected or extended from other objects.
Let's imagine you need to create a configuration for your CI. It'll have both production
and staging
environments. As you can imagine, they share almost the same base settings.
In JSON world, we would have to duplicate these configs:
{
"production": {
"node_version": "13.0.0",
"os": "ubuntu",
"package_manager": "yarn",
"run": ["yarn install", "NODE_ENV=${ENVIRONMENT} yarn build"],
"env": {
"ENVIRONMENT": "production"
}
},
"staging": {
"node_version": "13.0.0",
"os": "ubuntu",
"package_manager": "yarn",
"run": ["yarn install", "NODE_ENV=${ENVIRONMENT} yarn build"],
"env": {
"ENVIRONMENT": "staging"
}
}
}
Copy and paste are also annoying, especially when you have to change something in all places it's been used those Infos.
Anchors came to solve that problem. We can:
- First, create our anchor
# I name it as "base-config" but it can be whatever
# &base will be the "variable name" you'll use in the injection
base-config: &base
node_version: 13.0.0
os: ubuntu
package_manager: yarn
run:
- yarn install
- NODE_ENV=${ENVIRONMENT} yarn build
- Then, injecting the anchor created in the level we want to have to see these values being injected:
base-config: &base
node_version: 13.0.0
os: ubuntu
package_manager: yarn
run:
- yarn install
- NODE_ENV=${ENVIRONMENT} yarn build
production:
# I'm injecting all "base" attributes and values inside production
<<: *base
env:
- ENVIRONMENT: production
staging:
# I'm injecting all "base" attributes and values inside staging
<<: *base
env:
- ENVIRONMENT: staging
Looks simpler, right? And also easier to maintain.
If you copy this code and paste into a "Yaml to JSON converter" online tool you'll see the same code as I mentioned early in the JSON example but within the addition of the base config:
{
"base-config": {
"node_version": "13.0.0",
"os": "ubuntu",
"package_manager": "yarn",
"run": ["yarn install", "NODE_ENV=${ENVIRONMENT} yarn build"]
},
"production": {
"node_version": "13.0.0",
"os": "ubuntu",
"package_manager": "yarn",
"run": ["yarn install", "NODE_ENV=${ENVIRONMENT} yarn build"],
"env": [
{
"ENVIRONMENT": "production"
}
]
},
"staging": {
"node_version": "13.0.0",
"os": "ubuntu",
"package_manager": "yarn",
"run": ["yarn install", "NODE_ENV=${ENVIRONMENT} yarn build"],
"env": [
{
"ENVIRONMENT": "staging"
}
]
}
}
JSON syntax (yes, JSON)
As explained before a superset of a language is the base language PLUS some extra features, which means we could write a Yaml
file in JSON way
{
"details": {
"company": {
"name": "Google",
"year": 2019,
"active": true
},
"employees": [
"Anne",
"John",
"Max"
]
}
}
Doubting? Copy this code and paste it here
If you convert this YAML to JSON, you'll have the same structure:
{
"details": {
"company": {
"name": "Google",
"year": 2019,
"active": true
},
"employees": ["Anne", "John", "Max"]
}
}
Shell/Bash environment
As I told at the beginning of this article, it's very common .yml
files are used as config files for many things, but especially for CI/CD environment.
For those, you'll have to describe how the machine/docker should work, what should be installed, ran, etc.
Commonly, all those environments are Linux, which means you'll also have access to the environment itself.
On GitLab CI, for instance, you can specify on a global level environment variables you want to have available for the whole process:
variables:
NODE_IMAGE: node:10
stages:
- build
test:
image: $NODE_IMAGE
stage: build
Note that the syntax to use variables by $
isn't from YAML but shell/bash
.
What GitLab CI does is getting everything you'd defined in variables
and creates shell
variables.
Some other platforms also inject other values like commit ref, branch name, build time, author and also secret keys defined outside the configuration:
variables:
NODE_IMAGE: node:10
stages:
- build
test:
image: $NODE_IMAGE
stage: build
artifacts:
name: $CI_COMMIT_REF_NAME
In the example above, we're using a $CI_COMMIT_REF_NAME
external environment variable that GitLab CI platform makes available which describes The branch or tag name for which the project is built
.
Conclusion
I hope you now understand a bit more about YAML and at least feel comfortable reading and write your files.
Keep in mind that what you'll have access to or not, the limitations will be determined by the platform you're using. Travis defines a different configuration than GitLab CI or CircleCI for example.
Always check the documentation from the platform you're working on to see what's possible or not to be done! :)
References
- YAML Website. (You can find parses for all languages there);
- Learn X in Y Minutes: YAML: Here a whole guide/introduction about everything YAML can do for you;
- JSON ββ YAML Online converter: Useful to visualize what generates what and create a better understanding;
- YAML by Wikipedia
Top comments (9)
I remember when I first heard about YAML
It was some library in Python which you can configure yaml config files to create invoices from PDF files
I never understood why do we have yaml in the first place, I was in the mood "why do it when everything is json?"
Glad you reminded me of yaml and made clear to me what it does. Its great and does remind me of python in some weird way π
Great work :))
Iβve used YAML a lot but had no idea about Node Anchors... π€― good to know! Thanks for the article
That's nice, right? First time I saw it also blow my mind
I rather write XML with Notepad. YAML is one of the worst text formats I have encountered. The semantics are fragile, unclear, and quite often arcane. The specification is huge. It becomes unparseable by humans when it gets larger than a single page.
I open this post thinking βeverybody should just use JSONβ and you change my mind! Great post
I am a Front end dev but thus far not tried using yaml and this is the very 1st time I read about yaml. You made it so easy to understand.
Thanks a lot
Awesome into to YAML, Raul! Thanks so much!
You can also use yamlonline.com/ for the yaml validator as well as yaml converter to json,csv,xml,base64 also for beautify and minify YAML.
shameless promotion for my own PHP library for YAML :
github.com/dallgoot/yaml
all the features of current YAML version are supported anchors among them.