Preamble
So what I realized is this argument is basically the same as this post.
The Uncanny Va...
For further actions, you may consider blocking this person and/or reporting abuse
Something that is great in YAML is that you can add comments. Also, you need to follow indentation rules (what is great to avoid other people to mess the file). All these things made YAML a great way to serialize configurations.
JSON is not friendly or easy to follow and this can drive you to errors...
Is YAML perfect? No. Is YAML a great option when human intervention is needed? Yes.
I'd like to hear your counter argument to the issues I raised, because I'm arguing it isn't going when human intervention is needed.
Json isn't friendly, which is why I put it in "readable". The trouble is you need something with clear rules, because when the human writes it wrong, they'll neet to read it under the eyes of a parser, and json is simple for that.
Well, you can write also a YAML schema. You can also write something wrong in JSON, right? And what is going to correct you is the schema. So, you can have the best of both worlds
The rise of yaml is the rise of golang. The tools we know yaml from - the dockers, the kubernetes, the so on and so forths - are largely written in golang.
In go world, yaml is trivial to parse. By trivial, I mean: no trouble whatsoever. Go's struct tags, along with marshal/unmarshal, made config parsing a non-problem. In a world where parsing concerns are non-existent on the code side, we naturally err towards DX on the human side.
This doesn't necessarily hold true in Node, Python, or other languages, but - again - yaml's rise is go's rise.
Edit: I processed this little bit
This hints more at established languages like C and Go. Ampersand is the traditional syntax for a pointer. Generally speaking, the people who write lang specs aren't going to give too much weight to what powershell does.
Edit: some additions.
Xml is trivial to parse, and by that I mean someone wrote a lex and parsers so you don't have to.
The majority of my argument was related to the human need to understand the intricacies of the language, but you only focus on the machine aspect in your rebuttal, why?
We play a fine line between person and machine. More often than not, the machines win.
If anything, I played the middle ground - the "humans and machines can get along now" side of things.
Really, though, it's purely objective:
When go dies, yaml will die.
Maybe?
Not only Go is a friend of YAML. It happens that Python syntax shares several similarities with YAML, and using YAML is very natural to a Python programmer. Go may die, YAML will survive add long as Python does ;-).
YAML is indeed irregular and hard to consistently parse.
It also suffers from the same problems most configuration languages have: in particular it's repetitive and non programmable. This often leads to writing config generators which in turn introduce generation failures
I think Dhall gets all of this right. Its syntax is regular. It's programmable, but not total. It can't error out or loop infinitely giving it all the reliability of a static config file without the repetitiveness.
dhall-lang.org/
Any parser implementing the specification will consistently parse. What kind of inconsistencies are you talking about?
This link summarises some of them:
github.com/cblp/yaml-sucks
The problem is the language specification is poorly defined and full of irregularity and corner cases leading to inconsistent parsers.
Much of the problem comes from YAML allowing unquoted strings giving them nothing to clearly distinguish them from other data types. This leaves parsers in the awkward situation of needing to decide through complicated rules whether a token is a string or something else.
That reminds me of Lua. It isn't turning complete but if it has functions I am curious how serialization works with that.
As a write only config I'd recommend Lua.
I've never used Lua. But I get the impression the goals are a bit different in that Lua aims to be a programming language for embedding into an application and providing a scripting interface for it. As such I think it's probably a bit heavyweight for a simple configuration language. I expect that using it as a configuration language introduces the same problems that config generation introduces in that it can crash or hang which are usually not properties desirable for a configuration language. But, being an embedded language, it does at least mean your application can load it directly instead of config generation needing to be an intermediate step.
This table summarises what I think are some desirable traits in a configuration language, and how well various languages support them:
Agreed. Lua has many problems as pure configuration. But it does have ancestry influence to be a configuration language.
I guess I just think there could have been some work on perfecting what was doing, like a limited subset LSTN (Lua Simple Table Notation).
BTW, I hate Lua as a language.
I would encourage to avoid telling people what to do. Perhaps say "I don't like YAML, here is why" but trying to put a stop to it, just because it's not for you, seems a bit egotistical. I personally am not a fan of .NET but never would I say "Don't use it!". New developers may come to your article and take what you're saying as truth, and it could hurt them in the long run. It's best to provide examples of why you don't like something, and let the persons decide for themselves.
Exactly. If he doesn't like it, don't use it or anything that requires it. Don't tell others what to do. Json for example has its own quirks and is super hard to read for a human, so use it for human config is just crazy.
If YAML is being human typed, it seems like mistakes (like the & for powershell) would be fairly obvious thanks to syntax highlighting. Many languages have built in keywords/symbols, and users can avoid using keywords as variable names because they are highlighted as soon as they are typed. I use this method especially when using a language I'm not familiar with.
The same argument could be made for PowerShell itself. If I am using cd and the path includes a & (or one of several other special characters) quotes are needed, just like YAML. Maybe you would also criticize PowerShell for this as well, I'm not sure.
If you're ever concerned about ambiguity, just always use quotes. I would argue that English is the root of ambiguity, and whenever a language is made Grandma-readable it necessarily must introduce ambiguity.
It seems a little unfair to say to stop using YAML, but not offer a clear alternative.
I don't think YAML is without fault, but many issues like machine editablity (while preserving comments) were not brought up.
I'm not sure what you mean about whitespace only lacking meaning inside of lists. If you could explain that further.
I'm also not sure what you mean about the JSON reference conversation. If you could explain that a bit as well.
But that is exactly what I hate. Now I must arbitrarily give advice to always quote or learn the many ways yaml will bite you.
Lists can either be indented, or reside at the same level as their parent.
People use "$ref" : by convention in json in order to use references.
I did not make an alternate recommendation because their are so many out there. Yes having to learn 20 configuration languages over yaml probably isn't worth it
You mentioned machine readable (preserving comments). I don't see this usage. In fact I don't think I've seen it where the machine writes yaml, only humans.
Do you hate this about PowerShell and Bash as well?
That's because it doesn't work well. References are processed on load/import and that breaks the export, comments are destroyed, the style (quoted or unquoted) is ignored, etc. PyYaml is the only one with debatable support for comment preservation and it has plenty of problems. This is one of the weaknesses of YAML, and why it is a problematic replacement for many setup config files like the package.json.
Thanks for the clarifications, they make sense now.
However, I'm now confused why you would title this "Stop Using YAML" if you still think YAML is better than learning 20 different configuration languages.
Yaml advertising itself as human friendly, then having the same complexity as bash... Bash and powershell don't do that.
It is interesting that you list problems with yaml others are saying it does well... Hmmm...
Maybe another comment has mentioned this, but you might like StrictYAML. It is a subset of YAML designed specifically to weed out all ambiguity. It might even be the solution you want to suggest as an alternative to YAML.
I can understand where you're coming from with Bash and PowerShell. Learning by mistakes was painful and very confusing in Bash, and I honestly think shell languages need a complete redesign. Funny enough is I actually made an experimental shell to address this by having it accept all arguments in the form of YAML to reduce ambiguity.
I love YAML, but I don't want to pretend that it doesn't have serious flaws. Which is why I mention the problems that it has. I really want it to completely replace JSON, but it is going to need improvements before that is realistic.
For me personally, if it is humans writing YAML with syntax highlighting enabled, I think YAML is the gold standard for config files.
Hi i am working on the StackStorm open source project to automate some dev opsy things and the workflows are built in YAML. It is himan readable but brittle. But have been wondering if its the best way to define workflows which may be more dynamic and can be better expressed differnetly, such as ith bitwise operators.
So am interested in alternatives without throwing out the baby wih the bathwater. So curious about strictYAML.
But since a workflow ideally has programmability and graph attributes, what other alts are there?
I don't have vast experience with different alternatives. I think one of the big challenges often faced is you want something readily available in multiple languages.
I really like Lua. Not as a programming language, but the lightweight embedded part. They keeped the syntax light, but don't go overboard like yaml.
Agreed, white space sensitivity reminds me of the old school make file. It drives me to crazy. I have to use YAML to JSON online conversation tool every time to verify the correctness. Such a bad idea. And worse, unnecessary flexibility just gives you full of surprises. For example: if you have a key "on", it will surprisingly convert it to true. I have wasted so much time troubleshooting these kinds of weird problems.
Like any language it will have it's idiosyncrasies, my recommendation would be to learn them and internalize them, and then they won't distract you as much, they'll stop being a pain point.
That said, "the right tool for the job" rule suggests that calling out to a shell script might be the right thing for anything more complex and a simple command.
Significant whitespace and ampersands-for-macros doesn't a bad language make.
I feel like you didn't internalize my arguments. Calling out to a shell script for more complex things is exactly what I was doing.
Needing to internalize all of yaml in order to use it was kind of my point. YAML looks easy and human friendly, but rather than learn a few things which are common across most any language (strings use quotes and thus quotes need escaped, and thus escaping needs escape). No you need a to know a very special language. So why not instead internalize a simple, but annoying language?
I think you are being harsh with YAML. There are tools/libraries that generate perfectly valid YAML and linters (and formatters) in every popular text editor to help you understand what you are writing and correct you if you are wrong. If you forget to quote a & character, this is the equivalent of a typo and you make them in every (well or not well) defined language. Read the manual, learn the rules and you will be fine.
Isn't that exactly my point. Did you look into my reference to the uncanny valley?
I didn't get that point from your post. Which language have you used without reading the rules first? If you don't, you will make mistakes sooner or later. The fact that you don't like some of these rules, is not a valid reason to go on a conquest for people to stop using it. Personally, human readability is a totally legit reason for some people, even if it doesn't tick any other box. It's the same reason why some languages enforce indentation and some others don't, some use parentheses and colons and some other don't and people choose one or the other. It's called flavor and in my mind, the more flavors the better.
I'll check the reference, thanks!
YAML is good for simple use, like Windows INI. It is good for data collection from many sources. It is good for simple append data into file.
But I prefer XML for complex hierarchical data stucture. XML schema is good, but schema free wellformed XML is good enought. Elements, attributes, entities, comments, processing instructions, charsets, tools, ... all you need is XML.
Agree 100%. YAML should be avoided. Its invention was well-intended (and even welcome at the time). A good reference is:
noyaml.com/
What is needed is programmable configuration, in a programming language with modern tooling. Alternatives to YAML include Dhall, Cue, Jsonnet, and even the language you are using if it comes with something like load/eval (e.g. Scheme, Python, JavaScript). Using a general purpose programming language is discouraged as a configuration language because of side effects. What would be the best of both worlds, is to have the most fancy-pants language with a total subset that is for programmable configuration. Idris 2 could be ideal with a few strategic enhancements.
YAML is precisely defined: go to yaml.org and RTFM.
(BTW, XML is not "based" on HTML. It is a subset of SGML.)
ReRead what I wrote.
If you know the spec, your should never be "surprised".
And, I don't know how it could be "hard to parse". There are parser libraries for every language and they work perfectly well.
Maybe the sole fault of YAML is to look so simple that people think they don't need to read the spec. Experienced IT'ers do not fall in this trap.
Don't you think that it's not the tool's problem but the way how it's used?
Yes, I'm calling out two uses people should reconcider. I'm giving explanation of the issues so that others don't make the same mistake.
Now I'd argue is that YAML provides too much and draws people to use it inappropriately. It makes me wonder what an appropriate usage is.
Well, any usage that solves your current problems is appropriate one. But once it bumps into problems, it could mean two problems: the tool is used in wrong way, the tool itself is wrong. Mostly, it's a wrong way use. IMO, at this time is better to learn the tool deeper rather than switching to some other tool. It could be great to see some example drives you mad. It could clarify things, from my experience I've never had so big YAML files that's hard to parse. And from my experience working with Rails even very big files (translations, usually) don't bring any problems and irritating. Yes, sometimes it's hard to follow indentation. But with JSON it's a hard to follow parentheses, and indentation as well, if it's for humans. So I would say JSON is even worse.
How about toml?
Have not really used it,other than gitlab config.
Reading its spec I'd say it probably isn't as problematic. Though it may have different issues since the same data structure can be defined multiple ways and is invalid.
I love yml
Sounds like the real problem is you're using powershell.