YAML is everywhere in modern CI/CD pipelines — GitHub Actions, GitLab CI, Kubernetes manifests, Helm charts, and infrastructure configs.
But the moment you pass that YAML into APIs, SDKs, or validation tools, JSON suddenly becomes mandatory.
On paper, converting YAML to JSON sounds trivial.
In practice, it’s one of the most common sources of silent CI failures.
Let’s break down why this happens, what usually goes wrong, and how to handle YAML → JSON safely in CI pipelines.
Why CI Pipelines Often Need JSON
Many CI tools accept YAML only as input, but internally convert or forward configuration as JSON:
APIs expect strict JSON
Schema validators work on JSON
Helm renders YAML → JSON
Custom build steps serialize configs to JSON
So even if your repo uses YAML, JSON is almost always involved downstream.
The Hidden Problem: YAML Is More Permissive Than JSON
YAML is designed for humans.
JSON is designed for machines.
That difference causes subtle but dangerous issues.
1️⃣ Duplicate Keys (The Silent Killer)
YAML allows duplicate keys without throwing errors:
env:
NODE_ENV: production
NODE_ENV: staging
Most YAML parsers will silently overwrite the first value.
After conversion, JSON ends up with:
{
"env": {
"NODE_ENV": "staging"
}
}
Your pipeline passes.
Your app behaves incorrectly.
No error is reported.
This is one of the most common CI bugs.
2️⃣ Indentation Errors That “Parse” but Break Logic
YAML indentation defines structure.
This looks valid:
steps:
- name: build run: echo "Building"
But depending on the parser, this may serialize incorrectly or fail schema validation after conversion.
CI tools often don’t validate YAML deeply before passing it along.
3️⃣ Anchors & Aliases Don’t Translate Cleanly
YAML supports reuse:
defaults: &defaults
retries: 3
timeout: 60
job:
<<: *defaults
timeout: 30
After conversion, some tools:
inline the values
others drop anchors entirely
others fail schema validation
JSON has no concept of anchors.
4️⃣ Data Types Change Without Warning
YAML guesses types:
enabled: yes
version: 01
Depending on the parser, this may convert to:
{
"enabled": true,
"version": 1
}
If your API expects strings, this breaks compatibility.
Common CI Conversion Methods (and Their Limits)
yq
yq -o=json config.yaml
✅ Fast
❌ Does not detect duplicate keys by default
❌ Depends heavily on yq version
Python (PyYAML)
import yaml, json
data = yaml.safe_load(open("config.yaml"))
print(json.dumps(data))
✅ Flexible
❌ safe_load still allows duplicate keys
❌ Requires custom validation logic
Helm toJson
{{ toJson .Values.config }}
✅ Works inside Helm
❌ Debugging is painful
❌ Errors surface late in the pipeline
Best Practice: Validate Before Conversion
Before converting YAML to JSON in CI:
Validate indentation
Detect duplicate keys
Confirm data types
Inspect final JSON output
This should happen before the config reaches your API or deployment step.
A Practical Debugging Tip (Saved Me Many Times)
When a CI pipeline fails after conversion, I first:
Paste the YAML into a strict YAML → JSON converter
Inspect the JSON output
Look for:
missing fields
overwritten keys
unexpected booleans or numbers
I often use a browser-based YAML to JSON converter like
👉 https://jsonviewertool.com/yaml-to-json
It runs fully client-side and helps quickly spot structural issues before re-running the pipeline.
When Should You Avoid Conversion Entirely?
If possible:
Keep YAML all the way through (Helm → Kubernetes)
Or define configs natively in JSON when APIs are involved
Conversion should be intentional, not accidental.
Final Thoughts
YAML → JSON conversion isn’t hard — it’s deceptively dangerous.
Most CI failures caused by it:
don’t throw errors
pass validation
break production behavior later
Treat conversion as a validation step, not a formatting step.
Your CI pipelines — and future self — will thank you.
Further Reading
YAML vs JSON in APIs & CI pipelines
Helm toJson pitfalls
Duplicate key detection in YAML
Top comments (0)