JSON vs YAML: when to use each and conversion gotchas
JSON dominates APIs and YAML dominates config — they have complementary strengths. This article walks through the differences and the traps when converting between them.
Same data, two formats
{
"name": "Alice",
"age": 30,
"skills": ["JavaScript", "Python"],
"address": {
"city": "Tokyo",
"zip": "100-0001"
}
} name: Alice
age: 30
skills:
- JavaScript
- Python
address:
city: Tokyo
zip: '100-0001' YAML — easier for humans (indentation, no quotes). JSON — easier for machines (rigid grammar).
Key differences
| Property | JSON | YAML |
|---|---|---|
| Comments | None | # supported |
| Trailing comma | Disallowed | n/a |
| String quotes | Required | Often optional |
| Newlines | \n | Preserved |
| Anchors / refs | None | & and * |
| Multi-doc | One per file | --- separators |
| Type inference | Strict | Loose (Norway problem) |
JSON strengths
Simple and unambiguous:
- Types — string, number, boolean, null, array, object.
- No comments.
- No trailing commas.
- Fast to parse.
Optimal for API responses. Web standard (ECMA-404).
YAML strengths
Human-friendly:
- Comments allowed.
- String quoting often optional.
- Indentation expresses hierarchy.
- Multiple documents per file (
---separators). - Anchors / references for DRY.
Used for config files (CI/CD, Kubernetes, Docker Compose).
YAML 1.1 vs 1.2
YAML 1.1 has infamous gotchas:
country: NO
isOpen: yes
disabled: off NO→ boolean false (the user meant “Norway” country code).yes→ boolean true.off→ boolean false.
This is the “Norway problem”. Fixed in YAML 1.2 but many tools (Docker Compose, GitHub Actions) still use 1.1 semantics.
Safe practice — quote boolean-like strings:
country: 'NO'
isOpen: 'yes' Comments
YAML supports them, JSON doesn’t:
# Dev environment config
host: localhost
port: 3000 # default port Converting YAML → JSON drops comments. Round-tripping back can’t recover them.
Dates
YAML 1.1/1.2 has native date types:
created: 2024-01-15
modified: 2024-01-15T10:30:00Z These parse as Date objects. JSON has no native date type — they’re strings:
{
"created": "2024-01-15",
"modified": "2024-01-15T10:30:00Z"
} How YAML→JSON converters handle dates is implementation-dependent. Most leave them as strings.
Numeric precision
YAML 1.1 over-eagerly parses numbers:
phone: 09012345678 # number? string?
postal: 0123456 # octal?
hex: 0xFF # hex These get parsed numerically — leading zeros disappear, large values overflow. Phone numbers, postal codes always quote:
phone: '09012345678'
postal: '0123456' Key order
JSON — spec doesn’t guarantee key order, but most implementations preserve insertion order. YAML — mapping order is preserved (parser-dependent).
For “order-sensitive maps”, YAML is safer.
Duplicate keys
Both spec-prohibit them, but enforcement varies:
{
"key": "first",
"key": "second"
} key: first
key: second - JSON — depends on parser (“last wins”, error, warning).
- YAML 1.2 — should error.
Formatters sometimes silently dedupe.
Anchors and references
YAML-only:
defaults: &defaults
timeout: 30
retries: 3
prod:
<<: *defaults
host: prod.example.com
dev:
<<: *defaults
host: dev.example.com &defaults declares, *defaults references, <<: merges. Useful for DRY config.
JSON has nothing equivalent.
File size
For the same data:
- JSON — bloated by quotes and commas.
- YAML — heavy indentation can outweigh that.
APIs use gzip, so raw size differences are usually moot.
Streaming
JSON — JSONL (JSON Lines) gives one document per line:
{"id": 1, "name": "a"}
{"id": 2, "name": "b"} YAML — multi-document (--- separators) supports streaming. Parser-dependent.
Security
YAML tags like !!python/object can deserialize arbitrary objects — loading untrusted YAML is dangerous:
!!python/object/apply:os.system
- 'rm -rf /' PyYAML’s default yaml.load() is unsafe (use yaml.safe_load()). Ruby’s Psych likewise.
JSON has no such issue (only primitive types).
Choosing between them
| Use case | Pick | Reason |
|---|---|---|
| API request/response | JSON | speed, interop |
| Logs | JSON | parse speed |
| Config files | YAML | readability, comments |
| CI/CD pipelines | YAML | de facto standard |
| Machine-to-machine | JSON | strict spec |
| Human-edited | YAML | comfortable to write |
“Config in YAML, data in JSON” is the rule of thumb.
TOML — a third option
TOML (Tom’s Obvious Minimal Language) is rising:
name = "Alice"
age = 30
[address]
city = "Tokyo"
zip = "100-0001" - INI-file flavor.
- Less ambiguous than YAML.
- Cargo (Rust), Hugo, pyproject.toml use it.
Increasingly common where you want config-grade ergonomics without YAML’s traps.
Summary
- JSON for APIs and machine I/O.
- YAML for configs, with comments and human authoring.
- YAML 1.1 Norway problem — quote boolean-shaped strings.
- YAML → JSON loses comments.
- Don’t
yaml.load()untrusted input.
For round-tripping JSON ↔ YAML, the JSON-to-YAML tool on this site handles both directions.