JSON vs YAML: when to use each and conversion gotchas

4 min read

JSON dominates APIs and YAML dominates config — they have complementary strengths. This article walks through the differences and the traps when converting between them.

Same data, two formats

{
	"name": "Alice",
	"age": 30,
	"skills": ["JavaScript", "Python"],
	"address": {
		"city": "Tokyo",
		"zip": "100-0001"
	}
}
name: Alice
age: 30
skills:
  - JavaScript
  - Python
address:
  city: Tokyo
  zip: '100-0001'

YAML — easier for humans (indentation, no quotes). JSON — easier for machines (rigid grammar).

Key differences

PropertyJSONYAML
CommentsNone# supported
Trailing commaDisallowedn/a
String quotesRequiredOften optional
Newlines\nPreserved
Anchors / refsNone& and *
Multi-docOne per file--- separators
Type inferenceStrictLoose (Norway problem)

JSON strengths

Simple and unambiguous:

  • Types — string, number, boolean, null, array, object.
  • No comments.
  • No trailing commas.
  • Fast to parse.

Optimal for API responses. Web standard (ECMA-404).

YAML strengths

Human-friendly:

  • Comments allowed.
  • String quoting often optional.
  • Indentation expresses hierarchy.
  • Multiple documents per file (--- separators).
  • Anchors / references for DRY.

Used for config files (CI/CD, Kubernetes, Docker Compose).

YAML 1.1 vs 1.2

YAML 1.1 has infamous gotchas:

country: NO
isOpen: yes
disabled: off
  • NO → boolean false (the user meant “Norway” country code).
  • yes → boolean true.
  • off → boolean false.

This is the “Norway problem”. Fixed in YAML 1.2 but many tools (Docker Compose, GitHub Actions) still use 1.1 semantics.

Safe practice — quote boolean-like strings:

country: 'NO'
isOpen: 'yes'

Comments

YAML supports them, JSON doesn’t:

# Dev environment config
host: localhost
port: 3000 # default port

Converting YAML → JSON drops comments. Round-tripping back can’t recover them.

Dates

YAML 1.1/1.2 has native date types:

created: 2024-01-15
modified: 2024-01-15T10:30:00Z

These parse as Date objects. JSON has no native date type — they’re strings:

{
	"created": "2024-01-15",
	"modified": "2024-01-15T10:30:00Z"
}

How YAML→JSON converters handle dates is implementation-dependent. Most leave them as strings.

Numeric precision

YAML 1.1 over-eagerly parses numbers:

phone: 09012345678 # number? string?
postal: 0123456 # octal?
hex: 0xFF # hex

These get parsed numerically — leading zeros disappear, large values overflow. Phone numbers, postal codes always quote:

phone: '09012345678'
postal: '0123456'

Key order

JSON — spec doesn’t guarantee key order, but most implementations preserve insertion order. YAML — mapping order is preserved (parser-dependent).

For “order-sensitive maps”, YAML is safer.

Duplicate keys

Both spec-prohibit them, but enforcement varies:

{
	"key": "first",
	"key": "second"
}
key: first
key: second
  • JSON — depends on parser (“last wins”, error, warning).
  • YAML 1.2 — should error.

Formatters sometimes silently dedupe.

Anchors and references

YAML-only:

defaults: &defaults
  timeout: 30
  retries: 3

prod:
  <<: *defaults
  host: prod.example.com

dev:
  <<: *defaults
  host: dev.example.com

&defaults declares, *defaults references, <<: merges. Useful for DRY config.

JSON has nothing equivalent.

File size

For the same data:

  • JSON — bloated by quotes and commas.
  • YAML — heavy indentation can outweigh that.

APIs use gzip, so raw size differences are usually moot.

Streaming

JSON — JSONL (JSON Lines) gives one document per line:

{"id": 1, "name": "a"}
{"id": 2, "name": "b"}

YAML — multi-document (--- separators) supports streaming. Parser-dependent.

Security

YAML tags like !!python/object can deserialize arbitrary objects — loading untrusted YAML is dangerous:

!!python/object/apply:os.system
- 'rm -rf /'

PyYAML’s default yaml.load() is unsafe (use yaml.safe_load()). Ruby’s Psych likewise.

JSON has no such issue (only primitive types).

Choosing between them

Use casePickReason
API request/responseJSONspeed, interop
LogsJSONparse speed
Config filesYAMLreadability, comments
CI/CD pipelinesYAMLde facto standard
Machine-to-machineJSONstrict spec
Human-editedYAMLcomfortable to write

“Config in YAML, data in JSON” is the rule of thumb.

TOML — a third option

TOML (Tom’s Obvious Minimal Language) is rising:

name = "Alice"
age = 30

[address]
city = "Tokyo"
zip = "100-0001"
  • INI-file flavor.
  • Less ambiguous than YAML.
  • Cargo (Rust), Hugo, pyproject.toml use it.

Increasingly common where you want config-grade ergonomics without YAML’s traps.

Summary

  • JSON for APIs and machine I/O.
  • YAML for configs, with comments and human authoring.
  • YAML 1.1 Norway problem — quote boolean-shaped strings.
  • YAML → JSON loses comments.
  • Don’t yaml.load() untrusted input.

For round-tripping JSON ↔ YAML, the JSON-to-YAML tool on this site handles both directions.