Choosing a URL-safe encoding: Base64url vs Base32 vs Base58 vs hex
The “carry bytes as text” problem shows up everywhere: JWT tokens, TOTP secrets, cryptocurrency addresses, hash displays. Each domain settled on a different encoding, and the choice is rarely arbitrary. This article lines up Base64url, Base32, Base58, and hex side by side, with the use cases and the gotchas that bite when you switch between them.
At a glance
| Format | Alphabet | Chars | Padding | Output length for N bytes | Case |
|---|---|---|---|---|---|
| Base64url | A-Z a-z 0-9 - _ | 64 | = (omittable) | ⌈4N/3⌉ | sensitive |
| Base32 (RFC 4648) | A-Z 2-7 | 32 | = | ⌈8N/5⌉ | insensitive |
| Base58 (Bitcoin) | 1-9 A-H J-N P-Z a-k m-z | 58 | none | ~1.37 N (variable) | sensitive |
| hex | 0-9 a-f | 16 | none | 2N (fixed) | insensitive |
Don’t pick by output length alone. The right axis is who reads the result (a parser, a human, a phone keypad, another system).
Base64url: tokens that go in URLs and HTTP headers
The de-facto standard for binary-in-text on the web — JWTs, OAuth access tokens, WebPush subscriptions.
- Replaces
+with-and/with_so the output goes straight into a URL. - Padding
=is optional (RFC 4648 §3.2). JWT convention is to omit it.
Pitfall 1: Library disagreement on padding
Implementations vary:
- Node.js
Buffer.from(str, 'base64url')accepts unpadded input. - Python
base64.urlsafe_b64decoderequires padding (raises if missing). - Java
java.util.Base64.getUrlDecoder()accepts unpadded input.
A JWT generated in one language failing to verify in another is almost always this. Pre-pad before passing to a strict decoder:
function padBase64url(s) {
return s + '='.repeat((4 - (s.length % 4)) % 4);
} Pitfall 2: Confusing it with regular Base64
Standard Base64 (+, /, =) inside a URL turns + into a space, / into a path delimiter, and = into a query boundary. Never put plain Base64 in a URL.
Base32: humans reading aloud, case-insensitive contexts
TOTP (Google Authenticator) shared secrets per RFC 6238 use Base32. You’ve seen the JBSWY3DPEHPK3PXP style.
- 32-character alphabet, no lowercase — no case to remember.
- Avoids visually confusable
0/1/8/9(RFC 4648 uses2-7; Crockford uses a different set). - Plays well with reading aloud over the phone, writing on paper, OCR from a QR code.
Pitfall: RFC 4648 Base32 vs Crockford Base32
| RFC 4648 | Crockford | |
|---|---|---|
| Alphabet | A-Z 2-7 | 0-9 A-Z minus I, L, O, U |
| Case | insensitive | insensitive (normalized on input) |
| Checksum | none | ~, *, $, =, U |
| Padding | = | none |
ULID uses Crockford Base32 — different from TOTP’s RFC 4648. Same word, different specs. Always confirm which one your library implements.
Base58: preserving leading zeros, dense URLs
Used in Bitcoin addresses, IPFS CIDs, and many short-URL ID schemes.
- Deliberately omits visually confusable characters (
0,O,I,l). - No padding. Output length is not a clean function of input length — leading
0x00bytes are preserved as leading1s in the output.
Pitfall: Base58 vs Base58Check
Bitcoin addresses are not Base58 — they are Base58Check, a variant with a trailing 4-byte checksum:
[version (1B)] [payload (20B)] [checksum (4B)] → Base58 → address Plain Base58 decoding does not validate the checksum. Make sure the API you call is base58check_decode, not just base58_decode, when handling addresses.
Pitfall: implementation cost
Base58 encode/decode requires big-integer division, so it is slower and more complex to implement than Base64 or hex. The shorter, friendlier output is paid for in CPU.
hex: human comparison and universal compatibility
The default for hash display and binary debugging dumps. 5d41402abc4b2a76b9719d911017c592.
- 16-character alphabet, so output is 2× the input size — longer than Base64 family.
- Supported by the standard library of practically every language.
- Easy for humans to eyeball-compare prefixes (“yep, that matches up to
5d41”).
Pitfall: case convention
Spec-wise hex is case-insensitive, but conventions matter:
- SHA / MD5 / HMAC hex outputs are typically lowercase (OpenSSL, Python
hashlib). - Ethereum addresses use mixed case as a checksum (EIP-55).
Before lowercasing for comparison, check whether the case carries information.
Decision flow
For a new use case, run through these in order:
Goes directly into a URL or HTTP header? Yes → Base64url (shorter) or Base58 (shortest, has crypto pedigree). No → continue.
Read aloud or transcribed by a human? Yes → Base32 (RFC 4648 or Crockford depending on library/spec). No → continue.
Compatibility with arbitrary systems is paramount? Yes → hex. No → continue.
Minimize length above all? Yes → Base64url. No → pick by reader: Base32 (human) or hex (machine).
Summary
- Web tokens → Base64url (watch for padding-strict decoders downstream).
- TOTP, DNS, voice transcription → Base32 (verify which spec: 4648 / Crockford / Base32hex).
- Crypto addresses, short IDs → Base58 (don’t conflate plain Base58 with Base58Check).
- Hashes, low-level dumps → hex (check whether case is significant in your spec).
All four are “bytes as text,” but they target different readers. To experiment with concrete byte sequences, the Base64 tool covers the standard and url-safe variants of the most common case.