Two flavors of URL encoding: form-urlencoded vs RFC 3986
“URL encoding” is, in fact, two related but distinct specs. encodeURIComponent and application/x-www-form-urlencoded use slightly different rules, and conflating them causes the “why is my space sometimes + and sometimes %20” mystery. This article lays out the differences.
Track A: RFC 3986 (the URI spec)
RFC 3986 defines the URI grammar. It lists a set of reserved characters, and everything else that needs encoding goes through percent-encoding (%XX).
Reserved characters
gen-delims: : / ? # [ ] @
sub-delims: ! $ & ' ( ) * + , ; = Unreserved characters (no need to encode)
A-Z a-z 0-9 - _ . ~ In RFC 3986, a space encodes to %20. + is a reserved character and is not used as a substitute for space.
JavaScript’s encodeURIComponent() follows this spec almost exactly:
encodeURIComponent('hello world'); // → "hello%20world"
encodeURIComponent('a+b'); // → "a%2Bb" (+ also encoded)
encodeURIComponent('日本語'); // → "%E6%97%A5%E6%9C%AC%E8%AA%9E" Track B: application/x-www-form-urlencoded (HTML form submission)
HTML form submission has its own format, derived from the WHATWG/HTML spec and historically RFC 1738, with a few differences from RFC 3986:
- Space encodes to
+(not%20). - Newlines normalize to
%0D%0A(CRLF). - Other reserved characters are still percent-encoded.
JavaScript’s URLSearchParams follows this:
const p = new URLSearchParams();
p.append('q', 'hello world');
p.toString(); // → "q=hello+world" Where the difference hits: spaces and +
Practically, the divergence is mostly about spaces and +.
| Input | RFC 3986 | form-urlencoded |
|---|---|---|
| Space | %20 | + |
+ | %2B | %2B |
= | %3D | %3D |
& | %26 | %26 |
Newline (\n) | %0A | %0D%0A |
| 日本語 (UTF-8) | %E3%81%82 | %E3%81%82 |
A URL query string and a form-submission body look the same at a glance, but using the wrong encoder produces subtle bugs.
“Should query string spaces be + or %20?”
Historically:
- RFC 1738 (older URL spec) —
+for space in query strings. - RFC 3986 (current URI spec) —
%20for space in query strings. - HTML form submission —
+, kept for compatibility.
Modern servers accept both. Decoders typically restore + and %20 to a space, which is why nobody usually notices.
For encoding, pick the right tool for the audience:
- URL path components —
encodeURIComponent(handle/separately, since it’s reserved). - Query strings — either works;
URLSearchParamsif you want behavior identical to forms. - Form bodies —
URLSearchParams.toString()orFormData.
Decoders also come in two flavors
decodeURIComponent (RFC 3986 style)
Decodes only %XX; leaves + alone.
decodeURIComponent('hello+world'); // → "hello+world"
decodeURIComponent('hello%20world'); // → "hello world" URLSearchParams (form-urlencoded style)
Decodes both + to space and %XX.
new URLSearchParams('q=hello+world').get('q'); // → "hello world" If you take location.search.slice(1) and feed it directly to decodeURIComponent, query strings produced by form submission keep their + characters as literal pluses. For query parsing, prefer URLSearchParams.
Common implementation traps
1. URLs that intentionally contain +
Email aliases like user+tag@example.com survive in a URL only if you encode the + properly:
?email=user+tag@example.com
↓ URLSearchParams decode
email = "user tag@example.com" (+ became a space) Send-side has to encode the + as %2B:
?email=user%2Btag@example.com
↓ URLSearchParams decode
email = "user+tag@example.com" ✓ encodeURIComponent will turn + into %2B, so encoding query values through it is the safe default.
2. Hash fragments
Anything after # is client-side only — never sent to the server. The encoding rules follow RFC 3986, but browsers normalize hash content inconsistently. Reading non-ASCII characters from location.hash produces decoded vs raw values depending on the browser.
3. Non-UTF-8 encodings
Historically there are URLs encoded as Shift_JIS or EUC-JP. encodeURIComponent always uses UTF-8, so legacy URLs need explicit handling. In practice, today, almost everything is UTF-8.
Picking the right tool
Quick reference:
| Need | Use |
|---|---|
| Encode a path/query/fragment value into a URL | encodeURIComponent |
| Build a query string | URLSearchParams |
| Parse a query string | URLSearchParams |
| Decode a path or fragment value | decodeURIComponent |
When in doubt, “putting something into a URL” → encodeURIComponent.
To compare what a string looks like under both encodings, the URL encoder on this site shows percent-encoding and form-urlencoded output side by side. Useful for spotting the kinds of mismatches discussed above.