Two flavors of URL encoding: form-urlencoded vs RFC 3986

4 min read

“URL encoding” is, in fact, two related but distinct specs. encodeURIComponent and application/x-www-form-urlencoded use slightly different rules, and conflating them causes the “why is my space sometimes + and sometimes %20” mystery. This article lays out the differences.

Track A: RFC 3986 (the URI spec)

RFC 3986 defines the URI grammar. It lists a set of reserved characters, and everything else that needs encoding goes through percent-encoding (%XX).

Reserved characters

gen-delims:  : / ? # [ ] @
sub-delims:  ! $ & ' ( ) * + , ; =

Unreserved characters (no need to encode)

A-Z  a-z  0-9  -  _  .  ~

In RFC 3986, a space encodes to %20. + is a reserved character and is not used as a substitute for space.

JavaScript’s encodeURIComponent() follows this spec almost exactly:

encodeURIComponent('hello world'); // → "hello%20world"
encodeURIComponent('a+b'); // → "a%2Bb"  (+ also encoded)
encodeURIComponent('日本語'); // → "%E6%97%A5%E6%9C%AC%E8%AA%9E"

Track B: application/x-www-form-urlencoded (HTML form submission)

HTML form submission has its own format, derived from the WHATWG/HTML spec and historically RFC 1738, with a few differences from RFC 3986:

  1. Space encodes to + (not %20).
  2. Newlines normalize to %0D%0A (CRLF).
  3. Other reserved characters are still percent-encoded.

JavaScript’s URLSearchParams follows this:

const p = new URLSearchParams();
p.append('q', 'hello world');
p.toString(); // → "q=hello+world"

Where the difference hits: spaces and +

Practically, the divergence is mostly about spaces and +.

InputRFC 3986form-urlencoded
Space%20+
+%2B%2B
=%3D%3D
&%26%26
Newline (\n)%0A%0D%0A
日本語 (UTF-8)%E3%81%82%E3%81%82

A URL query string and a form-submission body look the same at a glance, but using the wrong encoder produces subtle bugs.

“Should query string spaces be + or %20?”

Historically:

  1. RFC 1738 (older URL spec) — + for space in query strings.
  2. RFC 3986 (current URI spec) — %20 for space in query strings.
  3. HTML form submission+, kept for compatibility.

Modern servers accept both. Decoders typically restore + and %20 to a space, which is why nobody usually notices.

For encoding, pick the right tool for the audience:

  • URL path componentsencodeURIComponent (handle / separately, since it’s reserved).
  • Query strings — either works; URLSearchParams if you want behavior identical to forms.
  • Form bodiesURLSearchParams.toString() or FormData.

Decoders also come in two flavors

decodeURIComponent (RFC 3986 style)

Decodes only %XX; leaves + alone.

decodeURIComponent('hello+world'); // → "hello+world"
decodeURIComponent('hello%20world'); // → "hello world"

URLSearchParams (form-urlencoded style)

Decodes both + to space and %XX.

new URLSearchParams('q=hello+world').get('q'); // → "hello world"

If you take location.search.slice(1) and feed it directly to decodeURIComponent, query strings produced by form submission keep their + characters as literal pluses. For query parsing, prefer URLSearchParams.

Common implementation traps

1. URLs that intentionally contain +

Email aliases like user+tag@example.com survive in a URL only if you encode the + properly:

?email=user+tag@example.com
↓ URLSearchParams decode
email = "user tag@example.com"  (+ became a space)

Send-side has to encode the + as %2B:

?email=user%2Btag@example.com
↓ URLSearchParams decode
email = "user+tag@example.com"  ✓

encodeURIComponent will turn + into %2B, so encoding query values through it is the safe default.

2. Hash fragments

Anything after # is client-side only — never sent to the server. The encoding rules follow RFC 3986, but browsers normalize hash content inconsistently. Reading non-ASCII characters from location.hash produces decoded vs raw values depending on the browser.

3. Non-UTF-8 encodings

Historically there are URLs encoded as Shift_JIS or EUC-JP. encodeURIComponent always uses UTF-8, so legacy URLs need explicit handling. In practice, today, almost everything is UTF-8.

Picking the right tool

Quick reference:

NeedUse
Encode a path/query/fragment value into a URLencodeURIComponent
Build a query stringURLSearchParams
Parse a query stringURLSearchParams
Decode a path or fragment valuedecodeURIComponent

When in doubt, “putting something into a URL” → encodeURIComponent.

URL encoding has two flavors — RFC 3986 percent encoding and form-urlencoded — with subtle differences (most visibly the space: %20 vs +) that cause browser/API mismatches. The URL encoder on this site shows both formats in parallel for visual comparison. When implementation confusion strikes, the fastest sanity check is to call encodeURIComponent and look at what it actually returns before committing to one form.