Your data is never sent to a server or stored anywhere. All processing happens in your browser.

Unicode Normalizer (NFC / NFD / NFKC / NFKD)

NFC

0 chars · 0 bytes (UTF-8)

NFD

0 chars · 0 bytes (UTF-8)

NFKC

0 chars · 0 bytes (UTF-8)

NFKD

0 chars · 0 bytes (UTF-8)

How to Use


Type or paste text containing combining marks, fullwidth ASCII, ligatures, or compatibility characters. The tool returns the result of all four Unicode normalization forms side-by-side, with code points and UTF-8 byte counts. Useful for debugging filename equality issues, search indexing, and database collation.

The four forms


Unicode defines four normalization forms (UAX #15). They differ along two axes: canonical vs compatibility decomposition, and whether to compose afterwards.

  • NFC (Canonical Composition): decompose then re-compose canonically. The default for most text storage and comparison. "が" stays "が" (one code point).
  • NFD (Canonical Decomposition): decompose into base + combining marks. "が" becomes "が" ("か" + combining voiced mark, two code points). Used by macOS HFS+/APFS for filenames.
  • NFKC (Compatibility Composition): like NFC but also folds compatibility variants — fullwidth A becomes ASCII A, ㈱ becomes (株). Use for search and identifier comparison.
  • NFKD (Compatibility Decomposition): the most aggressive — applies compatibility folding and decomposes. Useful for stripping diacritics or implementing case-insensitive search.

When to use which


  • Storing user-supplied text: NFC (smallest representation, widest compatibility).
  • Indexing for search: NFKC (so "Café" and "Cafe" and "cafe" all collapse).
  • Comparing filenames across OSes: normalize both sides to NFC before comparing.
  • Stripping accents: NFKD then remove combining marks (`\p{M}` regex).

Privacy & Security


All normalization happens in your browser via the standard String.prototype.normalize() API. No text is ever sent to a server.