Question 1

What is Unicode normalization?

Accepted Answer

Unicode normalization is the process of converting text into a standard form so that equivalent character sequences are stored identically. The Unicode Standard defines four normalization forms: NFC, NFD, NFKC, and NFKD. This ensures that text comparisons, searches, and storage are consistent regardless of how characters were originally encoded.

Question 2

What is the difference between NFC and NFD?

Accepted Answer

NFC (Canonical Composition) combines base characters and combining marks into single precomposed characters when possible. For example, 'e' + combining acute accent becomes 'é'. NFD (Canonical Decomposition) does the opposite: it decomposes precomposed characters into their base character plus combining marks. Both forms are canonically equivalent, meaning they represent the same text.

Question 3

When should I use NFKC or NFKD instead of NFC/NFD?

Accepted Answer

Use NFKC or NFKD when you want compatibility decomposition, which maps visually similar but semantically different characters to a common form. For example, the ligature 'ﬁ' (fi) becomes 'fi' in NFKC/NFKD, and fullwidth letters become their ASCII equivalents. This is useful for search indexing, username validation, and security checks. NFC/NFD preserve compatibility characters as-is.

Question 4

Which normalization form should I use for my application?

Accepted Answer

NFC is the most commonly recommended form and is the W3C recommendation for the web. It is also the default for macOS HFS+ filenames and most database systems. Use NFKC for search and comparison tasks where you want to treat visually similar characters as identical. NFD is used by macOS for filenames in APFS. The choice depends on your specific use case.

Question 5

Is my data safe?

Accepted Answer

Yes. All normalization is performed entirely in your browser using JavaScript's built-in String.prototype.normalize() method. No text, characters, or any other data is ever sent to any server. You can verify this by checking the Network tab in your browser's developer tools.

Question 6

Does normalization change the visual appearance of text?

Accepted Answer

NFC and NFD do not change the visual appearance — they only change the underlying code point representation. NFKC and NFKD may change the appearance because they replace compatibility characters: for example, superscript digits become regular digits, and ligatures are split into individual letters.

Question 7

How does normalization affect string comparison?

Accepted Answer

Without normalization, two visually identical strings may not be equal in a byte-by-byte comparison. For example, 'é' (U+00E9) and 'é' (U+0065 + U+0301) look the same but have different byte sequences. Normalizing both strings to the same form before comparison ensures consistent results. This is critical for databases, search engines, and authentication systems.

Unicode Normalizer

About This Tool

How to Use

Popular Unicode Normalization Examples

FAQ

Related Tools

Unicode Inspector

Encoding Detector

Text Case Converter

String Escape/Unescape