Question 1

NFC vs NFD — Canonical Composition vs Decomposition

Accepted Answer

## NFC vs NFD: The Two Canonical Forms

NFC and NFD are the two canonical normalization forms. They produce canonically equivalent text — meaning the text represents the same abstract characters — but they differ in how those characters are stored.

### NFD: Canonical Decomposition

NFD breaks precomposed characters into their base character plus combining marks:

| Input | NFD Result | Code Points |
|-------|-----------|-------------|
| é (U+00E9) | é | U+0065 + U+0301 |
| ñ (U+00F1) | ñ | U+

Question 2

When is this useful?

Accepted Answer

Essential for web developers working with internationalized text. The W3C recommends NFC for HTML content, while macOS file systems use NFD. Understanding the difference prevents bugs in file handling, form submission, and database storage of accented text.

Input	NFD Result	Code Points
`é` (U+00E9)	`é`	U+0065 + U+0301
`ñ` (U+00F1)	`ñ`	U+006E + U+0303
`ü` (U+00FC)	`ü`	U+0075 + U+0308

Input	NFC Result	Code Points
`e` + `́`	`é`	U+00E9
`n` + `̃`	`ñ`	U+00F1
`u` + `̈`	`ü`	U+00FC

NFC vs NFD — Canonical Composition vs Decomposition

Detailed Explanation

NFC vs NFD: The Two Canonical Forms

NFD: Canonical Decomposition

NFC: Canonical Composition

Key Differences

When They Produce the Same Output

Use Case

Try It — Unicode Normalizer

Related Topics