NFKC vs NFKD — Compatibility Composition vs Decomposition

Learn how NFKC and NFKD differ from NFC/NFD by applying compatibility decomposition. Understand when ligatures, fullwidth characters, and special symbols are transformed.

Core Forms

Detailed Explanation

NFKC vs NFKD: Compatibility Normalization

NFKC and NFKD add compatibility decomposition on top of canonical normalization. This means they replace characters that are semantically different but visually similar with their standard equivalents.

What Compatibility Decomposition Does

Input NFKC/NFKD Result Description
(fi ligature) fi Ligature split into letters
(fullwidth A) A Fullwidth to ASCII
½ (vulgar fraction) 1⁄2 Fraction decomposed
(Ohm sign) Ω Symbol to Greek letter
(Roman numeral I) I Numeral to letter

NFKC vs NFKD

The difference between NFKC and NFKD mirrors the NFC/NFD distinction:

  • NFKD: Compatibility decomposition only (longer output)
  • NFKC: Compatibility decomposition followed by canonical composition (shorter output)

For example, with fié (fi-ligature followed by é):

  • NFKD: f + i + e + ́ (4 code points)
  • NFKC: f + i + é (3 code points)

Important Warning

Compatibility normalization is lossy — it discards formatting distinctions that may be meaningful. The fi-ligature and the letters "fi" are semantically different in some contexts (e.g., typography). Only use NFKC/NFKD when you intentionally want to discard these distinctions.

Use Case

Critical for search engines, username validation, and security systems that need to treat visually similar characters identically. NFKC is used by Python's NFKC casefold for identifier comparison, and by PRECIS (RFC 8264) for username/password preparation.

Try It — Unicode Normalizer

Open full tool