Zero-Width Characters in Unicode

Discover invisible zero-width Unicode characters including ZWSP, ZWNJ, ZWJ, and Word Joiner — their code points, purposes, and how to detect them in text.

Special Characters

Detailed Explanation

Zero-Width Characters

Zero-width characters are Unicode code points that have no visible rendering but affect text processing, line breaking, and shaping. They are invisible to the naked eye, making them a common source of bugs and security concerns.

Common Zero-Width Characters

Code Point Name Abbreviation Purpose
U+200B ZERO WIDTH SPACE ZWSP Optional line break opportunity
U+200C ZERO WIDTH NON-JOINER ZWNJ Prevents ligature formation
U+200D ZERO WIDTH JOINER ZWJ Joins adjacent characters (emoji sequences)
U+2060 WORD JOINER WJ Prevents line break (replaces deprecated U+FEFF)
U+FEFF BYTE ORDER MARK BOM File encoding marker; legacy ZWNBSP
U+200E LEFT-TO-RIGHT MARK LRM Forces LTR directionality
U+200F RIGHT-TO-LEFT MARK RLM Forces RTL directionality

UTF-8 Encoding

All these characters occupy 3 bytes in UTF-8:

  • U+200B → E2 80 8B
  • U+200C → E2 80 8C
  • U+200D → E2 80 8D
  • U+FEFF → EF BB BF

Why They're Problematic

  1. String comparison: Two strings that look identical may differ by a hidden zero-width character
  2. Data validation: User input may contain invisible characters that bypass length checks
  3. Security: Zero-width characters can be used for steganography (hiding messages) or confusable attacks
  4. Search/indexing: Hidden characters affect search results and database lookups

Detection with the Unicode Inspector

The Unicode Inspector displays zero-width characters with their code point label (e.g. U+200B) instead of rendering nothing, making them immediately visible. The category column shows the appropriate classification, and the byte count reveals the 3-byte overhead each invisible character adds.

Practical Tips

  • Use String.prototype.normalize() to standardize text before comparison
  • Strip zero-width characters with a regex: /[\u200B-\u200D\uFEFF\u2060]/g
  • Check for unexpected zero-width characters when debugging string comparison failures

Use Case

Use this when debugging invisible character issues in user-submitted text, detecting potential steganography or string manipulation attacks, cleaning data imports that contain hidden characters, or understanding why visually identical strings fail equality checks.

Try It — Unicode Inspector

Open full tool