Complete Guide to Unicode Whitespace Characters

A comprehensive reference of all Unicode whitespace and invisible characters. Learn their Unicode code points, purposes, and common sources.

Debugging

Detailed Explanation

Unicode Whitespace Character Reference

Unicode defines many more whitespace and invisible characters than the basic space and newline. Understanding them is essential for robust text processing.

Standard Whitespace Characters

Character Unicode Name Marker
Space U+0020 Space ·
Tab U+0009 Character Tabulation
LF U+000A Line Feed
CR U+000D Carriage Return
NBSP U+00A0 No-Break Space °

Zero-Width Characters

Character Unicode Name Marker
ZWS U+200B Zero Width Space [ZWS]
ZWJ U+200D Zero Width Joiner [ZWJ]
ZWNJ U+200C Zero Width Non-Joiner [ZWNJ]
SHY U+00AD Soft Hyphen [SHY]
BOM U+FEFF Byte Order Mark / Zero Width No-Break Space [BOM]

Other Unicode Spaces (Not in Visualizer)

These are less common but worth knowing about:

Unicode Name Width
U+2000 En Quad Width of letter N
U+2001 Em Quad Width of letter M
U+2002 En Space Half an em
U+2003 Em Space Full em
U+2004 Three-Per-Em Space 1/3 em
U+2005 Four-Per-Em Space 1/4 em
U+2006 Six-Per-Em Space 1/6 em
U+2007 Figure Space Width of a digit
U+2008 Punctuation Space Width of a period
U+2009 Thin Space 1/5 em
U+200A Hair Space Very thin
U+202F Narrow No-Break Space Narrow NBSP
U+205F Medium Mathematical Space 4/18 em
U+3000 Ideographic Space Full-width CJK space

How Characters Get Mixed In

  1. Copy from web: HTML renders various space entities
  2. Copy from documents: Word processors use typographic spaces
  3. Multi-language input: Different input methods produce different spaces
  4. API responses: External data may contain unexpected Unicode
  5. Database migration: Character encoding conversion can introduce artifacts

Using the Whitespace Visualizer

The tool detects the 11 most commonly problematic characters listed in the first two tables. Paste any suspicious text to immediately see which invisible characters are present, their exact positions, and their counts. Use the Clean feature to selectively remove specific types.

Use Case

A developer building a text processing library needs to handle all types of Unicode whitespace correctly. They use the Whitespace Visualizer as a reference and testing tool to verify their regex patterns correctly identify each whitespace character type.

Try It — Whitespace Visualizer

Open full tool