ASCII Control Characters in Unicode

Understand ASCII control characters (U+0000 to U+001F) including NULL, TAB, LINE FEED, and CARRIAGE RETURN — their code points, UTF-8 encoding, and roles in text processing.

Basic Characters

Detailed Explanation

ASCII Control Characters

The first 32 Unicode code points (U+0000 to U+001F) plus U+007F (DELETE) are control characters inherited from the ASCII standard. These characters are non-printable — they do not render as visible glyphs but instead control how text is processed by terminals, printers, and software.

The Most Common Control Characters

Code Point Name Common Use
U+0000 NULL (NUL) String terminator in C/C++
U+0009 CHARACTER TABULATION (TAB) Horizontal tab in text
U+000A LINE FEED (LF) Newline on Unix/macOS
U+000D CARRIAGE RETURN (CR) Newline component on Windows (CR+LF)
U+001B ESCAPE (ESC) Start of ANSI escape sequences
U+007F DELETE (DEL) Delete character

UTF-8 Encoding

All ASCII control characters occupy a single byte in UTF-8, with values 0x00 through 0x1F and 0x7F. This one-byte representation means they are indistinguishable from their original ASCII encoding, which is a core design principle of UTF-8.

Why They Matter

Control characters frequently appear in data processing pipelines. A stray NULL byte can truncate strings in C programs. Mixed line endings (LF vs. CR+LF) cause issues when sharing files between operating systems. The ESCAPE character initiates terminal color codes and cursor movement sequences. Understanding these characters is the first step to debugging text encoding issues.

Identifying Hidden Characters

When you paste text into the Unicode Inspector, control characters are displayed with their code point label (e.g. U+000A) rather than an invisible glyph, making them easy to spot. The category column shows "Control" for all characters in this range.

Use Case

Use this when debugging data files that contain unexpected control characters — for example, finding hidden NULL bytes in a CSV export, identifying mixed line endings (LF vs CR+LF) in cross-platform scripts, or detecting stray escape sequences in log files.

Try It — Unicode Inspector

Open full tool