Detect and Remove Byte Order Mark (BOM)

Find the invisible BOM character (U+FEFF) at the start of text files. Remove it to prevent issues with shell scripts, JSON, PHP, and HTTP responses.

Unicode Whitespace

Detailed Explanation

The Byte Order Mark (BOM)

The Byte Order Mark (BOM, U+FEFF) is a Unicode character that was originally designed to indicate the byte order (endianness) of a text stream for UTF-16 and UTF-32 encodings. For UTF-8, the BOM is unnecessary since UTF-8 has a fixed byte order, but many Windows tools still add it.

Where BOMs Come From

  • Notepad (Windows): Prior to Windows 10 version 1903, Notepad saved UTF-8 files with a BOM by default.
  • Excel CSV export: Excel often adds a BOM when exporting to UTF-8 CSV.
  • PowerShell: The Out-File cmdlet writes UTF-8 with BOM by default.
  • Visual Studio: Some VS configurations add BOM to new files.
  • Text editors: Some editors add BOM when "saving as UTF-8".

How the Visualizer Shows BOM

The BOM appears as an orange [BOM] marker, typically at the very beginning of the text. The statistics panel shows the count (usually 0 or 1, but concatenated files may have multiple).

Problems Caused by BOM

Context Problem
Shell scripts #!/bin/bash becomes \uFEFF#!/bin/bash, causing "command not found"
JSON JSON spec forbids BOM; parsers may reject the file or include BOM in the first key
PHP BOM before <?php sends output before headers, breaking session and redirect
HTTP BOM in response body can break JSON APIs and XML parsing
CSV First column header includes invisible BOM, causing lookup failures
YAML BOM can confuse YAML parsers or appear in string values
Concatenation Concatenating files with BOM produces BOMs in the middle of the result

Removing BOM

  1. Paste your file content into the Whitespace Visualizer.
  2. Look for [BOM] at the very start of the visualization.
  3. In the Clean section, enable BOM and click Clean.
  4. The BOM is stripped and the remaining text is unchanged.

Prevention

Configure your editor to save UTF-8 without BOM:

  • VS Code: "files.encoding": "utf8" (default, no BOM)
  • Notepad++: Encoding > UTF-8 (not "UTF-8 BOM")
  • Vim: :set nobomb

Use Case

A PHP developer's session management suddenly breaks after a colleague edits a config file on Windows. Headers are already sent before session_start(). The Whitespace Visualizer reveals a BOM at the beginning of the PHP file, added by Notepad. Removing the BOM fixes the session issue.

Try It — Whitespace Visualizer

Open full tool