Convert Data with Unicode and Special Characters

Handle Unicode characters, emoji, accented letters, and other non-ASCII content when converting between TSV and CSV formats.

Special Characters

Detailed Explanation

Unicode and Special Characters

Modern data frequently contains Unicode characters such as accented letters, CJK characters, emoji, and mathematical symbols. The TSV/CSV converter handles all Unicode content correctly because JavaScript strings are natively Unicode.

Example: International Data (TSV)

Name	City	Notes
Jean-Pierre Dupré	Paris	Café owner ☕
田中太郎	東京	デベロッパー 💻
María García	México City	Estudiante 🌟
Михаил Иванов	Москва	Инженер

Generated CSV Output

Name,City,Notes
Jean-Pierre Dupré,Paris,Café owner ☕
田中太郎,東京,デベロッパー 💻
María García,México City,Estudiante 🌟
Михаил Иванов,Москва,Инженер

Character Encoding

The converter processes text as JavaScript strings, which use UTF-16 internally. This means:

  • All Unicode code points are supported, including supplementary plane characters (emoji, rare CJK characters)
  • No mojibake: Characters are not corrupted during conversion
  • BOM handling: If your input starts with a UTF-8 BOM (byte order mark), it is preserved

Special Characters That Trigger Quoting

Only the following characters trigger quoting in the output:

  • The target delimiter (comma or tab)
  • The quote character (double or single quote)
  • Newline characters (\n or \r)

Unicode characters like é, ñ, ü, å, CJK characters, and emoji do not trigger quoting because they are not syntactically significant in CSV/TSV.

Downloaded File Encoding

When you use the Download button, the file is saved as UTF-8. Most modern applications (Excel 2016+, Google Sheets, LibreOffice) handle UTF-8 CSV files correctly.

Use Case

Converting international customer data, multilingual content databases, or any dataset containing non-ASCII characters between TSV and CSV while preserving all Unicode characters correctly.

Try It — TSV \u2194 CSV Converter

Open full tool