Handling Excel-Generated CSV Files
Deal with CSV quirks from Microsoft Excel exports: BOM characters, locale-specific delimiters, date serial numbers, formula artifacts, and encoding issues.
Detailed Explanation
Excel CSV Export Quirks
Microsoft Excel is the world's most common source of CSV files, and it introduces several non-standard behaviors that trip up parsers.
Excel-specific issues
1. UTF-8 BOM (Byte Order Mark)
When you "Save As > CSV UTF-8" in Excel, the file starts with three invisible bytes (EF BB BF). These bytes cause the first column header to be read as \uFEFFname instead of name:
// Before: "\uFEFFname,age,city"
const clean = rawText.replace(/^\uFEFF/, "");
// After: "name,age,city"
2. Locale-dependent delimiters
Excel uses the system's list separator as the CSV delimiter. In the US/UK this is a comma, but in Germany, France, and many other countries it is a semicolon. A file saved as "CSV" in German Excel uses ; not ,.
3. Date formatting
Excel may export dates as:
- Formatted strings:
"1/15/2024"(depends on system locale) - Serial numbers:
45306(if the cell was formatted as General before saving) - Mixed: some dates as strings, others as numbers in the same column
4. Formula artifacts
Cells containing formulas are exported as their calculated values, but sometimes artifacts remain:
="001234"— Excel uses a leading equals-quote pattern to force text interpretation#REF!,#N/A,#DIV/0!— error values appear as literal strings- Leading apostrophes may appear for text-forced numbers
5. Newlines in cells
Excel allows cell values to contain Alt+Enter line breaks. These are preserved in CSV as literal newlines within quoted fields:
name,address
Alice,"123 Main St
Apt 4B
Springfield, IL"
Robustness checklist for Excel CSV
- Strip UTF-8 BOM
- Auto-detect delimiter (comma vs semicolon)
- Handle doubled quotes within quoted fields
- Support multiline quoted fields
- Detect and convert Excel serial dates
- Trim formula prefixes (
=", trailing") - Map Excel error values (
#N/A,#REF!) to null
Testing your parser
Always test with CSV files saved from Excel on different operating systems and locales. A CSV from Excel on macOS may differ from one saved on Windows, and both differ from LibreOffice exports.
Use Case
Building a CSV import wizard for a SaaS application where 80% of uploaded files come from Excel, and the importer must handle all common Excel export quirks without user intervention.