Handling Missing and Empty Values in CSV
Learn how to handle empty fields, missing values, and null representations in CSV-to-JSON conversion. Covers different null strategies and default values.
Detailed Explanation
Empty and Missing Values in CSV
Real-world CSV files frequently contain missing data. Understanding how empty fields map to JSON values is essential for data integrity.
Types of missing data in CSV
id,name,email,phone,notes
1,Alice,alice@example.com,555-0100,Regular customer
2,Bob,,555-0200,
3,Carol,carol@example.com,,
4,,,555-0400,"Has no name on file"
In this CSV:
- Row 2:
emailis empty - Row 2:
notesis empty - Row 3:
phoneandnotesare empty - Row 4:
nameandemailare empty
JSON output options
Strategy 1: Empty strings (default)
{ "id": "2", "name": "Bob", "email": "", "phone": "555-0200", "notes": "" }
Strategy 2: Null values
{ "id": "2", "name": "Bob", "email": null, "phone": "555-0200", "notes": null }
Strategy 3: Omit empty fields
{ "id": "2", "name": "Bob", "phone": "555-0200" }
Choosing the right strategy
| Strategy | Best for | Drawback |
|---|---|---|
| Empty strings | Schema-strict APIs that require all fields | Cannot distinguish "empty" from "absent" |
| Null values | Databases with nullable columns | Some JSON parsers or APIs reject null |
| Omit fields | Document databases (MongoDB) | Inconsistent object shapes |
Special empty value representations
Some CSV files use sentinel values instead of truly empty cells:
name,age,registered
Alice,30,2024-01-15
Bob,N/A,NULL
Carol,,none
Values like N/A, NULL, none, -, and #N/A (Excel) may all represent "no data." A robust converter should let users configure which strings to treat as null:
const nullValues = new Set(["", "N/A", "NULL", "null", "none", "-", "#N/A"]);
const parsed = rawValue && !nullValues.has(rawValue) ? rawValue : null;
Trailing commas
A common source of phantom empty fields is trailing commas at the end of lines. The line Alice,30, has three fields where the third is empty. Some editors add trailing commas inconsistently, so always handle the possibility of extra empty trailing fields.
Use Case
Cleaning up a survey results CSV where respondents skipped optional questions, and the data must be loaded into a database that distinguishes between null (not answered) and empty string (answered but left blank).