Converting Large CSV Files: Performance Tips
Handle large CSV-to-JSON conversion efficiently in the browser. Covers streaming parsing, chunked processing, memory management, and web worker strategies.
Detailed Explanation
Performance with Large CSV Files
When CSV files grow beyond a few megabytes, naive parsing strategies can freeze the browser tab or crash due to memory exhaustion. Here are proven techniques for handling large datasets.
Size thresholds
| File size | Strategy | Notes |
|---|---|---|
| < 1 MB | Parse all at once | No special handling needed |
| 1-10 MB | Chunked processing | Split into batches, yield to UI between chunks |
| 10-100 MB | Streaming + Web Worker | Parse in a worker thread, stream results |
| > 100 MB | Server-side or specialized tool | Browser may not be suitable |
Chunked processing
Instead of parsing the entire file at once, process it in chunks of 1,000-5,000 rows, yielding control back to the browser between chunks:
async function parseChunked(csv, chunkSize = 2000) {
const lines = csv.split("\n");
const headers = lines[0].split(",");
const results = [];
for (let i = 1; i < lines.length; i += chunkSize) {
const chunk = lines.slice(i, i + chunkSize);
for (const line of chunk) {
const values = line.split(",");
const obj = {};
headers.forEach((h, idx) => obj[h.trim()] = values[idx]?.trim() ?? "");
results.push(obj);
}
// Yield to the event loop so the UI remains responsive
await new Promise(resolve => setTimeout(resolve, 0));
}
return results;
}
Web Worker approach
For files over 10 MB, move parsing to a Web Worker so the main thread stays responsive:
- Send the file's
ArrayBufferto the worker viapostMessage. - The worker decodes and parses the CSV.
- The worker sends back results in chunks via
postMessage. - The main thread assembles the final JSON output.
Memory optimization
- Avoid storing the raw CSV string and parsed JSON simultaneously. Parse line by line, discarding each raw line after conversion.
- Use streaming output. Instead of building a single giant JSON array, output JSON Lines (one JSON object per line) which can be consumed incrementally.
- Limit preview rows. Show only the first 100 rows in the UI while processing the full file in the background.
Browser-specific limits
Modern browsers typically allow 1-4 GB of heap memory per tab. A 50 MB CSV file can expand to 200-500 MB of JSON objects in memory due to JavaScript object overhead. Monitor memory usage with performance.memory (Chrome) and warn users when approaching limits.
Use Case
Building a data analysis dashboard where analysts upload multi-million-row CSV log files directly in the browser and need immediate feedback without page freezes.