Optimizing Bulk JSON to SQL INSERT Performance
Learn performance optimization techniques for bulk inserting JSON data into SQL databases. Covers batch sizing, transaction wrapping, COPY commands, index management, and parallelism.
Detailed Explanation
Bulk Insert Performance Optimization
When converting large JSON datasets (thousands to millions of records) into SQL, naive row-by-row INSERT is prohibitively slow. Several techniques can improve import speed by 10x to 100x.
Batch INSERT Sizing
Instead of one INSERT per row, batch rows into multi-value statements:
-- Slow: one INSERT per row
INSERT INTO events (id, type) VALUES (1, 'click');
INSERT INTO events (id, type) VALUES (2, 'view');
INSERT INTO events (id, type) VALUES (3, 'click');
-- Fast: batched INSERT
INSERT INTO events (id, type) VALUES
(1, 'click'),
(2, 'view'),
(3, 'click');
Optimal batch size varies by database but is typically 500-5000 rows per statement.
Transaction Wrapping
Wrap all inserts in a single transaction to avoid per-statement commit overhead:
BEGIN;
INSERT INTO events (id, type) VALUES (1, 'click'), (2, 'view'), ...;
INSERT INTO events (id, type) VALUES (5001, 'click'), (5002, 'view'), ...;
COMMIT;
COPY Command (PostgreSQL)
For maximum speed, PostgreSQL's COPY command reads CSV/TSV directly:
COPY events (id, type, timestamp) FROM STDIN WITH (FORMAT csv);
1,click,2024-06-15
2,view,2024-06-15
\.
The converter can generate COPY-compatible output from JSON arrays.
LOAD DATA INFILE (MySQL)
MySQL's equivalent for bulk loading:
LOAD DATA INFILE '/tmp/events.csv'
INTO TABLE events
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n';
Index and Constraint Management
Temporarily disabling indexes and constraints during bulk insert can dramatically improve speed:
-- Before bulk insert
ALTER TABLE events DISABLE TRIGGER ALL; -- PostgreSQL
SET FOREIGN_KEY_CHECKS = 0; -- MySQL
-- ... bulk inserts ...
-- After bulk insert
ALTER TABLE events ENABLE TRIGGER ALL;
SET FOREIGN_KEY_CHECKS = 1;
Performance Comparison
| Method | ~Rows/second (PostgreSQL) |
|---|---|
| Individual INSERTs | 100-500 |
| Batched INSERTs (1000 rows) | 10,000-50,000 |
| COPY command | 100,000-500,000 |
Use Case
When loading a 500MB JSON export from an analytics service into a PostgreSQL data warehouse for reporting, bulk insert optimization reduces import time from hours to minutes. This is critical for nightly ETL jobs and data pipeline reliability.
Try It — JSON to SQL
Related Topics
Convert a JSON Array to Multi-Row INSERT Statements
Basic INSERT
Converting JSON Arrays into Multiple SQL Rows
Batch Operations
Converting CSV-Structured JSON to SQL INSERT Statements
Real-World Scenarios
Generating SQL UPSERT (INSERT ON CONFLICT) from JSON
Advanced Patterns
Converting REST API JSON Responses to SQL INSERT Statements
Real-World Scenarios