Choosing the Right Batch Size for Your Database
A guide to selecting the optimal batch size for MySQL, PostgreSQL, SQLite, and SQL Server based on row width, column count, and database configuration.
Detailed Explanation
Batch Size Selection Guide
The right batch size depends on three factors: your database engine, the width of each row (number of columns and string lengths), and your database configuration.
Decision Matrix
| Scenario | Recommended Batch Size |
|---|---|
| SQLite, wide tables (15+ columns) | 25-50 |
| SQLite, narrow tables (< 10 columns) | 100 |
| MySQL, default config | 100-500 |
MySQL, increased max_allowed_packet |
500-1000 |
| PostgreSQL, any config | 500-1000 |
| SQL Server | Up to 1000 (hard limit) |
Calculating the Limit
For SQLite, the constraint is SQLITE_MAX_VARIABLE_NUMBER (default 999):
max_rows = floor(999 / number_of_columns)
For a table with 12 columns: floor(999 / 12) = 83 rows maximum.
For MySQL, the constraint is max_allowed_packet (default 4MB):
estimated_packet_size = rows * avg_row_bytes
A row with 8 columns averaging 50 bytes each = 400 bytes/row. At 4MB limit: 4,194,304 / 400 = 10,486 rows.
Practical Tips
- Start conservative: Begin with 100, increase if no errors
- Monitor errors: "Packet too large" or "too many SQL variables" means reduce batch size
- Test in staging: Run the exact SQL against a staging database first
- Consider transactions: Wrap batches in BEGIN/COMMIT for consistency
- Network matters: Larger batches save more network round-trips for remote databases
Use Case
Your DBA reports that bulk imports occasionally fail with 'max_allowed_packet' errors in MySQL. By understanding the batch size calculation, you can set the exact limit that stays within the configured packet size.