Base64 Size Increase (~33%)

Understand why Base64 encoding increases data size by approximately 33%. Learn the math behind the overhead, its impact, and strategies to minimize it.

Concept

Detailed Explanation

Every Base64-encoded output is larger than its input. The overhead is approximately 33%, and this section explains exactly why and what you can do about it.

The math:

Base64 encodes 3 bytes of input into 4 characters of output. Each input byte is 8 bits, and each Base64 character represents 6 bits:

  • 3 bytes input = 24 bits
  • 24 bits / 6 bits per character = 4 Base64 characters
  • 4 output characters / 3 input bytes = 1.333... ratio

So the encoded output is 4/3 (approximately 1.333) times the size of the input, a 33.33% increase.

With padding: If the input length is not a multiple of 3, padding adds 1-2 extra characters. The formula for the exact encoded length (with padding) is:

encodedLength = 4 * ceil(inputLength / 3)

With line wrapping (MIME): MIME-formatted Base64 adds CRLF (2 bytes) every 76 characters, increasing the overhead further to approximately 36-37%.

Practical impact:

Original Size Base64 Size Overhead
1 KB 1.33 KB +0.33 KB
100 KB 133 KB +33 KB
1 MB 1.33 MB +0.33 MB
10 MB 13.3 MB +3.3 MB

For small assets (icons, thumbnails), the overhead is negligible and often offset by eliminating an HTTP request. For large files, the overhead becomes problematic: a 10MB video becomes 13.3MB, increasing bandwidth costs and transfer times.

Mitigating the overhead:

  1. Compression before encoding: Gzip or Brotli-compress the data before Base64 encoding. The compressed-then-encoded result is often much smaller than the Base64 of the raw data.

  2. Compression after encoding: When Base64 data is served over HTTP, the server's gzip/Brotli compression partially compensates. Base64 text has patterns that compress well, typically reducing the effective overhead to 5-10%.

  3. Use binary protocols: If size is critical, use protocols that support binary data natively (Protocol Buffers, MessagePack, WebSocket binary frames) instead of text-based formats requiring Base64.

  4. Threshold-based inlining: Only Base64-encode assets below a size threshold (typically 4-10KB). Larger assets should remain as separate files with proper caching.

Common mistake: Comparing Base64 file sizes without considering transport compression. A 100KB image as Base64 in an HTML file served with Brotli may transfer nearly the same number of bytes as serving the image separately.

Use Case

Calculating the bandwidth cost of embedding product images as Base64 in an API response versus serving them from a CDN with separate URLs.

Try It — Base64 Encoder

Open full tool