Gzip Compression Basics — How It Works

Learn how gzip compression works, what the DEFLATE algorithm does, and why gzip is the most widely used compression method on the web.

Fundamentals

Detailed Explanation

How Gzip Compression Works

Gzip is the most widely deployed compression format on the web. Every major browser and web server supports it through the Content-Encoding: gzip HTTP header.

The DEFLATE Algorithm

At its core, gzip uses the DEFLATE algorithm, which combines two techniques:

  1. LZ77 (Lempel-Ziv 77): Finds repeated sequences of bytes in the input and replaces them with back-references. For example, if the string "background-color" appears 50 times in a CSS file, DEFLATE stores it once and references it elsewhere.

  2. Huffman Coding: Assigns shorter bit patterns to more frequently occurring bytes. Common characters like e, t, and space get shorter codes, while rare characters get longer ones.

Gzip vs Raw DEFLATE

The gzip format wraps DEFLATE output with:

Component Size Purpose
Magic number 2 bytes \x1f\x8b identifier
Header 8+ bytes Compression method, timestamp, OS
Compressed data Variable DEFLATE output
CRC-32 checksum 4 bytes Integrity verification
Original size 4 bytes Uncompressed length (mod 2³²)

This adds roughly 18–20 bytes of overhead, which is negligible for any file larger than a few hundred bytes.

Compression Levels

Gzip supports levels 1–9:

  • Level 1: Fastest compression, lowest ratio (~60% reduction for text)
  • Level 6: Default balance of speed and ratio (~70% reduction)
  • Level 9: Slowest compression, highest ratio (~72% reduction)

The difference between level 6 and level 9 is typically only 1–3% more compression but can take 2–4x longer. Most web servers use level 6.

Use Case

Essential knowledge for any web developer. Understanding gzip fundamentals helps you make informed decisions about asset delivery, server configuration, and performance optimization.

Try It — Gzip Size Calculator

Open full tool