ZIP File Signature — Magic Bytes

Learn the ZIP file magic bytes (PK header) and how to identify ZIP archives in hex dumps. Covers local file headers, central directory, and related formats.

File Signatures

Hex

50 4B 03 04

ASCII

PK..

Detailed Explanation

ZIP files begin with the signature 50 4B 03 04, which spells "PK" followed by bytes 03 04. The "PK" stands for Phil Katz, the creator of the ZIP format and the PKZIP compression utility. This four-byte signature marks the beginning of a local file header, which is the first structure in a ZIP archive.

The PK signature family:

The ZIP format uses several different "PK" signatures to mark different internal structures:

Hex ASCII Structure
50 4B 03 04 PK.. Local file header (start of a file entry)
50 4B 01 02 PK.. Central directory file header
50 4B 05 06 PK.. End of central directory record
50 4B 07 08 PK.. Data descriptor
50 4B 06 06 PK.. ZIP64 end of central directory

Local file header structure:

After the 4-byte signature, the local file header contains:

  • Bytes 4-5: Version needed to extract
  • Bytes 6-7: General purpose bit flags
  • Bytes 8-9: Compression method (08 00 = Deflate, 00 00 = Stored)
  • Bytes 10-13: Last modification date and time (MS-DOS format)
  • Bytes 14-17: CRC-32 checksum
  • Bytes 18-21: Compressed size
  • Bytes 22-25: Uncompressed size
  • Bytes 26-27: Filename length
  • Bytes 28-29: Extra field length
  • Variable: Filename (ASCII or UTF-8)

Formats that use ZIP internally:

Many modern file formats are actually ZIP archives with a different extension:

  • .docx, .xlsx, .pptx — Microsoft Office Open XML documents
  • .odt, .ods, .odp — OpenDocument format files
  • .jar — Java archive files
  • .apk — Android application packages
  • .epub — Electronic publication books

All of these start with the 50 4B 03 04 signature. You can rename any of these to .zip and open them with a standard ZIP utility.

Empty and spanned archives:

An empty ZIP archive consists of only the end-of-central-directory record, starting with 50 4B 05 06. Spanned (multi-part) archives may start with a temporary spanning marker 50 4B 07 08. Tools should check for these variants when validating ZIP files.

Detection in hex editors:

When analyzing unknown binary data, searching for 50 4B 03 04 is one of the most common operations. Because so many formats are ZIP-based, this signature appears frequently in forensic investigations, malware analysis, and data recovery scenarios.

Use Case

ZIP signature detection is critical in file upload security (checking if uploaded files match their claimed type), in forensic data carving from disk images, and in build pipelines that validate artifact integrity.

Try It — Hex Editor

Open full tool