MD5 Hash Algorithm

Understand the MD5 hash algorithm, its 128-bit output, known vulnerabilities, and why it is still used for non-security checksums despite being cryptographically broken.

MD5

Detailed Explanation

MD5 (Message-Digest Algorithm 5) was designed by Ronald Rivest in 1991 as an improvement over MD4. It produces a 128-bit (16-byte) hash digest, usually displayed as a 32-character hexadecimal string. Despite being one of the most widely recognized hash functions, MD5 is now considered cryptographically broken and unsuitable for security-sensitive applications.

How MD5 works:

MD5 processes input in 512-bit blocks. After padding the message to a length congruent to 448 mod 512 and appending the original length as a 64-bit value, each block passes through four rounds of 16 operations each (64 operations total). These rounds use different nonlinear functions (F, G, H, I) along with bitwise operations, modular addition, and left rotations applied to four 32-bit state variables (A, B, C, D).

Why MD5 is broken:

In 2004, researchers Xiaoyun Wang and Hongbo Yu demonstrated practical collision attacks against MD5, generating two different inputs that produce the same hash. By 2008, researchers created a rogue CA certificate using an MD5 collision, proving the attack had real-world consequences. Today, generating MD5 collisions takes seconds on commodity hardware. The Flame malware in 2012 exploited MD5 collisions to forge Windows Update signatures.

Where MD5 is still used:

Despite its cryptographic weaknesses, MD5 remains common for non-security tasks: verifying file downloads for accidental corruption, generating cache keys, deduplicating data, and as a fast fingerprinting mechanism. Many legacy systems and protocols still reference MD5 checksums. Linux package managers historically used MD5 (though most have migrated to SHA-256). The key distinction is that MD5 is fine for detecting accidental changes but cannot protect against deliberate tampering.

Practical guidance:

Never use MD5 for digital signatures, certificate verification, password hashing, or any context where an adversary could exploit collisions. For new projects, always prefer SHA-256 or SHA-3 instead.

Use Case

MD5 is commonly used for quick file integrity checks, generating cache keys in web applications, and verifying that a download was not accidentally corrupted during transfer.

Try It — Hash Generator

Open full tool