SHA-1 in Git

How Git uses SHA-1 to identify commits, trees, and blobs. Learn about Git's content-addressable storage, the security implications, and the ongoing migration to SHA-256.

SHA-1

Detailed Explanation

Git uses SHA-1 as the foundation of its content-addressable storage system. Every object in a Git repository (commits, trees, blobs, and tags) is identified by the SHA-1 hash of its contents. This design gives Git its integrity guarantees and enables distributed collaboration without a central server.

How Git computes object hashes:

Git prepends a header to each object before hashing: "<type> <size>\0<content>". For example, a blob containing "hello" is hashed as "blob 5\0hello". The resulting SHA-1 hash (40 hex characters) becomes the object's identifier. This means the same file content always produces the same blob hash, enabling deduplication across the entire repository and across different repositories.

Content-addressable storage:

Git stores objects in .git/objects/ using the first two hex characters as a directory name and the remaining 38 as the filename: .git/objects/aa/bbcc.... This flat structure means any object can be retrieved by its hash. Commits reference tree objects by hash, trees reference blobs by hash, and each commit references its parent commit(s) by hash. This creates an immutable directed acyclic graph (DAG) where altering any object requires changing all descendant hashes.

Security implications:

SHA-1's known collision attacks theoretically allow an attacker to create two different files with the same Git blob hash. The SHAttered researchers demonstrated this with two PDF files in 2017. In practice, Git implemented a SHAttered detection mechanism that rejects known collision patterns. However, more general chosen-prefix collisions (demonstrated in 2020) cannot be detected this way. For most development workflows, the risk is low because exploitation requires both creating collisions and inserting them into a repository you have write access to.

The SHA-256 migration:

Git has been developing SHA-256 support since 2018. The extensions.objectFormat configuration allows creating repositories using SHA-256 instead of SHA-1. As of recent Git versions, SHA-256 repositories are functional but the ecosystem (GitHub, GitLab, hosting services) is still catching up. The migration plan includes a compatibility mode where SHA-1 and SHA-256 object names can coexist, with an object name translation table.

Practical impact:

For most developers, Git's SHA-1 usage works fine for day-to-day operations. The collision risk is primarily theoretical in typical development scenarios. However, for high-security applications (signed commits, supply chain security), the SHA-256 migration provides important additional assurance.

Use Case

Git relies on SHA-1 to uniquely identify every commit, file, and directory in a repository, forming the basis of its distributed version control and integrity guarantees.

Try It — Hash Generator

Open full tool