Colliding with the SHA prefix of Linux's initial Git commit

December 30, 2024

Or, how to break all the tools that parse the “Fixes:” tag

Kees Cook

There was a recent discussion about how Linux's “Fixes” tag, which traditionally uses the 12 character commit SHA prefix, has an ever increasing chance of collisions. There are already 11-character collisions, and Geert wanted to raise the minimum short id to 16 characters. This was met with push-back for various reasons. One aspect that bothered me was some people still treating this like a theoretical “maybe in the future” problem. To clear up that problem, I generated a 12-character prefix collision against the start of Git history, commit 1da177e4c3f4 (“Linux-2.6.12-rc2”), which shows up in “Fixes” tags very often:

$ git log --no-merges --oneline --grep 'Fixes: 1da177e4c3f4' | wc -l
590

Tools like linux-next's “Fixes tag checker”, the Linux CNA's commit parser, and my own CVE lifetime analysis scripts do programmatic analysis of the “Fixes” tag and had no support for collisions (even shorter existing collisions).

So, in an effort to fix these tools, I broke them with commit 1da177e4c3f4 (“docs: git SHA prefixes are for humans”):

$ git show 1da177e4c3f4
error: short object ID 1da177e4c3f4 is ambiguous
hint: The candidates are:
hint:   1da177e4c3f41 commit 2005-04-16 - Linux-2.6.12-rc2
hint:   1da177e4c3f47 commit 2024-12-14 - docs: git SHA prefixes are for humans

This is not yet in the upstream Linux tree, for fear of breaking countless other tools out in the wild. But it can serve as a test commit for those that want to get this fixed ahead of any future collisions (or this commit actually landing).

Lots of thanks to the lucky-commit project, which will grind trailing commit message whitespace in an attempt to find collisions. Doing the 12-character prefix collision took about 6 hours on my OpenCL-enabled RTX 3080 GPU.

For any questions, comments, etc, see this thread.