How do I figure out the original git revision of a GitHub zip I downloaded?
This is a problem I ran into today: Awhile back I downloaded a copy of a project hosted on GitHub, using the “Download Zip” button on the project page.
A few months later I came back, and I needed to know exactly which git revision it was I downloaded. Given this zip file, how does one “go backward” and figure out what revision it was? (The GitHub zips contain just a folder containing a single revision; there is no .git directory in it, so command line git can’t do anything.) I got some help on Twitter (thanks Robert!) and eventually figured out how— but I couldn’t find an explanation anywhere Google had picked up, so here’s a quick summary for future generations.
Method 1: Zip comment
This one’s pretty simple: There’s a thing in the headers of zip files called the “zip comment”, and apparently GitHub stores the original revision hash in there. You can get this out using the command line zip tool and the “unzip -z” flag:
If you’re using Windows, the makers of the WinZip tool claim they display the comment automatically when you open the zip in WinZip.
Method 2: Dates
Maybe you’ve lost the original zip, and you just have the folder? We still have one piece of useful information: the “modified” dates. If you look, all the files in the unzipped repo will have their created/modified dates set to the same date and time:
In fact, this is the date and time corresponding to the exact timestamp on the original Git commit. So we can just go look through the git logs until we find a revision with that exact timestamp:
By the way: If you download a zip from BitBucket, they helpfully actually put the exact revision hash into the name of the downloaded zip, so you’re unlikely to run into this problem. Just so you know though BitBucket git repositories do work with both the “zip comment” and the timestamp rules above. BitBucket mercurial repositories don’t have the “comment” in the zip file, but they do contain an invisible file named .hg_archival at the root level of the unzipped directory which contains the same information and then some.