Chapter 6. Git Grandmastery

This pretentiously named page is my dumping ground for uncategorized Git tricks.

Source Releases

For my projects, Git tracks exactly the files I'd like to archive and release to users. To create a tarball of the source code, I run:

$ git archive --format=tar --prefix=proj-1.2.3/ HEAD

Changelog Generation

It's good practice to keep a changelog, and some projects even require it. If you've been committing frequently, which you should, generate a Changelog by typing:

$ git log > ChangeLog

Git Over SSH, HTTP

Suppose you have ssh access to a web server, but Git is not installed. Though less efficient than its native protocol, Git can communicate over HTTP.

Download, compile and install Git in your account, and create a repository in your web directory:

$ GIT_DIR=proj.git git init

In the "proj.git" directory, run:

$ git --bare update-server-info
$ chmod a+x hooks/post-update

From your computer, push via ssh:

$ git push web.server:/path/to/proj.git master

and others get your project via:

$ git clone http://web.server/proj.git

Git Over Anything

Want to synchronize repositories without servers, or even a network connection? Need to improvise during an emergency? We've seen git fast-export and git fast-import can convert repositories to a single file and back. We could shuttle such files back and forth to transport git repositories over any medium, but a more efficient tool is git bundle.

The sender creates a bundle:

$ git bundle create somefile HEAD

then transports the bundle, somefile, to the other party somehow: email, thumb drive, floppy disk, an xxd printout and an OCR machine, reading bits over the phone, smoke signals, etc. The receiver retrieves commits from the bundle by typing:

$ git pull somefile

The receiver can even do this from an empty repository. Despite its size, somefile contains the entire original git repository.

In larger projects, eliminate waste by bundling only changes the other repository lacks:

$ git bundle create somefile HEAD ^COMMON_SHA1

If done frequently, one could easily forget which commit was the last to be sent. The help page suggests using tags to solve this. Namely, after you send a bundle, type:

$ git tag -f lastbundle HEAD

and create new refresher bundles with:

$ git bundle create newbundle HEAD ^lastbundle

Commit What Changed

Telling Git when you've added, deleted and renamed files is troublesome for certain projects. Instead, you can type:

$ git add .
$ git add -u

Git will look at the files in the current directory and work out the details by itself. Instead of the second add command, run git commit -a if you also intend to commit at this time.

You can perform the above in a single pass with:

$ git ls-files -d -m -o -z | xargs -0 git update-index --add --remove

The -z and -0 options prevent ill side-effects from filenames containing strange characters. Note this command adds ignored files. You may want to use the -x or -X option.

My Commit Is Too Big!

Have you neglected to commit for too long? Been coding furiously and forgotten about source control until now? Made a series of unrelated changes, because that's your style?

No worries. Run:

$ git add -p

For each edit you made, Git will show you the hunk of code that was changed, and ask if it should be part of the next commit. Answer with "y" or "n". You have other options, such as postponing the decision; type "?" to learn more.

Once you're satisfied, type

$ git commit

to commit precisely the changes you selected (the staged changes). Make sure you omit the -a option, otherwise Git will commit all the edits.

What if you've edited many files in many places? Reviewing each change one by one becomes frustratingly mind-numbing. In this case, use git add -i, whose interface is less straightforward, but more flexible. With a few keystrokes, you can stage or unstage several files at a time, or review and select changes in particular files only. Alternatively, run git commit --interactive which automatically runs commits after you're done.

Don't Lose Your HEAD

The HEAD tag is like a cursor that normally points at the latest commit, advancing with each new commit. Some Git commands let you move it. For example:

$ git reset HEAD~3

will move the HEAD three commits backwards in time. Thus all Git commands now act as if you hadn't made those last three commits, while your files remain in the present. See the git reset man page for some applications.

But how can you go back to the future? The past commits know nothing of the future.

If you have the SHA1 of the original HEAD then:

$ git reset SHA1

But suppose you never took it down? Don't worry, for commands like these, Git saves the original HEAD as a tag called ORIG_HEAD, and you can return safe and sound with:

$ git reset ORIG_HEAD

HEAD-hunting

Perhaps ORIG_HEAD isn't enough. Perhaps you've just realized you made a monumental mistake and you need to go back to an ancient commit in a long-forgotten branch.

By default, Git keeps a commit for at least two weeks, even if you ordered Git to destroy the branch containing it. The trouble is finding the appropriate hash. You could look at all the hash values in .git/objects and use trial and error to find the one you want. But there's a much easier way.

Git records every hash of a commit it computes in .git/logs. The subdirectory refs contains the history of all activity on all branches, while the file HEAD shows every hash value it has ever taken. The latter can be used to find hashes of commits on branches that have been accidentally lopped off.

The reflog command provides a friendly interface to these log files. Try

$ git reflog

Instead of cutting and pasting hashes from the reflog, try:

$ git checkout "@{10 minutes ago}"

Or checkout the 5th-last visited commit via:

$ git checkout "@{5}"

See the “Specifying Revisions” section of git help rev-parse for more.

You may wish to configure a longer grace period for doomed commits. For example:

$ git config gc.pruneexpire "30 days"

means a deleted commit will only be permanently lost once 30 days have passed and git gc is run.

You may also wish to disable automatic invocations of git gc:

$ git conifg gc.auto 0

in which case commits will only be deleted when you run git gc manually.

Building On Git

In true UNIX fashion, Git's design allows it to be easily used as a low-level component of other programs. There are GUI interfaces, web interfaces, alternative command-line interfaces, and perhaps soon you will have a script or two of your own that calls Git.

One easy trick is to use built-in git aliases to shorten your most frequently used commands:

$ git config --global alias.co checkout
$ git config --global --get-regexp alias  # display current aliases
alias.co checkout
$ git co foo                              # same as 'git checkout foo'

Another is to print the current branch in the prompt, or window title. Invoking

$ git symbolic-ref HEAD

shows the current branch name. In practice, you most likely want to remove the "refs/heads/" and ignore errors:

$ git symbolic-ref HEAD 2> /dev/null | cut -b 12-

See the Git homepage for more examples.

Daring Stunts

Recent versions of Git make it difficult for the user to accidentally destroy data. This is perhaps the most compelling reason to upgrade. Nonetheless, there are times you truly want to destroy data. We show how to override the safeguards for common commands. Only use them if you know what you are doing.

Checkout: If you have uncommitted changes, a plain checkout fails. To destroy your changes, and checkout a given commit anyway, use the force flag:

$ git checkout -f COMMIT

On the other hand, if you specify particular paths for checkout, then there are no safety checks. The supplied paths are quietly overwritten. Take care if you use checkout in this manner.

Reset: Reset also fails in the presence of uncommitted changes. To force it through, run:

$ git reset --hard [COMMIT]

Branch: Deleting branches fails if this causes changes to be lost. To force a deletion, type:

$ git branch -D BRANCH  # instead of -d

Similarly, attempting to overwrite a branch via a move fails if data loss would ensue. To force a branch move, type:

$ git branch -M [SOURCE] TARGET  # instead of -m

Unlike checkout and reset, these two commands defer data destruction. The changes are still stored in the .git subdirectory, and can be retrieved by recovering the appropriate hash from .git/logs (see "HEAD-hunting" above). By default, they will be kept for at least two weeks.

Clean: Some git commands refuse to proceed because they're worried about clobbering untracked files. If you're certain that all untracked files and directories are expendable, then delete them mercilessly with:

$ git clean -f -d

Next time, that pesky command will work!