Thursday, September 06, 2007

Git, Binary Files and Cherry Picking Patches

Steve Dekorte has some things he dislikes about git. This post is how I work around these issues in my own git repositories.

Git has a heuristic for detecting binary files. You can force other file types to be binary by adding a .gitattributes file to your repository. This file contains a list of glob patterns, followed by attributes to be applied to files matching those patterns. By adding .gitattributes to the repository all cloned repositories will pick this up as well.

For example, if you want all *.foo files to be treated as binary files you can have this line in .gitattributes:
*.foo -crlf -diff -merge
This will mean all files with a .foo extension will not have carriage return/line feed translations done, won't be diffed and merges will result in conflicts leaving the original file untouched.

Now when you pull from another repository that has changes to a .foo file you'll see something like: |  Bin 32 -> 36 bytes
Note that it shows it is a binary file. If you pull from another repository with changes to you'll get:
CONFLICT (content): Merge conflict in
The file will be untouched and you can change it manually to be the correct version. Either by leaving it untouched, or copying a new file over it. Then you need to commit the merge conflict fix (even if you left the file untouched):
git commit -a -m "Fix merge conflict in"
The cherry picking of patches works differently to Darcs. There are a couple of ways of handling this, but I use 'git cherry-pick'. If you have a number of contributers with their own repositories that you regularly pull from you can set up remote tracking branches:
git remote add john http://...
git remote add mary http://...
Now when you want John and Mary's most recent patches you can fetch them:
git fetch john
git fetch mary
This does not make any changes to your local branches. It gets and stores their changes in a separate remote tracking branch. If you want to see what John has changed, compared to yours:
git log -p master..john/master
From there you can decide to pull in all John's commits:
git merge john/master
If you want one commit, but not its dependencies then this is where 'cherry-pick' is used.

Given a commit id, 'cherry-pick' will take the patch for that commit and apply it to your current branch. It's used like:
git cherry-pick abcdefgh
This creates a commit with a different commit id than the original, but with the same contents. It needs to be a different id as it doesn't have the same dependencies as the original.

If you decide later you want all John's commits and do a merge which includes the commit that you cherry picked from you might expect conflicts. Git handles this case fine and does an automatic merge, noticing the patches are the same. So it effectively gives you the same functionality as Darcs selective patch pulling, but not as nice a user interface.



Blogger topfunky said...

That's just what I needed to be able to store graphic files that happen to be in XML format (i.e. Adobe Illustrator files). Marking them as binary works a lot better and doesn't introduce meaningless merges on their contents.


5:53 AM  
Blogger sergo said...


Could this

*.foo -crlf -diff -merge

be done automatically, every time I create new repository?

1:34 AM  

Post a Comment

<< Home