My git workflow

From AndrewMoore

Jump to: navigation, search

When I work on Koha, we use git and I submit my patches to a mailing list where they are reviewed by the community and then eventually picked up by the release manager and applied to the main repository. That means that it pays for me to organize my patches in an way that other people can understand. Each should be nearly able to stand alone, with all of the work for one logical chunk of work in one patch, and with each patch containing only changes for that logical chunk, even if one feature requires a handful of patches. Unfortunately, that's now how I work. For any given feature, I'll work a little on several different aspects of it, making some good choices and some bad choices as I go, refactoring and changing my mind all the time. That means that I need a workflow that allows me to do two things. First, I need it to let me program in a natural way. Second, it needs to let me generate patches that lay out my changes in a logical fashion. This is how I do that.

Typically, when I'm first thinking about how to implement a large feature, I know that there are several main areas of change. For instance, there may be a database table to add, a class of objects to build on top of that, and a user interface into that. These can each be thought of as separate chunks of work, and submitting them in different patches makes it easier to understand.

When I start actually writing code, I'll write on one of those pieces for a little while, and then switch to another piece to work for a while. Between each, I'll "git commit". I leave a commit message that indicates what piece of the feature I've been changing. In this way, I'll stack up a bunch of commits that each deals with one aspect of the feature. Any given aspect may be touched by several commits, though. Furthermore, they're all interleaved with each other. I use git to reorganize them into one patch for each logical chunk of work.

Formulating Patches Based on Logical Chunks of Work

I use two methods to reorganize my patches. I reorder them and I squash them together. I do this in order to make one large patch for each logical piece of my new feature. If I run "git rebase --interactive origin" at this point, I see something like this:

pick abcd111 starting Dog class to manage dogs
pick abcd222 adding Dog database table
pick abcd333 added some features to the dog class
pick abcd444 added more fields to the Dog table
pick abcd555 fixed some constraints on the Dog table

This shows all of the commits I have made to this branch that aren't on the origin branch.

I want to first reorder my patches to put all of the related ones together. For instance, all of the database changes should be together and all of the patches dealing with the new Dog class should be together.

You can move these lines around to reorder your patches, but you don't want to reorder any patches within any one chunk. For instance, I would make the previous look like:

pick abcd111 starting Dog class to manage dogs
pick abcd333 added some features to the dog class
pick abcd222 adding Dog database table
pick abcd444 added more fields to the Dog table
pick abcd555 fixed some constraints on the Dog table

Now, the Dog object patches come first (maintaining their order), and the database changes come second (in the same order they were in). At this point, I exit my editor to make sure that git can deal with this reordering.

If it fails, I'll usually "git commit --abort" and split apart my patches to make sure that each one only deals with one aspect of the new feature. That can be done with the process described in the "splitting commits" section of the git-rebase man page. When it finally succeeds, I combine my patches together to get one patch for each logical chunk.

I use "git rebase --interactive origin" again to combine my patches together. I change them to look like this:

pick abcd111 starting Dog class to manage dogs
squash abcd333 added some features to the dog class
pick abcd222 adding Dog database table
squash abcd444 added more fields to the Dog table
squash abcd555 fixed some constraints on the Dog table

This means that each "squash" line will be combined into the patch above it. When you exit your editor, git will give you a chance to edit your commit messages. This is a good time to use reasonable commit messages. It helps these main patches that I'm building now stand out from minor additions that I make later and will squash into these. I mention the bug number, give each one a sequence number, and use a good comment in the first line of the commit message. This helps me keep them organized going forward.

To show you how it looks afterwards, I can run a "git rebase --interactive origin" and see something like:

pick abcd111 starting Dog class to manage dogs
pick abcd222 adding Dog database table

Reorganizing Commits as I Work

Once I get a few patches in git, one for each logical chunk of the feature I'm working on, I continue to commit early and commit often. For my commit messages, I usually don't write much since they'll be squashed into earlier commits. I'll typically just note to which logical chunk each commit belongs. I'll use something like "squashme: user interface" or "squashme: database update". Then, after I get a few commits in, I'll reorganize the commits to combine them with the previous ones.

The way I reorganize my commits is typically a two step process. First, I'll reorder the commits and then, I'll squash the later, minor commits into the earlier commits that I'll eventually submit.

To reorder my commits, I run "git rebase --interactive origin". It might look something like this:

pick 94e1abf Bug 1234: database update: adding tables to hold dogs and dog types
pick 61d0dd6 Bug 1234: adding new dog class
pick eac8675 Bug 1234: adding user interface to edit dog objects
pick 0fc5e50 Bug 1234: displaying dog attributes on legacy animal pages
pick d12122c Bug 1234: adding cronjobs to manage dogs
pick 111abcd squashme: database updates
pick 222abcd squashme: display dog color on animal page
pick 333abcd squashme: database updates

The first 5 are the commits that I eventually intend to submit. The last three commits are the new ones that I'll reorder into the previous 5. I do this by moving the two database lines to right after the database update patch and the line mentioning the animal page to just below the main commit dealing with the animal page. Then, I exit my editor. git rebases this set of commits to make sure that they can work in this order.

If git fails, I'll git rebase --abort and figure out why not. Either I've included unrelated stuff in some of these minor commits and they need to be split up, or I've made a mistake while reordering the patches.

When that step has worked OK, I'll squash the minor commits. A git rebase --interactive origin at this point looks like:

pick 94e1abf Bug 1234: database update: adding tables to hold dogs and dog types
pick 111abcd squashme: database updates
pick 222abcd squashme: database updates
pick 61d0dd6 Bug 1234: adding new dog class
pick eac8675 Bug 1234: adding user interface to edit dog objects
pick 0fc5e50 Bug 1234: displaying dog attributes on legacy animal pages
pick 333abcd squashme: display dog color on animal page
pick d12122c Bug 1234: adding cronjobs to manage dogs

I then run through each of the "squashme" lines and change "pick" to "squash". This tells git to fold those commits into the previous ones, leaving me with 5 commits that each implement a logical portion of my feature.

I run through this process of reorganizing my commits several times a day as I complete several small chunks of work and need to step back to think about what's next.

Preparing to Submit My Group of Patches

Once I believe that my work is done and I have completely implemented my new feature or enhancement, I'll run through my commits to prepare them to submit. A quick git rebase --interactive origin will let me see my 5 main commits. I'll turn each "pick" into an "edit" in order to edit all of the commit messages. I have to fix a few things in each of the messages:

  • I make sure each commit message contains a note to documentation writers if necessary.
  • I read each commit message and make sure it makes logical sense. Sometimes, I leave poorly worded things in there, or artifacts of the many commits that I have squashed together show up.

After each commit message is edited, I can use this process to send my patches in to the patches list:

  1. git fetch # make sure that I have the latest HEAD
  2. git rebase origin # make sure that my patches are formed against that latest HEAD
  3. git format-patch --numbered -M origin # make my patches, number them, and make added or deleted files look right
  4. git send-email *patch # send them in!


Like what you have read? Looking for an engineer? I'm a perl programmer in Kansas City looking for a team to join and a project to contribute to.

Navigation
Share This Page
  • Stumble It!