Tuesday, December 30

Notes About Git + SVN + Google Code

I wanted to do some hacking over vacation, so I decided to pitch in and implement a feature request for Guice. Here are some of the things I learned about using Git with Google Code’s Subversion repositories:

Recommended Google Code Git Initialization
(run this in an empty directory that you would like to be a repository)

git svn clone -s --prefix=svn/ --rewrite-root=https://guice.googlecode.com/svn http://guice.googlecode.com/svn .

Note: This will pull the entire SVN repository back to its beginning. The -r argument can be used to limit the pull.

The --rewrite-root line is there so that you can remove it if you get membership status on the project and want to commit.

Additional Notes
  • Develop with Git on a Google Code Project” is a pretty good intro article that covers the very basics of using Git and connecting with Google Code.
  • An introduction to git-svn for Subversion/SVK users and deserters” seems to be the most comprehensive description of setup and use cases of Git’s SVN support.
  • Git SVN Workflow” is also a nice and friendly read.
  • Fink’s git-svn 1.6 package seems to be broken, at least for Mac OS X 10.5. It bus errors on launch. I found a fink-users thread that comments on this but provides no solution. I grabbed it from MacPorts instead, and that worked with no problem.
  • Passing the --prefix=svn/ argument to git svn init makes it easier to differentiate between server-side branches and local branches.
  • In theory, you cannot change the SVN repository URL later. In practice, it may be possible. This is particularly important for Google Code because if you check out the repository with HTTP, you cannot then commit because committing requires HTTPS for the authentication. Compounding this problem, you cannot (as far as I can tell) check out with HTTPS unless you are a member of the project. In this light, it may make sense to check out new projects using --rewriteRoot (as mentioned on the GitSvnSwitch page) pointing at HTTPS. I ended up just re-creating a new repository and using git am to move a commit between repos, as described here.
  • Use the --username argument to git svn fetch when checking out with HTTPS. You’ll be asked to type in a password (which is the random characters from your Google Code profile). Git will then save this information somewhere (I’m not sure where), so it means that you won’t have to memorize the Google Code password, or even re-specify the --username if your Gmail username differs from your username on your computer. 
  • Remember that git svn dcommit will commit to the repo once per Git commit. This will likely annoy other people on the project, so it’s best to either use git rebase -i and squash everything, or do a git reset svn/trunk and then make a single commit of the index (don’t forget to re-add new files).

Sunday, December 21

Installing Graphziv on Mac OS X

I’m playing with Graphviz on the Mac. Unfortunately, the latest official download, 2.20.3, gives the following error message when run from the command line:

dyld: lazy symbol binding failed: Symbol not found: _pixman_image_create_bits
  Referenced from: /usr/local/lib/graphviz/libgvplugin_pango.5.dylib
  Expected in: flat namespace

dyld: Symbol not found: _pixman_image_create_bits
  Referenced from: /usr/local/lib/graphviz/libgvplugin_pango.5.dylib
  Expected in: flat namespace

Trace/BPT trap

The solution is to go to the downloads directory and download the next-most-recent release, 2.20.2.

Wednesday, December 17

Flaw: Constructor does Real Work

Flaw: Constructor does Real Work: "Fundamentally, “Work in the Constructor” amounts to doing anything that makes instantiating your object difficult or introducing test-double objects difficult."

Misko has a great article about making sure that your constructors permit your class to be reasonably testable. Read the whole thing, his detailed examples are very informative.

Monday, December 15

topgit Means Never Having to Wait for Reviews

I’ve been using git for a little while now at work. Some rather clever Googlers have rigged up a tool that syncs a local repository with Perforce, so I can do local development and version control with git and then check the final CL in to the depot.

Over the weekend I started using topgit (README), a git wrapper that takes most of the work out of managing dependent “topic branches.” The gist is that you mark branches as depending on one another. Then, when you modify a branch, you can use the tg update command to propagate those changes to any branches that depend on it.

I’ve been doing a lot of large changes and refactorings recently, so what topgit really means for me is never having to wait for code reviews to keep working.

The basic workflow goes like this:

“refactor” is the name of my new branch, “blogger” is the trunk
$ tg create refactor blogger
hack hack hack
mv mv mv
add add add
commit commit commit


now mail out the CL for review
$ gitwrapper mail -m reviewer

In a pure Perforce environment, I couldn’t continue working on this code at all until the review came back and I could submit. With vanilla git, I can of course git checkout -b new-branch and keep going, and this is fine most of the time.

Where it starts to get sticky is when my reviewer has comments. No big deal, though. I can just git checkout back to the refactor branch, make those changes, and commit them.

Of course, I now need to update my new-branch so that I’m working from the latest intermediate state. git merge refactor handles that fairly well (sometimes I get pedantic and use git rebase) and I’m back to coding the new feature.

topgit supports exactly this pattern of development by automating the “update the new-branch” step. If I created the new-branch branch with tg create, then I could run tg update to merge in the latest versions of all of new-branch’s dependencies.

To continue, we have:

git checkout refactor
tweak tweak tweak
git commit -am "fixes from review"
git checkout new-branch
tg update
code code code

“All” is a very operative word in that last sentence. With topgit, a branch can have several dependencies, forming a DAG, and it will recursively update each one of them. This means that I can have several independent changes, with each out for review in parallel, but continue development on a new change that relies on all of them.

topgit works by keeping a separate reference branch for each topgit-managed branch. When you run tg update, it merges the dependencies into this branch, then merges that into the normal branch. This gives the added benefit that you can easily git rebase against the reference branch to clean up your commit graph.