A Guide to GIT using spatial analogies

Some developers find Git takes a little getting used to, claiming that it is conceptually convoluted compared to other distributed version control systems. I used to number myself amongst them.

Happily, I’ve found that a couple of simple spatial analogies have made me proficient and fluent in using Git’s command-line interface.

One of the things that tripped me up as a novice user was the way Git handles branches. Unlike more primitive version control systems, git repositories are not linear, they already support branching, and are thus best visualised as trees in their own right. Branches thus become trees of trees. To visualise this, it’s simplest to think of the state of your repository as a point in a high-dimensional ‘code-space’, in which branches are represented as n-dimensional membranes, mapping the spatial loci of successive commits onto the projected manifold of each cloned repository.

Branches as n-branes

The authors of the git manuals clearly had this in mind. Taken directly from the git manual:

In simplified form, git object storage is “just” a DAG of objects, with a handful of different types of entries from <commit> to the index, optionally modifying index and working tree to match. The <commit> defaults to HEAD in all forms.

If <branch> is specified, git rebase will perform an automatic git checkout <branch> before doing anything else. Otherwise it remains on the current branch. All changes made by commits in the current branch but that are not in <upstream> are saved to a temporary area. This is the same set of commits that would be shown by git log <upstream>..HEAD (or git log HEAD, if –root is specified). The current branch is reset to <upstream>, or <newbase> if the –onto option was supplied. This has the exact same effect as git reset --hard <upstream> (or <newbase>). ORIG_HEAD is set to point at the tip of the commits that were previously saved into the temporary area, then reapplied to the current branch, one by one, in order. Note that any commits in HEAD which introduce the same textual changes as a commit in HEAD..<upstream> are omitted (i.e., a patch already accepted upstream with a different commit message or timestamp will be skipped).

Update: There is, of course, a fabulously insightful commentary on reddit.

Update: Thanks folks. You’ve made my usual one or two hundred daily visitors look kinda paltry:

spike in daily traffic graph

24 thoughts on “A Guide to GIT using spatial analogies

  1. Only a geek would think to simplify something by explaining it with “isomorphic contours in source-code phase space”. I think perhaps it would help if you could explain the explanation!

  2. Excellent article. You have convinced me–GIT is far too bizarre and baroque to be useful. We’ll stick with hg and svn, thanks.

  3. Unfortunately, the words ‘isomorphic’, ‘manifold’ and ‘phase space’ have precise meanings which don’t map well here.

  4. I thought the explanation was rather plain to understand, if not a bit too simplified. You could have, for example, shown how branes tangental to the isomorphic contours are a special edge case that can arise surprisingly often.

  5. Git can get a little confusing. If spatial analogies aren’t your thing, perhaps postmodern literary critical theory? Imagine the state of the version control repository as being like the (naturally, entirely impossible) *objective* conception of the author’s intended narrative. Each distributed copy can then have several branches, which are deconstructions of the work, according to several subjective and culturally arbitrary perspectives…

  6. I’d like to try to provide an explanation of that spatial analogy…

    Perhaps it’s simplest to think of a point in a high-dimensional ‘code-space’, in which branches are represented as n-dimensional membranes, mapping the spatial loci of successive commits onto the projected manifold of each cloned repository, as the state of a git repository.

    ;)

  7. @Steve — +1 to that!

    A much simpler view IMO is to consider the Git repositories as narratives, or meta-narratives, each committer being seen as an author of one or more. This is made particularly clear in the git man pages, where a predominant concept is the distinction between closing and opening — in a sense, it promotes the use of structural deappropriation to challenge the linear view of “revisions”. As Torvalds says, “[l]inearity is fundamentally impossible”, so the repository — or at least, a branch — is interpolated into a realism that includes the “current” revision (tip or head, as you prefer) as a reality. To put it in political terms, Mercurial deconstructs Marxist socialism, while Git analyses capitalist construction. Bzr, of course, is a whole different kettle of fish :-)

    For me at least the underlying problem that all this is meant to solve is the object-orientiationism that leads to the meaninglessness of version control history. Ultimately it leads to a dialectic, and eventually the collapse of the realism intrinsic to (say) Mercurial, although in a more submaterial sense. If object-orientiationism holds, we have to choose between realism and the dialectic paradigm of expression. But the characteristic of Git is not desublimation, but postdesublimation.

    Of course, all this is just a long way of explaining that in Subversion the committer is dead, whereas in Git he is simply resting.

    Hope that helps!

  8. Thank you for your article, it made things simpler to understand why git-rebase has been causing me troouble.

    When I rebased think my repository might have collapsed into a Calabi-Yau space, but I’m not 100% sure which one. I have narrowed down the possible families of Calabi-Yau spaces but I’m concerned that the number of potential topologies are still infinite.

    Can you fix it for me? I have a project due tomorrow and my prof will fail me.

  9. Wow! I have often tried to put these geometrical intuitions into words. It’s thrilling to find that someone has finally done it for me.

    I think the bit I was missing was how the closure of the set of possible git repositories (under infinite development) extends this countable set to a continuum that is actually much easier to reason about.

    In fact the use of intersecting isoclines in configuration space is nothing short of brilliant. There are publishable results here.

  10. Some codebases tend to code-mass approaching a code-analogous Chandrasekhar limit, past which the entire codebase will collapse into a code singularity, in which any further addition of code will result in the loss of all useful information in the added code.

  11. Pingback: I hate git – The Scribbler of the Rueful Countenance

  12. Pingback: tap tap tap ~ Tools for effective iPhone app development

  13. Pingback: Lost in the Triangles » Blog Archive » Mercurial/Kiln experience so far

  14. Pingback: Git… enough! – The Scribbler of the Rueful Countenance

  15. Thank you for this simple explanation! LMAO!

    To visualise this, it’s simplest to think of the state of your repository as a point in a high-dimensional ‘code-space’, in which branches are represented as n-dimensional membranes, mapping the spatial loci of successive commits onto the projected manifold of each cloned repository.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>