Picturing Git: Conceptions and Misconceptions

Picturing Git: Conceptions and Misconceptions(biteinteractive.com)

145 points by nimeshneema 4 years ago | 99 comments

I read most of this long article, and I found it useful, but:

It's unsurprising that people's mental model of git is incorrect. Git is not something people study at a conceptual level, it's something they learn recipes for in order to work on some project. Recipes like "how do I save all this work I just did" and "oh shit, everything is hosed, please give me a magic spell I can paste into my terminal to fix it".

I don't really blame people, since git itself does nothing to teach you how it works. Git it is the definition of something you have to deal with in order to do something more important to you. Some people want to dig deep and understand how the system works: it's nice to sit near that person and ask them for help sometimes.

Saying "you should really understand more about git" is like saying "you should really study the tax code, it's important and it affects you whether you like it or not." True, but deeply irrelevant!

Certhas 4 years ago | |

I think it's the other way around. The fact that git does not provide a clean analogous way to intuitively interact with it just demonstrates that the git interface is horribly broken.

This is not essential complexity, it's just bad design that stuck.

Take a look at https://gitless.com/

If you just look at a summary of the commands, you will have an accurate mental model of what's going on:

    gl init - create an empty repo or create one from an existing remote repo
    gl status - show status of the repo
    gl track - start tracking changes to files
    gl untrack - stop tracking changes to files
    gl diff - show changes to files
    gl commit - record changes in the local repo
    gl checkout - checkout committed versions of files
    gl history - show commit history
    gl branch - list, create, edit or delete branches
    gl switch - switch branches
    gl tag - list, create, or delete tags
    gl merge - merge the divergent changes of one branch onto another
    gl fuse - fuse the divergent changes of one branch onto another
    gl resolve - mark files with conflicts as resolved
    gl publish - publish commits upstream
    gl remote - list, create, edit or delete remotes

To me this clearly demonstrates that the problem isn't that people aren't learning git, it's that git is bad to learn. Stash + Index + Working Tree isn't the right abstraction to present to people. Just say there is a working tree, and tracked and untracked files and snapshots. Done. Branches aren't particular commits but particular working trees on top of particular commits.

Working on a feature and want to look at the main branch, but not ready to commit the changes yet? Well just switch to the main branch, then switch back and pick up where you started. No need to know about an additional data structure called the stash.

Unfortunately this did not pick up enough steam. And because a lot of tools expose concepts from gits broken interface you have to learn the git interface anyway...

zauguin 4 years ago | | |

Having used `gitless` a while ago as my main interface I strongly disagree. Having a distinction between my working tree and things I'm actually considering to commit is a luxury you only really start to miss when it's gone. IMO gitless makes it way too easy commit too much. Also it's "feature" of keeping uncommitted changes local to the branch is just weird. If I want to make a branch specific change, I create a commit. This has the big advantage that it actually forces the user to add a message what the change is about, so if something else comes up I know what was going on when coming back to it later. It's not like this has to be a formal commit message, after all the commit can be dropped again later. Otherwise you end up being surprised by old experiments when switching to branches you haven't used in a while. If I just switch branches then the most likely reason for that is that I want to move the changes.

antman 4 years ago | | |

    gl merge - merge the divergent changes of one branch onto another
    gl fuse - fuse the divergent changes of one branch onto another

Good while it lasted though

tharkun__ 4 years ago | | |

In the same vein as my sibling but not repeating what he said I agree with him though, I regularly commit just specific files. I actually teach every GUI I use that comes with git integration NOT to Auto add and such nuisances. I use the command line and in probably 90% of cases a git commit -a is what I do. Another 5 is git add the entire directory tree I am in and the other 5 are specifically picking what to commit. I'm all for UIs doing auto add and commit -a equivalent by default. But do not take that ability away from me!

The list you provide sounded great until it came to gl switch. Why is there one specific operation for a branch that is NOT done via gl branch?

I don't understand what fuse is supposed to do from this at all. No idea whatsoever. Merge I get and anyone who has worked with any other versioning tool does conceptually.

Rebase most people seem to have a problem with but the abstract concept really isn't that hard. Just like cherry pick isn't really hard but somehow people have trouble with it. Though conceptually it really isn't hard either.

What really helped me the most with git was the realization that it's just a tree of commits with a bunch of labels. Labels have different types so to speak, like branch or tag, remote branches being special in a way etc. And obviously various commands can interact with these labels. Like a fetch updates the remote labels and moves them around on my local copy.

lmm 4 years ago | | |

I hate the inconsistency that stash brings, but gitless is useless since it destroys the primary use case for stashing: I start making changes and then realise I'm working on the wrong branch. Git's solution to the problem is awful, but it's better than nothing.

paulddraper 4 years ago | | |

> just demonstrates that the git interface is horribly broken

This is HN criticism #94238 on the terrible git CLI.

Okay, sure.

Would you kindly post your superior git CLI? Or at least the outline of it?

---

Snark aside, Git's popularity is not an accident. Bitbucket supported Mercurial too.

cerved 4 years ago | | |

personal opinion, if you're a software engineer that can't be bothered to learn git I'm not sure that I respect you as a professional

tomxor 4 years ago | |

> I don't really blame people, since git itself does nothing to teach you how it works. Git it is the definition of something you have to deal with in order to do something more important to you. Some people want to dig deep and understand how the system works: it's nice to sit near that person and ask them for help sometimes.

The official git handbook, freely available on the official git-scm site is not terribly long, and explains the internals on a conceptual level quite well.

I think the problem is most people learning git land on some wordpress site of someone trying to flog a condensed and uninsightful shortcut to getting started with git for ad clicks, which only involves a series of commands without explaining the effects of those commands - This, combined with peoples expectation that an SCM should take no thought whatsoever causes most people that use git on a day to day basis to not really understand it at all.

Git needs to be introduced as powerful data structure, kind of like how SQL is not a DB, imagine someone explaining SQL without ever refering to the DB tables, rows and fields... only talking about git commits is like only talking about the result of a single query. You must understand the data structure to easily use the interface, otherwise the interface will be very confusing or you will be limited to "recipes"... after that you are just learning new variations on how to manipulate and navigate that structure (yes the graph), and from this perspective peoples complaints about the historical inconsistencies we have to put up with in git porcelain are moot.

MathMonkeyMan 4 years ago | | |

I was going to write a blog post conveying my mental model of what git is (having had one too many conversations along the lines of "no, git is not a ledger of diffs").

So, I started reading through <https://git-scm.com/book/en/v2/Git-Internals-Git-Objects> again to make sure I didn't have anything wrong.

But now there's no point in writing a blog post. Maybe I'll write one that just links to <https://git-scm.com/book/en/v2/Git-Internals-Git-Objects>.

It even has nice diagrams, which I think are essential for this kind of thing.

dreamcompiler 4 years ago | |

The tax code is a completely inscrutable mess but git's internal model is one of the most simple and elegant structures in modern computer science. It's just covered over with utterly stupid commands and terminology that obscures the beauty of the underlying architecture.

I used to despise git because it was so hard to learn. Then as an exercise I started writing my own code to read and write its underlying files and it finally dawned on me how simple the whole thing was.

Git's a very unusual piece of software; it's mind-bogglingly useful, the basic data structures and algorithms are perfectly matched to its job, and it has a UI that's a train wreck.

ilikepi 4 years ago | |

In a literal sense, sure, git _the tool_ doesn't do much, though I think this is slowly improving as it evolves. For example, there is an experimental `git switch` command[1] under development to provide a simpler interface for changing branches. For me, the biggest leap in developing my own mental model was reading Scott Chacon's book Pro Git, and that is now available online for free on the official Git website[2].

[1]: http://git-scm.com/docs/git-switch

[2]: http://git-scm.com/book/en/v2

cerved 4 years ago | |

idk man, if you're a software engineer I think the onus is on you. There are plenty of great and free resources, like the pro git book. Every month there's a thread where a bunch of people come in and bemone how git is complicated blah blah. Every month lots of people point out that git is much easier to use if you just bother to conceptually learn about it's internals.

it's like coming into a forum for accountants where people bitch about having to learn tax code. please...

cryptonector 4 years ago | |

I use this when I want to teach someone Git: https://gist.github.com/nicowilliams/a6e5c9131767364ce2f4b39...

tempodox 4 years ago | | |

I find that a good introduction.

“All operations on a repository involve adding commits and/or manipulating the name resolution table.”

It may be simplified, but that statement alone, taken in context, is worth its weight in gold.

OneEyedRobot 4 years ago | |

>Some people want to dig deep and understand how the system works

I'd say that's definitely the case but also a problem.

Sophisticated users mixed with people who just want to do a few simple things is a bad combination. I seem to remember that ClearCase had the same issues.

_aleph2c_ 4 years ago |

Git isn't going away, so we might as well master it.

If you would like to know more about how to manipulate the git graph, take this excellent (and free) training:

https://learngitbranching.js.org/

To slowly level up, you can watch video demonstrations from Dan's git school. Dan provides 48, 30 minute training videos:

https://www.youtube.com/watch?v=OZEGnam2M9s&list=PLu-nSsOS6F...

mynameismon 4 years ago |

Related: A video[0] from The Missing Semester, a course by CSAIL MIT, which covers Git in one of it's lectures. Personally, the entire series is a must watch, but if time is limited, the first 20 odd minutes give an absolutely fantastic introduction to Git.

[0]: https://missing.csail.mit.edu/2020/version-control/

darekkay 4 years ago |

That's the blog post I always wanted to write. So many people spend little to no time to actually learn Git, because "it's just a tool to help you doing your "real" work" (=coding) . Or because "Git is too difficult/confusing/broken". Or because "I don't need anything except commit/push/pull". I find those arguments somewhat true, but I still feel that people are missing out when they don't learn a tool they use daily properly. In best case, it makes them less efficient. In worst case, they get into "unsolvable" issues and/or pollute the Git history with useless commits, making blames more troublesome for the whole team.

jimbob45 4 years ago | |

It is just a tool to help you do your real work and famously gets in the way. You use SVN and it covers 99% of use cases much more simply than Git manages.

sagonar 4 years ago | | |

If svn covers 99% of your use cases, then you need more experience with distributed version control systems.

Able to commit locally, examine changes work with them and then push is a something you might not need or require if you think about version system like SVN.

But if you have learned Git or Mercurial or some other distributed system you would never go back to svn.

deckard1 4 years ago | | |

that's just laughable considering SVN needs a central repo to work and I can just do "git init ." to create a git repo in any directory at any time I want and is entirely self-contained.

Once an SVN user discovers the magic of a staging area, stashes, or "git add -p" I don't know how they could claim SVN does anything better. All I remember from those days was how slow everything in SVN was. It felt like every command was backed by some horrible O(n^2) operation or really slow network connection.

git isn't hard. FFS, we shouldn't keep seeing these posts hitting HN every week. iptables? That's tough. DNS? No thanks. Managing package.json and keeping an app up-to-date? Git is nothing in comparison to the real challenges I face everyday.

8372049 4 years ago | | |

When you write things like that, you come across as trolling. Their point wasn't git vs. another tool, their point was the importance of knowing intricate details about your tools in certain circumstances.

If SVN is wonderful for you: Great! But that's not really relevant to the issue of using git effectively.

BossingAround 4 years ago | | |

I've never used SVN, but I assume that because git uniformly won as the VCS tool, it suits most people better than SVN..?

wokwokwok 4 years ago |

hm. vexing.

I feel like this is mostly accurate, to my knowledge, but reading this:

> I do not claim that this way of looking at Git represents absolute “facts” in any hard and fast or literal sense. But I contend that if you conceive of Git in the way that I’m going to suggest, if you substitute these conceptions of Git for any misconceptions you might have now, you’ll be a much happier and more fluid Git user.

…vexes me.

“Think of git like bowl of peanuts and marshmallows” and other pointless, wrong, metaphors about how git works are a dime a dozen.

Yet, here is someone who is clearly quite familiar with git, and they go to pains to point out they are simplifying and may not be correct in their explanations.

Its good to be humble, but ffs, git is too frigging complicated if the best you can get is a “probably wrong simplified mental model of how it works so you can be a bit more productive with it”.

I dont care;

- a simple meaningless metaphor that lets you be more productive? OK.

- a accurate description of how things actually work? OK.

…but pick one.

What I do not want is a possibly wrong complicated explanation of how git maybe works.

jldugger 4 years ago |

> Picturing Git: Conceptions and Misconceptions

Based on the title, I was expecting a more in-depth study of user misconceptions about git, similar to the famous CogSci paper "Two Theories of Home Heat Control." Except with like, diagrams.

And now I want someone to make that happen.

avip 4 years ago | |

While you're on wait there, you can read the excellent hg init https://hginit.github.io/ for inspiration.

rendall 4 years ago |

I avoid using words like "simple" when writing about technical topics. "It is simple!" is not inclusive language and, love git or hate git, it is not inherently simple. It takes effort to understand, and experience to avoid its pitfalls.

bironran 4 years ago |

Ugh. So many concepts. So many things to remember. Why? Git is simple. SIMPLE. But only, IMO, if you go bottom-up and not top-down. There are only 6 critical concepts in Git and each is simple enough to be described in a single sentence.

1. Commits are immutable blobs that have one or more parents. Graphs, not trees. Anyone who uses trees for git commits misses the whole point and makes their (and their collaborators) lives complicated.

2. Tags are (mostly, best practice) immutable pointers to commits. Tag are "this is this thing FOREVER*."

3. Branches are named, mutable (by design) pointers to commits. Branches are "this is this thing FOR NOW. Later it'll be something else."

4. HEAD is special "branch" that moves around automatically.

5. Origin is the local snapshot of the remote. Origin is "what did it look like when I last looked."

6. (fundamental but not critical) Remote is the current remote state (queried by RPC).

7. Index (aka stage) is where you put changes you want to make into commits. (this is somewhat simplified). Index is "My current and immediate plan. Scrub as needed."

That's (mostly, for non advanced use cases) it. Everything else are commands to query or manipulate the various state. Every action (until it becomes instinctual knowledge) should follow the same recipe: 1. Figure out the current state (current commit graph, relevant branches). 2. Figure out the target state (desired commit graph, new branches positions). 3. Mutate using ANY command you want.

I think that's the issue really. Inexperienced dev / people who don't understand git look at commands as "this is how to do a thing". No. In Git there isn't "how to do the thing". It's exactly like writing code - so many ways to achieve the goal, just choose your own. It might be efficient and elegant, or bumbling and ugly, but it'll get there.

thewebcount 4 years ago |

> The problem with how people use Git, I’m suggesting, is that their analogical or metaphorical conception of Git doesn’t work — it doesn’t fit the way Git actually behaves — if, indeed, the conception exists at all.

No, the problem is not with "how people use Git". The problem is with git. We've known for years how to make clear, concise interfaces that help people understand what's going to happen. Git does not have a clear, concise interface. That is its biggest problem and will continue to be until it is changed to have a clear, concise interface.

squaresmile 4 years ago | |

Fwiw, I'm not sure if it's recent or not but git CLI has pretty good suggestion of what commands to run next. git status gives a decent amount of info. It also suggests the newer commands like switch or restore.

jatone 4 years ago | |

then feel free to build that tool on top of git. nothing is stopping you, given the existence of such tools today.

cerved 4 years ago | |

git has a very clear, concise and stable interface if you understand how git works. it's designed this way, intentionally. people should stop complaining about it and either learn how to use it, switch to another tool or just write their own interface

recursive 4 years ago | | |

It's not possible to switch to another tool unless you only ever work on code that you wrote, and never need to collaborate with anyone else.

toiletaccount 4 years ago |

Complaining about git is like complaining about any other unix tool, if you don't read the docs you're in for trouble. Sometimes a lot of it.