Introducing Git protocol version 2

Introducing Git protocol version 2(opensource.googleblog.com)

547 points by robmaceachern 8 years ago | 163 comments

The current (and pretty much only, ever, despite Linus having been the creator) maintainer of git is a google employee [1], in case anyone else was wondering.

[1] https://en.m.wikipedia.org/wiki/Junio_Hamano

chrisseldo 8 years ago | |

>"Linus Torvalds said in 2012 that one of his own biggest successes was recognizing how good a developer Hamano was on Git, and trusting him to maintain it."

kumarharsh 8 years ago | | |

I came across this email from Linus announcing the handover: https://lwn.net/Articles/145123/

It's interesting how the first ever git project itself was looking for new maintainer almost as soon as it was created.

vga805 8 years ago | | |

wow! what an accolade

AdmiralAsshat 8 years ago | |

Thanks, that really helps.

As an open-source advocate, my first thought was, "Why the hell is Google releasing a version of a protocol that Linus Torvalds wrote?"

Without that context, it would be like Google throwing up an announcement, "Introducing Google's Linux Kernel 5.0!"

skywhopper 8 years ago | | |

Yeah, that was my reaction, and it made me sad that Google has so eroded my trust over the decades that I was turned off at seeing an announcement implying they are deeply involved in core open source tools. I mean, who else but companies swimming in cash can truly deeply support this stuff, and for the most part, the people working on these tools really do care about the open source community. But Google's reputation is so tarnished that my gut reaction is at odds with my rational one, and that's a sad thing to realize.

fiatjaf 8 years ago | | |

That may seen odd, but it could happen in a open-source world: multiple parties releasing different versions of the same piece of software and calling it the same.

Ceezy 8 years ago | | |

Same here. I was like, "If there are about to drop 3 versions at the same time like angular, I need to you use SVN ASAP".

Retroity 8 years ago | |

Non-Mobile link for those on desktop: https://en.wikipedia.org/wiki/Junio_Hamano

Cynddl 8 years ago | | |

The mobile version of Wikipedia works perfectly fine in a browser. I personally prefer it for readability.

biohax2015 8 years ago | |

Fun fact: if you google “git blame” it returns his wikipedia entry.

ojosilva 8 years ago | |

Alright, I was wondering why this was published on the Google Opensource website. I had no idea. Yet, the Git project itself has not been published under their umbrella.

https://opensource.google.com/projects/list/featured

willnorris 8 years ago | | |

We currently only list project that are or were primarily developed by Google. We decided to include projects that started at Google and were since donated to foundations, such as Kubernetes.

But we aren't yet including projects where we are just heavy contributors, but they're not "Google projects". That includes Linux, git, LLVM, and a host of others. We do want to recognize them in our project directory, but want to make sure that they are distinguished from Google projects so that we're not implying something that is accurate.

avar 8 years ago | | |

One of the places it's hosted at is https://kernel.googlesource.com/pub/scm/git/git

See the list of URLs at https://public-inbox.org/git/xmqqindt6g1r.fsf@gitster.mtv.co...

simias 8 years ago |

Let that be a reminder to all the coders out there: if you ever design a protocol or file format to communicate between machines always remember to add a version field or some other way to allow for updates and revisions later without breaking everything. Having a way to specify extensions in a backward-compatible way is nice too.

avar 8 years ago |

The specification of the v2 protocol is here: https://github.com/git/git/blob/master/Documentation/technic...

One of the more exciting things is that it can now be extended to arbitrary new over-the-wire commands. So e.g. "git grep" could be made to execute over the network if that's more efficient in some cases.

This will also allow for making things that now use side-transports part of the protocol itself if it made sense. E.g. the custom commands LFS and git-annex implement, and even more advanced things like shipping things like the new commit graph already generated from the server to the client.

xorcist 8 years ago | |

If you are to link to a git repo, don't link to some unoffical mirror. That would just confuse search engines.

The specificiation of the v2 protocol is here: https://git.kernel.org/pub/scm/git/git.git/tree/Documentatio...

(There are a couple of repos listed as official mirrors, such as the googlesource.com one, but the one you linked to isn't one of them.)

avar 8 years ago | | |

The repository I linked to is official. See https://public-inbox.org/git/xmqqindt6g1r.fsf@gitster.mtv.co...

What list are you referring to? If it doesn't list the one on GitHub it needs to be fixed.

Someone1234 8 years ago |

Too bad they didn't make Git LFS part of Version 2[0]. Most vendors[2] support LFS already but because it isn't required, some still lack it and its support cannot be assumed.

[0] https://git-lfs.github.com/

[1] https://github.com/git-lfs/git-lfs/wiki/Implementations

lsiebert 8 years ago |

I'm a very (very) minor contributor to git.

If you are at all interested in hacking on Git, it's not that difficult. Knowing C and portable shell scripting for writing tests are the big things.

One sticking point, you need to submit patches to the mailing list, you can't just do a github pull request.

See https://github.com/git/git/blob/master/Documentation/Submitt...

I still see github pull requests rather frequently, even though they have never been allowed. All discussion AND patches go through the mailing list, much like the linux kernel.

pm215 8 years ago | |

It's unfortunate that github doesn't let a project disable the on-website UI for pull request submission; as it is it's easy for somebody to end up wasting their time trying to submit a change that way. (QEMU has that issue too.)

ImJasonH 8 years ago | | |

Totally agree! I made nopullrequests.com to help solve this.

newscracker 8 years ago |

Sometime in the far future, someone will write an interesting story about how a double null byte came into existence in the git request protocol, and it will be amusing and interesting to look back. As the saying goes, hindsight is always 20/20. I'm glad that they found ways to maintain backward compatibility, at only a minor cost to understanding things.

Confiks 8 years ago |

It's quite a comedy that this feature has not been implemented for at least 6 years, solely because the raw git:// protocol's parameter handling was severely broken, and feature detection by disconnecting and retrying [1] was ultimately deemed far too dirty.

[1] https://public-inbox.org/git/CAJo=hJtZ_8H6+kXPpZcRCbJi3LPuuF...

cpburns2009 8 years ago |

Wait, why was this posted by Google? I thought Git was made by Linus Torvalds.

hk__2 8 years ago | |

Git was created by Linus Torvalds, but out of the 50k+ commits on the repo, only 250 or so are from him, with only 14 in the past 6 years. [1]

[1] https://github.com/git/git/graphs/contributors?from=2012-03-...

sdesol 8 years ago | | |

Here is a more detailed analysis, which shows all contributors:

https://public.gitsense.com/insight/github?r=git/git#b%3Dgit...

These are contributions by Linus:

https://public.gitsense.com/insight/github?r=git/git#b%3Dgit...

and as you can see, his contributions, really tapered off after 2010, while contributions from Hamano remained steady from 2008 to present date, as shown below:

https://public.gitsense.com/insight/github?r=git/git#b%3Dgit....

Semaphor 8 years ago | |

> Linus Torvalds said in 2012 that one of his own biggest successes was recognizing how good a developer Hamano was on Git, and trusting him to maintain it.

tgummerer 8 years ago | |

It's because a Google employee implemented protocol v2, and wrote a post about it.

euyyn 8 years ago | |

Opensource though.

rwmj 8 years ago |

Is there a git protocol variant that allows the client to avoid downloading objects that it already has stored locally in another repository or cache?

For example: I have the Linux kernel already cloned in some directory. I clone a second repo which has the Linux kernel as a submodule. Can I clone the second repo straightforwardly without having to download Linux a second time? (Well yes, but only by manual intervention before doing the git submodule update - it'd be nice if objects could be shared in a cache across also repos somehow).

Boulth 8 years ago | |

Have you seen alternates? https://stackoverflow.com/questions/36123655/what-is-the-git...

nothrabannosir 8 years ago | |

You could literally link the two object directories?

I just tried this and it seems to work:

  git clone git://github.com/git/git
  mkdir git2
  cd git2
  git init
  cd .git/
  rm -rf objects
  ln -s ../../git/.git
  cd ../
  git remote add origin git://github.com/git/git
  git fetch # returned without downloading anything
  git checkout master
  ls # etc.

If you seriously want to use this, you'll probably want to hard link the contents, instead. But iirc git clone from local disk already does that, for you?

In short: clone your local copy and taking it from there?

falsedan 8 years ago | | |

You can also use alternates:

  echo ../../../git/.git/objects >> git2/.git/objects/info/alternates

or use the original as a reference:

  git clone --reference git git://github.com/git/git git2

This sets up the alternates for you.

wereHamster 8 years ago | |

There's git command for that: https://git-scm.com/docs/git-worktree

Bjartr 8 years ago | |

Maybe this project could work for you?

https://github.com/jonasmalacofilho/git-cache-http-server

buckminster 8 years ago |

AIUI, the git ssh protocol is just the git protocol tunnelled through ssh. So why do they need different mechanisms for signalling V2?

_wmd 8 years ago | |

Deploying Git over SSH entails locking the precise command line executable by the public key you use to authenticate. Locking SSH SendEnv down is mandatory too, otherwise thousands of people would have shell access to GitHub.com!

This isn't even theoretical, there was an environment-related bug not 5 years ago involving Git. At least BitBucket was impacted, I think GitHub were patched before it was announced

simias 8 years ago | | |

I don't think that answers the parent's question, if the update was in the git protocol itself (encapsulated in the SSH session) then you wouldn't have to change anything at the SSH level.

As you point out selectively allowing a new environment variable could open a can of worms for shared hosts like github if they mess up their implementation.

xyzzyz 8 years ago | |

Because if you tunnel through ssh, you can signal v2 using ssh mechanism of setting environment variables. If you don't tunnel, you don't have this option. This is clearly described in the article.

deathanatos 8 years ago | | |

I think what the person you're replying to is asking is why not, in the case of ssh, use the signaling in the git protocol, since it will be there anyways. That is, if you don't tunnel, you must signal w/ the git protocol. If you do tunnel, why use a different mechanism, since the signal in the git protocol must be there?

I think that this is because the SSH protocol isn't just encapsulating the Git protocol directly (the initial assumption of ssh "just" encapsulating the git protocol is not fully correct), and one of the parts that differs is this particular part. (Since on the git protocol side, we need to select a "service":

> a single packet-line which includes the requested service (git-upload-pack for fetches and git-receive-pack for pushes)

which in SSH would be done not by transmitting that packet-line but by instructing SSH to run that particular executable.

> This is clearly described in the article.

It really isn't, IMO; if you don't have precise knowledge of the protocols involved, I don't think anything in the article particularly spells this out.

buckminster 8 years ago | | |

Yes, but once you've updated the git protocol, ssh support comes for free. Having one mechanism is simpler than having two. And as your sibling notes, setting env vars from ssh has disadvantages. So why bother?

Boulth 8 years ago |

> Server-side filtering of references

I wonder if this will be somehow exposed by git daemon. It could be used for easy per ref access controls.

For example Git Switch [0] that uses Macaroons had to clone the repository to implement per ref ACL.

[0]: https://github.com/rescrv/gitswitch

ksec 8 years ago |

I thought google uses hg, have they switched over to git as well?

seabrookmx 8 years ago | |

For all the "big" Google projects they use a proprietary system called piper.

I think all their open-source stuff (Angular, GoLang, Android) uses git (and sometimes Gerrit).

Although given Google's scale, I'm sure there's some teams/projects that use Mercurial.

ngoldbaum 8 years ago | | |

In fact, developers are allowed to use whichever VCS tool they want on their local machine (or on the online coding in the cloud CitC environment). Some opt to use hg. The canonical repo is in piper though, so the hg commits or git commits get converted before they land.

kardianos 8 years ago | | |

Gerrit is a review server that uses git. In fact, Gerrit now stores the majority of information in git itself for all the information it uses.

So for Google external projects, they use git.

> Although given Google's scale, I'm sure there's some teams/projects that use Mercurial.

I doubt it. Their tooling is probably pretty specific, and now that code.google.com has shut down, they probably don't have any review servers that support it.

pjmlp 8 years ago | | |

Go started on Mercurial and then eventually moved into Git.

hartator 8 years ago |

Is Git a Google project now?

jkaplowitz 8 years ago | |

No, but many of the core contributors are employed by Google and spend time on it as part of their day job (with Google's knowledge and permission). This post straddles both the open source part of their jobs and the "Git deployment at Google" part.

s2g 8 years ago | |

BRB switching to Mercurial

Ericson2314 8 years ago |

This is disgusting. So little forsight in the past... At least the outcome of quite useful.

s2g 8 years ago |

oh neat, and it's on a google blog.

That's great. Another subtle reminder that this ad company has way too much control.

GauntletWizard 8 years ago |

Interesting that they took to the Google blog to announce this; is there a corresponding LKML post?

jkaplowitz 8 years ago | |

Why LKML? Despite Git's origins from and use by the Linux project, it isn't especially tied to it now.

LKML would presumably be the place for Linux to announce when they adopt this.

The Google open source blog is among the several credible options for this post, since Google employs much of the core Git team, and this post discusses their experience deploying Git protocol v2 at Google.

As noted in the blog text, it's not in a released version of Git yet, just Git master branch. So maybe it'll appear on a dedicated Git announcement list, if any, once that happens.

joatmon-snoo 8 years ago | | |

Junio posts on the list when there's a new release, e.g. https://public-inbox.org/git/xmqqwoxw6kkk.fsf@gitster-ct.c.g...

It seems that https://groups.google.com/forum/#!forum/git-packagers is the closest thing to a formal announcement list that there is.

Analemma_ 8 years ago | |

> support for v2 was recently merged to Git's master branch and is expected to be part of Git 2.18

Not yet, but presumably there will be a post like this: https://lkml.org/lkml/2018/4/2/425 when it is released. It is strange that the Google Blog is the first place to announce it through.

tgummerer 8 years ago | |

As mentioned in another comment, protocol v2 was implemented by a Google employee, and they decided to write a blog post about it. This is not an official git announcement.

u801e 8 years ago | |

I found a mention of it in a "what's cooking" post on the git mailing list (Message-ID <xmqqvabm6csb.fsf@gitster-ct.c.googlers.com>). But I can't find a direct link on gmane.com right now.

gpvos 8 years ago |

Git didn't have a proper version number or extensibility field in its protocol? That's quite a bit of hubris.

zeroxfe 8 years ago | |

Or, more likely, an oversight.

gpvos 8 years ago | | |

Hmm, I haven't designed very many data formats or wire protocols, and I won't claim I got it right any of those times, but I included some kind of extension possibility every time.

Unfortunately due to a bug introduced in 2006 we aren't able to place any extra arguments (separated by NULs) other than the host because otherwise the parsing of those arguments would enter an infinite loop.