Grit: Rewriting Git in Rust with agents

Grit: Rewriting Git in Rust with agents(blog.gitbutler.com)

83 points by cbrewster 7 hours ago | 124 comments

Philpax 4 hours ago |

> In looking at the code that the LLMs have produced for the project, especially given the pretty massive and widespread architectural changes needed to make the implementation libified and memory safe, we decided that the codebase is not a derivative work that would require carrying forward the GPL license and have decided to release the code under the MIT instead.

Hmm. That's going to be interesting.

nextaccountic 4 hours ago | |

they would be just wrong. I hope someone with standing sues

gpm 3 hours ago | | |

I don't think it's that clear cut. The functional parts probably aren't copyrightable, only the stylistic ones. It's going to be a mix of courts applying laws in new ways that hasn't been done before and fact specific questions about what actually persisted through the LLM if it goes to court.

I'd be fascinated to see what happens if it does. Both in the analyses that we'd get of what the LLM did to the codebase and on the legal decisions on what the copyrightable creative elements in code actually are.

If I was the author though... there would be no way that I would be volunteering to be a test case like this. Also seems just rude for no reason.

joshka 2 hours ago | | |

I suspect that the issue is more likely that the LLM code doesn't have an author and hence some parts of it can't be licenses, it's less likely that it's infringing on git's copyright for various reasons. (I am not a lawyer, but I do read copyright law for funsies).

jhayward 13 minutes ago | |

I'm not a copyright lawyer, but it seems pretty clear to me you can't wash a license using an LLM.

[US jurisdiction]: Anything in the result written by the LLM can not be copyright by anyone.

Anything in the result written by a human can be, and if it was all emitted by the LLM then that portion originally written by a human carries its own copyright.

As a work of an LLM, the entirety presumably can not be copyright, at all. Portions written by humans presumably carry their original copyright.

hankbond 1 hour ago |

> It's like giving wishes as a genie. You gotta be super explicit with the ground rules. I have used the genie analogy before. It used to feel more like a Golem but now with the whole Fable sabotage mode https://jonready.com/blog/posts/claude-fable5-is-allowed-to-... it certainly feels more Genie-like.

Previously I described it as "Models give you what you ask, for not what you want". Now with Fable they don't even give you want you want so idk.

usernametaken29 1 hour ago |

I’m all for memory safety and such but honestly what’s the use case for this? Showing off agentic development? In 10+ years git has never failed on a memory overflow or else. Sometimes software is “good as is” and I’m pretty confident git classifies as such. I’ve also never really hit the limitations of git, even with teams of 20+ developers and lots of binary artefacts. You got to really stretch git limitations, in which case you might need to move away from git, and a rust rewrite will not help in any way whatsoever. So again … why?

WD-42 1 hour ago | |

License washing

schacon 47 minutes ago | |

I addressed this in the post, but Git has no linkable library and never has. If you want to do even something small, you need to fork/exec a process and communicate with it via stdin/out. Or completely reimplement it and all of the edge cases - for example, reading even one object can be either loose (easy) or in a packfile (much more difficult). Reading a reference (what SHA does a branch point to) can be in a loose file, a packfile, or a reftable. etc.

There is no way anyone would ever use this for it's CLI - it will almost certainly always be slower and worse in every way, even if I get it stable (which it's currently not). You can use libgit2 (a project I also helped kickstart), or Gitoxide (a project GitButler also currently helps drive) - they are faster and better in nearly every way, but they are not feature complete.

This isn't for the person using Git. This is for someone trying to build a tool that wants to use parts of Git, which is different.

eminence32 33 minutes ago | | |

But libgit2 exists, right? It may not have 100% feature parity with git, but that's a linkable library that gives you a lot of functionality when working with git repos.

well_ackshually 24 minutes ago | |

How else could they launder the git license and set themselves up for a bait and switch later down the line?

absintini 1 hour ago | |

Soon all the crustaceans will realize that C is better because AI can find all the vulnerabilities anyway.

Rust is some ugly poo.

galangalalgol 1 hour ago | | |

When we go a full year without a lpe in the linux kernel I'll start considering it...

jsLavaGoat 15 minutes ago |

I did something similar and called it gitredoxide since I started with gitoxide.

heyts 4 hours ago |

I'd be really interested in the opposite, just for the sake of experimentation since that's what these projects mostly are. They all seem to be rewrites for the sake of "performance", because the cost is now lower bc of AI. I'd be interested to see something like a port of Quake III in Python or Kubernetes in Perl, even Rails in Python would be goofy and really fun to see

jamesfinlayson 4 hours ago | |

> Quake III in Python

Probably doable - I remember most of Natural Selection 2 was Lua and it's more than a decade old at this point.

jordand 2 hours ago | | |

For Natural Selection 2, it was mainly the gameplay logic that was Lua, all running on their bespoke C++ game engine called Spark. But yeah, modern Python and Lua can be pushed to high performance.

Link: https://unknownworlds.com/en/news/spark-engine-questions-and...

MBCook 2 hours ago | |

> They all seem to be rewrites for the sake of "performance".

And yet this performs dramatically worse.

A slower, untested, incomplete git implementation, all for the low low price of $10-$15,000.

And don’t forget it wasted a bunch of human time in the process.

So if someone mentioned somewhere else there is already a Rust port a group is doing somewhere. How much could they have accomplished with this much money and time in software development resources?

Ok. AI can seemingly port stuff if you don’t test it thoroughly. I think that’s already been proven. At this point I’m seeing less and less value from these kind of things. I’m sure it was fun for the author, but how does it help other people?

Ferret7446 1 hour ago | | |

It's not for performance, it's for Rust.

If the first stereotype of Rust programmers is announcing that a project is in Rust before any other desirable software property (e.g. stable, performant, etc), the second stereotype is that Rust programmers love rewriting stuff in Rust, just for the sake of Rust.

(The 2.a. corollary is that they love rewriting GPL projects specifically and downgrading them to MIT/Apache)

blastonico 1 hour ago | | |

But... it's memory safe. Not that git has any important memory issue, but now people with skill issues in C can contribute to it without breaking stuff.

squidsoup 3 hours ago |

I guess software licenses are meaningless now since anyone can decide their llm clone is not derivative.

Yokohiii 2 hours ago | |

Currently some act like it is fine to translate a project and change the license.

Recently Casey Muratori said in a adjacent context that the microsoft AI push may be related to the fact that they have a long standing and elaborate codebase. A large historic software company could have advantages to train models. They could provide extra value with their IP.

Now their IP is potentially in their models and accessible to anyone. If they actually train models on their IP, anyone could implement their APIs and slap a GPL license on it.

At that point, things will get very interesting.

trumpdong 1 hour ago | |

They were already quite meaningless since nearly every FOSS copyright owner doesn't sue violators.

ianm218 4 hours ago |

I have been working on the same problem in other areas. My ultimate goal is to rewrite nginx in Rust passing as much as the upstream tests as possible while leveraging the strongest aspects of Ruts ecosystem - i.e. rustls (modern memory safe OpenSSL), Tokio (async runtime), h2 (http 2 impl) rather than implementing from scratch like the upstream. I started with Lua, then porting over Valkey, and now working on nginx. The reason was because I wanted to learn the ins and outs before taking on the most complex portion.

[1]. https://github.com/ianm199/lua-rs/tree/main Lua

[2]. https://github.com/ianm199/valdr Valkey/ Redis

[3]. https://github.com/ianm199/nginx-rs-port nginx

Happy to answer any questions on the approach! When I started a few weeks ago the harnesses on their own were not good enough to get very far without a "meta harness" of sorts but that is changing largely with Claude Workloads and Mythos. A lot of the work is developing some custom tooling to move these along faster.

jabwd 3 hours ago | |

Yeah I got one, why? You aren't learning anything, you are just copying code from other codebases and smashing it together to make some nginx-rust thingie... for what actual goal?

ianm218 3 hours ago | | |

Well the biggest goal was to be useful. Nginx serves ~20% of the web, memory unsafe languages might just become untractable for critical exposed to the web infra if the rate of critical CVE's on these rises faster than they can be patched, so a drop in replacement would be a big deal in that world.

But in terms of learning I'm learning relatively little about how to type Rust into an editor but a lot about how to set up agentic loops that can autonomously get tests to pass and improve performance.

For example if you just tell a frontier model (gpt5.5 or Claude Code 4.8) to make some portion of the tests pass they will take forever and just bang their heads against it. I developed a framework to mimic a lot of these tests in nginx... but in minimum non blocking ways so you can run many in parallel with short feedback loops.

Similar for performance - how to make tons of performance benchmark and expose maximum telemetry for agents to go and analyze the hotpaths etc.

jauntywundrkind 3 hours ago | | |

One very strong draw I feel, that's mentioned in this article: Rust's portability, it's ability to be compiled to wasm & run very well anywhere.

Aperocky 3 hours ago |

> A pretty fun experiment and I think we can shape this into something truly useful to the whole community.

Agree with first half of this sentence, we should all have fun with experiments.

> It was never based on a linkable and reentrant library, but instead on a "Unix" philosophy of chaining together simpler commands, which means that it's difficult to use it in long running processes without fork/exec overhead for everything.

Ahhh now we have philosophical disagreement in the only place in the entire article that says "why". Unix is a feature, it's arguably more important in current time: https://aperocky.com/blog/post.html?slug=unix-philosophy-age...

Levitating 3 hours ago | |

You cut that citation conveniently short.

Aperocky 3 hours ago | | |

Added it in full. It still squarely falls under "this is for fun/are you seriously doing this for this purpose" territory for me.

git operate on the filesystem level, the unix behavior is just getting buried. You cannot rewrite git into a linkable library and decide it's now not unix. It's entire behavior is unix, which is why it's awesome.

MBCook 2 hours ago | |

Isn’t git already just an interface over libgit? How is that different?

tredre3 2 hours ago | | |

Git is famously not built around a (reusable) library, hence why we have things like libgit2 (unrelated to git) and why any porcelain on top of git has to resort to calling the binary and parsing its text output.

schacon 33 minutes ago | | |

libgit.a isn't reentrant. It will call `die()` on many errors. If you link to it in a long running binary, it will kill your process on error.

Libgit2 is meant to address this and I was heavily involved in the development of that project 15 years ago. It's great but it's not feature complete and it's development is also completely separate from git development, so it's out of sync and constantly struggling to keep up.

dabedee 3 hours ago |

You're asking people to trust you and hand their codebase/IP to your tool while showing them exactly how you treat other people's code/licenses by "deciding" to not carry forward the GPL license.

stefanha 2 hours ago |

> The full build of all Git functionality in Rust is currently around 27M, but since a large part of it is a library, it could clearly be easily split up into domains of functionality - subcrates that do specific things.

I downloaded v0.3.99 for Linux x86_64 and stripped the binary. It ends up at 31 MB. The .text section is 25 MB.

I'm surprised by the large size. On my system /usr/bin/git is 4.7 MB, although git is split up into multiple programs. I'm not comparing apples to apples, but this is weird.

If anyone digs into the binary size, please share what you find.

schacon 39 minutes ago | |

I would also be interested.

I haven't dug into this at all yet, nor have I tried to optimize the size (or really, anything else).

However, the library part will be less than half of this - a lot of code is spent on the CLI specific stuff and would not be part of the library, which is mostly what I care about for the purposes of this project. The CLI part is just to try to prove the point that it actually does what Git does. The library part is what might be useful in that nothing else exists that does all of the things that it does (provide a reentrant linkable library that is feature complete with Git).

fg137 4 hours ago |

Does anyone plan to use this?

Similarly, is there any momentum left for Cloudflare's EmDash? I can barely find any discussion after April.

gpm 3 hours ago | |

It'd seem weird to plan to use this until the readme stops saying

> it has been nearly entirely written by agents and has not been used for realsies. It's probably currently unusably slow or completely broken in ways that are not exercised in the test suite.

Right now it's someone else's experiment that is still in the "might or might not pan out" stage.

There are a bunch of projects using the similar (not vibe coded, less fully featured) gitoxide project - there is demand for git-as-a-library.

schacon 29 minutes ago | | |

I would not use this except to help us test it if interested. I'm announcing it because it's interesting and a milestone in the breadth of test coverage it can pass. It almost certainly cheated on a bunch of those tests and is not feature complete yet.

The author of gitoxide is also working on GitButler (who worked on this project) and we're pushing both projects forward and actively using and developing Gitoxide as well. This is simply a different and hopefully complimentary approach to the same problem.

linsomniac 2 hours ago | |

I was immediately excited about this wrapped in Python because the current Python git bindings are kind of obtuse, but they do work so I guess I can't complain.

MBCook 2 hours ago | | |

But why switch to this?

Why not just make better Python bindings to libgit?

Yokohiii 3 hours ago | |

Wordpress is/was successful because it's braindead and has a solid userbase. I am not to flame WP, but it's a quality to target a specific group of consumers.

It's an organic success, hard to replicate. If at all, CF can only make people migrate with massive effort. Marketing effort, selling lots of snake oil in the process. WP wont just hop on the hot new thing, WP is the definition of the opposite. It works for them. Why change.

Git is the same on the other side. It requires maintenance and improvements, surgical and correct. No git maintainer has time to learn a gigantic new codebase and they will stick with what works for them. For git users there are no advantages. So similarly it would require a long time effort to push the project, building trust that it is somehow better, probably requiring Linus to say "it's great".

dabedee 3 hours ago |

This is coming from a cofounder at github, someone who probably knows precisely what the GPL is for. Whatever the legal merits, building on a GPL3 project's complete test suite and relicensing under MIT is not acting in good faith toward the original authors. I really find it disgusting and it makes me want to avoid gitbutler entirely.

joshka 2 hours ago | |

I think you're saying that you don't believe in the freedoms to use the GPL licensed test suite for certain purposes which are explicitly allowed by the GPL.

You don't get to choose a license and then add extra terms to it when you don't feel like it's up to scratch. That's something explicitly not allowed by the GPL license.

MBCook 2 hours ago | | |

Where does the GPL say you have the freedom to relicense code or derivatives under MIT by fiat?

Isn’t having to stay under the GPL a very big part of the GPL license?

aeon_ai 19 minutes ago |

I continue to be surprised by the lack of understanding around copyright law when it comes to AI.

anonova 3 hours ago |

Grit was the name of a _Ruby_ implementation of git way back when: https://github.com/mojombo/grit/. I believe it's actually what GitHub was built on then.

schacon 26 minutes ago | |

I started the project as Gust, but felt like Grit was such a better name. I asked Tom if I could boot the name back up again because I always liked it and he said it was fine.

Also, I worked on the Ruby Grit pretty extensively during the early days of GitHub, so hopefully I earned the right to carry on the mantle. :)

mojombo 2 hours ago | |

I created and named the Grit library that used to power GitHub. Scott Chacon (fellow GitHub cofounder, now CEO of GitButler) specifically asked my permission to re-use Grit as the name of this project, which I gladly granted. R is for Ruby. R is for Rust! Grit is dead. Long live Grit!

Yokohiii 2 hours ago | |

Okay name is taken. Let's rename it to Grift.

imoverclocked 4 hours ago |

What’s the long term strategy for this code base? Does the author expect community code contribution or just bug reports or maybe just test contributions?

schacon 23 minutes ago | |

I'm happy to take contributions if you want to throw some tokens at it. Bug reports would be amazing, since I haven't tested it for real very much (enough to know you can do basics).

I want to get it to the point where we can replace fork/exec'ing to an unknown Git binary or having said binary be an external dependency for GitButler. The networking stuff (push/fetch) is currently an external dep for both GitButler and Jujutsu (and pretty much every other Git-based tool in the world). I'm pretty sure I can get the project good enough at these networking ops (including all the hairy credential stuff) to be able to not need those fork/exec calls.

fg137 4 hours ago | |

In 6 months, seeing no adoption, move the repo to maintenance mode. Archive in 12 months.

Yokohiii 3 hours ago | |

He will be probably super happy for starring the project.

0x000xca0xfe 2 hours ago | |

The agents did all the work but _somebody_ has to test it for real on their own data to find the edge cases overlooked by AI. That's what users are for nowadays.

boredatoms 3 hours ago |

Theres already git-in-rust project that is making good progress

https://github.com/gitoxidelabs/gitoxide

schacon 22 minutes ago | |

Gitoxide is also developed primarily by Byron, who also is part of the GitButler team. We're pushing both projects forward.

jauntywundrkind 3 hours ago | |

Gitoxide is mentioned in this write up, yes,

> Currently both Gitoxide and libgit2's networking functionality is either partial, slow or non-existant. Both GitButler and Jujutsu rely on forking out to Git in order to push or pull data. A big reason for this is the incredibly complicated credential logic involved, but all of this is (theoretically) currently covered in Grit.

ewy1 3 hours ago |

pretty dystopian to ask a robot to recreate your favorite software just so you can relicense it for your business venture

schacon 18 minutes ago | |

We're choosing a license that is usable by the entire community. Our goal is a linkable library, which makes GPL impossible. If we had chosen to go with LGPL or GPL with linking exception (like libgit2), it would have the same issue of changing the license, so we went with whatever was the most permissive so everyone could use it for anything if they wish. This has nothing to do with business - I hope I can get the project to the point where Jujutsu or whomever can use whatever is valuable here for whatever they want.

We clearly learned from how Git does operations and emulated it in order to function interoperably, the same way that Gitoxide and libgit2 have, and released it under a license that would be the most valuable for people wanting to use a linkable library, the same way that Gitoxide and libgit2 have.

gdgghhhhh 4 hours ago |

So, they "decided" it's not a derivative and thus can be listened under MIT instead of GPL....

madeofpalk 4 hours ago | |

Yeah, that's usually how contracts work.

You decide whether you have followed it or not. The other party will decide if they agree. If in dispute, you go to a judge and they decide also.

subygan 4 hours ago | |

a lot of things are just "decided" really.

it's just in this case it's the author. we'll have to wait and see who decides to challenge it

tonymet 4 hours ago |

they still haven't explained why I should bother. Is it faster, easier, more efficient, more capable, more scalable on large codebases, supports better workflows?

In fact, I would rather it stay C for 15 more years.

schacon 11 minutes ago | |

I'm assuming you didn't read the article, since I'm pretty sure I covered all of this, but I'm happy to respond.

Don't bother.

It's probably not for you. It's slower, more obtuse, more bloated, less capable, exponentially less scalable at any size. Canonical Git is better in every way, except being a linkable library.

Even in the arena of being linkable libraries that can do Git stuff, both Gitoxide (Rust) and libgit2 (C which has git2 crate Rust bindings) are both better, they're just not feature complete. That is the only point of this project.

rvz 4 hours ago |

> the result is Grit, a from-scratch, library-based, memory-safe, idiomatic Rust reimplentation of Git that passes over 99% of the entire Git test suite.

Why not 100%?

> 41,715 / 42,001 tests passing (99.3%)

So it is not entire then but somehow that was worth burning $8,000~ dollars worth of tokens?

gpm 3 hours ago | |

> Why not 100%?

From the article

> It's not actually passing every single test, though that is on purpose. I did mark some parts of the testing suite as "skipped" because I don't think it's worth recreating them in a library like this - email related stuff, i18n, perforce/svn importers, some of the midx/bitmap stuff - things of that nature. However, for everything that I'm sure is relevant to nearly anyone reading this, the Grit library/CLI can now fully pass the Git test suite.

insanitybit 4 hours ago | |

So .7% tests fail therefor it was 100% a waste of time?

fg137 4 hours ago | | |

I think we are talking about ROI in terms of solving real world problems and making real impact, not the fact that a tool has been ported from language X to language Y.

rvz 4 hours ago | | |

Given the author already admitted that the implementation was slow anyway, you are no better off of using gitoxide instead and that has support for Windows where-as Grit does not.

Zopieux 4 hours ago | | |

Regardless, what's the point?

djha-skin 1 hour ago |

In the age of AI, writing things that used to take years can now be done in months or weeks if you have deep enough pockets for it.

Reimplementation is a particularly juicy target because it's easy to test. Imagine someone writing a better browser than Chrome from scratch in just a year.

Because of this moats around business due to difficulty of implementation are effectively gone.