More on Version Control(bramcohen.com) |
More on Version Control(bramcohen.com) |
For instance Pijul might very well be a lot better than git / jj. I wouldn't know, I haven't bothered trying it because all the projects I need to work in use git. But since jj has great git compatibility, I actually have been able to adopt it because of its git backend.
A new VCS that doesn't have git compatibility at its core is going to have a really hard time overcoming network effects.
When I'm rebasing my own work and editing history, that's exactly what I'm looking to accomplish, though.
I'm sure there are other Git forges that would support a similar workflow, with a "Squash and Merge" button or equivalent, but my team hasn't felt any need to migrate away from GitHub so I've never yet investigated that in detail.
Only downside I've found to this workflow is that it would make it harder to migrate to a different Git forge in the future: unless you're very careful with the migration, the PR numbers are likely to be different (perhaps resetting at 1, even) and the other forge won't end up with the commits that are on GitHub's copy of the repo but no longer on any active branch (we also use the "auto-delete branches when you hit the merge button" option). But it would still be possible for a migration tool to handle this correctly: look at all PRs on GitHub, grab the commits from them, and migrate them to Merge Requests on the new forge.
First, how could you make this deal with copies and renames? It seems to me like the pure version of this would require a weave of your whole repository.
Second, how different is this from something like jujutsu? As in, of course it's different, your primary data structure is a weave. But jj keeps all of the old commits around for any change (and prevents git from garbage collecting them by maintaining refs from the op log). So in theory, you could replay the entire history of a file at a particular commit by tracing back through the evolog. That, plus the exact diff algorithm, seems like enough to recreate the weave at any point in time (well, at any commit), and so you could think of this whole thing as a caching layer on top of what jj already provides. I'm not saying you would want to implement it like that, but conceptually I don't see the difference and so there might be useful "in between" implementations to consider.
In fact, you could even specify different diff algorithms for different commits if you really wanted to. Which would be a bit of a mess, because you'd have to store that and a weave would be a function of those diff algorithms and when they were used, but it would at least be possible. (Cohen's system could do this too, at the cost of tracking lots of stuff that currently it doesn't need or want to track.) I'm skeptical that this would be useful except in a very limited sense (eg you could switch diff algorithms and have all new commits use the new one, without needing to rebuild your entire repository). It breaks distributed scenarios -- everyone has to agree on which diff to use for each commit. It's just something that falls out of having the complete history available.
I'm cheating with jj a bit here, since normally you won't be pushing the evolog to remotes so in practice you probably don't have the complete history. In practice, when pushing to a remote you might want to materialize a weave or a weave-like "compiled history" and push that too/instead, just like in Cohen's model, if you really wanted to do this. And that would come with limitations on the diff used for history-less usage, since the weave has to assume a specific deterministic diff.
Funny, I would probably swap the first and the last adjective in that sentence.
Can you say more about this? What exactly is this trick you’re talking about? What are the benefits?
(That seems to be an archive of the old revctrl.org pages from a while back; most likely Bram Cohen has a blog somewhere explaining it in his own words - probably about 2003, at a guess)
But someone may need to explain it to me.
Of course centralized VCS are less popular. You need to setup a server first then wrangle with the server every time you create a new project -> fewer projects -> fewer users.
I very much prefer keeping histories by default (both my personal workflows and the tools I build default to that) but squash is a valuable tool.
How so? When I bisect I want to get down to a small diff, landing on a stretch of several commits (because some didn't build) is still better than landing on a big squashed commit that includes all those changes and more. The absolute worst case when you keep the original history is the same as the default case when you squash.
> The absolute worst case when you keep the original history is the same as the default case when you squash.
No, now you have a bunch of worthless broken commits that you need to evaluate and skip because they’re not the problem you’re looking for.
Now let's also talk about renames...
If you actually do like to deliver the correct number of commits then it's frustrating to work with people who don't care. In that case I would suggest making the squash optional but you could also try selling your team on doing smaller commits. In my experience you either "get it" or you don't, though. I've never successfully got someone to understand small commits.
Or don't have a choice. Our department-wide rules were almost to require that for all repos, I had to push hard just to make it "strongly suggested" instead.
Git will happily let you merge branches and preserve the history there. GP seems to like that history being in PRs only on github instead. I don't get why, that just seems worse to me.
Do you restrict yourself to 1 non-broken commit per PR? I don't, and nor does anyone I've worked with. If there were even 2 non-broken commits in the PR, then bisecting with the original history lands you on a diff half the size that bisecting with squashed history would, which is a significant win. (If you didn't care about that sort of thing you wouldn't be bisecting at all).
> No, now you have a bunch of worthless broken commits that you need to evaluate and skip because they’re not the problem you’re looking for.
What are you "evaluating"? If you want to ignore the individual commits and just look at the overall diff that's easy. If you want to ignore the individual messages and just look at the PR-time message that's easy too. Better to have the extra details and not need them than need them and not have them.
No. To the extent that I can however I do restrict myself to only non-broken commits.
> If there were even 2 non-broken commits in the PR, then bisecting with the original history lands you on a diff half the size that bisecting with squashed history would, which is a significant win
It is not a significant win when the bisecting session keeps landing me in your broken commits that I have to waste time evaluating and skipping.
And splitting out fixups doesn’t save anything (let alone “half the size”), most commonly those fixups are just modifying content the previous commits were touching already, so you’re increasing the total diff size you have to evaluate.
> What are you "evaluating"?
Whether the commit is the one that caused the issue I’m bisecting for.
> If you want to ignore the individual commits and just look at the overall diff that's easy. If you want to ignore the individual messages and just look at the PR-time message that's easy too.
Neither of these is true. git bisect (run) lands me on a commit, it’s broken, now I need to look whether the commit is broken in a way that is relevant to what I’m seeking.
> Better to have the extra details and not need them than need them and not have them.
Garbage is “extra details” only in the hoarder sense.
Skipping a commit that doesn't build is trivial (especially if you're automating your bisects).
> And splitting out fixups doesn’t save anything (let alone “half the size”), most commonly those fixups are just modifying content the previous commits were touching already, so you’re increasing the total diff size you have to evaluate.
If you feel the need to rebase to squash one-liner fixups into the commits they fix then that's a more subtle tradeoff and there are reasonable arguments. But squashing your whole PR for the sake of that is massive overkill, and the costs outweigh the benefits.
A broken commit usually compiles, if you don’t even compile before committing you should go back to school.
> If you feel the need to rebase to squash one-liner fixups into the commits they fix then that's a more subtle tradeoff and there are reasonable arguments. But squashing your whole PR for the sake of that
It would really have helped if you’d stated upfront that you can’t read.