GitFlow considered harmful

GitFlow considered harmful(endoflineblog.com)

464 points by edofic 11 years ago | 335 comments

sytse 11 years ago |

GitLab CEO here. I agree that GitFlow is needlessly complex and that there should be one main branch. The author advises to merge in feature branches by rebasing them on master. I think that it is harmful to rewrite history. You will lose cherry-picks, references in issues and testing results (CI) of those commits if you give them a new identifier.

The power of git is the ability to work in parallel without getting in each-others way. No longer having a linear history is an acceptable consequence. In reality the history was never linear to begin with. It is better to have a messy but realistic history if you want to trace back what happend and who did and tested what at what time. I prefer my code to be clean and my history to be correct.

I prefer to start with GitHub flow (merging feature branches without rebasing) and described how to deal with different environments and release branches in GitLab Flow https://about.gitlab.com/2014/09/29/gitlab-flow/

mikeash 11 years ago | |

The obsession of git users with rewriting history has always puzzled me. I like that the feature exists, because it is very occasionally useful, but it's one of those things you should almost never use.

The whole point of history is to have a record of what happened. If you're going around and changing it, then you no longer have a record of what happened, but a record of what you kind of wish had actually happened.

How are you going to find out when a bug was introduced, or see the context in which a particular bit of code was written, when you may have erased what actually happened and replaced it with a whitewashed version? What is the point of having commits in your repository which represent a state that the code was never actually in?

It always feels to me like people just being image-conscious. Some programmers really want to come across as careful, conscientious, thoughtful programmers, but can't actually accomplish it, so instead they do the usual mess, try to clean it up, then go back and make it look like the code was always clean. It doesn't actually help anything, it just makes them look better. The stuff about nonlinear history being harder to read is just rationalization.

phs2501 11 years ago | | |

The point of rebasing for clarity, IMHO, is to take what might be a large, unorganized commit or commits (i.e. the result of a few hours good hacking) and turning it into a coherent story of how that feature is implemented. This means splitting it into commits (which change one thing), giving them good commit messages (describing the one thing and its effects), and putting them in the right order.

Rather than hiding bugs, usually I wind up finding bugs when doing this because teasing apart the different concerns that were developed in parallel in the hacking session (while keeping your codebase compiling/tests running at every step) tends to expose codependence issues that you wouldn't find when everything's there at once.

It's basically a one-person code review. And when you're done you have a coherent story (in commits) which is perfectly suited for other people to review, rather than just a big diff (or smaller messy diffs).

It also lets me commit whenever I want to during development, even if the build is broken. This is useful for finding bugs during development as you'll have more recorded states to, i.e., find the last working state when you screw something up. And in-development commits can be more notes to myself about the current state of development rather than well-reasoned prose about the features contained.

I realize not everyone agrees with it, but I hope I've described some good reasons why I think modifying history (suitably constrained by the don't-do-it-once-you've-given-your-branch-to-the-public rule) is a good thing, not something to be shunned.

ams6110 11 years ago | | |

Disclosure up front, I don't really use git myself. I have tried it and found it to be too confusing. I liked svn and these days use hg. I also tend to work on mostly solo and small projects.

However in my observation I have found that more than any other revision control system I have used, the person ultimately responsible for the code spends far more time cleaning up history and recovering from developer mistakes on projects using git than any I can recall, and that goes back to CVS and Visual Source Safe, also including svn and hg.

I know a lot of people use git and love it so I'm prepared to accept that they're all smarter than I am. But IMHO, the version control system should be incidental to my work. It should not demand any significant fraction of my brainpower: that should be devoted to the code I'm working on. If I have to stop and THINK about the VCS every time I use it, or if it gives me some obscure "PC LOAD LETTER" type of response (which seems to happen to me when I use git) then it is a net negative. If I need to have a flowchart on my wall or keep some concept of a digraph in the front of my thinking or use a cheat sheet to work with the VCS, then it's just one more thing that gets in my way.

I think git probably has a place on very large codebases, with very distributed developers. For the typical case of a few developers who all work in the same office, I think in most cases it's overkill and people would be more productive using something simpler.

Joky 11 years ago | | |

There is difference between rewriting a "published" history from your local repo. I am heavily relying the ability to rewrite history before pushing. I hate seeing people pushing a series of commits (in a single push I mean) where the two first ones introduces a big mess and the subsequents are tentative to fixup the mess.

jkyle 11 years ago | | |

Say I create a feature branch, this is what a day's work might look like.

    839a882 Fix bad code formatting [James Kyle]
    6583660 Updated plugin paths for publish env [James Kyle]
    847b8f3 First stab at a mobile friendly style. [James Kyle]
    a70d3f7 Added new articles, updated a couple. [James Kyle]
    b743ec3 format changes on article [James Kyle]
    68231e7 Some udpates, added an article [James Kyle]
    2a92c5e Added plugins to publish conf. [James Kyle]
    6dec1e1 Added share_post plugin support. [James Kyle]
    070bbd0 Added pep8, pylint, and nose w/ xunit article [James Kyle]
    eb8dbcc Corrected spelling mistake [James Kyle]
    0b89761 Minor article update [James Kyle]
    677f635 Added TLS Docker Remote API article [James Kyle]
    d8e94fd Fixed more bad code formatting in nose [James Kyle]
    f06dc2d Syntax error for code in nose. [James Kyle]
    606ac2b Removed stupid refactor for testing code. [James Kyle]

This might be a very short one. If the work goes on for a couple of days, could be dozens of commits like this.

In the end, it'd be a veritable puzzle what I was trying to send upstream. Also, the merger has to troll through multiple commits and history. It's plain annoying.

So you rebase and send them something like this:

    947d3e7 Implemented mobile friendly style. [James Kyle]

And if they want more, they can see the full log with a bullet list:

    947d3e7 Implemented mobile friendly style.

    - Added plugins x, y, 
    - Implemented nose tests to account for new feature

Rebasing is about taking a collection of discombobulated stream of thought work flow and condensing it into a single commit with an accurate, descriptive log entry.

Makes everyone's life easier.

edit

It's also very nice to take out frustration generated commits like "fuck fuck fuck fuck fuck!!!" before committing upstream to your company's public repository. ;)

jrochkind1 11 years ago | | |

I tend to agree. One exception I think is rebase on a feature branch. If you rebase a feature branch onto master before merging it into master, I think you can get a cleaner history while achieving the linear history the OP wants -- and in this isolated case, I think you aren't losing any useful context by making it seem the feature commits were all done right before merge into master.

Maybe. I'm not actually sure, to be honest what's a good idea with git history, this included. Feedback welcome.

wylee 11 years ago | | |

> The obsession of git users...

That seems overly broad. It seems to me that most people who use git agree that public history shouldn't be rewritten, especially on master.

> The whole point of history is to have a record of what happened.

On the other hand, a bunch of "Derp" or "Whoops" type commits aren't very useful. It's definitely beneficial to clean that sort of stuff up by rewriting local history before pushing.

swsieber 11 years ago | | |

I usually use rewriting history to package the code I've worked on into logical commits that should be stable at each point when applied in order. That way, it's very easy to reverse a commit or cherry pick specific functionality into the production branch early (ie the main branch isn't the stable one).

Would I like to get away from that and do it from the get-go? Oh yes, it'd be great. But I'm not there yet and so re-writing history is nice. And doing so forces me to think about the code I've written and where the boundaries of the changes I've made are. Granted, I haven't done it on very long lived feature branches (or big ones) - that may be where most of the penalties are manifest.

saidajigumi 11 years ago | | |

> It always feels to me like people just being image-conscious.

Every author is "image-conscious" because they want to present their thoughts clearly to the world. That's where your rather substantial misconceptions about the application and utility of rebasing come from. This isn't about rewriting published history, which is rightly and nearly universally considered A Bad Idea(tm) in the git world. The recommendations around rebasing are essentially identical to authors editing their text before publication. Note "before". Before {an article, some code} is published, edit, rewrite, cleanup all you want. After it's published, an explicit annotation is the best practice. For an author, perhaps an "Updated" note in an article or a printing number in a book. For a developer, add a new commit recording the change.

For my part, I use rebasing extensively and lightly before I publish code. By "extensively" I mean, I just don't hesitate to edit for clarity. This is the same as I'd do in authoring a post or email. By "lightly", I mean that I don't waste time doing radical history surgery but I regularly do things like squash a commit into an earlier logical parent commit. E.g. I started a refactor, then a little while later found some more instances of the same change. Often, this is just amending the HEAD commit, but occasionally I need to go back a short ways on my working branch.

This also fluidly extends to use of git's index and the stash for separating out logical commits from what's in the working copy. A typical example:

1. git add <files for a logical change>

2. git stash -k # put everything not added into the stash

3. # run tests

4. git commit

5. git stash pop

Once you're used to the above workflow, an understanding of git's commit amending and rebasing tools extends this authoring capability into recent history. This is wonderful because it takes pressure off of committing, meaning that git history becomes a powerful first-class, editable history/undo stack.

nickbauman 11 years ago | | |

Remember Git was born in Linux. And in Linux, a commit is a political statement. Your need to be succinct (your commit must stand along with 2000 commits per day) and emphasize the "obvious" brilliance of what you're doing to overcome noise overrides the need for recording all the thought processes along the way.

In most organizations, we don't have anywhere near that number of participants and we don't want charismatic developers, we want something that works right now and we're confident that changing it is not merely a possible outcome but very very likely.

sytse 11 years ago | | |

I totally agree with you, I don't get it either. My only explanation is that as a programmer you are trained to write clean and understandable code. I also try to apply this to my commit messages (with varying results). But rewriting history to make everything look clean and simple is the wrong this to do. The messier your history is the more likely you'll need to retrace your steps (and CI results) at some point. It is mostly people coming from SVN and only running CI on the trunk branch that favor the rewriting approach. It might be hard to let go.

jordigh 11 years ago | | |

> The obsession of git users with rewriting history has always puzzled me.

Editing draft commits is fine. Editing public commits is less fine. The problem is that git has no way to distinguish draft and public commits except by social convention.

Mercurial Evolve actually enforces the separation between draft and public commits, and can also allow setting up servers where people can collaboratively edit draft commits.

My talk about it:

https://www.youtube.com/watch?v=4OlDm3akbqg

INTPnerd 11 years ago | | |

Everything about git is about managing a useful history. Otherwise it would be a history of every keystroke or at least every file write. Instead you write some code until you feel you have enough to make a useful commit (you will have to come up with your own idea of what represents a useful commit), commit all those changes together as a single commit (thereby losing history), and come up with a useful description of all those changes. Managing an already created commit is just a further extension of this idea. You can use what you learned from from your experience of coding, testing, and committing to change your commit history to be even more useful. Of course things can go wrong if you are changing the history of a branch that others have cloned or branched off of.

stream_fusion 11 years ago | | |

I can make 20 or 30 commits during some code changes in a morning's worth of coding. This allows me to easily trace back to any point, or cross-reference changes across many local branches, etc.

At the end, it might all be squashed down into a single bug-fix commit for the devel branch.

The commit granularity that's desirable and effective for an individual is very different to the history you want in the main feature branches.

tomphoolery 11 years ago | | |

> The obsession of git users with rewriting history has always puzzled me. I like that the feature exists, because it is very occasionally useful, but it's one of those things you should almost never use.

I disagree, and it's actually impossible not to use it. Rebase rewrites history. If you have a long-running feature branch you need to merge back into master, you have to rebase it against the current master. There's really no other choice.

> The whole point of history is to have a record of what happened.

Define "what happened" in this context...are we talking about what the feature's changes end up looking like, or the entire linear history of the work on this feature starting from the point at which the programmer experimented with a bunch of dead-ends before finding the right path?

Personally, I feel like an extremely detailed history of my personal problem-solving adventure on every complex ticket is irrelevant. At the end of the day, the code reviewer just wants to know what changed. When I review code, I prefer to look at a massive diff of everything that's been done, not read commit-by-commit. I'd rather see exactly what I'm going to pull in when I merge it into master.

I would also disagree with you here that the whole point of source control is to maintain a history of what happened, and argue that the point of source control is communicating changes between developers on a team. The fact that it backs up your code and keeps a history of what changed are merely secondary features to the central value of providing a way of communicating changes to a codebase between developers. I think Git is the best version control system for doing this, because it allows you to rewrite history. That said, rewriting history is very dangerous and if you use it incorrectly (like never ever rewriting history on a branch other people have to pull from), you're

> If you're going around and changing it, then you no longer have a record of what happened, but a record of what you kind of wish had actually happened.

If you're using Git, this is a complete falsehood if you are the person who made the commits. The reflog provides a reference to every single change made to your repository, so you can just reset back to the point before you rebased and voila, like magic everything is back to the way it was. This isn't a "hack", that's what reflog is for. It's a giant undo list for your local clone of the repo.

So in essence, history is never destroyed. It's just hidden from view. You can always go back in Git unless you actually `rm -rf .git/`.

> Some programmers really want to come across as careful, conscientious, thoughtful programmers, but can't actually accomplish it, so instead they do the usual mess, try to clean it up, then go back and make it look like the code was always clean.

You might be correct in some cases, but I think for the majority of the time you are confusing explicitness with vanity. Programmers want other people on their team to know what they did, or at least the intention of their code, and having commit messages that "tell a story" and make sense are vital for doing that.

ChrisMeek 11 years ago | |

I'm starting to think that part of the problem we are facing at the moment is that feature branching itself can be exceptionally harmful.

I see statements like "The power of git is the ability to work in parallel without getting in each-others way" and get really worried about what people are trying to achieve. I want my team's code to be continuously integrated so that problems are identified early, not at some arbitrary point down the line when two features are finished but conflict with each other when both are merged. We seem to be reversing all the good work the continuous integration movement gave us; constant integration makes integration issues smaller and easier to fix.

I personally prefer to use toggling and techniques like branch by abstraction to enable a single mainline approach. Martin Fowler has a very good article on it here http://martinfowler.com/bliki/FeatureBranch.html

platz 11 years ago | | |

Agree, avoid large merges at all costs. the longer a feature branch exists the more needless cost it incurs when trying to re-integrate it with the merge.

jes5199 11 years ago | | |

Those are good techniques and I advocate them strongly.

Even so, I see value in very short-lived branches for code review. Ideally a branch exists for about four hours before it is merged to master.

sytse 11 years ago | | |

The idea is that you split features up in smaller parts so that you merge not more than a handful of days of work at a time.

masklinn 11 years ago | |

A big annoyance is UI:

* most git UI don't provide for branch filtering or --left-only (which hides "accessory"/"temporary" merged branches unless explicitly required)

* developers won't necessarily care for correct merge order, breaking "left-only" providing a mainline view

The end result is, especially for largeish projects, merge-based workflows lead to completely unreadable logs.

sytse 11 years ago | | |

I'm not sure I completely understand. But it would be great if you can send small merge requests to GitLab to try to improve these things.

DougWebb 11 years ago | |

If you're using CI testing results and tying them to particular commits, you end up with the same problem whether you merge or rebase.

If you test a commit and it passes, and then merge that commit into master, the merge may have changed the code that the commit modified, or something that the commit's code depended on. The green flag you had on the commit is no longer valid because the commit is in a new context now and may not pass.

If you rebase the commit onto master, you're explicitly changing the context of the commit. Yes, you get a different SHA and you're not linked to the original CI result anymore, but that CI result wasn't valid anymore anyway. This is exactly the same situation that the merge put you into, but without the false assurance of the green flag on the original test result.

As many others have noted, rebasing is only recommended on private branches to prepare them for sharing on a public branch. If you're running CI it's probably only on the public branches, so rebasing wouldn't affect that. But if you're running CI on your private branch too, then you're going to want to run it after rebasing onto the public branch and before merging into the public branch. That gives you assurance that your code works on the public branch before you share it. Again, if you're using a merge-based workflow you'd have to do the same testing regardless of your earlier test results.

notduncansmith 11 years ago | | |

My intuition of how CI should work is that it tests what the new master would look like were that commit [rebased|merged] into the current master, for that exact reason. Is that not how CI systems tend to work? What you've described above does seem awfully silly.

jamesmiller5 11 years ago | | |

This is exactly how I setup our teams workflow. Private commits are pushed into gerrit run against the CI suite.

  1. If reviews + CI tests go well we fast forward merge onto master.
  2. If the commit's parent isn't the latest commit on master, it is automatically rebased and the CI suite is kicked off again.
  3. Upon successful fast forward merge into master, all in-flight reviews are automatically rebased on master's new head and CI's kicked off again.
  4. Any open commit can become the top of master without worry it will break the build.

For our team of ~10 this works exceptionally well with master not being a broken due to our code in the last ~6 months. (edit: formatting)

brento 11 years ago | |

There is nothing wrong with rebasing a feature branch imho. Feature branches should be considered ephemeral. But it probably depends on your team and project size.

sytse 11 years ago | | |

My personal opinion is that it breaks history and CI tests for all the feature branches. But at GitLab we encountered customers that insisted on having a linear history after migrating from SVN. Therefore there is a function in the UI of GitLab EE to rebase a merge request when accepting the merge request. See http://doc.gitlab.com/ee/workflow/rebase_before_merge.html

solutionyogi 11 years ago | |

It is better to have a messy but realistic history if you want to trace back what happend and who did and tested what at what time. I prefer my code to be clean and my history to be correct.

^^^ Couldn't agree more.

However, I don't know why people want to avoid rebasing feature branches. Rebasing feature branches means that you only have to resolve the conflict once and have a clean history for your release branch. Granted, it works well in my team where only a single developer owns a given feature branch.

aidos 11 years ago | | |

I don't think that's true.

If you have a feature branch with a number of changes in the same place, rebasing on to a branch that also changes in the same spot means you need to fix all of the related commits during the rebase. If it's a big feature that could end up being a huge task.

I may be wrong about that as I'm no git guru.

radicalbyte 11 years ago | |

Sytse, do you mind if I ask a slightly OT question?

How does GitLab store the code-review data? Is it stored in the (or a) git repo? Is the feature compatible with rebasing feature branches before merge?

Also, pricing: I only just noticed that your pricing was per YEAR, not per MONTH. Most boostrap-pricing-page software is priced monthly and the user/year text is lowlighted. This has to be costing you sales.

sytse 11 years ago | | |

You're very welcome.

When you accept a merge request the title, description and a link to the merge request are stored as the commit message. For example see https://gitlab.com/gitlab-org/gitlab-ce/commit/6c0db42951d65... This allows you to see any other things that were discussed. Hopefully any line comments were resolved with a commit (thus documenting them) or were based on a misunderstanding.

Thanks for the pricing tip, we'll fix it https://gitlab.com/gitlab-com/www-gitlab-com/issues/348

Mithaldu 11 years ago | |

Looking at the recent history i can see how you'd come to like it. You seem to mostly be doing merges or documentation changes, which probably means you don't have to do a lot of history spelunking to fix bugs caused months ago.

Are you sure your developers feel the same as you do? Are you sure they're willing to be open enough to you about their misgivings?

sytse 11 years ago | | |

I think the majority of people on our team don't like rebasing because it makes spelunking harder in some cases. But there are certainly people that preferred having some commits rebased so it was easier to revert them (reverting a merge is possible but harder). Although I'm not an active developer myself I think my dislike of rebasing everything is shared.

mrbobdobbs 11 years ago | |

At our offices we use Pull Requests and Rebase as a combined work flow to get mostly linear history. Before issuing a PR we get the latest master, rebase our branch onto that so that all commits in our branch are approximately at the same time stamp and then issue a PR. This creates a nicely linear history for the most part.

The only evil is a willingness to force push the updated history over our branches before the PR goes up. But no one shares branches usually. Or the collaborators on a branch are few and they agree when to rewrite history.

dasil003 11 years ago | |

> * In reality the history was never linear to begin with. It is better to have a messy but realistic history if you want to trace back what happend and who did and tested what at what time. I prefer my code to be clean and my history to be correct.*

I disagree very very strongly with this. I wrote about this years ago at http://www.darwinweb.net/articles/the-case-for-git-rebase, but it's due to for an update to clarify my thinking.

Basically my beef is the idea that never rebasing is "true" history, and rebasing gives you "corrupt" or "whitewashed" history. In fact, the only thing you have weeks, months, years after pushing code is a logical history. It's not as if git automatically records every keystroke or every thought in someone's head—that would be an overwhelming amount of information and difficult to utilize anyway—instead it's all based on human decisions of what to commit and when. Rebasing doesn't "destroy" history, it's just a decision of where a commit goes that is distinct from the time it was first written, but in fact you lose almost no information—the original date is still there, and from that you can infer more or less where it originally branched from.

"But," you say, "surely complete retention of history is preferable to almost-complete retention?". Well, sure, all else being equal I would agree with that. But here's the crux of the issue: merging also loses information. What happens when you merge two conflicting commits is that bugs are swallowed up in merge commits rather than being pinned down to one changeset. This is true whether it is a literal merge conflict that git detects, or a silent logic error that no one discovers until the program blows up. With two branches merging that are logically incompatible, whose responsibility is it to fix their branch? Well, whoever merges last of course, and where does that fix happen under a never-rebase policy? In that single monstrous merge commit that can not be reasoned about or bisected.

But if you always rebase before merging to master, then the second integrator has to go back and correct each commit in the context of the new branch. In essence, they have to fix their work for the current state of the world, as if they had written it today. In this way each tweak is made where it is visible and bisectable instead of squashed into an intractable merge commit.

I get that there is some inconvenience around rebasing coordination and tooling integration (although GitHub PRs handle it pretty well), but the idea that the unadulterated original history has significant value is a straw man. If the branch as written was incompatible at the point it got merged, there is no value in retaining the history of that branch in an incompatible state because you won't be able to use it anyway. In extreme cases you might decide the entire branch is useless and just pitch it away entirely, and certainly no one is arguing to save history that doesn't make it onto master right?

ryanthejuggler 11 years ago |

> ...the history of a project managed using GitFlow for some time invariably starts to resemble a giant ball of spaghetti. Try to find out how the project progressed from something like this...

It's simple. Read backwards down the `develop` branch and read off the `feature/whatever` branches. Just because the graph isn't "pretty" doesn't mean it's useless.

In general, I'm starting to dislike "XXX considered harmful" articles. It seems to me like you can spout any opinion under a title of that format and generate lingering doubt, even if the article itself doesn't hold water. Not to generalize, of course--not all "XXX considered harmful" articles are harmful. They generally make at least some good points. I just think the title format feels kind of clickbaity at this point.

That said, kudos to the author for suggesting an alternative rather than just enumerating the shortcomings of GitFlow.

jaimebuelta 11 years ago |

The approach discussed on the article seems to take into account only one possibility: you deploy master in prod, and it's always considered correct. That works for small projects, but in my experience, when you have a bunch of people (let's say 20) pushing code to a repo, you need several levels of "correctness"

- branches: Work in progress.

- develop: Code ready to share with others. It can break the build (merge conflicts, etc) and it won't be the end of the world.

- master: This shouldn't be broken. It needs to point to a commit that has already been proven not break the build/pass all the required tests.

As always, you need to find a balance with these things and adapt to the peculiarities of your code base and team. I really see them as suggestions...

chaitanya 11 years ago |

I have never understood why people hate merge commits so much. Their advantages are not insignificant: you know when a feature got merged in master, its much easier to revert a feature if you have a merge commit for it, much easier to generate a change log with merge commits, and you have none of the problems that pushing "cleaned up" histories will have: https://www.mail-archive.com/dri-devel@lists.sourceforge.net...

The main disadvantage, as the article rightly points out, is that it makes it much harder to read the history. But that's easily solved with a simple switch: --no-merges. It works with git-log, gitk, tig, and probably others too. Use --no-merges, and get a nice looking linear history without those pesky merge commits.

ninjakeyboard 11 years ago |

You don't need to implement ALL of gitflow - I see it as scalable.

Master should always be latest production code, development branch contains all code pre-release. That's the core.

The other branches let you scale gitflow - if you need to track upcoming release bugfixes etc, you can use a release branch. A team of maybe 6 or 7 would likely start to need a release branch. Feature branches at this point are best left local on the developers repository. They rebase to fixup commits, and then merge those into develop when they're ready.

If you get into bigger teams - like maybe 6 agile teams working on different larger features, then you can introduce feature branches for the teams to use on sprints to keep the work separate.

The issue with gitflow is the lack of continuous integration, so I personally like to get teams to work only on a develop branch during sprints and use feature toggles to commit work to the develop branch without breaking anything.

As I see it, gitflow and CI are at odds and that's my biggest gripe with integrating lots of feature branching for teams - everyone has to integrate at the end of the day.

So I believe the model can and should be scaled back as far as possible, using only develop and master and release as primary workflow branches, introducing the others when the need arises - doing it just because it says so in the doc isn't the right approach.

LukeB_UK 11 years ago |

Anyone else fed up of articles using "considered harmful" in the title? Especially when it's just that the author doesn't like that thing.

mayoff 11 years ago | |

http://meyerweb.com/eric/comment/chech.html

isaacremuant 11 years ago | |

I am. I resent such articles unless they come from a very clear eminence who has public and verifiable evidence to support his case.

This seems more of a: "This tool is popular but it doesn't work for me so it's bad".

In fact, as you say, he dislikes the tool (from the get go):

> I remember reading the original GitFlow article back when it first came out. I was deeply unimpressed - I thought it was a weird, over-engineered solution to a non-existent problem. I couldn't see a single benefit of using such a heavy approach. I quickly dismissed the article and continued to use Git the way I always did (I'll describe that way later in the article). Now, after having some hands-on experience with GitFlow, and based on my observations of others using (or, should I say more precisely, trying to use) it, that initial, intuitive dislike has grown into a well-founded, experienced distaste.

Throwing my two cents. There's no perfect methodology and teams that communicate and adhere to a set of standards will probably find a good way to work productively with git. They can always be helped with scripts like the gitflow plugin or some other helper if they think the possibility of human errors is big.

I also have anecdotal experience of working with and without and, being fine with either although I do appreciate git flow in any project that starts getting releases and supporting bug fixes, hot fixes and has been living for a while so it incorporate orthogonal features at the same time.

cleaver 11 years ago | |

At this point, I just assume that "X considered harmful" contains an element of satire. That's not always the case, but I have no problem saying "Satirical 'Considered Harmful' Articles Considered Not Harmful".

pskocik 11 years ago | |

Yup.

solutionyogi 11 years ago |

So let me get this straight.

He is suggesting to use 90% of what GitFlow suggests (feature/hotfix/release branches) but doesn't like the suggestion of non-fast-forward merge and master/Dev and that makes GitFlow harmful? I don't think I agree.

I think having the Dev branch is useful. Consider this actual scenario at my current workplace.

1. We have 4 developers. Nature of the project is such that we can all work independently on different features/changes.

2. We have Production/QA/Dev environment.

3. When we are working on our individual features, we do the work in individual branches and merge in to Dev branch (which is continuously deployed).This lets us know of potential code conflicts between developers in advance.

4. When a particular feature is 'developer tested', he/she merges it into a rolling release branch (Release-1.1, Release-1.2 etc) and this is continuously deployed to QA environment. Business user does their testing in QA environment and provides sign off.

5. We deploy the artifacts of the signed off release branch to Production and then merge it in to the master and tag it.

Without the development branch, the only place to find out code conflicts will be in the release branch. I and others on my team personally prefer the early feedback we can get thanks to the development branch.

Advantages of an explicit merge commit:

1. Creating the merge commit makes it trivial to revert your merge. [Yes, I know it is possible to revert the merge but it's not exactly a one step process.]

2. Being able to visually see that set of commits belongs to a feature branch. This is more important to me (and my team) than a 'linear history' that the author loves.

We have diverted from GitFlow in only one way, we create all feature/release/bugfix branches from 'master' and not 'develop'.

Now, don't get me wrong, GitFlow is not simple but it's not as complicated as author seems to suggest. I think the author was better served with article title like 'What I don't like in GitFlow'.

perlgeek 11 years ago |

A reason that we are switching from full git flow to a reduced model (basically one master branch + feature branches, occasional hotfix branches) is that git flow isn't compatible with continuous integration and continuous delivery.

The idea of CI is that you integrate all commits, so you must integrate the develop branch - build the software, run the tests, deploy it to a production-like environment, test it there too.

So naturally, most testing happens in that environment; and then you make a production release starting from the master branch, and then install that -- and it's not the same binary that you tested before.

Sure, you could have two test/staging environments, but I don't think I could get anybody to test their features in both environments. That's just not practical.

nijiko 11 years ago |

Do what is easiest for you.

1. Merging vs Rebasing

Open source projects should stick with Merging over cherry-picking and rebasing especially if you want others to contribute. Unless you feel fine doing all of the rebasing and cherry-picking for them. Otherwise, good luck gathering a large enough pool of people to contribute. Simplicity always wins here.

2. GitFlow vs X

Once again do what is good for your company and the people around you. If you have a lot of developers having multiple branches is actually /beneficial/ as Master is considered ALWAYS working. Develop branch contains things that are ready to ship, and only things that are READY TO SHIP. So if your feature isn't ready yet, it can't go to develop, and it won't hit master. Your features are done in other branches.

3. Rewriting history

Never do this. Seriously, it will come to bite you in the ass.

4. Have fun.

Arguing is for people who don't get shit done.

jacobparker 11 years ago |

> I thought it was a weird, over-engineered solution to a non-existent problem.

To be fair, its a cookie-cutter approach that resonates with people unfamiliar with git but not ready/willing to invest the time to understand it deeply. That is understandable; a lot of people come from other systems and just need to get going right away and gits poor reputation for command-line consistency etc. is well-earned.

(To be clear, I am not a fan of git flow.)

If anyone is interested in truly understanding git, start here: http://ftp.newartisans.com/pub/git.from.bottom.up.pdf

matthiasv 11 years ago | |

His point is that this cookie-cutter approach is more complex to the development model he presents (and is in fact pretty common among open source software). You don't need to understand Git deeply to realize that master is stable and merges in finished and cleaned up features.

jacobparker 11 years ago | | |

I should have been more clear - I agree that its more complicated. My point was that regardless, it seems to resonate. I believe its because the complications look useful at a glance and layering a rigid model on top of git frees you from having to consider its full scope of possible operation.

(I believe that git flow is definitely better than "everyone does things their way", and that's one competing "rule-book" for a team new to git.)

I'm pretty confident that understanding the tool better will help you to judge how to use it more effectively. The best way to understand git is to understand its data-model.

omouse 11 years ago |

The only thing GitFlow had going for it is that it has a clearly written article about it with pictures that explain how it works. That's it; the freedom of git and being able to define what works for you is too much for people and they think they need to turn development back into Subversion-style or desktop-release style.

ryanthejuggler 11 years ago | |

I agree with you, with one addition in GitFlow's favor--it standardizes how your team works. When you have multiple team members collaborating on a project, a poor standard is better than none at all.

gouggoug 11 years ago |

GitFlow is also in my opinion a bad flow as it does end up with a merge commit spaghetti over time.

Merge commits are great. They are here to group a list of commits into a logical set. This logical set could represent one "feature", but not necessarily. It is up to you to decide whether commits A B C D should or shouldn't be grouped by a merge commit. Merge commits also make regression searchs (i.e. git bisect) a lot faster. And to top it of, they will make your history extremely readable, but that is granted you merge correctly... and that is where git rebase and git merge --no-ff come into play.

At my company, every developer must rebase their topical branch on top of the master branch before merging. Once the topical branch is rebased, the merge is done with a --no-ff. With this extremely easy flow, you end up with a linear history, made of a master branch going straight up and only everyonce in a while a merge commit.

Our commit history looks like this:

  *-------------*---------*---------*----------*----*------->
   \-----------/          \---------/           \--/

Following the simple rule "commit, commit, commit..., rebase, merge --no-ff" avoided the merge spaghetti a lot of people compain about. Although, I have to admit our repository is small (6583 commits to date).

This works even when multiple devs work on the same branch: they must get in touch on a regular basis, rebase the branch they are working on and force push it. Rewritting history of topical branches is only bad if it is not agreed on. As long as it is done in a controlled manner nothing's wrong with it.

Another rule we follow is to always "git pull --rebase" (or git config branch.autosetuprebase=true).

Our approach might not, however, scale for larger teams or open source projects.

nsfyn55 11 years ago |

I have had this ideological debate about "fast forwarding" more times that I can count. I agree with the author "no-ff" is silly. I've been working on professional software teams for over a decade. When I encountered fast forwarding/rebasing it was absolutely a breath of fresh air. I've been using git now for 5 years and I have not encountered a single instance where using either of these tools has presented any sort of problem. I can't remember a week where having a concise, readable history hasn't proven its value. I also can't remember a single time I've said "Man I'm glad I had all these merge commits around they really saved my proverbial bacon"

From what I can tell no-ff exists to satisfy the aesthetic preference of your local team pedant. It gives them something to do between harping on whether your behavior is in the correct "domain", deciding if a list comprehensions are truly "pythonic", and spending that extra month perfecting the event sourcing engine to revolutionize the "contact us" section of your site.

Mithaldu 11 years ago |

I adore every person who advocates a mostly linear history and is able to elucidate that efficiently and elegantly. :D

ionforce 11 years ago | |

Me too. I'm on team linear history and I still haven't gotten off my bum to make an article/presentation about it.

But I'm with you, brother!

nsfyn55 11 years ago | | |

orbitur 11 years ago |

I can distill my thoughts down to this:

The time it takes to carefully rebase a branch onto another, and to compress commits for a feature into one, is still much longer than the time it takes for my eyes to pass over so-called "empty" merge commits.

If I want to look at when a feature entered a branch, I can look at its merge commit. And the feature branches are there to show how a feature was built; bugs could be the result of a design decision that happened in one of the midway commits.

I looked at OP's example pic in the blog, and I read all of his words, but I wasn't sold. His picture looks like a normal git history to me. It requires almost no effort to find what I'm looking for.

And that's not even touching his rage against the idea of a canonical release branch (master). But that's for another day.

rattray 11 years ago |

I actually thought GitHub Flow was more commonly used, as it's lighter-weight:

[1] http://scottchacon.com/2011/08/31/github-flow.html

[2] https://guides.github.com/introduction/flow/

talles 11 years ago |

I second the feeling of GitFlow being over-engineering: http://blog.talles.me/that-git-flow.html

CrystalGamma 11 years ago | |

little typo: vMAJOR.MINOR.PATH => vMAJOR.MINOR.PATCH

And I must say I agree with all you say in that post.

talles 11 years ago | | |

A downside of always typing, never copying and paste.

Thanks!

marvel_boy 11 years ago | |

Absolutly agree, to much complexity.

juandazapata 11 years ago |

It's not confusing, maybe it's not suited for your particular case, as any other tool. There's no magic tool/process/etc that does it all for everybody.

GitFlow has been working great for us. A team of 15 developers, working with feature branches, we have our CircleCI configured to automatically deploy the "develop" branch to our "QA environment", and our "master" branch to "production" environment.

The "hotfix" and "release" are proven to be useful to us too; we just need to have effective communication with our team, so everybody rebase their feature branch after a merge in our main branches.

menssen 11 years ago |

I have come to actually like the two permanent branches approach. I know that for any repository that follows this model that:

* "master" is the current stable release

* "develop" is the current "mostly stable" development version

The first time you clone a repository this is an extremely helpful convention to quickly get your head around the state of things.

If you're doing it right (and don't use --no-ff, which I agree is unreasonable), I can't think of a scenario where this causes extra merge commits. Merges to master should always be fast forward merges.

jives 11 years ago | |

We follow this model. I like thinking of "develop" as an integration branch, and "master" as an always-deployable gold master. And yep, develop -> master merges are always fast-forward merges.

mcv 11 years ago |

Advising rebasing over explicit merges is dangerous and foolish. Rebasing does have its place, but you really need to know what you're doing.

Also, I don't see his point about that messy history. I can see exactly what happened in that history (though the branch names could have been more informative). With multiple people working on the same project, feature branches will save your sanity when you need to do a release, and one feature turns out to not be ready.

jedbrown 11 years ago |

Git-flow indeed has serious problems. This author is proposing a particular subset of gitworkflows(7).

https://www.kernel.org/pub/software/scm/git/docs/gitworkflow...

I like subsetting gitworkflows(7) because you can incrementally add process when the tangible benefits (like increased reliability and experimental access for eager users) outweigh their process cost (which depends on team experience). I wrote about these issues here:

http://mail-archive.com/search?l=mid&q=87zjx4x417.fsf@mcs.an...

This diagram represents a workflow that uses 'maint', 'master', and 'next' branches.

http://59A2.org/files/simplified-gitworkflows7.svg

btreecat 11 years ago |

To me (a mercurial user mostly) this is kind of like a "no duh" article having never read the original "gitflow."

I think that is because I am used to using hg's branches, bookmarks, and tags for different use cases.

If I want to mark a revision as a particular release number (which is something we don't really do here but I can see the value) then I would use hg tag. Tag's are permanent.

If I want to mark a revision as "production" and then have some automated process take over based on the the updated info, I would use hg bookmark. Bookmarks are the closest equivalent to git's branches. Bookmarks can be updated to a new revision or removed.

If I wanted to work on a parallel branch of development for an experimental feature or if I am attempting to upgrade some dependencies, I can use hg branch. This creates a named branch in the code base which is permanent. This branch can eventually be either closed or merged back into the main.

nycticorax 11 years ago |

It seems like one of the ironies of git-flow is that it would actually work better if you used it with Mercurial rather than Git, because Mercurial stores what branch you were on when you made a commit. This means that a tool could automatically look at the Mercurial commit tree and figure out which swimlane each commit belongs in, and use this information to draw a commit history tree that wasn't such a mess.

I apologize for the self-promotion, but this answer on Stack Overflow (and the question) talks about this difference between Git and Mercurial, and includes links to articles that explain it better than I could:

http://stackoverflow.com/a/26784550/1013442

mpdehaan2 11 years ago |

Posted a link to this prior blog of mine in article comments also - http://michaeldehaan.net/post/116465000577/understanding-whe... - but I'm a huge fan of rebase and topic branches.

One main branch is great, and also if working with a large number of contributors I really like a clean history, and makes things much easier to review.

It's kind of a shame something got branded with a slick name like "GitFlow", when "doing it the way you ought to be doing it" doesn't have a slick name :)

rdsubhas 11 years ago |

A single eternal master works for a Continuously Deployed app/site.

Not for any other project where maintenance releases are a norm. This includes stuff strict API compatibility projects, semantically versioned frameworks/plugins/libraries, many forms of desktop/offline apps, some android apps, most enterprise apps, etc - more or less where developers don't have the liberty to thrust the latest master on their users.

I'm not against CD, and not a big fan of Git Flow either. But different things have their own uses. I'm really liking GitHub Flow and GitLab Flow though!

habitue 11 years ago | |

Right, when you need to maintain (and patch) old versions of a piece of software, having eternal release branches is necessary. The fixes on those old versions often don't ever want to be merged back to master because the code is very different in more recent versions.

Touche 11 years ago |

> All other branches (feature, release, hotfix, and whatever else you need) are temporary and only used as a convenience to share code with other developers and as a backup measure. They are always removed once the changes present on them land on master.

From an open source developer's perspective I need more "eternal" branches because I need to plan future releases. Putting everything into master makes the decision for me (if I have a breaking change I have to bump a major version even if maybe I want to delay doing that).

songshu 11 years ago |

I wish GitFlow had not called that branch "master", and had called it "released" or "production" instead. It's really useful to have a branch which you know always exactly represents the code running in production. You can keep an IDE pointed at somewhere and update when you need to without worrying about tags or whatever. This is the one part of it I've tried to sell to colleagues, which would have been easier if it had a better name.

pskocik 11 years ago |

I think he makes a valid point about how it's not necessary to have both develop and master if you use tags. On the other hand, I think the `--no-ff` merges is what git-flow got right. The separation of features into their own branches is useful. It's basically about grouping related commits together. You can always render the history in a way that looks prettier and even if you can't--the history doesn't need to look pretty, the final product does.

ionforce 11 years ago | |

I disagree with you that the history doesn't look pretty.

Have you not found the ability to investigate/audit bugs hindered by non-linear histories?

scottious 11 years ago |

I don't see why there has to be "this is harmful" and "this is a better way".

I've used all kinds of branching models... I've used just a master branch and you commit directly to master. I've used full git-flow.

I think the branching model you use is dependent on the people and the project. But really no matter which model I've used it seemed to me to be fine... And if it wasn't fine, we extended it to meet our requirements.

Anderkent 11 years ago |

Most of the merges in his first pictures aren't even fast-forwardable, so his complaint about no-ff seems.. weird?

You should still rebase your feature branch on top of whatever you're merging into whenever you can, even if you're using git-flow. That's just common sense. When you do, your history looks almost the same as in his 'pretty graph', there's just one more 'link' back to the previous feature merge.

The advantages of this additional context are important. Firstly, you can get a compressed view of only the features that were merged (without detailed commits) with something like `git log --first-parent`. I guess the only way to do that in OPs approach is `git log | grep 'SPA-'`? Rather... unreliable.

Using no-ff also means you don't have to do the silly thing of putting your issue name / branch name in every commit title. Titles are pretty short already, having to allocate ~10% of it to tracking the name of the branch is just wasteful. With no-ff it's obvious which feature the commit is for (the branch name in the merge). If your tool fails to present that in a reasonable fashion, that's disappointing, but the data includes this context and that's the most important thing.

As to the master/develop split, yeah I could be convinced it's unnecessary. Still, I think it's convenient to have a clear separation of 'this code is in production', 'this code is in development'. If you just make a release branch then merge it into develop, you have to know the exact tag before being able to find the latest release. 'master' being the alias for 'latest release' is fine.

tokenizerrr 11 years ago |

I like to be able to (temporarily) revert an entire feature branch, which merge commits help with. Is there a way to easily do this without them?

dankohn1 11 years ago |

We use a more extreme version of rebasing feature branches before merging into master: we squash the features into a single commit when merging to master. The reason is that we don't care about the (sometimes hundreds of) commits that made up a feature. What matters is that it works as designed and passes tests. If we merge a feature to master and then need to revert it, we will revert the whole feature.

This also allows us to keep merging master into feature branches, (where there is only a single commit that might need to be manually merged) instead of rebasing feature branches on master (in which case it can be necessary to manually merge multiple intermediate commits).

What cleared up git merge --squash for me was a comment showing that:

  git checkout master
  git merge --squash feature

is the equivalent of doing:

  git checkout feature
  git diff master > feature.patch
  git checkout master
  patch -p1 < feature.patch
  git add .

Thrymr 11 years ago | |

That loses a lot of information, though. If you want to bisect to find a particular bug, tracking it down to the merge is a good start, but I'd rather have the actual commit (from the maybe hundreds) that went into the merge. Sure, you can revert the feature, but what if you want to fix the bug?

Git history has a lot of commits. That's OK.

dankohn1 11 years ago | | |

Yes, the downside of a much more streamlined commit log is that we do throw out information.

radicalbyte 11 years ago |

I see GitFlow as a pragmatic workflow customized to cloud-based software. Master is auto-deployed, and Dev acts as insurance.

We're currently having lots of success with this:

* Always work in a feature branch. * Pull master + rebase feature branch when done. * Merge to master with --no-ff --edit and include a summary.

Rebasing feature branches keeps them readable and avoids continuous merges. Disable fast-forward keeps the log for /master abstracted to feature-level, but the details are available in the graph.

Major releases are branched, minors (bugfixes) are tagged. Bugfixes are made in master and cherry-picked into the release where possible.

Currently our CI build only works on /master, but in the coming month it'll build all feature branches which have been pushed to the main repository.

This is very similar to how Perforce streams work, but it's distributed. If you really hate distributed version control and love GUIs then I can recommend Perforce.

jwr 11 years ago |

That's the first time I heard of GitFlow. People seriously do software development this way? I find that hard to believe.

What the author describes is fairly close to what I've been using in a number of companies now for the last 9 years or so.

Whether to rebase is a personal preference. I tell developers to always rebase local work before committing. Unobserved changes might as well not exist (if a tree falls in the forest and no one is there to hear it falling, does it make a sound?), so if you haven't pushed your work, rebase it. No one cares when you did the work.

As for feature branches, it depends. If the history is clean and there aren't too many at one time, we might merge without rebasing. But I still prefer to clean up the commit history and rebase. I don't understand the obsession with "true history". History is written by victors, in this case — resulting work/code.

JulianMorrison 11 years ago |

The whole point of hotfixes is that they are relative to old code and that they alter what is considered the current version of that old release. Which is important when (as is typical in business) you have customers who are on specific releases and either haven't paid for the new hotness, or haven't integrated to it and don't yet want it.

So absolute minimum you need one persistent branch per old release, if you ever hotfixed it and still have it deployed in the field. GitFlow falls over here, because it only has one master. But at least it does recognize the fact that repairing released code is different from pushing the unreleased state of the art forward.

Finster 11 years ago |

I can count on zero hands the number of times I've needed to solve a problem by navigating the branch tree.

I've lost count of the number of times that two eternal branches and feature branches with pull requests (+ code review) has saved major flaws from getting to production.

The develop branch is perfect for automatically deploying our bleeding edge to our test server.

Although, if we move to a more continuous deployment approach, we may transition away from two eternal branches. But when GitFlow was first written about, continuous deployment really wasn't the trend that it is now.

Methods will continue to evolve...

uzero 11 years ago |

Writing considered harmful posts is considered harmful - I hate these so much.

leni536 11 years ago |

I never used gitflow so I could be wrong but the main problem with logging seems to be this:

Gitflow thinks about branches as lanes. Git branches are actually labels. What's the difference? In the gitflow model every commit belongs (implicitly) to a branch (or a lane). Git branches don't work that way. One could actually implement "lane" as an additional commit metadata and tweak git-log (and other git utilities) to always show lanes in straight lines in the graph.

5outh 11 years ago |

On the main project I'm working on, the reason we have develop/master is mainly for hotfixes.

We deploy once a week, but if we need to get something out the door quickly, we make a hotfix branch off of master, then merge it into both develop and master. This way, if we find something that needs to be fixed before the next release, but don't want to push half-done updates, we can seamlessly do it.

Bahamut 11 years ago |

One thing that bothers me about GitFlow is that it mangles history with merges. Sometimes it becomes tricky to debug issues when history was created with GitFlow.

I would rather branch off of master, bring changes in via git am or rebasing when ready, then tag a release when it is ready to be released. If there is something wrong with master, the tagged releases serve as easy points to branch off of.

pacquiao882 11 years ago | |

I think it depends on the project's structure and team discipline. I tend to prefer straight lines in the history where a single feature or a group of similar functions are linear.

t_fatus 11 years ago |

I think it greatly depends of the size of your team: if you're alone you have one branch, if you're two you may have 3 branches, each for one of you + master, if you're four you start to use feature-based branches..

It might be fun to compute the number of branches needed as a function of the number of devs in your team.

mml 11 years ago |

Ever since I started getting involved in SCCM stuff, I've been astounded at how much breath and emotion people are eager to waste defending their choice, or technique or strategy or whathaveyou.

SCCM system discussions should be banned on HN, as pointless and heated as vi vs. EMACS discussions.

erikb 11 years ago |

It's not really harmful just because it's too much overhead for small projects. If you have a huge project I'd assume that it's much harder to read history anyway, and then a more complex pattern of development is reasonable.

dicroce 11 years ago |

Even though I could frequently commit on feature branches, I usually don't. Hence, when I merge feature branches I don't have crazy messy histories that I feel it necessary to rewrite.... Works for me.

jguegant 11 years ago |

What about using merging when it comes to the feature branches but rebase when pulling (git pull --rebase)? Is it that harmful to rewrite the history for your local changes?

darylteo 11 years ago |

Curious here: has anyone tried using GitLab + forks to replace development branches? Would it needlessly overcomplicated?

sytse 11 years ago | |

You can do it but at GitLab we advise against it if you can avoid it. Many things become harder, for example it is more work to to link merge requests to issues and you can't push a commit to help a person without them giving you access first.

lgp171188 11 years ago | | |

If you are using the Integration-Manager workflow (which GitLab doesn't support as well as Bitbucket or GitHub), all the members of a team have read access to all the repositories and forks in the team namespace. That means the owner of a developer branch fork can always read the repo of another contributor and pull the changes.

dimino 11 years ago |

I personally find git flow to be a wonderfully elegant and simple way of handling a project in git. Not everything is perfect, but I consider git flow to be much like PEP8. It's almost always a good idea to do it the git flow way, unless you have a very specific and documented exception, in which case do that instead.

To me what matters more is the consistency.

Also, the attitude and tone of this article straight up stinks.

- Checkout master. - Start an interactive rebase of master onto the last commit before the series of commits you wish to remove. - Mark all the commits you don't care about as "skip". - Let the rebase run and resolve conflicts on the way, the same as you'd do with your current work flow.