EffVer: Version your code by the effort required to upgrade

EffVer: Version your code by the effort required to upgrade(jacobtomlinson.dev)

98 points by hack_ml 2 years ago | 46 comments

samatman 2 years ago |

The only issue with SemVer is that it's a social contract. There's an available solution to this: make it a technical contract instead.

Most languages these days have a built-in test suite. They can define "no breaking changes" so that it actually means something. Have a set of tests called API. During a major release cycle, you can add tests, but you can't change the tests you have, and the tests have to keep passing. The package registry can run those tests, and if any fail, you don't get to post a minor version release with that code.

This goes from an underdefined "our API will have no breaking changes" to "this is the guaranteed behavior of the API, and cannot change until the major version number is bumped". If a downstream user of the package sees some behavior they want added to the API contract, they can write a test and submit it as a PR, and that test can go into the next release if the maintainers agree that it's a stable behavior which they don't intend to change.

When you move from e.g. 1.0 to 2.0, the tests which now fail are moved to "1.0 API", but they're never removed. No test which is ever in an "API" testset can ever be removed, the package manager enforces this. Provide some mechanism so users of the package can annotate API tests in packages they use as a part of their own test suite, so that when they upgrade, those test failing is an immediate message about what no longer works. If you only rely on behavior which is in common from 1.0 to 2.0, it should be safe to upgrade.

No more taking people's word for it when they say "no breaking changes", no more bikeshedding about what is or isn't a breaking change, just... tests. End of.

jchanimal 2 years ago | |

I describe this idea of an executable feature spec in my roadmap blog post from earlier this year. I agree it’s a great way to think about it.

https://fireproof.storage/posts/roadmap-to-1.0/

We’d define 1.0 in exactly the way you describe, where we can add tests for 1.1 but not remove them without triggering 2.0

abathur 2 years ago | |

I fiddled around a little with the idea of test-driven versioning a while back. Maybe you'd find it interesting. https://github.com/abathur/tdver

I did draft a git-based implementation (https://github.com/abathur/tdverpy), but it just obviously can't be as compelling as one that was part of a language's native tooling/ecosystem could be.

samatman 2 years ago | | |

This is quite similar to what I have in mind, yes. Great minds think alike!

I do think that having an API subset of tests is better than basing the system on all tests. Packages should have as many tests as possible, I frequently write tests which I know will break when I do further work on the code, so that I notice when it happens, and because if it happens accidentally it's probably a bug. Wouldn't want a versioning system to have a side effect of making people reluctant to write a test, because it would commit them to the results. I envision tests migrating from the rest of the suite to the API set over time.

I do like that your system completely specifies the meaning of minor and patch numbers, and wonder if there's a way to tweak my proposal so that it does so as well.

bbkane 2 years ago | |

It's a good idea, but I think it still relies on people taking the effort to be responsible, which just doesn't seem to work long term...

samatman 2 years ago | | |

That's always a risk! One of the strengths of the proposal is that if a maintainer slacks on defining a solid API testset, users can submit the tests they think belong. At that point the responsibility is baked in: once a test is added, you either keep it green or bump the version, enforced by the registry.

If a maintainer staunchly refuses to define an API, that's useful information, the kind you can't get with standard SemVer, where the only mechanism is trusting strangers to do the right thing. Which, to be fair, works ok, some of the time.

dmurray 2 years ago | |

I don't know why this got downvoted; it's at the very least an interesting proposal. Would love to hear a critique arguing that it's a terrible idea.

Existing package managers could even implement it in a completely backwards- compatible way: if you as a package maintainer don't care for it, you simply never add "API tests".

chacham15 2 years ago | | |

Its an idealistic view which will almost certainly fail. Test suites, as much as we like to hope that they reflect real usage, mostly dont. A simple example: if function a gets changed from o(n) to o(n^2) but otherwise behaves identically, most test suites will still pass, but if a user has that function in its own inner loop you can go from o(n^2) to o(n^2^2) which can definitely break a lot of things (simple example: transaction was holding lock for too long and so the transaction was aborted). Being able to catch the above is a high bar for a test suite which I'm fairly confident most test suites are way below that.

falserum 2 years ago | | |

I would not call it terrible, but I got a “silver bullet” vibe, which it is definitely not.

1. For a library there is API and there are implementation details. What if test depended on implementation detail?

2. What if tests had undisputable bug?

3. Test refactoring requires major release now?

4. Realistically test suite will have some execution paths not covered.

I like the idea of running same tests over multiple versions, to observe changes. But I disagree that it would automate semver. (Maybe in very limited subset of cases)

P.S. Not an actual downvoter, but if I would have downvoted, these would have been the reasons.

zamadatix 2 years ago |

EffVer ignores that different users will experience different amounts of pain, not solving the complaint it has about SemVer. If 99% of your users need to do nothing but 1% of your users are going to need significant effort to migrate (say, retiring a couple version old schema most users never even used) then macro/meso/micro all fail to communicate the expected amount of pain. Similarly, if you take the attitude every minor patched bug could have users then micro isn't communicating anything different than it would have in semver anyways.

If you want to communicate impact it might make more sense to add on to semver in some way with a 2 axis "amount of effort" and "likelihood it impacts you" as say "-b7" or something. That said, start trying to include so much information in the version string and eventually you'll just end up with an compressed version of the release notes and not a version number.

gtirloni 2 years ago | |

There is no replacement for reading changelogs or release notes.

Maybe if people did that for their dependencies, we wouldn't have certain software stacks with thousands of them for a simple helloworld-ish backend.

I'm of the opinion that SemVer or any other version arrangement is not to be trusted blindly. When I see a minor version upgrade, it gives me some hope I can upgrade without much trouble but I've been burned too many times to go in blind like that.

codetrotter 2 years ago | |

> That said, start trying to include so much information in the version string and eventually you'll just end up with a compressed version of the release notes and not a version number.

I hear ya. So what what we should be doing is to make a 4096-dimensional vector based on an embedding created from our release notes. And use that as the release version :D

medstrom 2 years ago | | |

What is "based on an embedding"?

thrwwycbr 2 years ago | |

I have a solution for that.

SocialVer:

- upvote or downvote major release changes

- emojis to communicate the level of upgrade pain

- emojis to communicate the level of disaster after upgrading

dbrueck 2 years ago |

A long time ago I gave up on trying to convey much meaning in version numbers, and have used YYYYMMDDBB (BB = build number for that day, starting at 0) for well over a decade, and I love it.

There are many 'pros' to this approach: it's stupidly simple and tools can autogenerate it easily, it's trivially sortable, it tells you how long ago the release happened, but above all it intentionally conveys nothing about your perception of the magnitude of changes and therefore is never misleading. The real meaning is conveyed via release notes: high level changes (with emphasis on any breaking changes) followed by a detailed changelog.

I understand the desire to convey more meaning in the version number itself, but every alternative approach I've tried always falls apart in some way and/or becomes more trouble than it's worth, especially when it's a marketing person who wants version numbers to get bigger faster or a "humble" team member who is anxious to call this the 1.0 release.

Stuff like SemVer seems like a good idea initially, but even with a rigorous test suite there are cases where a bug fix or new feature aren't quite as backwards compatible as intended, so trust in the version number only goes so far. Or it tends to give undo emphasis, e.g. in this release you are pushing out several backwards compatible bug fixes and you are finally pulling a feature you deprecated a long time ago. You have good evidence that nobody has used this feature in years, and for all intents and purposes this is a very small patch release, but you instead have to bump the major version, implying that it's a big release.

Something like EffVer is an interesting approach, but when it ends up being inaccurate for you (i.e. when a supposedly painless upgrade is anything but), then all it has done is pour salt on the wound.

rvdginste 2 years ago |

I still consider semver better. When it is used correctly, the version number gives a clear indication of what kind of changes to expect when upgrading. Obviously this is done to the best of knowledge of the author and might not always be 100% correct.

Either way, the amount of work to do for an upgrade depends on which parts of the product you are using and whether those parts have any changes in the new version. For this reason, most projects also have a changelog which gives you more detailed information about the upgrade. When preparing for an upgrade it is advised to read the changelog.

GuB-42 2 years ago |

Isn't it essentially generalized semver?

The more breaking changes are, the more effort it is required to take them into account. Semver only applies to APIs. Effver could apply to UIs too, but for APIs, it would be similar, just not as well defined (because it is more general).

krainboltgreene 2 years ago |

Any versioning mechanic that allows for a `0.X.Y` has ultimately failed it's users. There are libraries on almost every package manager that have millions of downloads, thousands of production users, but still pretend they are `0.X.Y` as if that means anything. I mean just think about what this sentence:

    zero version still denotes a codebase under development

A human wrote that and said "Yeah, this makes sense to me." All code is under development until it's not.

CuriousCosmic 2 years ago | |

I don't think that's really necessarily fair.

A major version of zero means pre-release code. i.e. a codebase under active development (with the implicit assumption that there will likely be major breaking changes).

A major version of zero just means "I am not committing to a stable API until 1.0" which is a completely fair stance. I'm not going to write code that's very clearly unstable and in active churn and try to pretend it's stable. I'm also not going to keep around a legacy API at that point yet.

Compare that to a standard bump in major version (i.e. 1.0 to 2.0). In this case there is an expectation of a migration path and in all likelihood a versioned legacy API that'll stick around so that users can slowly migrate across the breaking changes between API versions.

Frankly I'm not going to commit to doing that for 0.X.Y/indev projects.

krainboltgreene 2 years ago | | |

> i.e. a codebase under active development (with the implicit assumption that there will likely be major breaking changes).

You have just described all actively written software as "major zero". This is why it's a silly concept.

medstrom 2 years ago | |

All my software so far is 0.X.Y. I'm thinking about just dropping the 0.

Though that'd communicate something totally different from EffVer... Bumps in my X are not "macro effort".

bdjsiqoocwk 2 years ago |

I didn't understand the objections to semver. Can someone give me a specific example of semver failing? I can't think of anything other than the package publisher choosing the increment wrong, and in that case it's not the versioning mechanism's failure, but the publisher's.

cryptonector 2 years ago |

At first glance, I love the thought.

In reality projects/vendors often make versioning decisions for marketing reasons. If you add a ton of killer features with no backwards incompatibility and trivial upgrade path, you might still bump the major version number even though normally that would denote radical backwards-incompatible change.

The need for marketing versioning will not go away, so maybe what we need is an upgrade quantifier modifier to the version number.

E.g., 8.0.0 can be a major functionality release, and 8.0.0-ez can be a major functionality release that has an easy upgrade path while 8.0.0-hd can be a major release that has a difficult (hd == headache) upgrade path.

Karellen 2 years ago |

* Not related to or affiliated with The EFF https://www.eff.org/

(I realise that TLAs have a limited namespace and are bound to have multiple meanings in many contexts, but The EFF is quite a prominent and well-established use in the computing/software arena.)

buro9 2 years ago |

Isn't the effort relative to what you're currently running?

If the micro version you're running is 100 versions behind, is it still expected to be micro effort?

medstrom 2 years ago | |

People aren't stupid. Yes, if you skip 100 micro versions, it may not be so micro, but people can do arithmetic themselves.

hervem 2 years ago |

I read effort and versioning, got some PTSD from Scrum remembering how team (max out to 6 p.) never succeed to have an effort scale.

Am4TIfIsER0ppos 2 years ago |

How much bigger should python version 3 have been then?