Packagers don't know best(vagabond.github.io) |
Packagers don't know best(vagabond.github.io) |
Unbundling upstream libraries from downstream projects flattens the change-flow network, reducing the time it takes for things to get fixed and for the fixes to propagate. For example, say that project P uses library L and bundles a slightly modified L in its release. Whenever L’s developers fix or improve or security-patch L, P’s users don’t get the new code. They have to wait for P’s developers to get around to pulling the new code from L, applying their own modifications, and re-releasing P.
Packagers say that’s crazy. They ask: Why does P need a modified L? Is it to add fixes or new features? If so, let’s get them into L proper so that L proper will not only meet P’s needs but also provide those fixes and new features to everyone else. Is it because P’s version of L is no longer L but in name? Then let’s stop calling it L and confusing everybody. Fold the no-longer-L into P or release it as a fork of L called M that can have a life of its own.
The point is that keeping L out of P makes two things to happen: (1) It ensures that when L’s developers improve L, all users, including P’s downstream users, get those improvements right away. (2) It ensures that when P’s developers improve L, those improvements flow upstream to L quickly and reach all of L’s users, too.
More improvements, to more people, faster. That's the idea.
I'm not saying I disagree with you, just trying to point out a spot you might have overlooked.
But just from a logical standpoint, couldn't you apply the following equally to the Riak guy who is complaining: "arbitrarily changing parts of a software package without even trying to understand the consequences of those changes is madness"
It absolutely is madness. If the Riak guys want to use leveldb in a way Google won't support, they should rally with the package managers and get Google to stop being "pretend open source." (Hint, Google: just releasing the source doesn't work if you ignore all bug reports and patches from outside.)
I suspect the real issue here is too much "Not Invented Here" syndrome by all parties involved.
It enumerates minimum versions of shared libraries, as well as explicit versions of static libraries.
This is part of the reason for the plethora of Linux distributions. Some deployments can afford the rapid pace (and consequent instability) of the short term Ubuntu releases or Fedora. Other deployments really do require the longer term stability of the more methodical Ubuntu LTS releases or CentOS / RHEL.
Any improvement requires change, but not all changes are an improvement.
The improvement I'm talking about occurs upsteam of the distributions, even though it is caused by the distributions' packaging policies.
Libraries are upstream from projects, and projects are upstream from distributions. If the distributions discourage projects from bundling libraries, this policy will encourage project developers to talk to the upstream library developers to get desired changes into the libraries, rather than go the customize-and-bundle route. This improved coordination and patch-flow benefits the users of the libraries and the users of the projects, regardless of whether those users rely on any particular distribution to get the software. Users are, as always, still free to pick whatever distribution best suits their preferences, or no distribution at all. Still, they benefit from the distributions' debundling policy.
Authors of well packaged software know when they need fine grained library features, and include static versions of that library. Authors of well packaged software also pay attention to the distribution of commonly used libraries, and make careful decisions when using system-provided shared versions of those libraries.
The author of the article is complaining when someone downstream overrides those decisions. If you're asking how many clusters one of the primary developers of Riak has run, you may not be reading closely enough.
In that time frame I'm stuck between not shipping a new version of P (often unacceptable as I've got users to answer to) or shipping my own slightly modified version of L.
Saving space is just a nice side effect, so why not have that too?
The DLL hell problem doesn't exist in a GNU-based system because we have sonames. Windows and Mac OS X don't have those; instead, the software libraries there can't coordinate with each other harmoniously, so each program has to have all of its libraries packaged with its own set of bugs while making a hostile and rude gesture to the rest of the programs in the OS.
I care about having a system with hundreds or thousands of packages installed on it that all work consistently.
Linux is not OS X, and packages are not .dmg files; I want your package using the system version of libfoo, not your own fork of libfoo. If you have awesome changes to libfoo, then you should either get them into upstream libfoo, or go all the way and actually fork libfoo into libfooier upstream to allow packaging it separately.
Linux and OS X both have the same underlying options for static or shared libraries. There is a large amount of "enterprise" software that is distributed just like a .dmg file.
There is plenty of middle ground between having everything dynamically linked and everything statically linked. The author of the article believes that packagers should trust developers to make good decisions. (Granted, there are plenty of bad developers and abandonware galore. Packagers are justified in stepping in to make new decisions here.)
Very few people understand the importance, and benefits, when they have never seen anything other than .msi[1], .dmg, or worse, .zip.
[1]: Lets just pretend for a minute that Windows software only comes as plain MSI.. all the various .exe's which splatter stuff all over the disk simply doesn't exist.
That package is a dummy package that depends on erlang-base and the rest of the base erlang platform. You would have to force dpkg to ignore dependencies in order to install erlang without erlang-base. I would love to hear how that happened.
Splitting things up into multiple packages makes distributions easier to manage. One person can take the lead on package-dev while another person can take the lead on package-doc. Splitting things up into multiple smaller packages also makes distributing fixes a lot easier. With a one line fix to one include would you rather send out the entire erlang environment or just the small package that needed the fix?
And yes splitting things up to save storage requirements is most useful for resource constrained devices, not new servers/laptops. But it means that a user who is comfortable with Debian or Fedora on the server/desktop can use their same trusty OS on their next project when the device places serious restrictions on system overhead.
Nix is a purely functional package manager. This means
that it can ensure that an upgrade to one package cannot
break others, that you can always roll back to previous
version, that multiple versions of a package can coexist
on the same system, and much more.
So you can all have your own versions of lager or whatever, and still have everything managed sort of nicely. Doesn't solve the "include the docs or not" problem, though. And I'm not sure if it does anything for tmoertel's patching concerns.It solves the security issue the space issue, and also the I need special patches for my version of this lib in my unique application.
They also include the meta packages [1] so you don't have to install every package individually if disk space is not an issue.
[1] https://fedoraproject.org/wiki/Features/TeXLive#Benefit_to_F...
Also, I'd say, if your software needs lots of modified dependencies, you're not communicating with those projects properly.
If every single project were to fork every one of their dependencies, the result would be maintenance nightmare.
It's certainly nice to be able to take an existing library an app depends on, patch it to fix a security hole, and drop that in. But that isn't what's happening in this context...
In both cases, I've basically written "lazy python bindings" for something in C++ (lazy because I only support the features I want in pythonland). Neither of the C++ projects is on github or anything, they're just hosted out there somewhere else (one on SVN, and one only available as archives, I think.)
In the archive case, and since the codebase is small, I just included the whole codebase in my git repo, and added a few small cpp, pyx and py files around it. This library already has a fork, and has the most stars (like, 3) of all my github repos - embedding all the required code and statically linking (indeed, compiling) it as part of my `setup.py` works great, and is easy for 3rd party users too.
In the SVN case, the main project is huge, like a few hundred MB of source (and they use some crazy code generation, so that's not even the half of it.) It also comes with its own very very basic python driver. So, my approach is to give people two or three small patches, build instructions (the project is a nightmare to build correctly,) and then my python code just installs on its own and talks to the project as a normal python library. This version is useless - it's permanently out of date, I can't even get the build instructions I wrote 3 months ago to work when I'm trying to set it up for someone else, and the whole thing is a massive nightmare. If I'd forked it and provided the huge source tree myself, that would be reduced - but that project is also under active development and it'd be great to actually use their latest, least buggy version!
Each of these decisions was made the way it was for real, sensible reasons - I'd hate for a package manager to have to contend with the mess of the second project, and yet apparently that's the way they'd prefer to go with both!
Good job no one needs to use any of my code, really.
OS administrators want a maintainable, supportable system that minimises the number of security vulnerabilities they're exposed to and packages software in a consistent fashion. They also want deterministic, repeatable results across systems when performing installations or updates.
Likewise, keeping various components from loading multiple copies of the same libraries in memory saves memory, which helps the overall performance of the system.
Also, statements like this aren't particularly helpful and are factually inaccurate:
So package maintainers, I know you have your particular
package manager’s bible codified in 1992 by some grand
old hacker beard, and that’s cool. However, that was
twenty years ago, software has changed, hardware has
changed and maybe it is time to think about these choices
again. At least grant us, the developers of the software,
the benefit of the doubt. We know how our software works
and how it should be packaged. Honest.
Some packaging systems are actually fairly new (< 10 years old), and the rules determined for packaging software with that system have actually been determined in the last five years, not twenty years ago as the author claims. Nor are the people working on them grand, old, bearded hackers.OS designers are tasked with providing administrators and the users of the administrated systems with an integrated stack of components tailored and optimised for that OS platform. So developers, by definition, are generally not the ones that know how to best package their software for a given platform.
As for documentation not being installed by default? Many people would be surprised at how many administrators care a great deal about not having to install the documentation, header files, or unused locale support on their systems.
Every software project has its own view of how its software should be packaged, and while many OS vendors try to respect that, consistency is key to supportability and satisfaction for administrators.
So, in summary:
* preventing shipping duplicate versions of dependencies can significantly reduce:
- maintenance costs (packaging isn't free)
- support provision costs (think technical support)
- potential exposure to security vulnerabilities
- disk space usage (which does actually matter on high multi-tenancy systems)
- downtime (less to download and install during updates means system is up and running faster)
- potential memory usage (important for multi-tenancy environments or virtualised systems)
* administrators expect software to be packaged consistently regardless of the component being packaged
* some distributors make packaging choices due to lack of functionality in their packaging system (e.g. -dev and -doc packaging splits)
* administrators actually really care about having unused components on their systems, whether that's header files, documentation, or locales
* in high multi-tenancy environments (think virtualisation), a 100MB of documentation doesn't sound like much, until you realise that 10 tenants mean 10 copies of docs which is a wasted gigabyte; then consider thousands of virtualised hosts on the same system and now it's suddenly a bit more important
* stability and compatibility guarantees may require certain choices that developer may not agree with
* supportability requirements may cause differences in build choices developers do not agree with (e.g. compiling with -fno-omit-frame-pointer to guarantee useful core files at minor cost in perf. for 32-bit)
I'd like to see the author post a more reasoned blog entry with specific technical concerns that are actually addressable.
From a package maintainer's perspective, especially in the case of Debian, they must ensure that packages are stable and secure. It's their job to make sure security updates are released. In the case of FreeSWITCH, there's no distinction between the main source and its dependencies. Package maintainers might as well not bother with including software like FreeSWITCH in their repos or risk the integrity of their system.
System implementer's are mostly ambivalent about these issues until their distro's FreeSWITCH package includes broken dependencies or until their FreeSWITCH installation has a security exploit due to a library that can't be patched independently.
I love FreeSWITCH but I'm sorry to say that it's poorly architected. However, I'm a system implementer, so I don't care.
Meanwhile, Ubuntu's split between main and universe/multiverse is a pretty good compromise. I wouldn't be disappointed if Ubuntu jettisoned universe and multiverse, the better to focus on having a solid main repository, and let a thousand small, focused repositories pick up the slack. As long as all of those repositories leave the packages in main alone, as EPEL does with Red Hat-based systems.
it isnt very smart that a user must be root to install a GUI
The thing is, Erlang assumes that things work from said releases, and find the newest available applications in their library path. This makes sense because it is entirely possible for an Erlang application that was upgraded without ever being shut down to want to roll back to older versions.
When this happens, this application has a path with all the libraries and dependencies it ever needed and can rollback to an older one (without shutting down), or start fresh from the newest one automatically.
Other metadata may be added by each release as required.
The thing is that Erlang developers who are experienced and will write and ship products and Erlang will know this and try to build releases and packages that respect this. Then package managers will (often) undo it to fit whatever pattern they have in mind. They did it, for example, with Ubuntu, removing one of the test frameworks that is part of the standard library and setting it in a different package.
Users who tried the language for the first time couldn't run things that depended on the standard library because it was separated in many different packages.
I fail to understand how, exactly, packagers' choice to not use the release is erlang's (or riak's) fault.
I am happy to give up a little extra disk space in exchange for having predictable executables that work in the configuration they were built and tested for.
Clearly we have differing requirements. I've found the Debian user experience so much nicer than the one you get on OS X or Windows: you install a package with apt, it pulls in all dependencies, and it Just Works. I consider "self-contained" a bug and a warning sign that makes me start looking for a ten-foot pole; "self-contained" is another way of saying "inconsistent" and "not well integrated".
Development libraries, command line utilities, interpreters-- bread-n-butter developement or unixy stuff or in otherwords, the things the average user don't see-- is usually a a lot easier to get and keep up to date when you have a package manager.
Arguably, this is a good trade-off because users rarely uninstall software. But it means that if you ever become uncertain about the configuration state of a Mac, you're probably going to have to reinstall from scratch.
When you bundle the libraries yourself, you only have to target the libraries you included. When you let the package manager do the magic for you, you have to target every version of the libraries, ever.
No, no, no. This is not how you fix critical / security issues in a well maintained system. You either backport a single patch that fixes the problem without changing any signatures, or if you support a very old, incompatible software you reimplement the fix yourself. Then the release is not a new library. It's the old one + fix.
This is what the proper package maintenance is about. No functions should ever be "suddenly gone".
Also if you say in your installation requirements "this software requires libfoo >= 1.2.3, < 2.3.4", no sane package maintainer will disagree. Your application may be patched in the packaging process to work with a different supplied version, but most likely it will just get what's needed.
[1] https://developer.apple.com/library/mac/documentation/develo...
Oh, and its the thing that totally stops me from buying surface, since I did "dir /s" on one of the devices at the store
This, a hundred times. The OP wants to bundle modified versions of other people's open-source software as part of their own without feeding the changes upstream properly, and that's just not the right way to do things. Distributions' rules discouraging bundled packages are there because even worse things happen if everyone does that. Sometimes the dependent package has to put off packaging a new release for a particular distro until their dependencies are satisfied, but then it's time to put on big-girl panties and move on. Managing dependencies and reducing version sensitivity are part of a developer's job.
I agree completely, but I'd like to take your idea further in a direction you likely didn't intend.
The fundamental observation of distributed version control systems, in my opinion, is: Every commit is essentially a fork.
When you combine these two ideas: 1) fork->rename and 2) change==fork, with the 3) identities & values from FP/Clojure/etc, you realize that version numbers are complete folly.
Coincidentally, I just wrote about this with respect to SemVer: http://www.brandonbloom.name/blog/2013/06/19/semver/
In short, if you have awesomelib and make an incompatible version, you can call it awesomelib2. Or you could call it veryawesomelib or whatever else you want. If you give up on the silly idea of being able to compare version numbers, then versioning and naming become equivalent.
If more developers cared about versioning their software appropriately based on incompatible changes or stability guarantees, it would significantly reduce the costs of maintaining OS software distributions and providing integrated software stacks to users.
That's not what he said. He said that packagers frequently break his software for users by incorrectly breaking it up into the wrong pieces and then including a version of that piece that doesn't work. It's especially bad in the case of erlang applications as he enumerates and it's caused by packagers not taking the time to understand the consequences of where they split the software into packages, all in the name having only one version of lib-erl-foo installed on your system.
If the developer did, then they need to reconsider how difficult their making the lives of their customers by forcing the potential for additional vulnerability exposures on the system.
There's a non-zero cost involved in packaging.
I am the administrator of the machines I use. I am also the user of the machines I use. I care far, far more about my experience as a user than I do as an administrator. The less administrating I have to do the better. As the administrator of my array of personal computers what I want is for everything to work, and to stay working, and for new things never, ever, under any circumstances to break old things.
Or....
You can recognize that the author needed those patches to that library and figure out some way to include them.
There are plenty of reasons to ignore patches from outside that are completely valid. Google gets to decide the direction of their fork of leveldb. If a patch doesn't fit that direction they are under no obligation to accept it.
It's not madness for Riak to want a divergent version of a package. Nor is it madness for the package maintainer not to desire to take that package in the direction that Riak wants to. This is why there we have forks in the first place and it's perfectly fine.
In short. No it doesn't equally apply to the Riak guy. The package is responsible for cutting boundaries in the proper place if they don't want to do the work investingating that then they shouldn't package it.
They don't have any control over Google, by "rallying" or otherwise.
Consider the recent case of the WebKit/blink split. Here you have two sets of some of the world's smartest engineers, who cannot agree about how to render a webpage! And this is a well-defined problem. There are actual standards about how to render webpages! In theory, everybody agrees about what's going on here, and yet it's fork time.
As for what Library X does, there are no standards, and not necessarily does anybody agree about what they are building. And let's be honest here, you probably do not have WebKit-caliber developers hacking on Library X. So the chance that you can arrive at consensus for Library X is much lower than for WebKit/blink.
Meanwhile, instead of dicking around with the will-they-wont-they-merge-upstream committee, you can just ship software that works in practice to people that want to use it. If you are a software developer, and you have the choice between writing software and arguing about it, it is usually a good bet on average to write software.
http://s3.amazonaws.com/downloads.basho.com/riak/1.3/1.3.2/u...
It integrates well with the host system (init script, "riak" user/group, data in /var/lib/riak, logs in /var/log/riak, etc.) and bundles dependencies in /usr/lib/riak.
In my experience, bundling dependencies is often the only practical way to install a complex app. Take Sentry as another example:
https://github.com/getsentry/sentry
The current version (5.4.5) depends on 37 Python packages:
BeautifulSoup==3.2.1
Django==1.4.5
Pygments==1.6
South==0.7.6
amqp==1.0.11
anyjson==0.3.3
billiard==2.7.3.28
celery==3.0.19
cssutils==0.9.10
distribute==0.6.31
django-celery==3.0.17
django-crispy-forms==1.2.8
django-indexer==0.3.0
django-paging==0.2.5
django-picklefield==0.3.0
django-social-auth==0.7.23
django-social-auth-trello==1.0.3
django-static-compiler==0.3.3
django-templatetag-sugar==0.1
gunicorn==0.17.4
httpagentparser==1.2.2
httplib2==0.8
kombu==2.5.10
logan==0.5.6
nydus==0.10.6
oauth2==1.5.211
pynliner==0.4.0
python-dateutil==1.5
python-openid==2.2.5
pytz==2013b
raven==3.3.11
redis==2.7.6
sentry==5.4.5
setproctitle==1.1.7
simplejson==3.1.3
six==1.3.0
wsgiref==0.1.2
Ruby or Node apps have dependency trees of similar or greater size. It takes an enormous amount of effort to roll all of these as individual packages (yes, I've done it) and it's a colossal waste of time once you realize you can have a working package in under a minute with virtualenv, pip, and fpm: $ virtualenv --distribute /opt/sentry
$ /opt/sentry/bin/pip install sentry
$ fpm -n sentry -v 5.4.5 -s dir -t deb /opt/sentryEdit: Also, the package managers have an awful habit of installing dependencies that are not actually dependencies. Drives me up a wall.
(I spent the first half of my career in the ISP business.) If I see something as an infrastructure cost that scales slowly but surely, I want a standard distro package wherever I can get one. Sometimes you're in an environment where you've got extremely specific vendor-or-client-imposed requirements, and the best you can hope to do is standardize your configuration / deployment process.
I've had environments where I cared more about CPAN or PyPI than yum or apt (or RHN or roll your own). If I have N00 servers doing the same thing or N00 servers doing a variety of things, the answer shifts.
The only real way to avoid that is for a project to include a bunch of compliance tests, that validate that an underlying library correctly performs the operations the project needs it to do. But this is actually a lot of work, so in reality it will almost never be done. Which leads us back to the discussion about the wisdom of packagers changing libs without understanding the on downstream projects.
Encoding intelligence (beyond, perhaps, simple sequence) in version numbers for software is fundamentally folly.
Encoding intelligence about compatibility in version numbers of APIs is only folly to the extent that "proper engineering discipline has not been applied when managing the stability and/or backwards-compatibility of shared interfaces."
Confusing what makes sense with software and what makes sense with APIs is as problematic as any other confusion of interface with implementation.
The purpose of libraries is generally to provide an API to one or more consumers.
I'm talking about versioning as applied to the library, as a representative of a set of interfaces provided.
Not as some sort of runtime detection mechanism.
But they don't. And you're not going to be able to make them. And even if you did, people would disagree about what constitutes compatibility, stability, and engineering disciplin. One man's "breaking change" is another man's "that was an implementation detail". It's not possible to get this right, since first you need to define "right". That's why versioning is folly.
If, on the other hand, those thousands of programs all bundled zlib, the user won't be safe until hundreds of maintainers wake up and do (repeatedly the same) patching. Or even worse, if there isn't even a package management system, as some apparently want, the user has to also go fetch the fixed programs from thousands of upstreams. Oh, and the user also has to know about the vulnerability. Not gonna happen!
As we can see, the classical model reduces work duplication, reduces patching times and manpower need, and certainly takes a big responsibility off the user's shoulders.
This is the argument that comes up every time. Perhaps it matters to people who are running complex servers hosting an array of services. In my life, it never comes up. I am either using a personal machine or managing a server which is responsible for one single service.
The idea that upgrading one library could affect the behavior of hundreds of programs is terrifying. How do I know they all still work? I don't. I have to go test them all. What this means is that I never update any libraries at all, on a linux machine, because I can't know in advance what the upgrade might break.
My concern is not about keeping everything up to date; it is about keeping everything working. If there is a bug in one program then I want to update that program and only that program and no other programs at all. Then I can evaluate the behavior of the new program. If it is worse than the old behavior, I can hopefully go back to the old version. If it is better, then I can keep it. Nothing else should change.
This is exactly what I get on Mac OS X, and it's what I get when I build apps with statically linked libraries on Linux: stuff works until I break it, and then I know what I broke, so I can fix it.
When I let package managers update things for me, my system becomes an unknowable chaos of changing behavor. Instead, I simply never update anything until I am ready to pave the machine and start from scratch. I install everything I might want to use, then I disable updates and leave it alone until I am ready to start over.
If you could explain how your philosophy would deal with, for example, nginx and Apache both depending on libssl, which itself depends on libcrypto, which depends on libz and libc (both of which are also separate independent dependencies of Apache and nginx) then maybe we could discuss it better.
Oh, and in theory I should be able to swap libssl for libgnutls arbitrarily. How do we handle that?
Separation of concerns is a value of good software projects. But there are practical realities that the author of the article enumerates specifically.
If there is a tight coupling between his application and a handful of upstream libraries, packagers are far more likely to break his application by distributing the latest version of that shared library. Other applications that aren't as tightly coupled can handle that upgrade. Since it is tightly coupled, he's going to be highly attuned to the upgrade needs for his specific statically compiled version.
Application uninstalls are as trivial as dragging the application to the trash bin. No, this will not eliminate the application's data from ~/Library, etc, but 98% of the time you don't want that anyway. If you know what you're doing, it's usually a quick `rm -rf ~/Library/...` and you're done. Some poorly behaved apps stick stuff in other places or otherwise muck with your system, but now with the app store, that's no longer an issue.
And, if you're absolutely anal about deleting every single trace of an app, there are tools that automate the process. For example: http://www.appzapper.com/ -- But really, it's probably a waste of your time unless you had a badly behaved app go rouge. In my many years of Mac ownership, I've installed and uninstalled hundreds of apps and the only time I ever had to bang my head against the wall was when I used to use MacPorts and a Postgres install went haywire because of the same sort of packaging nonsense that the article is talking about.
When uninstalling an application, you usually do want to remove all of the application's components. How often do you say, "You know, I'd like to uninstall 25% of this application, even though the remaining 75% will just be dead weight without it"?
Here "properly engineering and documenting" means pushing upstream changes to officially support your use case, and documenting it so other people know why your use case is important.
I stand corrected.
> In my life, it never comes up. I am either using a personal machine or managing a server which is responsible for one single service.
I just went back through my own update log a couple of months: You don't use anything that uses libxml [1], ffmpeg (audio/video Swiss army knife) [2], poppler (popular PDF library) [3] or openSSL [4]? Do you pay attention to the security notices of every program you have that bundled one of those? Are you sure that upstream is paying attention to the security ntoices of those libraries?
> The idea that upgrading one library could affect the behavior of hundreds of programs is terrifying. How do I know they all still work? I don't. I have to go test them all. What this means is that I never update any libraries at all, on a linux machine, because I can't know in advance what the upgrade might break.
For non-rolling distributions (for example Debian stable, Ubuntu, Fedora, RedHat) packages don't have their behavior changed throughout the lifetime of a release <6>. Security and bug fixes are backported to whatever version is in place for the lifespan of the distribution's release. This is part of the point of stable releases, and something that's routinely forgotten by those who hound package maintainers for newer versions of things!
As an example, let's consider [1]. Ubuntu 13.04 uses libxml2 version 2.9.0 plus some Debian/Ubuntu patches. They released version 2.9.0+dfsg1-4ubuntu4.1 with the following changelog [5]:
* SECURITY UPDATE: multiple use after free issues
- debian/patches/CVE-2013-1969.patch: properly reset pointers in
HTMLparser.c, parser.c.
- CVE-2013-1969
Only the security fix is applied, and after upgrade you know that libxml2 will continue as it has done for the lifetime of your distro release, except it's no longer vulnerable to CVE-2013-1969. Moreover, this applies to every program using libxml (i.e. a lot) with one update. Again: There's no other change in behavior for libxml!> This is exactly what I get on Mac OS X, and it's what I get when I build apps with statically linked libraries on Linux: stuff works until I break it, and then I know what I broke, so I can fix it.
Stuff seems to work until you break it. Then a security hole is found, and it turns out things aren't really working very well at all. To me it seems like a lot of work to keep track of such things by yourself!
[1] http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2013-1969
[2] http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2013-2496
[3] http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2013-1790
[4] http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2013-0169
[5] https://launchpad.net/ubuntu/+source/libxml2/2.9.0+dfsg1-4ub...
<6> Firefox and a few other packages, mostly web browsers who are making it hard to provide security backports, are notable exceptions.
With Windows, it seems like every little thing needs some complex installer procedure that splats files all over the system. Every program needs dozens of DLLs, and the only way to get rid of it is to run an uninstaller that hopefully remembers everything it created, and even more hopefully doesn't break anything else on your machine.
With Mac OS X, there's generally no installation process at all. You download the app, and you put it where you want it to go, and then you run it, and that's it. Nothing goes anywhere and you don't need any special process to manage it.
The experience I've had on Ubuntu is sort of midway between these. There's a complex installation process, and everything has to deal with it, and shit gets plastered all over your machine and there's no way you can keep track of it all, but at least things generally mostly work most of the time.
But really: why manage complexity when you can do away with it?
You're speaking of apt/packaging like it's a bad thing, when it's awesome. Want to download a new OSX or Win app? Open your browser, hunt it down, in the case of Win, figure out if you can trust the site, download it, open the downloaded item and do the install dance, and then have it stay sessile on your system, never updated... unless it has its own phone-home system - and now you have more crap on your system.
Want to install a package with package management? It takes literally seconds plus download time. Package management systems are all about doing away with managing complexity.
And what happens in the case of a package not available through apt? ("Sorcery! All applications must be packaged! All shared libraries must be packaged individually!")
And there went your afternoon installing some just-slightly-uncommon piece of software
I can understand not deleting things out of ~/Documents, but a lot of stuff that goes in the Library folders is not what users think of as data that should outlive the application.
However, I think that there are basically three categories of application data
1) Documents -- These should never be deleted and are not invisible 2) Settings & other small data not worth deleting, probably nice to keep around in case you ever re-install. Most stuff. 3) Large semi-temporary files, like samples and other downloaded add ons that are optional parts of the application
I think OSX handles 1 & 2 well, but you're right, it needs a way to handle #3 too. However, I think that #2 is a much better default than #3.
If the dependency graph G = (V, E) has a vertex for every software project and an edge x -> y iff downstream project y depends on upstream project x, then the change-flow network is the graph C = (V, F), where there is an edge x -> y in F iff there is a downstream path between x and y in G and also y requires an update and re-release when x changes (e.g., because it bundles a copy of x in its releases).
So if there is a change to project x, for it to flow to all affected dependents, you must update all downstream neighbors of x in the change-flow network C.
For example, consider the following dependency graph, in which library L is used by downstream library L2, and L2 by project P:
L -> L2 -> P
If none of the projects bundle their upstream dependencies in their own releases, then the corresponding change-flow network has no edges, and updating any project requires only re-releasing its own package to satisfy all dependencies: L
L2
P
But if L2 bundles a copy of L, and P bundles a copy of L2, then the corresponding network looks like this: L -> L2
L -> P
L2 -> P
P
A change to L requires re-releasing not only L but also L2, and P. A change to L2 requires re-releasing L2 and also P.Does that make more sense now?
If P statically links to its own version of L2, then L2 is just a part of P. The fact that there may be a dynamically linked version of L2 elsewhere on the system is irrelevant.
Consider:
L -> L2 -> P
L -> L2 -> Q
L -> L2 -> R
If the authors of L2 release a new version that P and Q are happy with, but creates an extremely subtle segfault condition in R, then what?The packager could just wait to release the upgrade to L2 until all downstream packages have compatible releases.
The packager could backport a subset of the L2 patches that is still compatible with R (Redhat does this a lot).
The packager could silently curse the author of R for not statically linking the necessary frozen-in-time version of L2 and thus bypassing this problem entirely.
No, it's highly relevant because when a security fix lands for L2, it takes longer to propagate to users if projects like P bundle their own versions of L2 as part of their releases. In that case, users must wait for the project developers to work the already-released L2 fixes into their own bundled versions of L2 and then release new versions of the projects before any downstream users get the fix. But if P and other projects use the same version of L2 that everybody else does, everybody gets the fix right away.
> If the authors of L2 release a new version that P and Q are happy with, but creates an extremely subtle segfault condition in R, then what? ...
> The packager could silently curse the author of R for not statically linking the necessary frozen-in-time version of L2 and thus bypassing this problem entirely.
More likely, the packager would patch L2 to fix the problem with R and then talk to the upsteam L2 developers to get the patch included in L2 proper. This way, R's users get the fix right away and the problem gets eliminated at its source, in L2, rather than papered-over in R's private copy of L2.
As I wrote in my original post, one of the big benefits of the "no bundling" policy is to make sure that patches flow upsteam to where they belong instead of piling up in downstream repos where they do good for only one dependent project instead of all dependent projects.
If you have never in your entire life run `apt-get upgrade` and spent the next four hours wishing you hadn't, then you are quite fortunate. Regardless of the size of your shop, if you're outsourcing patch approval blindly to any distribution, you probably aren't doing anything particularly interesting.