Debian must ship reproducible packages

Debian must ship reproducible packages(lists.debian.org)

372 points by robalni 54 days ago | 169 comments

uecker 53 days ago |

This is a huge achievement for Debian and the free software world.

It took a while though until this was understood. In 2007 when pointing out on debian-devel that this is needed, I was still told what huge waste of time this would be. And indeed it took a huge amount of work by many people to get there, but it is well worth it.

PunchyHamster 53 days ago | |

There was no bug or attack on Debian since 2007 that reproducible packages would prevent.

"Well worth it" is not correct. And it just ups the the contribution barrier to Debian higher, I already heard a lot of people complaining that contributing to Debian is hard and while in past I defended it by "they need all the checks and bounds to make sure packages play with eachother nicely", this is just step that makes it hard for no reason and little benefit.

savolai 53 days ago | | |

” If you are wondering why we are doing this at all, then hopefully the Reproducible Builds website will explain why this is useful.”

https://reproducible-builds.org/

Could you perhaps respond to the argumentation here?

MomsAVoxell 53 days ago | | |

Reproducible builds are applicable not only to respond to ‘attacks’, a subject you seem to be bikeshedding, but also for other reasons too.

Anyone having to maintain a code base or a distributed fleet of devices will gain from this decision, immensely, as their operational periods come and go.

Reproducible builds are about longevity as much as they are about security.

Please don’t make bold claims about ‘no reason and little benefit’ while demonstrating ignorance of this hard fact: reproducible builds should have been the norm, in computing, from the get-go.

azkalam 53 days ago | | |

Reproducible builds reduce the need for trusted parties.

Have many organizations produce the binaries independently and post the arifacts.

Once n of m parties agree on the arifact hash, take that as the trusted build.

If every party reaches a different hash then we cannot build consensus.

benregenspan 53 days ago | | |

Is the "Jia Tan" XZ Utils compromise not a good example? That relied on code snuck into a release that was not in source.

(It was caught before being promoted into a stable Debian release, yes, but this sort of relied on a happy accident, too close for comfort)

eptcyka 53 days ago | | |

It makes shipping backdoors a whole lot harder, yes.

aborsy 53 days ago | | |

There was perhaps no detected bug or attack. There have most likely been bugs or attacks that reproducible builds would have prevented.

deknos 53 days ago | | |

"mimimimi".

Those people do not care about quality in opensource at all. For longliving software this is very important.

Of course, all those javascript and kubernetes packages which are irrelevant in a few years again, might complain, but let them complain.

ckastner 53 days ago | | |

> There was no bug or attack on Debian since 2007 that reproducible packages would prevent.

I'm reading this as a suggestion that the reproducible builds effort was an ineffective deterrent.

However, note that your observation could also be explained by the opposite: the reproducible builds effort was an effective deterrent, so nobody bothered with attempts.

> And it just ups the the contribution barrier to Debian higher

Until yesterday, the package just got flagged in the tracker, and you could either ignore it, or fix it yourself, or the kind people behind the reproducible builds effort supplied a patch themselves.

Now, you can no longer ignore it. But fixes are often trivial. Use a (stable) timestamp provided by the build, seed RNGs with some constant (instead of eg: time), etc. These are best practices anyway.

perlgeek 53 days ago |

https://wiki.debian.org/ReproducibleBuilds has some more infos; some is outdated, but it also has a chart showing how many packages are built in the CI, and how many of those are reproducible builds.

(Orange = FTBR = "failed to build reproducibly")

I'm not good at reading numbers from charts, but I'd guess it's a few percent (4-5ish?).

bpavuk 53 days ago | |

all I get is this:

> Forbidden

> <p>You are not allowed to access this!</p>

(yes, with HTML tags on display) :)

EDIT: I also found a "I Challenge Thee" page in history. did I just get blocked by antibot measures? why???

unleaded 53 days ago | | |

Do you have JavaScript disabled? They put one of those anti-scraper things on it.

TacticalCoder 53 days ago |

What people really don't understand about reproducible builds is that they're not a guarantee that there's no backdoor.

They're a guarantee that if there's a backdoor, it's reproducible 100% of the time.

This is a godsend for white hats fighting the good fight.

And, as a side note, it's strongarming vs the bad guys: "Would be too bad if we could reproduce your shiny exploit 100% of the time wouldn't it!?".

Note that we should go further (but it's a bit orthogonal to reproducible builds): builds of the final binary/package should happen by first entirely discarding all files not necessary for the final build (like all test cases and all test assets). The build should literally happen in an environment that gets rid of those (after, of course, having test in another environment that all tests cases succeed): if I'm not mistaken get rid of test assets would have stopped Jia Tan's XZ backdoor attempt dead in its track (for example). Because IIRC there were binary data part of the backdoor hidden in some asset only used by test cases.

P.S: as a bonus they also allow to detect bit-flips (I'm not saying there aren't other ways to detect bit-flips: what I'm saying is that if you have deterministic builds anyway and something doesn't reproduce correctly due to a flipped-bit, it's going to be noticed).

jaypatelani 54 days ago |

Good thing. NetBSD has fully reproductible build since 2017. https://blog.netbsd.org/tnf/entry/netbsd_fully_reproducible_...

Zopieux 53 days ago |

A great milestone, congrats Debian on taking a stance and holding high standards for yourself, especially in the current era.

jgneff 53 days ago |

I'm so happy to see this change. I got involved with reproducible builds in 2021 after reading in horror about the SolarWinds attack. [1]

I think Magnus Ihse Bursie said it best while working on reproducible builds of OpenJDK: "If you were to ask me, the fact that compilers and build tools ever started to produce non-deterministic output has been a bug from day one." [2]

[1] https://www.linux.com/news/preventing-supply-chain-attacks-l...

[2] https://github.com/openjdk/jdk/pull/9152#issue-1270543997

micw 53 days ago |

I wonder why this is a thing nowadays. I use yocto for embedded devices and it was almost a no-brainer to implement reproducible builds. I can also easily enable Debian package management, so everything is already available.

MomsAVoxell 53 days ago | |

What do you mean why is it a thing nowadays?

Reproducible builds are an essential method in industrial computing - Debian isn’t at the forefront of this, it is merely adopting industry wide techniques also applied to other operating systems in use in long-term and safety-related applications.

Certainly, a lot of the hard work of the Yocto and Debian developers is already in your hands.

What is interesting is that this is now being applied in a more forward-focused policy by the Debian developers, that it will now be the norm rather than an option…

dezgeg 53 days ago | |

Did you actively verify that your builds were bit-reproducible?

tofflos 53 days ago |

amd64 forky

reproduced: 97.02% good: 17586 bad: 511 fail: 30 unknown: 0

This, statistics for other architectures, and the reasons for unreproducibility can be found at https://reproduce.debian.net.

suprjami 53 days ago |

I am always surprised Debian are leading this and not the commercial vendors. You'd think big organisations paying for RHEL and Ubuntu would be beating down the door for verifiable binaries.

tremon 53 days ago | |

If a competitor can prove that their packages are bit-for-bit identical to what a big organization is shipping, that allows the competitor to benefit from the security assurances of the big org. This is great for software freedom, not so great for wannabe monopolists.

jorams 53 days ago | |

Reproducible builds exist to reduce the need for trust, while commercial vendors are in the business of selling trust.

rurban 52 days ago |

So these are broken on amd64. Debian arm64/forky rebuilderd stats https://reproduce.debian.net/arm64/stats/forky/

Most with failed to reproduce: NT_GNU_BUILD_ID. The others on some other bits. Mostly timestamps or hashes I assume

casey2 53 days ago |

This fights against "opensource-washing" which is the practice of large companies claiming to release open source code, but the compile takes so long (as well as being overly-convoluted) that most people and many distros can't afford to maintain the package.

It feels like AI and traditional software are converging in complexity.

pixel_popping 53 days ago |

Forbidden

You don't have permission to access this resource. Apache Server at lists.debian.org Port 443

ameliaquining 53 days ago | |

I can see it just fine; maybe an overzealous firewall thinks you're a bot? At any rate, the Wayback Machine has it: https://web.archive.org/web/20260510074120/https://lists.deb...

baranul 53 days ago | | |

Unfortunately, many of these "protections" don't know what is a bot or a human. Many clueless websites are often just blocking huge swaths of legitimate readers and customers.

pixel_popping 53 days ago | | |

Why would you block access to a static page, even Bots, what's the point? I'm not a bot, very typical non-privacy setup (Firefox, Linux, VPN) for personal usage.

It does work with my privacy/scrapping setup (residential proxy, spoofed fingerprints, Qubes and so on), great job debian.

inglor_cz 53 days ago |

Has anyone fought Microsoft Visual Studio successfully to produce reproducible builds of C++ programs? From what I have heard, it is one of the worst contexts to do it.

Dwedit 53 days ago | |

It's that RICH header that you need to exclude. I just tested my copy of MSVC 2019, and `/emittoolversioninfo:no` will exclude the RICH header from the binary. Supposedly also works in MSVC 2022.

The build timestamps in the PE header and export table are also a problem as well.

azkalam 53 days ago | |

Probably easiest way is to use Bazel to leverage the effort that has gone in there

einpoklum 53 days ago | |

Well, you can't build MSVS yourself, reproducibly or otherwise, so this is a less appealing endeavor I would think.

kkyktkrkekk 53 days ago |

”Optimize the code for 5 seconds”, as many compilers, including vc++ on windows did, was probably one of the dumbest thing ever invented. It meant that the binaries became more optimized when building on faster computers.

shevy-java 54 days ago |

A small step for debian,

giant leap for mankind.

stingraycharles 54 days ago | |

As someone who recently spent a lot of time on making a large C++ program entirely reproducible on 4 different OS’es, one cannot understate just how many tiny details matter here.

gjvc 53 days ago | | |

"overstate"

rurban 53 days ago |

... and most of this work is done by other distros and maintainers. Starting with binutils

amelius 53 days ago |

That's cool but I'm honestly a bit disappointed in how apt refused to embrace/support both the container and AI/GPU aspects of computing. Are we going to see some changes there?

yjftsjthsd-h 53 days ago | |

Those seem like unrelated things? I can imagine ways for apt to integrate with containers, but what would it possibly do for AI or GPU other than delivering packages like it already does?

Arrowmaster 53 days ago | |

What exactly are you talking about? Those don't seem related.

Hendrikto 53 days ago |

Why the fuck does that site break the back button? DO NOT do that.

em-bee 53 days ago | |

since there is no other way to reach you please allow me to use this off topic message to let you know that there is a response to your comments on the gnupg discussion from two weeks ago.

einpoklum 53 days ago |

Debian must ship packages without the hard dependence on systemd.

blueflow 54 days ago |

zero improvement on end-user experience. does not solve supply chain issues, debian package will reproducabily contain the malware from upstream.

charcircuit 53 days ago |

So much time has been wasted on reproducible builds which could have better spent on securing more important parts of Debian. Practically minor changes like a build timestamp being different is not an issue.

Hendrikto 53 days ago | |

It allows verifying that the binaries actually match the source, which is extremely valuable.

charcircuit 53 days ago | | |

Bit for bit matching is not required for that.

farfatched 53 days ago | |

Yes, making sure build timestamps are reproducible isn't a security win.

What is a win is that two independent parties can run the same build, and get the same binaries.

This is important because it removes trust from builders: anyone can verify their output.

It just so happens that unimportant things like build versions impede that.

charcircuit 53 days ago | | |

Anyone can verify the actual code in the binary matches even if some bytes within the binary file itself are different. The verification routine doesn't have to be a basic bit for bit equality test.

deknos 53 days ago | |

you are free to provide patches instead of bitching.

charcircuit 53 days ago | | |

And Debian is able to offer me a few million dollars yearly to help fix their security situation.

kkfx 54 days ago |

Debian, like any other legacy distro, mush became declarative, because the '80s model of manual deploy and the absurd pain of D/I and Preseed must end.

kakwa_ 53 days ago | |

In the end, Nix is just a thin veneer on this stuff.

Given how many quick & dirty sed patching or exec commands I've seen in the few nix package/modules I've read, I would not exactly bet my life on it being completely idempotent & reproducible.

kkfx 53 days ago | | |

it's the best option after IllumOS (OpenSolaris) IPS integrated with ZFS. Far less powerful not imposing zfs (only well supported for root, swap, encryption etc), so not integrated in the package system and bootloader management (BEs, Boot Environments).

It's not reproducible bit by bit, it fetch the current version of anything, but it's still easy to reproduce enough, stable enough and complete enough, while classic distros need a fresh install every major release or facing issues an keeping a system in unknown state for long until it explode.

farfatched 53 days ago | |

I've been 100% on NixOS on many years, but it's Debian that really drove this project.

They're still a pragmatic choice for many usecases.

suprjami 53 days ago | |

bootcrew have bootc Containerfiles for Debian, Ubuntu, Arch, and openSUSE:

https://github.com/bootcrew/mono