Clang now makes binaries an original Pi B+ can't run

Clang now makes binaries an original Pi B+ can't run(rachelbythebay.com)

375 points by FPGAhacker 2 years ago | 137 comments

stephen_g 2 years ago |

Wow, looking at the history of the ARM generation the original versions of the Raspberry Pi uses, it’s hard to believe it’s so old! When the Raspberry Pi B+ was released (2014), the ARM core it used was already 11 years old (using the ARM1176 core from 2003). So it’s not unbelievable that you might need to start supplying an arch flag to produce compatible code building on a different platform (like the newer Raspberry Pi the article says they first built on).

As others have said, it does seem like a misconfiguration (perhaps in the defaults shipped by their distribution) that the correct arch is not picked by default when building on the Raspberry Pi B+ itself.

yjftsjthsd-h 2 years ago | |

> When the Raspberry Pi B+ was released (2014), the ARM core it used was already 11 years old (using the ARM1176 core from 2003).

IIRC the original Pi used leftover chips from a TV box, which is the kind of product that IME never ships more compute than they have to, for price reasons.

RantyDave 2 years ago | | |

Phones, not TV's, but that's pretty much the idea. Even better, the ARM core was tacked on as a sort of "dammit, I supposed we'll have to run applications" kinda thing and isn't even necessarily initialised during boot.

Raspberry Pi's actually boot on a really fringe processor called a VideoCore. Arguably the GPU bootstraps the CPU, which makes my brain hurt.

einr 2 years ago | | |

TV boxes IME usually ship with less compute than they have to [in order to provide reasonable UX] ;)

wmf 2 years ago | | |

never ships more compute than they have to

ARM keeps releasing newer slow cores that support the latest instructions; for example the Cortex-A5 was available and the RPi 1 really should have used that.

vkaku 2 years ago | |

It was meant to be a low price computer.

kazinator 2 years ago | |

She's not building on the B+, though.

Quote:

I started trying to take binaries from my "build host" (a much faster Pi 4B) to run them on this original beast. It throws an illegal instruction.

This is like building something with the latest MSVC on Windows 11 and trying to run the .EXE on an old PC running Windows XP. :)

I suspect the entire Pi distro she's running on the Pi 4B itself won't run on the B+, since all of it is probably compiled the same way, possibly down to the kernel.

kelnos 2 years ago | | |

But at the end, she puts together a new SD card for the B+, boots it, and tries to compile an empty program on the B+ itself. "It can compile something it can't even run", she says.

mid-kid 2 years ago | | |

She was building on the B+ in the later example of the blog.

schemescape 2 years ago |

I didn't see it addressed here or in the article: this is a bug, right?

Edit: oddly, after searching LLVM bugs, I found a bug that sounds pretty much exactly like this issue... but it's from 2012 and is closed (although the final couple of comments make it sound like maybe it wasn't actually fixed--note: I only skimmed the comments and I probably misunderstood):

https://github.com/llvm/llvm-project/issues/13989

Edit again: I forgot about the comment at the end of the article that clarifies that explicitly passing the target results in a working program. In that case, it sounds like some sort of configuration bug--I would assume (but am not certain) that the default target would be the current processor, at least on Unix. That bug I linked was probably about producing incorrect code even when the target was set correctly which, thankfully, isn't happening today.

Arnavion 2 years ago | |

Yes, your bug is about the compiler emitting armv7 instructions despite being told to target armv6. Rachel fixed her problem by telling the compiler to target armv6. So I assume your bug is indeed already fixed and not related to Rachel's problem.

krick 2 years ago | |

Obviously it is a bug, but, apparently, author didn't bother to report it, opting to write a blogpost with a somewhat clickbait title and ending with "so weird" instead.

wmf 2 years ago | |

Yeah, this is not the behavior people expect.

NikkiA 2 years ago | | |

I would expect a arm64 machine to not build a arm32 compatible binary by default, it's the same as running clang on a x86-64 host and expecting it to produce 386 compatible binaries without a -march=i386 somewhere.

The weird clang install on a fresh B+ install is more puzzling, unless there's some user error somewhere.

rschu1ze 2 years ago |

The database I work on (ClickHouse) tries hard to stay compatible with really old hardware. The standard ARM binaries require Armv8.2 from 2016 (available in Raspberry Pi 2 >=2) and x86 binaries run on hardware from around 2010 (SSE4.2 + pclmul* instructions for fast CRC). We also build (but don't test using CI) binaries for Armv8.0 and SSE2-only systems. A quick install script downloads and unpacks the right binary for the target host.

I find it generally hard to strike a good balance between backwards compatibility and usage of modern CPU features in newer AArch64 generations (https://en.wikipedia.org/wiki/AArch64). We found that there are surprisingly many institutions on a shoestring budget (universities in emerging countries) or hobbyists that can't afford to upgrade their hardware.

On a technical note, what I found quite cumbersome is that the cpu flags in /proc/cpuinfo don't always correspond with the flags passed as -march= to the compiler, e.g. "lrcpc" vs "rcpc". To make all of this work, one really needs to maintain two sets of flags.

WirelessGigabit 2 years ago | |

I think in those cases it actually serves all your customers to do multiple builds so that they can pick and choose the one that matches their architecture the closest.

opello 2 years ago |

Seems like the problem is likely a configuration target change in the clang-13 package that's current for bookworm.

Specifically because under bullseye (and clang-11) the default target is armv6k-unknown-linux-gnueabihf while under bookworm (and clang-13) the default target is arm-unknown-linux-gnueabihf.

Or maybe the default changed for the given build configuration on the LLVM side?

opello 2 years ago | |

I really wish I understood the Debian change management process better. I guess I don't even really know if Raspbian is actually maintained by Debian.

But, when comparing [1] to [2], the rules file has a nice test that says "if DEB_HOST_ARCH is armhf, set the LLVM_HOST_TRIPLE to armv6k..." which seems to confirm a build configuration change.

[1] http://raspbian.raspberrypi.org/raspbian/pool/main/l/llvm-to...

[2] http://raspbian.raspberrypi.org/raspbian/pool/main/l/llvm-to...

orra 2 years ago | | |

To answer your incidental question, Raspian is maintained by Raspberry Pi folks, not Debian.

JonChesterfield 2 years ago |

I doubt this is a deliberate change. Picking up information from the environment - a sibling mentions /etc/env.d/gcc - seems fairly likely. I'd guess the default triple is something like arm-unknown-linux unless clang finds or is is told something more specific to use, and the mechanism by which it gets told to use something more specific has fallen over.

This might mean there are no arm v6 buildbots running, or it might mean there are ones running but the implicit configuration is still working on them.

LLVM is a really good cross compiler. Build for any target from any target, no trouble. Clang is less compelling - if it's built with the target, and you manage to tell it what target to build for, it'll probably do the right thing (as in this post - it guessed wrong, but given more information, did the right thing). Then the runtime library story is worse again - you've built for armv4 or whatever, but now you need to find a libc etc for it, and you might need to tell the compiler where those libraries and headers are, and for that part I'm still unclear on the details.

rcarmo 2 years ago | |

Most distros and compilers effectively dropped ARMv6 a couple of years back - I had similar trouble building binaries for my old Synology NAS.

anthk 2 years ago | | |

Alpine Linux might still support it I think.

hulitu 2 years ago | |

> Picking up information from the environment - a sibling mentions /etc/env.d/gcc - seems fairly likely.

Why would CLANG do this ?

elteto 2 years ago | | |

Clang already replicates a bunch of flags, macros, and behaviors from gcc. The objective is to be a drop-in replacement, and make the developer experience much nicer when migrating. There are some rough corners, of course, but overall it’s actually very nice.

JonChesterfield 2 years ago | | |

If clang didn't try to do the right thing based on the context it finds itself in, people would have to specify a lot more compiler flags to tell it what to do. Target triple, where libc is, where libstdc++ or libc++ is, what linker to use, what flags to pass the linker and so forth. This is much more annoying than `clang foo.c`.

matja 2 years ago |

clang/clang++ read from /etc/env.d/gcc to get the target flags/profile, it's up to the OS to maintain them and make sure they're correct, looks like that didn't happen for this OS.

My Gentoo ARM SBC based on an even more ancient armv4 arch has been chugging along just fine with the latest gcc/clang updates:

    grep CTARGET /etc/env.d/gcc -r
    /etc/env.d/gcc/armv4tl-softfloat-linux-gnueabi-11.3.0:CTARGET="armv4tl-softfloat-linux-gnueabi"

Arnavion 2 years ago | |

/etc/env.d is a Gentoo-specific directory to define default env vars for user sessions. It's not a feature of clang to read that directory, so it's not correct to assume other distros would have it. It's just that Gentoo's compiler setup reads the CTARGET env var to select the target, and Gentoo uses /etc/env.d to set it.

matja 2 years ago | | |

Is that why other distros break? :)

contingencies 2 years ago | |

Gentoo always works .. it just takes longer :)

cbmuser 2 years ago |

The article doesn’t mention whether Debian or Raspian was installed. And, in case of Debian, whether the armel or armhf port is being used.

Without that information, it’s pretty pointless to make claims about the instruction set LLVM compiles to because that’s a matter of what native target LLVM has been configured for.

FWIW, in Debian, llvm-toolchaim-snapshot still supports armel which uses ARMv5T as the baseline (there is currently an unrelated bug in LLVM’s OpenMP library though which prevents a successful build).

jchw 2 years ago | |

What's weird is that the Clang binary is clearly compiled for an instruction set that is compatible with the Pi B+, but it doesn't target an instruction set that is compatible with the Pi B+. This is genuinely weird, since that's not meant to be a cross-compiler; in theory, the host and the target should be the same.

Presumably the image is Raspbian. I don't see a reason why not to assume that.

guipsp 2 years ago | | |

One key thing is missing from your comment (which explains the 'weirdness') - clang, and other llvm-family tools are cross-compatible by default. There is no separate cross compilation binary. This is just a configuration bug.

teddyh 2 years ago |

It would probably be helpful to know the output of the command “dpkg-architecture” and the contents of the file “/etc/os-release”. Otherwise it will be hard to make any useful comments.

cbmuser 2 years ago | |

Exactly. The article omits information that is fundamental to being able to fix this problem.

fsniper 2 years ago |

Title is unfortunately sensational. This is a default target change. Turns out clang still can build binaries for the Pi B+. You just need to be explicit about the architecture. So perhaps a small title change that's more clear about this being only default setting change?

nottorp 2 years ago | |

Doesn’t seem so sensational when it can’t build binaries for the target machine… on the target machine itself…

fsniper 2 years ago | | |

Title suggests when using clang, built binaries can't be run on this device. It means it can't build for this architecture at all. But the post eloborates that it's possible to build for this architecture, it's just incorrectly targets another one. It would not matter if it's on the same architecture, or cross compiling. The capability is there. It requires you to be explicit about which architecture you are targeting.

A default for targeting is incorrect, and/or an architecture identification is buggy. But binaries built for Pi B+ - when using correct targeting arguments - can be run on Pi B+.

Now if the title is using wording that suggest a functionality is not there anymore vs the reality, where defaults or identification are incorrect, wouldn't that mean that is hunting for sensation?

cbmuser 2 years ago | | |

It’s sensational because it’s wrong. LLVM still supports even ARMv5T which is the baseline of Debian’s armel port.

dagmx 2 years ago | | |

It can, it just doesn’t by default. Which is what the person you’re replying to is saying.

sovietmudkipz 2 years ago |

Oh cool so this is kinda how one might debug why a program isn’t running on arm. I have a Unity Linux build that I can’t get running inside a container. Unity mono is trying to make a system call that isn’t available, even after passing in the amd64 flag to docker when running the container.

I haven’t debugged it because I found a work around (enable development mode, change build settings so mono isn’t used). I should return to it at some point, just to learn more.

vkaku 2 years ago |

clang has been broken for a while in the last few versions. Many issues were left unfixed and development moved to 17.0.0 when they should have fixed those as point releases for 16.0.x instead (patches were available and not integrated).

In this particular case though, the end processor/native detection seems to be failing and clang feature detection gets armv7l as native (or could just be the default generation option). Looks like a good bug to report, if only we get the good clang folks who will take the time to land a fix.

I have been playing around with zig. My current focus will be on not using broken compiler backends for a while.

frizlab 2 years ago |

Title is misleading

eimrine 2 years ago | |

Because "as a default" statement is missing?

jenadine 2 years ago | | |

Yes.

It now sounds like it is completely broken. But you can just fix it with a flag. And the change of default was probably an unintentional bug.

blahgeek 2 years ago | |

Yes. It should be “clang does not correctly detect host architecture in raspberry pi B+”

usr1106 2 years ago | |

Aka called clickbait. Although the bug and the workaround are useful to know for everyone working with that machine.

johnklos 2 years ago |

You'll see a whole bandwagon of people saying things like, "supporting old hardware is BAD! It takes time and money that nobody has!", as though someone needs to be hired to sit around and do nothing but pore over code and constantly rewrite code for old hardware.

There's plenty of evidence to the contrary, but since when has evidence mattered when it comes to defending the right of big business / big distro to do whatever they want? ;)

Really, this is just laziness and sloppiness on the Linux distro makers' part. Any amount of testing would catch this. Thanks, Rachel!

1vuio0pswjnm7 2 years ago |

"I guess nobody still runs these old things anywhere?"

I have one running BSD UNIX-like OS as I type this comment.

auselen 2 years ago |

Confused article? You make a host/native build instead of cross and expect it to work on some other machine?

daviddoran 2 years ago | |

No. Half way through the article she specifically starts doing everything on the B+ (the old RPI with the issue).

auselen 2 years ago | | |

Thanks for that. I didn’t notice she switch to B+ later.