But really, I don't understand why a sensitive security-related library would implicitly use an unsafe function like setenv().
This is a oversimplification. Windows has essentially the exact same API and it works just fine in multithreaded contexts.
The issue here is unix allows the underlying pointer to be accessed, bypassing any possible thread-safe APIs.
Ref: https://man7.org/linux/man-pages/man3/setenv.3.html
... it clearly says: "MT-Unsafe"
Also, there is a whole section about get/set env thread safety here (under "Other safety remarks -> env"):
Because I use structured concurrency, I can make it so every thread has its own environment stack. To add to a new environment, I duplicate it, add the new variable, and push the new enviroment on the stack.
Then I can use code blocks to delimit where that stack should be popped. [1]
This is all perfectly safe, no `unsafe` required, and can even extend to other things like the current working directory. [2]
IMO, Rust got this wrong 10 years ago when Leakpocalypse broke. [3]
[1]: https://git.yzena.com/Yzena/Yc/src/branch/master/tests/yao/e...
[2]: https://gavinhoward.com/2024/09/rewriting-rust-a-response/#g...
[3]: https://gavinhoward.com/2024/05/what-rust-got-wrong-on-forma...
If you have 1) C FFI interop in Yao, there's still a chance you might have two C libraries cause a crash without your code even being involved.
It seems like the only reliable way to fix this is to change these functions so that they exclusively acquire a mutex.
And remember that the exec* family of calls has a version with an envp argument, which is what should be used if a child process is to be started with a different environment — build a completely new structure, don't touch the existing one. Same for posix_spawn.
And, lastly, compatibility with ancient systems strikes again: the environment is also accessible through this:
extern char **environ;
Which is, of course, best described as bullshit.So setenv's existence makes getenv inherently unsafe unless you can ensure the entire application is at a safe point to use them.
stdenvlock(); // imaginary function added to ISO C or POSIX
char *home = getenv("HOME");
char *home_copy = strdup(home);
stdenvunlock(); // only here can we unlock
// home pointer is now indeterminate
Other solutions:1. Put the above sequence into a function, and don't expose the mutex. Thread-safe code must use:
char *home = dupenv("HOME"); // imaginary function; caller responsible for freeing.
2. Provide environment lookup into a buffer: getenvbuf("HOME", mybuf, sizeof mybuf); // returns some value that helps to resize the buffer
All functions that retain pointers out of the classic getenv remain unsafe.A mutex can be provided to those applications that want to manipulate the environ array directly, or use getenv and setenv, or any combinations of these.
The main problem is all the code out there using getenv.
If your program wants to use the environment as an out-of-band global var for cross thread communication, you can make your own mutex.
A mutex can ensure thread safety but risks deadlocks if not used carefully and will hurt performance...
Nice to see that the author of the library has a sensible take. Unfortunately the ecosystem does not: https://github.com/seanmonstar/reqwest/blob/master/Cargo.tom...
This also shows up in web frameworks where Vue has the v-html directive and react has dangerouslySetInnerHTML. Vue definitely has it better.
https://doc.rust-lang.org/stable/std/env/fn.set_var.html
There is a patch for glibc which makes `getenv` safe in more cases where the environment is modified but C still allows direct access to the environ so it can't be completely safe in the face of modification https://github.com/bminor/glibc/commit/7a61e7f557a97ab597d6f...
I'm also not convinced by Musl's maintainer that it can't be fixed within Musl considering glibc is making changes to make this a non-issue.
extern char **environ;
As long as environ is publicly accessible, there's no guarantee that setenv and getenv will be used at all, since they're not necessary.If you're willing to get rid of environ, it's pretty trivial to make setenv and getenv thread safe. If not, then it's impossible, although one could still argue that making setenv and getenv thread safe is at least an improvement, even if it's not a complete solution (aka don't let the perfect be the enemy of the good).
Exactly my point. Over time *environ would disappear, at least from the major software projects that everyone uses (assuming it's even in use in them in the first place).
[1] https://github.com/bminor/glibc/commit/7a61e7f557a97ab597d6f...
Most of the rest of the problem here seems to be the development environment. They're testing on a remote machine in an Amazon data center and using Docker. This rig fails to report that a process has crashed. Then they don't have enough debug symbol info inside their container to get a backtrace. If they'd gotten a clean backtrace reported on the first failure, this would have been obvious.
Why is anyone using "setenv" anyway?
There's no reason setenv should have been called here. The `openssl-probe` library could simply return the paths to the system cert files and callers could plug those directly into the OpenSSL config.
Oversights all around and hopefully this continues to improve.
Because it’s there and it looks like a good idea until it takes one of your fingers.
I always thought this was kinda foolish: your configuration method is a flat-namespace basked of stringly-typed values. The perils of getenv()/setenv()/environ are also, I think, a great argument against using env vars for configuration.
Sure, there aren't always great, well-supported options out there. I prefer using a configuration file (you can have templated config and a system that fills in different values for e.g. dev/stage/prod), and I'll usually use YAML, despite its faults and gotchas. There are probably better configuration file formats, but IMO YAML is still significantly better than using env vars.
These kinds of detailed troubleshooting reports are the closest thing you can get to having to do it yourself. Thanks to the authors. It's easy to say "don't use X duh" until a dependency relies on it, and how were you supposed to know?
> We don’t have the necessary files outside of the container, and our containers are quite minimal and don’t allow us to easily install gdb.
Have people lost the ability to build and debug their code locally, without clouds and containers?
The reason is that cloud is where all the money is because cloud is DRM. Put software there and you can charge a subscription and nobody can evade it and you have perfect lock in forever. People usually can’t even get their data out. You can also do all kinds of realtime analytics conveniently to optimize your product.
Computing architecture is downstream of the business model. Mainframe died originally because there was no Internet and PCs were cheaper, but vendors also lost a lot of their lock in power. Now they have a way to bring a model that is much more profitable back. No more pesky freedom for users, who to be fair if given such freedom will often just refuse to pay, making quality software a non-viable business.
Tangent I know.
there are faults to the cloud but it solves real problems users have.
they should have handled crashes better - a problem they seem to recognize but not the issue here so not covered.
No, of course not, but it didn't crash on our machines!
I hate envvars. It’s “the Linux way”. I avoid them like the plague. A++ strong recommend.
libc is terrible. The world needs to move on.
if (__environ == NULL || name[0] == '\0')
return NULL; import numpy
setproctitle() worked before numpy import but not after because it couldn't find the memory address of **environ.I'm hazy on the details but it led me to a somethingenv call (possibly getenv or setenv) in numpy initialization and it turned out that function changed the address of **environ and that was the reason for why setproctitle couldn't find it.
And:
> This function is safe to call in a single-threaded program.
> This function is also always safe to call on Windows, in single-threaded and multi-threaded programs.
> In multi-threaded programs on other operating systems, the only safe option is to not use set_var or remove_var at all.
It doesn't seem it would take much to do it efficiently, even retaining the poor getenv() pointer-returning API (which could point to a thread local buffer). The coordination between getenv and setenv could be very lightweight - spinlock vs mutex.
There's also no real backwards compatible way of fixing setenv(). getenv() returns a pointer that can be read at any time, and then there's the *environment parameter that can also be used to read env variables.
IMO the entire API should be deprecated for a thread safe one, but until someone comes with a standard setenv() alternative that's implemented by the libc runtimes, we'll be stuck with the shitty POSIX API, and every year we will read blog posts about get/setenv() crashing processes.
The setenv( ) function need not be thread-safe. A function that is not required to be thread-safe is not required to be reentrant.
https://www.open-std.org/jtc1/sc22/open/n4217.pdf.Page.. 1860 :')
Is "the standard says it doesn't NEED to be thread safe" the argument that the Linux libc maintainers are using for not enhancing it to be thread safe, or is it based on some technical or backwards compatibility issues in doing so ?
A similar bug related to setlocale was found in 2007 and fixed in 2014. That bug did not take getenv/setenv into account. https://sourceware.org/bugzilla/show_bug.cgi?id=5443
> POSIX.1 does not require setenv() or unsetenv() to be reentrant.
A non-reentrant function cannot be thread safe.
In general (for POSIX, libc and many other libraries: if the docs do not explicitly say "this function is thread safe" they are not).
"It is ridiculous that this has been a known problem for so long. It has wasted thousands of hours of people's time, either debugging the problems, or debating what to do about it. We know how to fix the problem." https://www.evanjones.ca/setenv-is-not-thread-safe.html
Actually, a non-reentrant function can be thread-safe. A common example of such a function in libc being malloc().
You run into a problem if you keep using a string returned by getenv after calling another environment function: including possibly getenv itself!
However, it's easy to just strdup the result of getenv; that defends against the issue in a single-threaded program.
1. If a process crashes and dumps, be sure to look at the system log of the cause (e.g. SIGSEGV, OOM, invalid instruction, etc.)
2. Be certain you’re looking at the right core dumps — I believe UID 1000 just means posix UserID (which is unrelated to a PID), though I don’t use containers.
3. Stay focused on the right level of abstraction — memory model details are great to know, but irrelevant here.
4. Variables do not correlate 1:1 with registers, except in C calling conventions. The assumption about x20 and a local variable is incorrect, unfortunately.
5. getenv() and setenv() do not work as implied in the post. When a process starts via execve(), the OS/libc constructs a new snapshot of the environment, and cannot be modified by an ancestral process. It’s a snapshot in time, unless updated by the process itself. When a process fork()s, the child gets a new copy of the parent’s environment — updates do not propagate.
getenv() is thread safe and reentrant. You don’t use an environment to pass shared data — setenv() is generally used when constructing the environment for a child process before a fork(). See man environment.
6. FWIW, ‘char** env’ is a null-terminated array of pointers, so dumping memory from *env (or env[0]) is only valid until you hit the first NULL. The size of the array is not stored in the array.
I hope this helps! And apologies if this is redundant — I read so many comments; mostly variations of “the problem with getenv is x”, but gave up before reading all of the (currently) 168 comments.
I should see if the env_logger crate has a better solution.
keep[s] older versions around and adopt[s] an exponential resizing policy. This results in an amortized constant space leak per active environment variable, but there already is such a leak for the variable itself (and that is even length-dependent, and includes no-longer used values).
There have got to be pathalogical uses out there where this will cause unbounded memory growth in well-formed (according to the API) programs, no?
Interesting to see this _introduce_ a ‘bug’ (unbounded memory growth) for these programs that follow the API in order to ‘fix’ programs that don’t (by using the API in multiple threads). Pragmatism over dogma I guess. Leaves me feeling a bit sketched out though.
The only way to coordinate locking would be to do so in libc itself.
synchronization methods impose various complexity and performance penalties, and single threaded applications which don't need that would pay those penalties and get no benefit.
Unix was designed around a lightweight ethos that allowed simple combining of functions by the user on the command line. See "worse is better", but tl;dr that way of doing things proved better, and that's why you find yourself confronting what it doesn't do.
How do you figure? The problem isn't the implementation, it's the API. setenv(), unsetenv(), putenv(), and especially environ, are inherently unsafe in a multithreaded program. Even getenv_r() can't really save you, since another thread may be calling setenv() while the (old) value of an env var is being copied into the provided buffer. Sure, a getenv_r() fixes the case where you get something back from getenv(), and then another thread calls setenv() and makes that memory invalid, but there's no way to protect the other calls breaking the API.
There are ways to mitigate some of the issues, like having libc hold a mutex when inside getenv()/setenv()/putenv()/unsetenv(), but there's still no way for libc to guarantee that something returned by getenv() remains valid long enough for the calling code to use it (which, right, can be fixed by getenv_r(), which could also be protected by that mutex). But there's no good way to make direct access to environ safe. I suppose you could make environ a thread-local, but then different threads' views of the environment could become out of sync, permanently (and you could get different results between calling getenv_r() and examining environ directly).
Back-compat here is just really hard to do. Even adding a mutex to protect those functions could change the semantics enough to break existing programs. (Arguably they're already broken in that case, but still...)
> How do you figure?
From https://illumos.org/man/3C/putenv:
> The putenv() function can be safely called from multithreaded programs
Won't that depends on the libc implementation. For example, maybe setenv writes to another buffer, then swaps pointers atomically; wouldn't that work?
If there were a language feature that let me mark apps such that during any process env vars are not writable and are readable only once (together, in a batch, not once per var), I'd use it everywhere.
But yes, a flat namespace, with string values, shared as a free-for-all with who knows what libraries and modules you're loading… that's not a good idea even if it didn't have safety issues in setenv().
Being on the JVM which actually treats the environment as immutable that and which probably inspired a lot of the 12 factor app movement (with companies like Soundcloud being big Scala and Java users and pushing this), I've never experienced any issues with the environment changing on me or causing any threading issues. The environment is effectively immutable and there's nothing in my processes that sneakily circumvents that (via some native calls into libc). So, complete non issue on the JVM.
Even if somebody manages to modify the environment, the immutable copy stays the same. That copy gets created on JVM startup and is immutable. Anything using normal Java apis to interact with the environment will never see the modification. I'm sure people might have tried to work around that but it's not a wide spread practice. Because, again, why would you even want to do that?
The problem with configuration files is that their parsing is process specific. That's why Linux/Unix is such a mess. Every single tool seems to have its own conventions and mechanisms for configuration. There are no standards for this.
Other of course than the Docker ecosystem. You can do whatever you want inside the container but effectively your only interface to the outside world is either messily mounting some volume and doing whatever convoluted way of configuration your app requires; or just using environment variables. Most modern software is docker ready/friendly in the sense that you can fully control their behavior via the environment. It's perfectly adequate for most things that people run via docker these days. Which of course is pretty much anything.
And of course with Docker compose or kubernetes (which I'm not necessarily a fan of) you get yaml files defining lists of environment variables that define how your process starts. So you more or less get what you are asking for. I'm not a big YAML fan but it works well enough. Too much potential for syntax issues really ruining your day IMHO. But it's not like the alternatives are free of issues.
I personally use 12 factor app style, but once it's entered the app I validate the env variables and data and then store them. It's totally fine after that.
Throw away your CPU and RAM then.
And even if those abstractions can't be 100% effective, we'd go a long way to achieving the desirable results of getting rid of it, if we just develop the mindset of avoiding it if at all possible, excepting for very rare instances where it's needed as a last resort.
Go ahead and write lots of mutable global statics. But when your program crashes randomly and you need my help to debug and it is, once again, a global mutable then you have to perform a walk of shame.
the problem is not linux, not mutable global state or resources and not libc.
the problem is not getting time at work to do things properly. like spotting this in GDB before the issue hit, because your boss gave you time to tirelessly debug and reverse your code and anything it touches....
there is too much money in halfbaked code. sad but true.
If both Rust and C have independent standard libraries loaded into the same process, each would have an independent set of environment variables. So setting a variable from Rust wouldn't make it visible to the C code, which would break the article's usecase of configuring OpenSSL.
The only real solution is to have the operating system provide a thread-safe way of managing environment variables. Windows does so; but in Linux that's the job of libc, which refuses to provide thread-safety.
*BSD allows it too, or used as of 2022.
What is unusual about Linux is that it guarantees a syscall ABI, meaning that if you follow it, you can make a system call "portably" across "any" version of Linux.
Is that true? It's just a process global string -> string map, that can be pre-loaded with values before the process starts, with a copy of the current state being passed to any sub-process. This could be trivially implemented with batch processing/supervisory programs.
Yes to read them, if Rust wish to modify - modify your own, already copied structure. I'd do that in pretty much any language.
https://github.com/sunfishcode/eyra
Oh look:
> Why use Eyra? It fixes Rust's set_var unsoundness issue. The environment-variable implementation leaks memory internally (it is optional, but enabled by default), so setenv etc. are thread-safe.
Re: fork(), I just meant to be thorough in explaining the environment is copied, not shared by processes. Setenv() only affects the process from which it’s called.
The array size bit in the article: The value 0x220 looks suspiciously close to the size of the old environment in 64-bit words (0x220 / 8 = 68), and this value was written over the terminating NULL of the environment block…
HTH!
It does not help, because you do not appear to have understood the article (or even read it all that closely).
Some of these bullet points feel a lot like the kind of junk output one sees from the various (popular, but flawed) AI summary tools...
I could be wrong though —- could you be specific? I don’t want to misinform anyone…
It's worth noting here that you can also build your binaries and keep debug symbols separately.
You don't need to ship them with the binary (although it will make many scenarios a bit simpler if you do, since you'll always have the right ones available).
Some info that might help: https://www.tweag.io/blog/2023-11-23-debug-fission/ https://undo.io/resources/gdb-watchpoint/reduce-binary-size-...
You might want to look into debuginfod.
3. There's nothing wrong with the level of abstraction here. If you have a crash that occurs on ARM but not on amd64, the differences in how those architectures operate is a very reasonable initial assumption.
4. The value in x20 is the same value in the local variable in question. Even though there may not be a general one-to-one mapping between variables and registers, at this particular instant in time that variable does correspond to this register.
5 is irrelevant, as the article isn't discussing forking. It's discussing the (somewhat questionable) practice of a program using getenv/setenv as mutable state.
For 6, the article doesn't say that env stores its own array length. It says that setenv called something like free() on the old env array, and free() overwrote env with the length of the memory allocation (which is a quite reasonable way for malloc to do book keeping).
POSIX predates adoption of threads in the UNIX world.
Well, that's a lot of caveats. As I said, it would take years to complete. And it looks like it's well on its way but not near complete.
For some reason lots of programmers will behave like the comment section on an accident video. "I would notice that earlier", "I'd avoid that", "I can react faster".
These things are perhaps more commonly known to be bad, and the dangers are perhaps more obvious.
There will always be people who use things in the wrong way too, which doesn't make the thing bad, but how it's used.
There are buildings in my country with nets around them because people keep jumping off them (suicidal). The buildings are safe. The nets are not a solution, they just shift the problem and don't tackle the root cause.
There are many car crashes with fatal victims. Sure care manufacturers try to make cars safer, but there's no hordes of people hating on cars calling for them to be abolished in favor of safer technology because people rely on them heavily.
Same for libc. People try to improve its safety, and try to advice and write about its dangers. Just because bugs exist and unsafe conditions can occur doesn't mean something should be dropped all together... a lot of the world relies heavily on libc, safe and unsafe uses of it even.
What's more is that libc and linux etc. are open-source. If someone knows a sound solution to these issues which does not break the entire world, they are free to submit pull requests....
simply stating something is 'rubbish' and needs to be put down is an unproductive and shortsighted sentiment.
> They did, it's called core. But it assumes no operating system at all, and environment variables require an operating system.
I think there's some confusion here. The C standard library is an abstraction layer that exists to implement standard behavior on hardware. It's entirely unrelated to the existence of an OS. Things like "/proc/$PID/environ" have nothing to do with C.
There are many standard libraries, for embedded, that implement these things, like getenv, on bare metal [1].
Standard C libraries exist to implement functionality. It does not define how to implement the functionality. That's the whole point of C: it's an abstraction that has very little requirements.
The implementation of environment variables don't require an OS. If they made this "core", they could trivially implement the concept.
[1] https://en.wikipedia.org/wiki/Newlib [2] getenv: https://sourceware.org/newlib/libc.html
To put it bluntly, newlib is an antisocial libc. It provides bare compileability of programs by implementing C and POSIX facilities atop a small set of system calls. However, in practice, it requires basically nothing to actually work. If you look at what it requires [1], you can see that virtually all of the system calls are allowed to do nothing but return an error. The only function that is actually shown to do something is sbrk which is a simple bump allocator, and even then it's only strongly recommended to work so that malloc also works since a lot of ordinary C programs use malloc. This says to me "get code to compile at all costs" with no concern for a wider environment (since there may be no "wider environment" in the first place).
More charitably, we can view newlib as a set of compatibility shims bridging hosted and freestanding C. This has a place, of course; there are C libraries that assume a hosted implementation but don't really need (all of) a hosted implementation.
This doesn't really apply to nostd Rust, and creating a set of "environment variables" that interoperate with nothing, just because you can, is kind of pointless when there's no O/S and no FFI involved. I explained more about why (IMO) core::env/alloc::env shouldn't exist in the other comment.
All that having been said, newlib does seem to sit in a position somewhere between core+alloc and full std in terms of Rust (std also includes networking). Maybe there is a need for FFI/C compatibility without networking? I can't say for sure, but I haven't needed it.
Context: The setenv function is not thread-safe even in Rust
Question: Why doesn't Rust implement a standard library without C?
Answer: It does, but core lacks std::env, because env vars are part of an O/S
Question: Is an O/S really necessary for env vars?
Answer: Not conceptually, but without an O/S, env vars don't work as expected
I also like the sibling comment that pointed out env vars are social as much as technical. The key element is interoperability. And we haven't even discussed Windows, which has different functions and conventions for environment variables.Now, let me address what you just said. First of all, on embedded, a freestanding C implementation is not even required to provide getenv at all. Second, while getenv is in standard C and required for hosted implementations, setenv is not. And the whole thread is really about setenv. Once we pull in setenv, we're talking not just about standard C but about POSIX, which is a specification for operating systems. I assume for the sake of fruitful discussion, we both accept that a variable put into the environment with setenv should be retrievable thereafter with getenv. This moreover should apply even if it's Rust that calls setenv and C that calls getenv and vice-versa.
So, however Rust implements environment variables should be consistent with how C implements environment variables, and since C provides the foundation for system calls and FFI for most other major languages, adhering to this convention allows interoperability across very many languages. This convention is defined by libc (the implementation of the C standard and POSIX interfaces) and thus interoperability is based on libc compatibility. So either Rust implements its own libc, which C programs would have to be (re-)compiled to use, or else it uses an existing implementation of libc, inheriting all of its quirks. Indeed, Rust targets specify the libc (or equivalent) they're using, such as -gnu, -musl, -darwin, -mingw, -msvc, etc. Linking with libraries built for a different libc on an otherwise identical platform (-gnu vs. -musl on Linux, -mingw vs -msvc on Windows) generally doesn't work and even when it appears to work leads to strange issues later. So you can't just write your own getenv and expect it to work with some other implementation of setenv.
To connect back with my other comments, there is no core::env because core assumes no libc at all. The nostd flavor of Rust (where core is available but not std) is basically equivalent to freestanding C and like freestanding C there is no interoperability guarantee, not even with freestanding C on the same hardware (indeed, the whole concept of "freestanding" is that there are no conventions to adhere to in the first place). So, std::env::set_env has the exact same problems as C setenv because it's the same thing under the hood. This cannot be addressed without fixing libc itself. Moreover, when libc is not involved, then there is no env to support to begin with.
Finally, to round out addressing what you said, core::env could exist, but probably shouldn't, for two reasons. First, it would be misleading. As I've already laid out, it would not interoperate with anything since there's nothing there to interoperate with. It would just be a global string->string map exclusive to that program, which the programmer could just as well create on his own. Second, because presumably you want it to be something other than empty, it would require some kind of global allocator, which core also assumes doesn't exist. So it would have to be something like alloc::env instead, and once you've pulled in alloc, you can just use one of the collection types (though, notably, HashMap isn't in alloc yet [1]).
I clarified why I mentioned fork().
I tried to explain the difference between registers and variables.
I’m not trying to show off or bring anyone down… I just like to help people. I’m old (my first Linux kernel commit was in 2004). And I could be wrong — please LMK if I made a factual error (I’d appreciate it, honestly).
All good?
The article specifically mentions that the authors consulted the disassembly to see what was in x20. I know it is a general purpose register. They know it is a general purpose register. This knowledge is completely irrelevant: they read the code, they matched it against the actual source, they can confirm that at the time of crash x20 contains what they said it contains. The compiler optimizations have already run. They can't change anything anymore. That you mentioned this shows that you do not follow the actual order of events here.
envp, similarly, is in the process of being operated on in the crashing code. The authors grabbed its size from some random context at the time of the crash. The fact that it is not actually stored in the array itself is completely irrelevant to the fact that its numeric value was present in the crash dump. Obviously, some code that operated on it had computed the value and stashed it, which is a completely natural and expected thing for this code to do.
Finally, nobody cares about setenv across processes. The article didn't talk about this. It's completely irrelevant to mention this, and in fact there is another comment further down (which you may not have read, I'm ok with that) that also has the same confusion and it belies a poor grasp of what the actual problem is.
You can see that I am forced to do significantly more work than you to respond to what specifically is the problem here. It looks like you are pattern matching on specific words and then regurgitating your knowledge on it, whether it is relevant or not. When it's not, it's essentially just spam; when it is you fail to actually take into account the content that is actually being discussed. When I'm talking about how I almost got run over by a driver on their phone you are not welcome to step in and start talking about how a lot of hit-and-runs involve drunk drivers. I wasn't talking about a hit-and-run, and I just told you the person was on their phone. Somehow you completely missed that and kept talking about what you wanted to mention, like if you gave the gist of the conversation to someone else and asked them for their response on it and then pasted that here without checking to see if it was relevant or not. Don't do that.
It's hard to find good details here, but here's a mailing list thread from 2019 mentioning libc usage: https://groups.google.com/g/golang-nuts/c/uX8eUeyuuAY/m/Cfhl...
> On Solaris (and Windows), and more recently in macOS as well we link with libc (or equivalent).
> Go used to do raw system calls on macOS, and binaries were occasionally broken by kernel updates. Now Go uses libc on macOS.
Ideally all libraries which use environment variables should have APIs allowing you to override the env variables without calling setenv(), but that isn't always the case.
No, the problem is that libraries try to do this at all. Libraries should just have those APIs you mention, and not touch env vars, period. If you, the library user, really want to use env vars for those settings, you can getenv() them yourself and pass them to the library's APIs.
Obviously we can't change history; there are libraries that do this anyway. But we should encourage library authors to (in the future) pretend that env vars don't exist.
I do think you'd be hard-pressed to find a situation where a program calling setenv() to configure a library actually makes sense. It's a pretty strong sign that someone made a bad decision. People will, however, make mistakes in API design.
I agree with you that it would be much better if, when libA needs to set behavio Foo in libB, it called libB:setBehavior (Foo) rather than setenv ("LibBehavior", "Foo")
But let's not throw the baby out with the bathwater.
Just like a library wouldn’t try to use argv directly, it shouldn’t use envp either (even if done via getenv/setenv)
dlopen()
which passes on your environment. If you want to load libpam-keberos and pass DEBUG=verbose
you will need to setenv() your own environment.The Rust stdlib is already using synchronization on the versions of these functions that are exposed from the Rust stdlib. That's why those functions were allowed to be marked as safe in the first place.
The problem is that people are calling C code from Rust (which already requires an unsafe annotation), and then that C code is doing silly thread-unsafe shenanigans for regrettable historical reasons.
It's beyond Rust's power to fix without cooperation from the underlying C code, which happens to be provided by the OS, which is just being compliant with Posix. Rust can only do so much when the platform itself is hell-bent on sabotaging you.
It certainly would be nice if the C library had fewer built–in footguns. And if we could write programs in other languages without ever depending on it (which wouldn’t but much use when you’re relying on a C library anyway, but it still would be nice).
Yes, it will take a long time, and some users will complain it doesn't work on their PDP-11, but the problem will never be solved if there's no migration path to a safe solution.
There's no way to use it safely in a multi threaded application that may use setenv (unless you add your own synchronisation, and ensure everything uses it, even third party libraries).
I am also quite technical, thanks.
This is the iCloud model and it works. Imagine a more open version with competing storage providers.
This, however, would hand control back to the user, which would be bad for the software industry with its addiction to lock in and recurring revenue.
I'm not saying you are wrong, but there is a lot of nuance here.
There are exceptions, like large AI models and huge databases like web search, though in the case of AI models I can run pretty decent ones locally already, but on an admittedly expensive laptop. If the rate at which models grow is not as fast or faster than the rate at which computers grow, mainstream PCs or even phones will catch up eventually.
I've actually wondered if that might be a major factor that swings the pendulum back... if you can run an AI that has memorized the entire Internet locally, that makes all kinds of things possible in local compute.
Installing apps could be easy, even automatic on demand. That's kind of what the web does. Imagine the web with better caching of program objects, maybe a runtime built around WASM, and an iCloud-type data model, and you can visualize personal computing for today. The kludgy idea of installers that vomit files all over the system is already legacy.
But it would still break SaaS lock-in, so this isn't where the money goes. Our software paradigms wrap themselves around whatever works as a business model.
modern computing is mostly just the most malignant, worst possible re-interpretation of plan 9 anyways
Leaking would be good enough for many use cases, but it would break long-running users of setenv (mainly those with libraries abusing env vars, as in TFA), and doesn't even solve how they interact with putenv and environ. This whole API is just cursed.
Libc could of course get better APIs, like GetEnvironmentVariable on Windows, but that won't fix all existing code.
If current platforms are safely making a copy of getenv before allowing their scheduler to interrupt, then yes I'd be ok with your solution.
If DOM nodes during the next render differ from what react-dom expects (i.e. the DOM nodes from the previous render), then react-dom may throw a DOMException. Mutating innerHTML via a ref may violate React's invariants, and the library correctly throws an error when programmers, browser extensions, etc. mutate the DOM such that a node's parent unexpectedly changes.
There are workarounds[1] to mutate DOM nodes managed by React and avoid DOMExceptions, but I haven't worked on a codebase where anything like this was necessary.
[1] https://github.com/facebook/react/issues/11538#issuecomment-...
innerHTML is useful when there is a trusted HTML source, which is becoming more popular with stuff like HTMX and FastHTML.
So a non-reentrant function is a function that may not be invoked again between a previous invocation and returning from that invocation.
When a function may be invoked from different threads, then it is certain that sometimes it will be invoked by a thread before returning from a previous invocation from a different thread.
Therefore any function that may be invoked from different threads must be reentrant. Otherwise the behavior of the program is unpredictable. Reentrant functions may be required even in single-thread programs, when they may be invoked recursively, or they may be invoked by signal handlers.
An implementation of "malloc" may be reentrant or it may be non-reentrant.
Old "malloc" implementations were usually non-reentrant because they used global variables for managing the heap. Such "malloc" functions could not be used in multi-threaded programs.
Modern "malloc" implementations are reentrant, either by using only thread-local storage or by using shared global variables to which some method for concurrent access is implemented, e.g. with mutual exclusion.
A reentrant function is thread-safe, but a thread-safe function may or may not be reentrant.
For instance, if a function uses mutual exclusion (say, posix_mutex_lock() and friends) to ensure thread-safety it won't be reentrant, because if the function is invoked via a signal handler it may deadlock. Which is why many common libc functions like malloc and stdio are not required to be async-signal-safe in POSIX, whereas they are required to be thread-safe.
Therefore I do not think that anyone has bothered to implement a signal-safe malloc, as this is likely to be complicated.
Allocating memory in a signal handler makes no sense in a well designed program, so not being allowed to use malloc and related functions is not a problem.
That's part of the design of the OS. How the OS implements this is primitive, and so it leaves it up to every language to handle. The blog mentions the issue is with getenv, setenv, and realloc, all system calls. To me, that sounds like bad OS design is causing issues downstream with languages, leaving it up to individual programmers to deal with the fallout.
None of these 3 functions is a system call. open(), mmap(), sbrk(), poll(), etc. are system calls. What you're referring to is C library API, which as Go has shown (both to its benefit and its detriment) is optional on almost all operating systems (a major exception being OpenBSD.)
If you really want to lose some sanity I would recommend reading the man page for getauxval(), and then look up how that works on the machine level when the process is started. Especially on some of the older architectures. (No liability accepted for any grey hair induced by this.)
But ... there's a difference between being able to do direct syscalls via asm, and them being portable across kernel versions, which is what this subthread was about.
Granted, most people want version portability, but still on a technical level, it's not the same thing.
The closest libc can get to MT safety is to never deallocate an environment string or an environ array. Solaris does this--if you continually add new variables with setenv it just leaks environ array memory, or if you continually overwrite a key it just leaks the old value. (IIRC, glibc is halfway there.) But even then it still requires the application to abstain from doing crazy stuff, like modifying the strings you get back from getenv. NetBSD tried adding safer interfaces, like getenv_r, but it's ultimately insufficient to meaningfully address the problem.
The right answer for safe, portable programs is to not mutate the environment once you go multi-threaded, or even better just treat process environment as immutable once you enter your main loop or otherwise finish with initial process setup. glibc could (and maybe should) fully adopt the Solaris solution (currently, IIRC, glibc leaks env strings but not environ arrays), but if applications are using the environment variable table as a global, shared, mutable key-value store, then leaking memory probably isn't what they want, either. Either way, the best solution is to stop treating it as mutable.
https://learn.microsoft.com/en-us/windows/win32/api/winbase/...
https://learn.microsoft.com/en-us/windows/win32/api/winbase/...
The thing is, the OP people weren't doing that at all, it was some irresponsible library maintainers. If your code does that, you have to include something like the "surgeon general's warning" everywhere: "CAREFUL: USING THIS LIBRARY MAY CAUSE TERMINAL CRASHES".
History: V7 research UNIX had "getenv()", but not "setenv()".[1] BSD Unix 4.x had "getenv()" and "setenv()"[2] Google's "AI Overview" says "The setenv() and unsetenv() functions were included in Version 7 of AT&T UNIX.", but that does not seem to be correct.
This misfeature seems to be what was once called a "Berkeleyism", a Berkeley mod to UNIX.
"setenv()" predates UNIX/Linux getting threads.
[1] http://web.cuzuco.com/~cuzuco/v7/v7vol1.pdf
[2] https://archive.org/details/44bsdprogrammers0000ucbe/page/n3...
Worst case memory usage (all threads get all vars) is that you end up having a separate copy of the environment per thread, but it seems this is the best that can be done given the awful API.
POSIX doesn't require mmap() to be async-signal safe, but on Linux it de facto is.
Of course, doing every memory allocation at the kernel level is going to be very slow and resource intensive. But if it isn't on a hot code path...
And the library's use of setenv is clearly a bug as setenv is documented to be not threadsafe in the C standard library. So that would take care of that problem.
Using a copy by default may have worked if it was designed as such before Rust 1.0, but Rust took the decision to expose the real environment and changing this now would be more disruptive than marking mutations as unsafe.
From the syscall interface point of view ... you pass the initial env of a process when you exec(), and the kernel copies that to (userland) memory of the new process. The fact "default initialisation" can copy from the environment of the exec()'ing parent, or the fact that the kernel can "read" a process' env (see /proc/<PID>/environ) doesn't change this; the kernel needn't be "accommodating" all the possible and impossible ways how a user application may want to interact with that state there, if you mess-too-much with it, you get garbage. Sooo ... the portability wart is setenv(), because as far as the system is concerned... your "initial" env is passed to you when exec() is called, and any modification thereafter is your concern, your problem, but foremost, your choice. And choices come with taking responsibility for the ones you make.
Under the hood the pointer is initialized by the loader, in a special place in executable memory. Most of the time, the loader gets the initial environment variable list by looking at argv* (try reading past the end of the null separator, you'll find the initial environment variables).
It would be possible for a language to hack it such that on load they initialize their own env var set without using libc and be able to safely set/get those env vars without going through libc, and to inherit them when spawning child processes by reading the special location instead of the standard location initialized by your platforms' loader/updated by libc. But how useful is a language with FFI that's fundamentally broken since callees can't set environment variables? (probably very useful, since software that relies on this is questionably designed in the first place)
If you wanted to make a bullet proof solution, you would specify the location of an envp mutex in the loaders' format and make it libc's (or any language runtime) problem to acquire that mutex.
* there are platforms where this isn't true
Of course the elephant in the room remains: I need my data where I am.
As far as local data: my laptop has terabytes, my phone over a hundred gigabytes. I have fiber at home and have seen speeds approaching a gigabit on 5G.
It’s not that often that people sit down at entirely unfamiliar machines they’ve never used, log in, and try to do data intensive work. In that case I suppose an iCloud model of compute would be downloading a lot.
I don't think POSIX fixes this: it doesn't specify that the environ array is protected against concurrent access.
If two threads call getenv right around the same time, one of them could invalidate the environ array just as the other one has started to traverse it.
If you want to be safe, copy the environment to a different data structure on program startup. Then have all your threads refer to that data structure.
As well as absolute paths, it’s ok to work with descriptor-relative paths using openat() and friends.
If one thread is using relative paths, and another is doing a chdir-based traversal (as using the nftw function, for instance), that first thread's accesses are messed up.
This is why POSIX now has various -at functions; the provide stable relative access.
The former allows you to design a coherent system. a lot of design questions which are annoying (“how do I access config data consistently, etc) become very clear.
It also makes C more productive. If global vars and static locals are unbanned, features like closures become less important.
Getenv() could keep several copies of the value around: one internal copy protected by a mutex, that it never returns, and one copy per thread that it stores in thread local storage. When you call getenv(), it locks the mutex, checks if the current thread's value exists, populates it from the internal copy if not, and returns it. It will also install a new setenv-specific signal handler on this thread and store info about this thread having a copy.
Setenv() will then take the same mutex as getenv(), check if the internal copy is different from the new value; if it is, it will modify the internal copy, modify the local thread's copy if that has one, and then signal each other thread in the process that has a copy in TLS. The setenv signal handler will modify the local copy that thread holds.
It's gonna be slow for a large multi-threaded program, but since setenv() used to corrupt memory for such programs, they probably don't care. And for single-threaded programs, or even for programs that don't access getenv()/setenv() on multiple threads, there should be no extra overhead other than the mutex and the bookkeeping.
The only issues that would remain are programs which send the pointer they get from getenv() to other threads without ensuring locking access, and programs which rely on modifying the pointer from getenv() directly as a way to set an env var, and expect this to be visible across threads. Those are just hopelessly broken and can't use the same API - but aren't more broken then they are today.
Of course, in addition to this complex work to make the old API (mostly) thread safe, it should also offer a new API that simply returns a copy every time, doesn't promise to show modifications to your copy when setenv() gets called (you need to call getenv() again), and puts the onus on you to free that copy explicitly.
Returning a copy isn't great (memory allocation!), the API should probably be something like:
int getenv(const char *varName, char *buf, size_t bufSize, size_t *varSize);
Where the caller manages the buffer and getenv writes into it (so it can e.g. be stack or statically allocated), the third argument is the size of the caller-managed buffer, then the last variable is an "out parameter" that returns the "true" length of the environment variable. Then afterwards, you can check if `*varSize > bufSize`, and if so, you need to make your buffer larger. The return value is an error code.Doing it like this, you can easily implement the "return a malloced copy" if you want to, but it also gives you the option to avoid allocation entirely. This is important for e.g. embedded or real-time applications, or anything that just likes to avoid `malloc()/free()`.
Your particular solution doesn't work because people expect `getenv` to be async-signal-safe, which means you shouldn't be allocating memory.
Hmm ... doing an incref-like operation during `getenv` for a previously `setenv`ed variable that hasn't yet been accessed in this thread would be fine ... clear those refs during calls we know indicate knowledge refreshes ...
It's equally nasty. POSIX requires that the argument to `putenv()' not be copied, so it's not very different from assigning to `environ' directly.
"easy": protect the page containing environ and handle the mutation from the signal handler.
/s of course.
It looks useful.
And that's before you even get to the `extern char *environ` global.
One could "dream of" a func that tells libc "acquire/drop this mutex of mine around get/set/putenv calls" but that'd simply move the problem - because the nifty "frameworks" would do that (independently of each other, we're sovereign and entitled frameworks around here) and race each other's state nonetheless.
Malicious software exists, does that mean we should remove all threading primitives from the standard?
If your sensitive logs end up in the webserver root because one thread used chdir to temporarily change the working directory it's on the application writer.
Or to put it another way, the filesystem as a whole being shared mutable state does not make the current working directory being shared mutable state between threads any less of an issue.
No library does that documentation, so you can't use libraries on POSIX systems if writing multithreaded code. Or you do and hope for the best. So everyone just hopes for the best.
Caveats: POSIX.1 does not require setenv() or unsetenv() to be reentrant.
...
Interface: setenv(), unsetenv()
Attribute: Thread safety
Value: MT-Unsafe const:env
Libraries that are thread-safe DO provide that documentation. One assumes that
libraries that don't provide that documentation are not thread=safe.GnuTLS docs: The GnuTLS library is thread safe by design, meaning that objects of the library such as TLS sessions, can be safely divided across threads as long as a single thread accesses a single object.
Btw, you can _also_ substitute libc's setenv/getenv/putenv with your own (locking) implementations, courtesy preload and all the funky features of ELF symbol resolution. Actually easy. But impossible if you link against static code using it (go ... away). Hmm. easy ? impossible ? damn this grey world. Gimme some color.
If anything calls the C getenv function, like a third party library, things are maybe not fine.
30 years after these decisions were made, most sensible people do single threaded GUIs anyway (that is, all calls to the windowing API come from a single thread, and all redraws occur synchronously with respect to that thread; this does not block the use of threads functioning as workers on behalf of the GUI, but they are not allowed to make windowing API calls themselves).
Consequently, the overhead present in the win32 API is basically just dead-weight, there to make sure that "things are safe by default".
There's a design lesson here for everyone, though precisely what it is will likely still be argued about.
"If you detached a thread in your application using a non-Cocoa API, such as the POSIX or Multiprocessing Services APIs, this method could still return NO."
Also, I've never heard of this behavior despite years developing for macOS (admittedly tangentially). I don't see how that could work given that threads can come and go during the life of the application.
How much overhead is it though? IIRC uncontended mutexes are practically free, especially when they're only being used from a single thread.
Our industry is way too eager to make things unsafe for the sake of marginal performance differences that are irrelevant for most use cases, IMO.
You could wrap setenv in a mutex, but that's not good enough. It can still be called from different processes, which means you'd need to do a more expensive and complex syncing system to make it safe.
That ballons out to other env related methods needing to honor the synchronization primitive in order for there to be a semblance of safety.
However, you still end up in a scenario where you can call
setenv
getenv
and that would be incorrect because between the set and the get, even with mutexes properly in place and coordinated amongst different applications, you have a race condition where your set can be overwritten by another application's set before your get can run. Now, instead of actually making these functions safe you've buried the fact that external processes (or your own threads) can mess with env state.The solution is to stop using env as some sort of global variable and instead treat it as a constant when the application starts. Using setenv should be mostly discouraged because of these issues.
Its a different story for languages/environments that are supposed to be safe by default and where you have language features that ensure safety (actors, optionals etc) but not for something like libc which has a standard it has to conform to and like 100 years of history.
Making global state, especially state that has no reason to be modified or even read very often like the env, thread safe is a trivial issue, well studied and understood. Could an intern do it? Probably not. Could literally any maintainer of a standard C library? Easily.
This is much more of a culture problem preventing such obvious flaws from being recognized as such.
Side-note: your set-then-get example is a theoretical problem in search of a use case. Why would you ever want to concurrently set an env var and expect to be guaranteed to read that same value? And even if this is a real thing that applications really use, exposing a new function to sync anything on the env mutex is, again, trivial. So, if you really needed that, you could do
lockenv
setenv
getenv
unlockenv
And problem solved.This needs to be fixed inside libc, but there's no way to do so completely without breaking backward-compatibility.
[ EDIT: not quite sure how to think about this ... if I create NSThreads to act as worker threads that do not make cocoa calls, I still have to deal with new overhead in any cocoa call stacks. That's not ideal, but again, it's a "middle-way" approach, and like every other approach has its own pros and cons. ]
What could work is per thread env. changes - but that's not likely to happen
I pointed out that, in addition to libc setenv/getenv using a mutex internally, they could also expose new functions to allow transactional access for anyone that really needs it - though I suspect that is a vanishingly small minority.
Note that Java, and the JVM, doesn't allow changing environment variables. It was the right choice, even if painful at times.
I am fairly certain that somewhere inside the polyhedron that satisfies those constraints, is a large subset that could be statically analyzed and proven sound. But I'm less certain if Rust could express it cleanly.
struct BeforeEnvFreeze(());
struct AfterEnvFreeze(());
impl BeforeEnvFreeze {
pub fn new() -> Self { /* singleton check using a static AtomicBool or something */ Self(()) }
pub fn freeze(self) -> AfterEnvFreeze { AfterEnvFreeze(()) }
pub fn set_env(&self, ...) { ... }
}
impl AfterEnvFreeze {
pub fn spawn_thread(&self, ...) { ... }
}
fn main() {
let a = BeforeEnvFreeze::new();
a.set_env(...);
a.set_env(...);
//b.spawn_thread(...); // not available
let b = a.freeze(); // consumes `a`
b.spawn_thread(...);
//a.set_env(...); // not available
}
Exercises left to the reader:• Banning access to the relevant bits of Rust's stdlib, libc, etc. as a means of escaping this "safe" abstraction
• Conning your lead developer into accepting your handwave
• Setting up the appropriate VCS alerts so you have a chance to NAK "helpful" "utility" pull requests that undermine your "protections"
And of course, this all remains a hackaround for POSIX design flaws - your engineering time might be better spent ensuring or enforcing your libc is "fixed" via intentional memory leaks per e.g. https://github.com/bminor/glibc/commit/7a61e7f557a97ab597d6f... , which may ≈fix more than your Rust programs.
I'd go further and say env should always be read-only and libraries should never even read env vars.
I mean, based on this issue I would say the only safe time is "at the start of the program, before any new threads may have been created".
But again, as others have said, there's no good reason I'm aware of to set environment variables in your own process, and when you spawn a new process you can give it its own environment with any changes you want.
When using C++ I wanted programs to have a function that was called before main() and set up things that got sealed afterwards, like parsing command-line-arguments, the environment variables, loading runtime libraries, and maybe look at the local directory, but I'm not sure if it'll be a useful and meaningful distinction unless you restructure way too many things.
I remember that on the Fuchsia kernel programs needed to drop capabilities at some point, but the shift needed might be a hard sell given things already "work fine".
Not sure why would it be considered painful. Imo, use of setenv to modify your own variable, the definition of setenv is thread unsafe. So unless running a single threaded application it'd never make sense to call it.
Java does support running child processes with a designated env space (ProcessBuilder.environment is a modifiable map, copied from the current process), so inability to modify its own doesn't matter.
Personally I have never needed to change env variables. I consider them the same as the command line parameters.
Another reason why Java isn't the greatest language to create CLI tools with.
Linux and macOS both support per-thread working directory, although sadly through incompatible APIs.
Also, AFAIK, the Linux API can't restore the link between the process CWD and thread CWD once broken – you can change your thread's CWD back to the process CWD, but that thread won't pick up any future changes to the process CWD. By contrast, macOS has an API call to restore that link.
But I think it was actually possible to hack around up until Java 17.
Those are trivial things in around 100 lines of code and have been available since System.getenv() got back (it used to be deprecated and non-functional prior Java 1.5 or 2004)
Environmental variables are not a replacement for your config. It’s not a place to store your variables.
Even if the env var API is fully concurrent, it is not convention to write code that expects an env var to change. There isn’t even a mechanism for it. You’d have to write something to poll for changes and that should feel wrong.
The most common use I see for this is people setting an env in the current process before forking off a separate process; presumably because they don't realize that you can pass a new environment to new processes.
I wonder what bugs you'd find if you injected a library to override setenv() with a crash or error message into various programs. Might be a way to track down these kind of random irreproducable bugs.
I'd be happy to have you copy the immutable read-only environment vector of strings into your space and then treat that as the source of such things.
I think it would be interesting to build all the packages with a stdlib that dumps core on any call to setenv() or unsetenv(). That would give one an idea of the scope of the problem.
This holds for a lot of programs, but what if you're writing a shell?
Of course, this means you won't see any changes to env vars from libraries you may use that call setenv(), but you also shouldn't need, or want, that in a shell.
I still think having a proper synchronous thread safe setenv()/getenv() in libc is the better choice.
"The getenv function returns a pointer to a string associated with the matched list member. The string pointed to shall not be modified by the program, but can be overwritten by a subsequent call to the getenv function."
In a single threaded virtual machine, you can immediately duplicate the string returned by getenv and stop using it, right there.
Under threads, getenv is not required to be safe.
I think that with some care, it may be; an environment implementation could guarantee that a non-mutating operation like getenv doesn't invalidate any previously returned strings.
I think POSIX does that. It allows getenv to reallocate the environ array, but not the strings themselves:
"Applications can change the entire environment in a single operation by assigning the environ variable to point to an array of character pointers to the new environment strings. After assigning a new value to environ, applications should not rely on the new environment strings remaining part of the environment, as a call to getenv(), secure_getenv(), [XSI] [Option Start] putenv(), [Option End] setenv(), unsetenv(), or any function that is dependent on an environment variable may, on noticing that environ has changed, copy the environment strings to a new array and assign environ to point to it."
environ is documented together with the exec family of functions; that's where this is found.
So whereas there are things not to like about environ, it can be the basis for thread safety of getenv in an application that doesn't mutate the environment.
Mutating argv is actually quite popular, or at least it used to be.
With good reason. Files are surprisingly hard: https://danluu.com/deconstruct-files/
Note the standard:
https://pubs.opengroup.org/onlinepubs/009604499/functions/se...
> The setenv() function need not be reentrant. A function that is not required to be reentrant is not required to be thread-safe.
With the increased use of PIE, thunks for both security and due to ARM + the difference between glibc and musl, plus busybox and you have a huge mess.
I would encourage you to play around with ghidra, just to see what return oriented programming and ARM limits does.
Compilers have been good at hiding those changes from us, but the non-reentrant nature will cause you issues even without threads.
Hint, these thunks can get inserted in the MI lowering stage or in the linker.
But setenv() is owned by posix, with only getenv() being differed to cppr.
Perhaps someone could submit a proposal on how to make it reentrant to the Open Group. But it wasn't really intended for maintaining mutable state so it may be a hard sell.
For configuration files, the write-fsync-move strategy works fine. Generally you don't need fsync, since most people don't use the file system settings that allow data writes to be reordered with the metadata rename.
Modifying _your own_ environment _at runtime_ is not. The corresponding functions - setenv/getenv - and state - envp/environ - have in the UNIX standards "always" (since threads exist, really) been marked non-MT. "way back when" people were happy to accept that stated restrictions on use don't make bugs. Today, general sense of overentitlement makes (some) people say "but since whatever-trickery can remove this restriction... you're wrong and I'm entitled to my bugfix". I agree the damage is done, though.
It is fine because it is usually done during the initialization phase, before starting any other thread. setenv() can be used here too, though I prefer to avoid doing that in any case. I also prefer not to touch argv, but since that's how GNU getopt() works, I just go with it.
Once the program is running and has started its threads, I consider setenv() is a big no no. The Rust documentation agrees with me: "In multi-threaded programs on other operating systems, the only safe option is to not use set_var or remove_var at all.". Note: here, "other operating systems" means "not Windows".
¹ or technically whatever your ELF entry point is, _start in crt0 or your poison of choice.
> ¹ or technically whatever your ELF entry point is, _start in crt0 or your poison of choice.
Once you include the footnote, at least on linux/macos (not sure about Windows), you could take the same perspective with regards to envp and the auxiliary array. It's libc that decided to store a pointer to these before calling your `main`, not the abi. At the time of the ELF entry point these are all effectively stack local variables.
The API is ugly, and since it needs CAP_SYS_RESOURCE many programs can't use it... but systemd does: https://github.com/systemd/systemd/blob/2635b5dc4a96157c2575...
This shouldn't cause the kind of race conditions we are talking about here, since it isn't changing a single arg, it is changing the whole argv all at once. However, the fact that PR_SET_MM_ARG_START/PR_SET_MM_ARG_END are two separate prctl syscalls potentially introduces a different race condition. If Linux would only provide a prctl to set both at once, that would fix that. The reason it was done this way, is the API was originally designed for checkpoint-restore, in which case the process will be effectively suspended while these calls are made.
Agreed, 150%. My comment had more to do with rejecting files than it did with embracing environ as a suitable alternative.
SQLite is likely the most trouble-free option at the moment.
With that being said, it would be nice to see Android's sys/system_properties.h ported to GNU/Linux proper and, from there, other Unixen.
> I would encourage you to play around with ghidra, just to see what return oriented programming and ARM limits does.
Having worked professionally in reverse engineering at DoD, I can assure you that this is something I'm intimately familiar with.
Files and environ are bad.
(Ed.: the man page should say "you are required to take a shower after writing code that uses setenv(), both to get off the dirt, but also to give you time to think about what you are doing" :D)
Thing is, the (history of the) UNIX APIs - call'em "libc" if you like - is littered with the undead corpses of horrible ideas. Who thought that having global file write offsets are great ? Append-only writes ? Global working directories ? The ability to write the password db via putpwent() ? Modifying your own envp or argv ? Why have a horribly-scaling hack like fcntl-based file locking even in the standard ?
"Today", were one to start from scratch, the userspace API of even unix-ish operating systems would be done much differently. After all, systems designers and implementors are intelligent people and learn, and there's 50y+ of history to learn from. But the warts are there, and sometimes, there to "program around" them.
It is kind of ironic how so many stick with UNIX and C ideas as religious ideals from OS and systems programming ultimate design, while the authors moved on creating Plan 9 and Inferno, Alef and Limbo.
And I still don't understand why processes "modifying their own envp or argv" are met with such revulsion in this comment thread except from the "I dislike that on ideological grounds" reason. Now, the ability to modify envp and/or argv of other processes while those are running, yes, that's a horrible idea. But modifying your own internal process state?
Oh, and fcntl file locks are horrible for the historical reasons: basically, when POSIX (or its predecessor?) were trying to decide on a portable interface, the representative of one of the vendors cobbled together this API and its implementation in a week or two, and then showed to the meeting with it. To his surprise, instead of arguing everyone else basically said "eh, looks fine", and that was it, we now have broken "why on earth does close()/fork()/exec() interact with locks like that" behaviour.
In Java you’d have the static initializers run before the main method starts. And in some languages that spreads to the imports which is usually where you get into these chicken and egg problems.
One of the solutions here is make the entry point small, and make 100% of bootstrapping explicit.
Which is to say: move everything into the main method.
I’ve seen that work. On the last project it got a little big, and I went in to straighten out some bits and reduce it. But at the end anyone could read for themselves the initialization sequence, without needing any esoteric knowledge.
You can add `premain` function that calls `main` and set it as an entrypoint, you can implement pre-start logic in main and call main loop later.
This is how any sane program is written anyway: set up environment -> continue with business logic
Maybe it's possible, but if I need to review every library (and hope they don't break my assumptions later) I think I lost on building this separation in practical way.
This (obviously?) isn't "110%" perfect as the order of the constructor calls for several such objects may not be well-defined, and were they to create threads (who am I to suggest being reasonable ...) you end up with chicken-egg situations again.
There was one place and only one place where we violated that, and it was in code I worked on. It was a low level module used everywhere else for bootstrapping, and so we collectively decided to do something sneaky in order to avoid making the entire code base async.
And while I find that most of the time people can handle making one special case for a rule, it was a complicated system and even “we” screwed it up occasionally for a good long while.
The problem was we needed to make a consul call at startup and the library didn’t have a synchronous way to make that call. So all bootstrapping code had to call a function and await it, before loading other things that used that module. At the end we had about a dozen entry points (services, dev and diagnostic tools). And I always got blamed because nobody seemed to remember we decided this together.
I hate singletons. And I ended up with one of only two in the whole project, and that hatred still wasn’t enough to prevent hitting the classical problems with singletons.
They might do it for testing purposes.
If you're only reading environment variables you have no problem, though. It's only if you try to change them that it causes issues.
For setting, "only set environment variables in the Bash script that starts your program" might be a good rule.
(the argument that the horses have long bolted with respect to "just do the right think ok?!" here holds some water. I'm of the generation though where people on the internet could still tell each other they were wrong, and I assert that here; you're wrong if you believe a non-threadsafe unix interface is a bug. No matter what kind of restrictions around its use that means. You're still wrong if you assume the existence of such restrictions is a bug)
I ended up implying some extra support when all I meant was “one could”.