Dynamic Scoping in C++(blog.dokucode.de) |
Dynamic Scoping in C++(blog.dokucode.de) |
What value does a forked thread get? The value at the dynamic scope of the parent at the point of thread creation.
What happens if a delimited continuation is invoked by a different thread compared to the one that created the continuation? If a parameterize call was made within the continuation's delimited extent, then it moves with the continuation. If not, it'll be in the executing thread. In either case the answer is consistent: The value within the dynamic extent of the continuation is used.
What happens to other threads if one overrides a parameter within its own dynamic scope? Nothing, threads don't have a dynamic scope relationship between them after thread creation.
FWIW the Racket parameters are inspired by:
"Processes vs. User-Level Threads in SCSH" by Martin Gasbichler and Michael Sperber
https://www.researchgate.net/publication/2546137_Processes_v...
In this implementation, the threads will corrupt the data structure and result in undefined behavior.
This isn't a latte; this is just an espresso with steamed milk added to it. :)
What you describe basically is how dynamic scoping is mechanically implemented under the hood.
It's certainly how that's commonly emulated but that can leak out e.g. CPython uses threadlocals for decimal contexts, but if you set a localcontext in a coroutine / generator and suspend that, the information leaks out.
I assume the same happens with gevent unless you `patch_thread()`, and even then that assumes `decimal` always deref's threadlocals from the python-level module rather than statically resolve them.
I don't know what you think dynamic scoping is? Because 'global variable with a stack of values' is what it is.
You really are confusing the implementation of dynamic scoping with what dynamic scoping is: the entire point of having that term at all is to describe how the variables are scoped not how they are set. If you have to type the name of the variable into the global scope, then obviously it isn't dynamically scoped.
Thread local storage does not make it absolutely re-entrant.
We could move the True Scotsman goalposts even farther out and say that we appreciate that the syntax being fine, but your approach doesn't work with interrupt handlers.
I mentioned thread-local storage just because if you are going to develop this you should take that into consideration, as that's a common thing that will burn a lot of people; it was an unrelated code quality point for something you should do if you are going to do this kind of global variable stack thing. You could though use it to build something that was actually scoped by having a generic global dictionary and then keeping the names inside of the functions; at least then you are providing the core base noun of "dynamic scoping".
(And of course, as someone who has spent all of their time programming in C++ coroutines for over a year now, I am well aware that the thread local storage isn't sufficient to make this trick work correctly in every case.)
With coroutines, implementing dynamic scope becomes a lot more interesting, because switching to different coroutines requires switching which dynamic bindings are active.
The correct implementation is somewhat subtle and not immediately obvious if you haven't thought about it a lot. http://okmij.org/ftp/papers/DDBinding.pdf lays it out formally, but in the end the correct implementation is for each coroutine to have its own stack of dynamic bindings, and when you resume a coroutine in some context, you extend the bindings in that context with the coroutine's set of bindings while the coroutine is running, and remove those bindings again when the coroutine is done running. This preserves the intuitive behavior that one expects from dynamic scope - see the paper for more justification.
Others have got this wrong too, so you're in good company. Python, for example, added contextvars with https://www.python.org/dev/peps/pep-0567/, which have semantics which are usually identical to dynamic scope. But they chose an excessively-simple implementation, so the behavior diverges from proper dynamic scope when using coroutines in unusual ways, or using generators at all: https://www.python.org/dev/peps/pep-0568/
Sure, it allows for neat tricks, I suppose. It mostly allows impossible to diagnose error conditions since what happened actually depends on anything that may have happened before, invisibly.
I find it particularly amusing since fighting off global states has been a worthy goals of languages, libraries and framework. Without it, you can say goodbye to reproducible behavior and multi-threading.
(If dynamic scoping is thread-local, you still have the issue that anything can affect anything else, so nothing can be assumed to be reentrant anymore.)
Practically, dynamic scoping is more confusing than context objects.
void main() {
int x = 2;
fn();
}
Does fn access or change x? You need to inspect the body of fn to know.I would call dynamic scoping a poor form of coupling. Instead of bundling your coupling wires in a neat little set of in/out arguments and a return value (the format of which only needs the function’s declaration, not its definition), you are instead reaching out of and into the function’s body, like sprawling tendrils, as your function has free pickings of your variables.
It also strangely couples the names together. The outer function and the inner function may see the variable in completely different lights, yet dynamic scoping requires the outer use the name prescribed by the inner.
Optimization would be hard without WPO. You’d essentially need to keep a run-time “scope” object for every function. Though, the author’s proposed design for dynamic scoping in C++ means you don’t need it for every function; however that design has its own issues: how would you optimize such a design? It would a puzzling challenge.
Dynamic<int> foo; // define at global scope
{
DynamicBind<int> foo; // re-bind dynamically
}
It used thread-local storage and all. The global constructor for the Dynamic<> template class would allocate the thread specific key. The DynamicBind<> template class did the saving, location altering, and restoring. Ergo, dynamically-scoped variables are shadowable side-channels that can influence the behavior of a function
And thought that is exactly why not to use dynamic scoping. It makes every function impure by default.Regarding threads: It is correct that the current version of the template has a problem with multi-threaded programs. However, as adding 'thread_local' to the global variable is sufficient to solve the problem, I did not mention this in the original post. However, I updated the blog post in this direction. Furthermore, I added a (run-time) check that ensures that you use DynamicScope<T> only with thread_local.
Regarding Lambdas: I don't think there is a problem here. Dynamically scoped variables promise to return that value that is the most currenly bound in the current execution context. As the resolution is done on dereferencing, this is the exact behavior that DynamicScope<T> provides. This means that a lambda does not (lexically) catch the value of the dynamically-scope variable at definition time, but at the execution time of the lambda.
Is there a simple real-world example that would explain when dynamic scoping would be better than some kind of access protocol to a shared value?
It's the same difference as using environment variables vs command line arguments. Imagine all programs having to pass TERM, DISPLAY, HOME, etc. as arguments in case some descendant process wants to use it and the user override. Like passing TERM, DISPLAY to git in case you want to override them for the configured pager or editor.
In other words, the issue is that when you have project A using project B using project C, project A has to manually carry around the context of B, and C in case the user wants to override them.
Interestingly this is exactly what I have been wishing was standard practice for a few years now! Otherwise you end up pickup up implicit configuration from the environment that you didn't intend at all.
> In other words, the issue is that when you have project A using project B using project C, project A has to manually carry around the context of B, and C in case the user wants to override them.
Yes please! That makes the most sense to me. I admit my years of experience with Haskell may have coloured this opinion.
You would normally use dynamic scoping for certain global parameters that apply to a lot of operations.
Unix shell scripts provide something similar to dynamic scoping with environment variables: If you write a shell script that sets LD_LIBRARY_PATH or TMPDIR, then all programs invoked from that shell script will inherit the values. And if your shell script calls another shell script, then that shell script can again set environment variables, and those are visible until it returns.
I would say that environment variables have been a great success story, and folks aren't too confused.
One pattern that most languages don't support encapsulating is this: Say you have a() which calls b() which calls c() which calls d(). d() needs to get some data from a(). The typical way to handle that is by passing that data as parameters through b() and c(), but that couples those middle-level functions to a() and d(). Any time you change the data a() needs to get to d(), you have to touch b() and c() too.
You could wrap the data in some opaque "context" parameter and pass that through b() and c(). That's an OK solution and is pretty common. But a() still has to opt in to that pattern, which means b() and c() are still coupled to the choice to use any encapsulation at all.
Dynamic scoping is a solution to this. a() can bind a value to a dynamic variable and d() can access it without it having to pass through b() and c() at all. It essentially gives you a side channel for parameters.
A more concrete example is trees of UI components. Pretty often you have some big top level UI component that has a lot of application-level business state. Down in the leaves, you have UI components specific to that application that need that state and render it. But in between those you have a bunch of generic UI components like list views, frames, tabs views, radio button groups, that have nothing to do with your app and just visually arrange the UI.
You really don't want to make a new frame widget class every time you need to pass a bit of business data through it into the thing inside the frame. So instead, what a lot of UI frameworks do is support dependency injection. A widget at the root of the tree can provide an object of some type, which makes it implicitly available to all child widgets (transivitely) of that widget. Children far down in the tree can request the object without widgets along that having to pass it along explicitly.
Dependency injection is essentially a re-invention of dynamic scoping.
Safely intercepting global IO.
The average language is never going to thread IO explicitly, so if your callee has not added explicit hook points then all you can do is try to swap out the relevant subsystem, but your average standard IO is usually not even thread-local, and when it is that doesn't help when the language has sub-tread stack swapping (e.g. Python's generator will just suspend the stack relevant section of the stack entirely so if you've updated a threadlocal in a coroutine it is not rolled back on suspension). Plus you still need to remember to properly clean up your threadlocals as they won't self-revert.
With dynamic variables you can just rebind the variable. Only stack frames following yours will see the update, other stacks will be unmolested and none the wiser regardless of your shenanigans.
Here's one case where I wished I had it. I was writing a tool (in Python) for some scientific task. Parse a file, do some calculations, call out to some library to do more calculations. The whole thing was pretty complex, but cleanly architectured. But then the requirements changed. I had to do something different in the innermost function, depending on the configuration. This was probably 6 layers of functions deep.
Now I had two options: add another parameter to every function to carry my configuration variable, or put everything in a class and use a member field. I couldn't use global variables, since I was doing many of these calculations concurrently. And I didn't want to add new parameters, since it clutters the code, and it mixes different levels of abstraction. Most of the intermediate functions don't care about what's going on at the lowest level. Yes it changes their output, so from strictly "functional" best practices I should string along a parameter. But it felt wrong anyway. So what I did was cram everything in a class and call it a day.
With dynamic scoping, I could have put the configuration in a dynamically scoped variable.
Ideally, there would be a way to specify that a function takes dynamic scope. Then tooling would understand that all the intermediate functions have a controlled amount of impurity. In pseudocode:
MyResult myCalculation(float mass, float energy) (dynamic string extratext) {
// do the calculation and add extra text to the result
}
// way up the call stack:
using dynamic extratext = "Preliminary, do not publish" {
calculateAllTheThings();
}
I know this goes against the current trends (make functions pure if possible, avoid mutable state, think a certain way about data flow...). But in practice, those trends sometimes work fine, and sometimes produce convoluted code. In some cases, I find it easier to produce code that looks clean and functional from a domain logic POV, and add stuff that is orthogonal to it (logging, presentation, ...) via a different mechanism.What do you think the call stack is?
The only difference between what you're thinking of and what I'm thinking of is you're row-based and I'm columnar-based. Why is that such an important difference?
That would work with proper threadlocals. However with dynamic variables you'd also get this for different sub-thread stacks. Not all languages can expose that but languages with generators do e.g. in Python if you `yield` then the stack is suspended, this means a threadlocal you've changed would not be rolled back, but a dynamic variable (which Python does not have) would.
> besides the thread issue you mentioned
Sort of like how you can do object-oriented programming in C by making a struct of function pointers to create your own v-tables.
You mean that when you call e.g. `tmux` you want to be forced to call it as something like `tmux -e HOME=... -e TERM=... -e DISPLAY=... -e USER=... ...`, and likewise for basically all other programs?
> I admit my years of experience with Haskell may have coloured this opinion.
Reader in Haskell helps prevent having to do that similar to environment variables or dynamic scoping, and is really, really common to use it as such instead of having to pass arguments around, so...
Reader in Haskell addresses some of the same concerns, but it differs in important ways. Reader is statically determined, Reader is clearly scoped to particular parts of your program (you can see whether a given function's behavior might depend on a value from a Reader in a way that's not visible with dynamic scope or environment variables), and Reader forces you to have set everything you might access. You could work around that last by making it a `Reader (Map String Dynamic)` or something, but that's not common in Haskell.
I make no claim, here, that these are obviously the right decisions - but they seem to provide enough differentiation that someone could rationally prefer one or the other.
> Reader in Haskell helps prevent having to do that similar to environment variables or dynamic scoping
I would be happy with an interface like Reader in Haskell but I don't see that it's much like dynamic scope. The subject has been explored a few times in this discussion.
For me it seems like dynamic scoping is very similar or at least related to dependency injection. Which seems to be one of those things that really have polarized opinions of developers. Some people love it because the code looks really neat and it is really easy to add new pieces. Other hates it because it is really hard to understand what is happening under the hood.
I would say that dynamic scoping is a bit worse (in my opinion) in that it is like dependency injection that are supposed to be modified one or more times in between so it will be almost impossible to figure out why a value is what it is when it goes wrong.
If you're in CL however, you can just rebind standard-output and you'll capture what anyone downstack sends there.
It will work if there is a spaghetti stack. So that is to say, each thread extends the dynamic environment with a newly allocated frame that points upward to the parent environment.
Each thread needs a thread-specific pointer to the top of its own dynamic environment chain.
'Dynamic binding' is an abstract concept. The abstract concept is a global variable with a stack of values. How you implement it and whether you share the call-stack instead of a separate stack is is up to you. It's still the same dynamic scoping.
The difference with statically-scoped global variables is that you can't set them to different things in different threads, and you have to be careful to set it back to it's original value after you called the code you wanted.
In other words, it's the same difference as using environment variables vs configuration files for executables.
I'm having trouble making sense of your first paragraph. Part of the point of local variables is that they're local. It doesn't matter whether the function you're calling uses them or not. If it uses something with the same name, it's their own copy, and you don't care whether they do or don't. If they want yours, you pass it in. What am I missing?
Because global variables aren't used the same between the two types of languages. In lexically-scoped languages, it's bad practice overall to use global variables for anything except constants. In dynamically-scoped languages, they're used as an environment just like how environment variables are used among processes.
You asked why not use "normal" (statically-scoped) global variables, and I replied on the difference. That doesn't mean that I support using them like that in such languages.
Imagine environment variables didn't exist. A `su` command that modified the home directory in /etc/passwd for the duration of its subprocess to change it back when it dies would also seem pretty ugly for me. Indeed, if a language lacks dynamic scoping or environment variables didn't exist, the proper practice would be to pass the whole environment explicitly as arguments. That's what's typically done in statically-scoped languages, but it has the caveats I mentioned in this other comment:
https://news.ycombinator.com/item?id=24545180
> I'm having trouble making sense of your first paragraph.
Here's an example using Elisp:
; Turning off static-scoping
(setq lexical-binding nil)
;; Bad practice
(defun foo ()
bar)
(foo)
;=> Debugger entered--Lisp error: (void-variable bar)
(let ((bar 3))
(foo))
;=> 3
;; Good practice
(defvar bar 2)
(foo)
;=> 2
(let ((bar 3))
(foo))
;=> 3Dynamically-scoped variables (more precisely: variables with global scope and dynamic extent) are one such wonderful thing. In languages which offer those, one might use one in a function to hold a state which isn't related to the primary task of that function, e.g. a stream reference to send debug information to. Sure, one could give that to the function as one of the arguments, but doing so will make the interface eventually unwieldy and carrying such state from function to function becomes a hassle. Alternatively, one could keep such state as class variable, but if it isn't directly related to the or a single class, that wouldn't be easily comprehend-able design either. Keeping such state in a plain global variable (with indefinite extent) restricts it to being a single value for all consumers.
There's always more than one way to solve a technical problem, but dynamically scoped variables are sometimes the easiest, most straight-forward and most flexible way to do so.
And sometimes you need to set up a derived context for one function call, but continue to use the previous one in other calls:
func Foo(ctx context.Context, ...) {
Bar1(ctx, ...)
derived := makeDerivedContext(ctx)
Bar2(derived, ...)
Bar3(ctx, ...)
// ...
}
It's not hard, but just extremely tedious, esp. if context is mostly utilized in the leaf functions, and the most of the rest of your code simply passes it around.That's where dynamic scoping enters in.
> Is there any way in which doing so via "dynamic scoping" makes it any less of a horrible idea?
In both types of languages it's a horrible thing to set a global variable, because the modification is visible to all code that runs later. If you want to know what value a global variable would have in a function, you'd need to be aware of all code that executed before.
In lexically-scoped languages, you can only read or set a variable. If you forbid setting it, you can only read it. In dynamically-scoped languages you can not only read or set a variable, you can also shadow it. Shadowing is not as bad as setting, because the "modification" is scoped to a call. Because of that, it doesn't undergo changes across functions that don't have a caller/callee relationship (e.g. cousin functions in a call-graph), and I believe that's where the real devil in modifying global variables lies, in reading a variable that could have been modified by functions that don't have a clear relationship with the current one.
Having said that, shadowing has its place. It certainly can be misused like any other feature, but I'm just answering why it's less of a horrible idea, as you put it. Though, it certainly has its benefit with no equal. Not even Haskell's Reader typeclass, because of its static-typed nature, offers exactly the same ability to modify the behavior of any callee-descendant function, unforeseen by the author of the function you're calling.
It occurs to me that this is also similar to the ability in OOP to inherit from a class or module and overriding one of its methods, only that the change, instead of being scoped to the callee-descendants, is scoped to the other methods (and their callers) that use the overriden method.
Reader is like a middle-ground between dynamic-scoping and explicit passing of a context argument around as you would in a statically-scoped language.
>Reader is like a middle-ground between dynamic-scoping and explicit passing of a context argument around
I just wanted to point out that the-thing-which-Reader-is, is sometimes called "implicit parameters", and there are papers written about it.