The Problem with Implicit Scoping in CoffeeScript(lucumr.pocoo.org) |
The Problem with Implicit Scoping in CoffeeScript(lucumr.pocoo.org) |
"""
Sorry, folks, but I'm afraid I disagree completely with this line of reasoning -- let me explain why:
Making assignment and declaration two different "things" is a huge mistake. It leads to the unexpected global problem in JavaScript, makes your code more verbose, is a huge source of confusion for beginners who don't understand well what the difference is, and is completely unnecessary in a language. As an existence proof, Ruby gets along just fine without it.
However, if you're not used to having a language without declarations, it seems scary, for the reasons outlined above: "what if someone uses my variable at the top of the file?". In reality, it's not a problem. Only the local variables in the current file can possibly be in scope, and well-factored code has very few variables in the top-level scope -- and they're all things like namespaces and class names, nothing that risks a clash.
And if they do clash, shadowing the variable is the wrong answer. It completely prevents you from making use of the original value for the remainder of the current scope. Shadowing doesn't fit well in languages with closures-by-default ... if you've closed over that variable, then you should always be able to refer to it.
The real solution to this is to keep your top-level scopes clean, and be aware of what's in your lexical scope. If you're creating a variable that's actually a different thing, you should give it a different name.
Closing as a wontfix, but this conversation is good to have on the record.
"""
Missing that Ruby stops scoping variables at a method and uses separate lexical scoping rules for constants thereby avoiding this issue mostly.
class Foo
def bar
"bar"
end
def baz
bar = "foo"
puts bar
end
def bang
puts bar
end
end
foo = Foo.new
foo.baz # => "foo"
foo.bang # => "bar"Unless you happen to embrace JavaScript's functional side and write lots of top level helper functions (in a closure of course after which you export)
shadowing the variable is the wrong answer. It completely prevents you from making use of the original value for the remainder of the current scope.
This smells of static enforcement - strange for a language expounding JavaScript's dynamic nature and an odd departure from "it's just JavaScript".
Shadowing doesn't fit well in languages with closures-by-default
Not sure what evidence this is based on given the heap of great languages with closures-by-default that give programmers more control over scope without introducing goofy constructs like special assignment operators or global/nonlocal keywords.
CoffeeScript breaks the one real form of encapsulation (which includes the power of naming) that JavaScript has - function locals.
In addition, to repeat myself elsewhere in this thread, the goals here are conceptual simplification and readability, not giving the programmer more control over scope. The final result is that hopefully:
someVariable
... more code here ...
someVariable
... more code here ...
someVariable
... more code here ...
someVariable
... in the above code, you can know that "someVariable" always refers to the same thing. With "var", the above code could allow "someVariable" to refer to three different things, each for slightly different sections of the above chunk of code.If you really want three different values, use three different names. In all cases, it will read better than shadowing would have.
That's totally false.
In languages like Scheme, O'Caml, etc you are never prevented from using the original value.
The point is that lexical, aka static scope is all about lexical pieces of code that you fully control, and all their properties are statically apparent, by looking at a single piece of code.
Scheme:
(DEFINE FOO 1)
(LET ((FOO 2)) ... FOO IS SHADOWED HERE ...)
Inside the LET, FOO is shadowed (FOO is "your FOO") and that's lexically, statically apparent by looking at the piece of code.If you don't want it shadowed, and use the global FOO, you just use another variable name. You cannot be prevented from using the original global FOO, because you choose the local variable names you use in a piece of code.
Unless you're using a system in which non-hygienic macros are present and they expand into it...
Javascript seems to have the model "everything you do is global, unless you know better".
I'm not a big Javascript hater, but this is one very sore point.
Although the following examples could become unambiguous with parenthesis, these examples demonstrates how a trivially overlooked ending delimiter further complicated the language. Not only is the intent of the CoffeScript code unclear in the examples below but the slight variation in the CoffeScript, produces radically different output. The CoffeeScript differences are so small it would be easy for someone to add accidentally while editing. Anonymous function passing and function calling in Javascript require no additional wrappers or edits, while in CoffeeScript you must add special case clarity.
http://img542.imageshack.us/img542/7379/coffeescripttojavasc...
How arrogant! You'd think he'd step back for a second and consider the suggestion, but it sounds like he's on autopilot.
Pretend like you're a beginner, learning this stuff for the first time. If everywhere you see a variable "A", within a certain lexical scope, it means the same thing ... that's much simpler to understand than if "A" means three different things at three different places, because you happened to shadow it twice.
Either way, I think it is what it is and the benefits of CS very much outweigh the cons. Thanks for the feedback
I didn't want to change the default semantics, but I wanted to have a way for the programmer to be safe if they wanted to, so I created the `using` keyword for function declarations.
You explicitly declare what you intend to overwrite in the lexical scope, including overwriting nothing at all with `using nil`.
http://www.rubyist.net/~matz/slides/rc2003/mgp00010.html
Source: https://github.com/jashkenas/coffee-script/issues/712#issuec...
https://github.com/jashkenas/coffee-script/issues/712#issuec...
https://github.com/jashkenas/coffee-script/issues/712#issuec...
It's open source. Why not fork it and get some like minded coders to change it with you?
If you need global variables, it's sensible to just adopt a simple naming convention, like prepending g_ (or whatever pleases you) to all your variables. I already did that with plain JS and it's well worth the "effort".
Here is an example of what I'm talking about:
if isAir cx, cy, cz + 1 then addPlane('near', block)
Should be:
if isAir cx, cy, cz + 1 then addPlane 'near', block
Personally, I use them everywhere because I like having the stronger visual clue that this is a method I'm calling. I think making them optional in CS was a bad idea.
if isAir(cx, cy, cz + 1) then addPlane('near', block)
imho, so much more readable.
Once you understand the reach of CoffeeScript's top-level variables, it is easy to write bug-free code. Since you know that top-level variables have wide scope, you simply need to be judicious about putting variables at top-level scope. If a variable is not needed at top level scope, don't put it there.
Obviously, plenty of folks managed to write bug-free code in Python before "nonlocal" was invented. I'm not saying it's a bad idea, but you can avoid bugs without it.
((a = 5, b = 6, log = x -> console.log x) -> log a + b)()
I disagree. The simple solution to this is to write tests.
That's a step backwards. Code error checking should be done as early as possible. In order of earliness:
* Typing in the code. (Ideal: it is clear from the syntax that the code performs X instead of Y.)
* Compiling. (Strong type checking ensures you cannot return 5.3e7 or null from GetHostName.)
* Running the code at all. (Code contracts and assertions trigger if GetHostName returns "".)
* Automated unit tests. (Check that DB.GetHostName() returns the same string given to DB.Connect().)
* Automated integration tests. (Check that the DB module can connect to and retrieve useful data from a dummy database.)
* QA ("Hey Joe, the system hangs when I give "¤;\@" as my username and press the connect button rapidly for a few seconds.")
* Customer ("Hi the system has a problem, please fix.")
The further up, the faster, more accurately and with less "noise" the error can be discovered.
top_level_variable = null
f = ->
top_level_variable = "hello"
f()
console.log top_level_variable # prints helloSee: https://github.com/jashkenas/coffee-script/issues/712 https://github.com/jashkenas/coffee-script/issues/238
Whether he's right is another question. But if you don't like his decision you can use the Coco (https://github.com/satyr/coco) fork which fixes it by introducing := for nonlocal assignment.
It shouldn't take a lot of thought to see why this is a somewhat user-hostile, passive-aggressive approach for a language. If something is ill-advised, the language should actively steer you away from it, not dissuade you with subtle bugs a few thousand lines of code down the road.
I think this scheme would be more workable if sigils or similar conventions of some kind were mandatory for top-level symbols. Then it would be much harder to accidentally wander into this problem. The language he seems to be drawing inspiration from, Ruby, does do this.
I really don't understand why this isn't being fixed: doesn't global by default break encapsulation? I'm probably missing something, but this is the main reason I haven't tried coffeescript yet.
how_many_times_functions_have_been_called = 0
f1 = ->
how_many_times_functions_have_been_called += 1 # refers to top-level scope
console.log x # undefined
x = 1
console.log x # 1
f2 = ->
f1() # refers to f1 at top-level scope
how_many_times_functions_have_been_called += 1 # refers to lop-level scope
console.log x # undefined
x = 2
console.log x # 2
f1() # you can call f1, it's at top-level scope
f2() # you can call f2, it's at top-level scope
console.log how_many_times_functions_have_been_called # 3, refers to top-level scope
console.log x? # false, x does not exist at top_level scopeThere are lots of people throwing in their $0.02 on how the language should work without having joined the mailing list or seen any discussions on the thought process behind its features.
I'm not saying the suggestion made isn't reasonable, but I can understand glib replies like this from the author that don't make too much effort to explain his stance more than 140 characters.
There are dozens of lengthy conversations about this in the CoffeeScript issues, if you'd like to take a deeper look.
Here's how he introduced CoffeeScript to HN almost exactly two years from today: http://news.ycombinator.com/item?id=1014080
Poppycock. It doesn't get simpler than
a. newly bound variables are new
b. any variable which wasn't bound in the present scope must be bound in a enclosing scope.
If your language makes it much more complicated than First Order Logic (http://cnx.org/content/m12081/latest/), you're doing it wrong.Of course, it's kind of socially acceptable to get wrong because a lot of language designers didn't think it through and used the same operator for both binding and reassigning a variable (i.e. '=').
foo = ->
bar = "woot!"
console.log bar
This compiles to: var foo;
foo = function() {
var bar;
bar = "woot!";
return console.log(bar);
};
bar is locally scoped to foo(). Now, 2 weeks later and 200 lines earlier, you come along and define: bar = ->
alert "Holy crap cheese is awesome!"
Which compiles to: var bar, foo;
bar = function() {
return alert("Holy crap cheese is awesome!");
};
foo = function() {
bar = "woot!";
return console.log(bar);
};
Now, all of a sudden, the "bar" reference in foo isn't scoped to foo() anymore, it's scoped globally, and once you invoke foo(), it'll replace the function bar with a string, potentially breaking your app. It's an ease-of-maintenance issue.This isn't consistent behavior, though. If you define your top-level bar() function after foo, like so:
foo = ->
bar = "woot!"
console.log bar
bar = ->
alert "Holy crap cheese is awesome!"
Then you get "correct" scoping (and the outer bar is shadowed): var bar, foo;
foo = function() {
var bar;
bar = "woot!";
return console.log(bar);
};
bar = function() {
return alert("Holy crap cheese is awesome!");
};
On one hand, it could be argued that this is a "name things better" problem, but on the other, I have to agree that it'd be nice to be able to explicitly scope things when needed. Given that the behaviors are divergent based on what order the variables appear in, I'd say it's confusing enough that a way to explicitly say "hey, I know what I'm doing, I want to shadow any outer variables and declare local scope here" would be useful. bar = ->
alert "Holy crap cheese is awesome!"
foo = ->
for bar of bars
console.log bar
return
↓ var bar, foo;
bar = function() {
return alert("Holy crap cheese is awesome!");
};
foo = function() {
var bar;
for (bar in bars) {
console.log(bar);
}
};Is there anyway to make the scope explicit in coffeescript?
You are going against the consensus established by ALGOL and Scheme (and used by most languages in the functional camp since then) here. That's your prerogative of course, but personally I'd be wary of design choices that go against established wisdom.
With "var", the above code could allow "someVariable" to refer to three different things, each for slightly different sections of the above chunk of code.
Yes, but that's not an issue, as it is easy to check (by looking at the code section in isolation) which of the three bindings an identifier refers to - which is, for me, the definition of lexical, static scope.
doSomething(->
# stuff happens
), onerrorIn coffeescript you can do any of the following, or a few other variants:
doSomething ->
stuff
, onerror
doSomething(->
stuff
, onerror
)
doSomething (->
stuff
), onerror
doSomething(
->
stuff
onerror
)
doSomething (-> stuff), onerrorThe solution/workaround is "use descriptive names and avoid polluting scopes", but the reality of software development is that you're eventually going to cross those wires, and it's going to make you crazy until you figure out what happened.
This behaviour seems quite unreasonable to me, but I haven't been able to find explanations about it, other than it's expected behaviour.
CoffeeScript will automatically declare all variables in the nearest lexical scope it can find. The top-level scope in CoffeeScript isn't global -- it's the top of the file. You don't have to know anything about what values may or may not exist in global scope at any given moment ... all you have to know is what variables are visible in your function's enclosing scopes, just within the file you're working in.
Thanks a lot for clarifying that, it doesn't look that bad this way.
Python has the inverse behavior of CoffeeScript. So that was never an issue.
CoffeeScript's scoping forces you to always keep track of whatever is enclosing the current scope ALL THE WAY TO THE TOP. This is way too much when your function doesn't need to access outer variables (which should be the minority of the cases).
So, problem is, you either make all your functions have non-free variables (but Jashkenas seems to dislike functions shadowing outer variables too, which is just... overtly weird), or you keep track of all variables above the current scope.
The former is not too unreasonable, until you remember it makes no sense with closures :3
Obviously it depends on programming style. In Coffeescript you don't have something akin to 'self' and so assigning to a non-local happens a lot.
class Foo
bar: -> 'bar'
baz: ->
bar = "foo"
console.log bar
bang: ->
console.log @bar()
foo = new Foo()
foo.baz() # => "foo"
foo.bang() # => "bar"Here's a better example:
Ruby:
def bar
"I called bar!"
end
def foo
puts bar
bar = "I manually assigned bar"
return bar
end
puts foo()
puts bar()
# =>
I called bar!
I manually assigned bar
I called bar!
Coffeescript: bar = ->
"I called bar!"
foo = ->
console.log bar()
bar = "I manually assigned bar!"
return bar
console.log foo()
console.log bar()
# =>
I called bar!
I manually assigned bar!
TypeError: string is not a function bar = ->
"I called bar!"
foo = ->
console.log bar()
bar = "I manually assigned bar!"
return bar
console.log foo()
console.log bar()
Here is a slightly less contrived example: log = (data) ->
console.log data
enhance_logging = ->
i = 0
log = (data) ->
i += 1
console.log i, data
log "no line numbers"
enhance_logging()
log "line one"
log "line two"
Here is the output: > coffee foo.coffee
no line numbers
1 'line one'
2 'line two'Like everything in languages (or APIs), there's a tradeoff here. By making scoping automatic, and (hopefully) forbidding shadowing, you can make the language conceptually simpler. Think of it as making variables be "referentially transparent" in terms of their lexical scope. Everywhere you see "A" within a given lexical scope -- you know that "A" always refers to the same thing. In a language with "var" and with shadowing, "A" could mean many different things within any given lexical scope, and you have to hunt for the nearest declaration to tell which one it is.
On the downside, you have what Armin describes: If you happen to try to use the same name for two different things within the same lexical scope, it won't work.
Since it's always the case that you are able to choose a more descriptive name for your variable, and gain clearer code by it, I think it's very much a tradeoff worth making.
But perhaps I'm still not getting at the answer you're looking for here...
To give an example, many CoffeeScript programmers do follow simple naming conventions to call out top-level variables, such as CamelCase for classes or ALL_CAPS for constants. When you follow these conventions, it's pretty easy to avoid naming collisions.
In certain cases, though, you want a top-level variable to be lowercase, perhaps for stylistic reasons. If your files are relatively small, it's pretty easy to check for naming collisions when you introduce top-level variables after the fact, so a developer might decide that the risk is acceptable, especially if there is good test coverage.
Another kind of user hostility is to optimize for safety at all costs. I don't think any scoping mechanism totally eliminates the possibility of bugs, but some schemes do err on the side of safety over convenience. There's nothing wrong with trading off convenience for safety, but, on the other hand, you can make judgment calls that convenience and/or simplicity of the scoping model outweigh the risk of naming collisions.
Obviously, I like CoffeeScript, so I think Jeremy's made the correct tradeoffs. All languages work a little different--JavaScript, CoffeeScript, Python2, Python3, and Ruby all have different rules--and none of them are perfect in all situations. In all of the languages, though, it's reasonably straightforward to write correct code once you adopt general good practices--be careful with your names, and understand the language's approach to scoping.
Not to mention that loop counters are no different than other `var`iables in JS.
Why? Hackernews and reddit exist and everybody is free to send me a mail or contact me on twitter. This way I do not have to moderate any comments or deal with spam.
> I also wish that he could engage the broader CS community in a proper forum before dissing the language and/or creating a fork of the language.
And do what? Duplicating an issue that is already there? Commenting on a dead issue? The author has expressed his unwillingness to deal with this issue so why should I reopen the issue there?
> He's overreacting.
How am I? I wrote a very short blog post about why I think the scoping is bad and how it caused me problems. Many people asked me on Twitter why I think the scoping does not work as good as it should and since I only have 140 characters to explain stuff there I wrote it to my blog. How else should I communicate that?
> CS does have a mailing list, but most of the action happens via github issues.
There is an issue about this topic from a year ago which was closed and the author does not want this to be changed. I am okay with that, a language needs leadership. That does not mean however that other people should not know about this issue when they design the next programming language.
I wish that you would engage the CS community on this topic without the sole agenda of getting this fixed. It's true that the issue was put to rest a while ago, but the decision wasn't made in a vacuum--there was consensus involved. I think it's a little unfair to say that Jeremy "has expressed his unwillingness to deal with this issue"; that makes it sound like he was dismissing you or dismissing further debate on this, when in fact he just told you what had already been decided.
When I say you're overreacting on this issue, it's just my opinion, so don't get too worked up about it. You had a bug. "Log" means two things. IMHO you don't need to fork coffeescript; that would be a gross overreaction, but YMMV.
You are not doing anything wrong by taking the time to write up your opinion in a blog--if I implied that in any way, it was not out of malice; it was just imprecise writing on my part.
I'm going to keep using Coffeescript, I love it. Maybe this kind of scoping isn't the best decision for modularity and teamwork, but on reflection at least the rules are simple and consistent enough for a single programmer to keep in mind.
In addition, nothing in CoffeeScript is set in stone -- because every script compiled with every version of CoffeeScript is compatible with every other version, we're much more comfortable making changes to the language than we otherwise would be. If you can make the case that this change is a good idea, we'll definitely make it. So feel free to comment on the old issue or open a new one if you wish.
This attitude concerns me. I can't just say "oh, that's CoffeeScript 1.0 stuff, just trash it, the JS still works". I still have to update the CoffeeScript to upgrade the version. There's no less risk in breaking backwards compat with CoffeeScript than any other language.
> How else should I communicate that?
Who says you should?Surprised at that reaction, he's had a problem and he's written an excellent blog post about it.
Personally, I would say that a truly talented programmer is simply someone who is very capable in mathematics and can produce a working and extendable program in a reasonable amount of time, that does something new and useful. This criteria alone leaves a ton of people out, you know!
Might have to do with Python itself as well: because it's function-scoped and it tends to avoid higher-order function (in part due to the limitations of its anonymous functions), there are far less occasions write to lexical closures than in Scheme, Smalltalk or Ruby.
That is not the JavaScript model at all.
Python went for a more complicated scheme of having every variable in a function be a new binding when first assigned, and then it's just mutation (i.e. 'local'), unless explicitly declared to be nonlocal, but it still makes sense with their... penchant for mutable state.
And Coffescript instead went over to PHP to have some of the glue it was eating.
If only it were that simple. It also has "you create thing with `var` and sometimes with `=`", plus it does not have block scope:
if( true)
{
var x = 1;
}
// x == 1 here.That's not what I'm criticizing. Lexical shadowing is not what I'm advocating. I think your choice is fine in so far as it goes. It has its logic.
What I am criticizing is how it can fail. The top scope is different, quantitatively and qualitatively, from almost all nested scopes. It's much larger, and spread lexically over a larger area. If you have a team of developers, it will be modified concurrently. No one developer necessarily knows the full set of symbols defined in the top scope while they are writing an individual procedure.
And thus the problem: a developer thinks they've chosen a "different (better) name" for a some variable, but in fact they've chosen one that a different developer also thought was a "different (better) name", only one of them is in a lexically enclosing scope. This problem isn't likely to occur on the level of nested procedures or nested blocks, because the definitions would be visually close. But it's much more likely to happen when one of the symbols is defined in the top scope. Here, the definition could be many hundreds or thousands of lines away. It may even be in a separate commit, waiting to be merged, such that there's no way for either developer to know without closely reviewing every change.
And this is the criticism: the failure mode for this inadvertent reuse of a variable name is subtle bugs, as what one developer thought was a global symbol turns out to be modified and acquire strange values through unexpected codeflow, almost like the VM was corrupted and memory was behaving unreliably.
The qualitative difference of the top scope in situations like this is the reason why I suggested sigils or somesuch to disambiguate those scenarios. Perhaps top-level symbols can't be reassigned from nested scopes unless you use '$' as a prefix to their name; a visual shorthand that you are definitely not creating a new local symbol.
The reason I summarized your argument in the way I did is because your argument against this failure mode seems to be "don't create top scopes with lots of symbols". That's a fine argument (or rather, exhortation), but it isn't a realistic one. If the language is problematic with lots of symbols in the top scope, it should be unpleasant to use with lots of symbols in the top scope. And the unpleasantness shouldn't come from subtle bugs (the passive aggressiveness I mentioned); it should come from awkward and ugly sigils, or some other intrinsic way of discouraging those styles.
You should definitely go down this track. It would be a massive achievement.
If you want to have a constructive conversation, you could try dialing back the snark, and addressing my arguments directly.
I think the big problem with Coffeescript's behavior is that it can introduce some damn subtle bugs that can be really hard to track down if you don't know what you're looking for, because you're not able to explicitly specify scope semantics. It's even worse if you're polluting higher-scope variables of the same type, because it becomes even less obvious where the error comes from.
Coffeescript more or less shares Ruby's scoping rules, but there's a cultural difference between the Ruby and Javascript communities that makes it a little less workable in Javascript. Specifically, Ruby's "everything is an object", aggressive use of namespacing, and the general idiom that only constants go into the global namespace tends to limit scope issues that could arise from mix-ins.
Coffeescript does attempt to mimic this by providing class semantics and wrapping everything in anonymous functions to limit scope leak, but there's still a lot of temptation to just create a bunch of top-level functions, and that leads to situations like the one described in the blog post.
Just clarifying; are you actually suggesting this, or was that sarcasm?
browser.coffee:CoffeeScript = require './coffee-script'
coffee-script.coffee:{Lexer,RESERVED} = require './lexer'
coffee-script.coffee: {Module} = require 'module'
command.coffee:{EventEmitter} = require 'events'
grammar.coffee:{Parser} = require 'jison'
lexer.coffee:{Rewriter, INVERSES} = require './rewriter'
nodes.coffee:{Scope} = require './scope'
nodes.coffee:{RESERVED} = require './lexer'
repl.coffee:{Script} = require 'vm'The common ground between math and programming is the requirement of a very good capacity to manipulate abstract concepts in a defined frame of known validity. But it pretty much stops here. In mathematics, you define all your abstracts concepts and frame of validity, in programming, you are given a (very shaky and detailed) frame of validity on which you build up abstract concepts.
The ones that are good in approaching the discipline through the study of details to build up stuff that will work on top of it will make the developers. The ones that needs a strong and well defined frame for their work, for it brings a much more powerful ground and enable to reach very high levels of abstraction, will feel more comfortable on mathematics.
No wonder why those that can combines both of those approaches can yield stunning results.
Sounds interesting, tell me more.
Personally, I would say that a truly talented programmer is simply someone who is very capable in mathematics and can produce a working and extendable program in a reasonable amount of time, that does something new and useful.
I would also say a criterion is that he really understands abstraction and how to get more from less. Ie think of how John Carmack creates abstraction which is just right for the problem (and in C at that). Or think about the metalinguistic paradigm in programming (or OOP used right, for that matter). Or how JQuery (and these days Coffeescript) makes client-side web code clean and accessible to everyone, without the previous Javascript hacks and DOM-spaghetti. An average programmer just plods along and hacks out a solution in a linear manner, a great programmer will traverse levels of abstraction to not only solve the problem but also shine a light at it from a superior perspective.
This is also why I dont do conferences, I dont want to be known for going around promoting things (like crock), sure I'll blog about features added once and a while but other than that I want the projects to speak for themselves. Eventually if I can gain enough knowledge then sure being well known is neat since you can leverage it to hopefully expose better projects, but there's no end to what you can learn in this industry.