Diminishing returns of static typing(blog.merovius.de) |
Diminishing returns of static typing(blog.merovius.de) |
A type system doesn't only describe the behavior of the program you write. It also informs you of how to write a program that does what you want. That's why functional programming pairs so well with static typing, and in my opinion why typed functional languages are gaining more traction than lisp.
How many ways are there to do something in lisp? Pose a feature request to 10 lispers and they'll come back with 11 macros. God knows how those macros compose together. On the other hand, once you have a good abstraction in ML or Haskell it's probably adhering to some simple, composable idea which can be reused again and again. In lisp, it's not so easy.
A static type system that's typing an inexpressive programming construct is kind of a pain because it just gets in the way of whatever simple thing you're trying to do. A powerful programming construct without a type system is difficult to compose because the user will have to understand its dynamics with no help from the compiler and no logical framework in which to reason about the construct.
So, a static type system should be molded to fit the power of what it's typing.
The fact that every Go programmer I talk to has something to say about their company's boilerplate factory for getting around the lack of generics tells me something. This is only a matter of taste to a point. In mathematics there are a vast possibility of abstract concepts that could be studied, but very few are. It's because there's some difficult to grasp idea of what is good, natural mathematics. The same is in programming: there are a panoply of programming constructs that could be devised, but only some of them are worth investigating. Furthermore, for every programming construct you can think of there's only going to be a relatively small set of natural type systems for it in the whole space of possible type systems.
Generics are a natural type system for interfaces. The idea that interfaces can be abstracted over certain constituents is powerful even if your compiler doesn't support it. If it doesn't, it just means that you have to write your own automated tools for working with generics. It's not pretty.
The catch there, as is often the case, is hidden in the word "good". Working with text data in Haskell is almost as painful as working with text data in C++, and for much the same reason: the original abstraction is far from ideal for most practical purposes, but became the least common denominator. Everyone and his brother has written a better string abstraction or more powerful regex library or whatever since then, but they're all different.
Consequently, even with the power of generics or typeclasses, you still often see developers just converting to and from the primitive default representation for interoperability. Static typing will at least stop you from screwing that up, which certainly is an advantage over dynamically-typed languages in some situations. However, it apparently hasn't made it any easier for the developer community as a whole to migrate to a better abstraction as the default.
In short, we often don't know what will turn out to be a good abstraction until we've gained a lot of experience, and in the face of changing requirements on most projects, we probably never can know from the start because what works as a useful abstraction might change over time. So while types are useful for checking whatever abstractions we have at any given time, until we've also got techniques for migrating from one to another much more smoothly and on much larger scales than anything I've yet encountered, I think we shouldn't oversell the benefits, particularly in terms of composability.
What a ridiculous stereotype. Clojure community typically maintains the belief that macros are the last resort for things that genuinely justify them. You really shouldn't spread hyperbole like this.
There are languages that enforce termination. They only accept programs that can be shown to terminate through syntactic reasoning (e.g., when processing lists, you only recurse on the tail), or where you can prove termination by other means.
Coq is like this, as is Isabelle, as is F* , as are others. They also provide different kinds of escape hatches if you really want non-terminating things, like processing infinite streams.
This "we can never be sure of anything, because the halting problem" meme is getting boring. Yes, you cannot write the Collatz function in Coq. No, that is not a limitation in the real world.
There are two ways to see type systems. In the first way you construct terms along with their types, this is called Church style. In the second way, the terms exist before their types and you use types to describe their behavior, this is called Curry style. In particular take System F. In Church style the terms of System F come with their types. In Curry style we see System F types as a way to describe the behavior of untyped lambda terms.
I used to think Church style was more important but lately I've been more partial to Curry style. Programs exist before you type them, type systems tell you how they behave. They also tell you how to construct programs but this is subordinate to the more fundamental descriptive capacity.
How about, say, a video game? That's something where we reasonably _want_ it to not terminate, because we're primarily interested in its side effects.
Many. That's the whole point -- to let you choose "the way to do something" that applies the best to your circumstances (development time, performance, allowable complexity, etc.)
So you are limited by your own mind and skills -- not by the language.
To me, it's right tool for the right job. I have no problem spinning up a static language for performance and outsourcing the scripting to a dynamic language like Python for the best of both worlds in terms of speed, and rapid development.
That's not really true, just a belief. I give you an example to start understanding these things: the exact same program written in a very high level and very expressive language, like Perl, instead of Go, is going to have at least 3 times less code and since defect rates per line of code are comparable, you would end up with at least 3 times less bugs. Suddenly reliability argument of static typing doesn't make any sense. That's because in PL research there is a huge gap in understanding of how programmers actually think.
And I'm not sure you should expect the number of bugs per line to remain constant across languages. Extra lines required because you have to do your indexing by hand as you're iterating over a list certain increases the chances of an error but the extra '}' required to end the block in some languages increases line count with very little chance of causing an error.
I was in favour of dynamic typed, but lean more and more towards static typing, like ocaml.
Given a million line codebase written in Perl vs a three million line codebase written in Go, which do you think most engineers would prefer?
Alternatively, you just write a function to iterate your game world update N times, where you choose N large enough to ensure that your game can run until the heat death of the universe. That's not a nonterminating function, but it's long-running enough for all practical purposes.
I agree with this. But I think that usually we are not really talking about all programs. We are talking about useful programs, and those are usually terminating. In theory, not always, see first-order theorem provers; but in practice, we always call a prover with a time limit because nontermination isn't useful.
> Programs exist before you type them, type systems tell you how they behave. They also tell you how to construct programs but this is subordinate to the more fundamental descriptive capacity.
That's an interesting point, and I agree in many respects. I think when programming in a dynamically typed language, I approach things in one style, and in statically typed languages in another style. But specifically for termination, I don't think so. I never want to write a nonterminating program; the termination property for my programs exists (as a requirement) before the program does.
I love everything about Swift except the compile times and occasionally inscrutable compile error messages.
I love the interactivity of Javascript, but despise the lack of types, it's like I'm sketching out the idea for a program instead of directly defining what it is. And the lack of types burns me occasionally.
So the trade off is: static typing gives you more compile-time certainty, but at a cost of spending more time developing your code. Dynamic typing gets you to a working product or prototype typically much much faster, but with added run-time debugging.
Each has its benefits and costs.
In my experience, there is no doubt that dynamically typed languages are faster-to-production than statically-typed. This doesn't mean that I don't admire static typing, though, because most developers appreciate some degree of purity in their work.
https://github.com/fthomas/refined
not only for the static checking,
scala> val i: Int Refined Positive = -5
<console>:22: error: Predicate failed: (-5 > 0).
val i: Int Refined Positive = -5
but the expressive descriptions of a domain model.All static typing means is that type information exists at compile time. All dynamic typing means is that type information exists at runtime. You generally need _at least_ one of the two, and the benefits each gives you is partially hobbled by the drawbacks of the other, so most dynamic languages choose not to have static typing. I also feel that dynamic languages don't really lean into dynamic typing benefits, though, which is why this becomes more "static versus no static".
One example of leaning in: J allows for some absolutely crazy array transformations. I don't really see how it could be easily statically-typed without losing almost all of its benefits.
The key is balance. Pure static does create a lot of extra up front cruft at the expense of long term safety. Pure dynamic does create a much faster path to features at the expense a lot of long term confusion.
The reason we have this conversation is because of web applications where everything is travelling over the wire as a string, consumed by the web server as a string, converted by whatever language the server is in...into something that it can use...9/10 times validated to make sure it reflects what we need and then stuff into a database.
In the case that you're using a SQL database, a huge number of people are enforcing types at the database layer and the validation layer. Since so much is "consume and store" followed by "read and return" the types at that server layer end up creating a ton of extra work that in many cases shows little to no benefit.
At the point that you're doing more in server layer, suddenly it becomes a lot more useful. At the point you're working on desktop, mobile, embedded, console, computational and graphics...static is going to provided more value.
At the point you're working on web in front of a database, the value is much more questionable.
This is really one of the reasons I'm such a huge Elixir fan because IMO it strikes that perfect balance where I live...on the server in front of a database. You get static basic types with automatic checking via dialyzer and you can make it stricter as necessary.
In such a case, the line between these two type environments narrows.
> In such a case, the line between these two type environments narrows.
Not really. Static types still offer you total proofs of the properties you encode as types, not just experimental results of tests.
It is not a proof-like system, but outside of dependent typing, static typing does not catch value-related bugs, but Clojure.spec can. In a static type system, how easily would it be to exactly specify and guarantee that a function's second parameter is of a higher value than its first, or that a function's output is an integer between 5 and 50, etc? Clojure.spec is just predicate functions composed together to define the flow of data in a program, and those compositions can be used in a variety of ways.
> "Go reaps probably upwards of 90% of the benefits you can get from static typing"
That 90% number is totally made up as well. I don't see evidence that the author actually worked with Haskell, or Idris, or Agda these being the three static languages mentioned. Article is basically hyperbole.
If I am to pull numbers out of my ass, I would say that Go reaps only 10% of the benefits you get with static typing. This is an educated guess, because:
1. it gives you no way to turn a type name into a value (i.e. what you get with type classes or implicit parameters), therefore many abstractions are out of reach
2. no generics means you can't abstract over higher order functions without dropping all notions of type safety
3. goes without saying that it has no higher kinded types, meaning that expressing abstractions over M[_] containers is impossible even with code generation
So there are many abstractions that Go cannot express because you lose all type safety, therefore developers simply don't express those abstractions, resorting to copy/pasting and writing the same freaking for-loop over and over again.
This is a perfect example of the Blub paradox btw. The author cannot imagine the abstractions that are impossible in Go, therefore he reaches the conclusion that the instances in which Go code succumbs to interface{} usage are acceptable.
> "It requires more upfront investment in thinking about the correct types."
This is in general a myth. In dynamic languages you still think about the shape of the data all the time, except that you can't write it down, you don't have a compiler to check it for you, you don't have an IDE to help you, so you have to load it in your head and keep it there, which is a real PITA.
Of course, in OOP languages with manifest typing (e.g. Java, C#) you don't get full type inference, which does make you think about type names. But those are lesser languages, just like Go and if you want to see what a static type system can do, then the minimum should be Haskell or OCaml.
> "It increases compile times and thus the change-compile-test-repeat cycle."
This is true, but irrelevant.
With a good static language you don't need to test that often. With a good static type system you get certain guarantees, increasing your confidence in the process.
With a dynamic language you really, really need to run your code often, because remember, the shape of the data and the APIs are all in your head, there's no compiler to help, so you need to validate that what you have in your head is valid, for each new line of code.
In other words this is an unfair comparison. With a good static language you really don't need to run the code that often.
> "It makes for a steeper learning curve."
The actual learning is in fact the same, the curve might be steeper, but that's only because with dynamic languages people end up being superficial about the way they work, leading to more defects and effort.
In the long run with a dynamic language you have to learn best practices, patterns, etc. things that you don't necessarily need with a static type system because you don't have the same potential for shooting yourself in the foot.
> "And more often than we like to admit, the error messages a compiler will give us will decline in usefulness as the power of a type system increases."
This is absolutely false, the more static guarantees a type system provides, the more compile time errors you get, and a compile time error will happen where the mistake is actually made, whereas a runtime error can happen far away, like a freaking butterfly effect, sometimes in production instead of crashing your build. So whenever you have the choice, always choose compile-time errors.
Wouldn't it be great if we can use the computer to figure out what the types should be by a runtime evaluation of the code and save precious human time for things only humans can do?
I don't have to think or decorate my speech with types of noun, verb, pronoun, adjective etc. when I speak, but I'm still able to communicate very effectively, because your brain is automatically adding the correct type information based on context that helps you understand what I'm saying, even with words that have multiple types. Granted, natural language is different than programming language but there was once a trend to try and make programming languages more like human language, not less so.
How is that? I'm not seeing the increased utility.
> It's better to trade a fast and quick runtime type error....
What if the runtime type error crashes your app in production and loses your company money? What if it's something that slipped through your end-to-end integration testing because certain unlikely conditions never got covered, but they happened in production?
> ... than a lengthy compile-time type checking process,...
There are several modern compilers which are quite fast: D, OCaml, Java.
> ... because less code needs to be evaluated at run-time to expose the type error.
With static type checking, no code needs to be evaluated at runtime to expose a type error. Does dynamic typechecking offer a reduction over that?
> Wouldn't it be great if we can use the computer to figure out what the types should be by a runtime evaluation of the code and save precious human time for things only humans can do?
Wouldn't it be great if the computer would figure out the types at compile time and save us from having to manually input them? Well, the computer can do that, thanks to type inference. Several popular languages offer full, powerful type inference.
Software failures are failures of understanding, and of imagination.
The problem is that programmers are having a hard time keeping up with their own creations.
dynamic typing simply doesn't scale.
I would not consider a language to be modern unless it has Type Providers I consider this to be such an essential feature. I believe Idris and F# are the only languages that have it. People are trying to push TypeScript to add it - who knows if it will happen.
Many are saying that if you have a dynamic language you just need to be disciplined and write many tests. With good static typed languages like F# you can't even write tests on certain business logic since the way you write your code you make "impossible states impossible", see https://www.youtube.com/watch?v=IcgmSRJHu_8
1. performance dominates (like 80:20)
2. tooling
3. doc (becomes crucial on large projects)
4. correctness
Formal correctness doesn't really matter. Anecdotally (since that's really all we have), I find in practice, very few bugs are caught by the type-checker.Further, code is usually not typed as accurately as the language allows. i.e. the degree of type-checking is a function of the code; the language only provides a maximum. In a sense, every value has a type, even if it's not formally specified or even considered by the programmer, in the same sense that every program has a formal specification, even if it's not formally specified.
Upfront design is the price. Which is difficult to pay when the requirements are changing and/or not yet known.
By adding types (and in the extreme, dependent types), you're allowing compiler to prove more things about the code (to check correctness or generate more optimal code). If you actually need to prove more things, then it's better to leave that for a compiler rather than human.
Of course, if you're writing e.g. web scraping script, you don't need these guarantees and then you don't have to care about types. But the better engineering you want, the more static typing will help and there is no diminishing returns.
It makes the higher level types seem more transcendental than they are, and also seems to put actual validation on a second rate level. End of the day if an argument is the right scalar or interface you'll get the same result on runtime whether you hinted it -- for one's quality of life improvements -- or checked it with some boilerplate validation. Worst case scenario people will forgo encoding known stricter constraints after generally hinting the expected type.
The state of the art is not up to proving every desirable property of every program that we would like to build. But that has nothing much to do with computability. And some extremely impressive things have been done, like the seL4 separation kernel, which has static proofs of, among other things, confidentiality, integrity, and timeliness, and a proof that its binary code is a correct translation of its source.
OK, let's put that to the test. Here is a particular program:
let x = 6
let y = 3
while true:
if y>x then halt
if is_prime(y) and is_prime(x-y) then
x = x + 2
y = 3
else
y = y + 2
endif
Can you tell me if it halts or not?> The state of the art is not up to proving every desirable property of every program that we would like to build.
Isn't that exactly the same as what I said?
> But that has nothing much to do with computability.
What does it have to do with then?
> some extremely impressive things have been done
Yes, in some very particular cases. But note that even a proof of correctness is not a guarantee that the code is bug-free.
What’s old is new again, though one can hardly imagine cat-v touting the merits of Java.
> upfront investment in thinking about the correct types
being a cost. Surely you have to do this whether the compiler will check your work or not, and if you just don't do the thinking you'll end up with bugs? Isn't this a benefit?
Just ONE study, so don't take too much heed. That said, apparently:
* Strongly type, statically compiled, functional, and managed memory is least buggy
* perl is REVERSELY correlated with bugs. Interestingly, Python is positively correlated with bug. There goes the theory about how Python code looks like running pseudo-code... Snake (python's, to be more precise) oil?
* Interestingly, unmanaged memory languages (C/C++) has high association with bugs across the board, rather than just memory bugs.
* Erlang and Go are more prone to concurrency bugs than Javascript ¯\_(ツ)_/¯. Lesson: if you ain't gonna do something well, just ban it.
All in all, interesting paper.
(I'm not talking about systems which just infer types automatically).
Can someone explain this?
Now if we are to accept all of this, that opens up a different question: If we are indeed searching for that sweet spot, how do we explain the vast differences in strength of type systems that we use in practice? The answer of course is simple (and I'm sure many of you have already typed it up in an angry response). The curves I drew above are completely made up. Given how hard it is to do empirical research in this space and to actually quantify the measures I used here, it stands to reason that their shape is very much up for interpretation.
The line charts are there to illustrate the point, not as proof. Like not all arguments are axiomatic proofs, not all charts are plotting data.
>That 90% number is totally made up as well of course.
Yeah, we got that from reading TFA already.
>This is a perfect example of the Blub paradox btw. The author cannot imagine the abstractions that are impossible in Go, therefore he reaches the conclusion that the instances in which Go code succumbs to interface{} usage are acceptable
The author actually not only can imagine them, but plots them (e.g. how for some languages/uses cases the sweet spot will be 100% type help from the language), and explains why he thinks that interface{} can be acceptable in some cases.
>This is true, but irrelevant.
Irrelevant for you maybe. For others (and for prototyping/early exploratory use cases in general) the quick feedback cycle beats the guarantees from Haskell like types. See Bret Victor.
>In the long run with a dynamic language you have to learn best practices, patterns, etc. things that you don't necessarily need with a static type system because you don't have the same potential for shooting yourself in the foot.
I'd say that most people's experiences with typed languages like C++, Java, C# etc run counter to that. Most that I've seen anyway. Same, or even more so, for Haskell -- there are literally tons of stuff to learn, to the point it throws people off.
Can you provide an example?
> In a static type system, how easily would it be to exactly specify and guarantee that a function's second parameter is of a higher value than its first, or that a function's output is an integer between 5 and 50, etc?
Scala:
def foo(param1: Int, param2: Int): Int = {
require(param2 > param1, "Param2 must > param1")
param2 - param1 ensuring { result =>
result >= 5 && result <= 50
}
}If you write this program in a statically checked language:
let x : int = 6
let y : int = 3
while true:
if y>x then break
if is_prime(y) and is_prime(x-y) then
x = x + 2
y = 3
else
y = y + 2
endif
x = "foobar"
I can tell you that it will not type check. And for the same reason, if you write the same program in a language that can express termination and claim that it terminates, the program will not type check until you have supplied a proof of (edit: the negation of!) Goldbach's conjecture in a form that the type system understands.Replace the word "arbitrary" with "some" and I'll agree with you. There are some things a static type system will tell you. Some of those things are even useful things to know. But there are some things a static type system will not tell you, and cannot tell you, and some of those things are useful things to know too.
Furthermore, the way static type systems are used in practice, they don't just tell you things. They will actually refuse to let you run the program unless it conforms to some preconceived notion of correctness that is built in to the type system. Personally, that's the part that rubs me the wrong way. It is sometimes useful to me to run a program even if I know that it has certain kinds of errors in it.
> it will not type check
I'm pretty sure it would. Why do you think it would not?
Languages like Coq require you to prove a function halts before it will compile. Yes, for an arbitrary function it can be arbitrarily difficult or impossible to prove termination. In most cases though, termination proofs aren't that complex (e.g. "it halts because the collection gets smaller each recursive call").
Besides, you're argument is basically sounding like "because you can't prove all functions halt it's a waste of time proving any functions halt". See the sel4 OS for an impressive example of what formal proofs can do.
In general, when writing programs, we ought to develop at least a very informal argument for why they have the properties we want them to have. To the extent that they are correct, these informal arguments could be formalized. It's possible to imagine that with future technology, formalizing these arguments with the assistance of powerful tooling will actually be easier than reasoning about them informally, in the same way that you often find running and inspecting your lisp program easier than reasoning about it without assistance. As far as I know, neither computability theory or any other theoretical obstacle rules this out; it is just (perhaps far) beyond the state of the art.
Perhaps you have mistaken me for an absolutist advocate of static typing or formal methods, which I guess is reasonable in the context of the thread. I'm not at all: I've experienced plenty of joy and pain (and bugs) in both static and dynamic languages, and have had more experience and success with advanced testing methods than with formal ones. At this moment, I'm writing a testing tool in a dynamic language! I just wanted to clear up a technical misconception, because I have seen fields held back before by widely misunderstood impossibility results.
Happy lisping!
I think we actually agree here. Static typing can be useful. I just personally find the manner in which it is usually deployed to be unnecessarily annoying.
Well, that's incredibly stupid. That means you can't write, for example, a web server in Coq unless you intentionally introduce undesirable behavior to satisfy the compiler.
> because you can't prove all functions halt it's a waste of time proving any functions halt
No. That's obviously a straw man. Can you please consider the possibility that I might not be a complete idiot?
My argument is: because the halting problem is undecidable, there are an infinite number of properties of programs that are also undecidable. So there are only two possibilities:
1. None of the infinite undecidable properties of programs are things we will ever care about or
2. There are properties of interest that cannot be decided by static typing
Which of those is the case is an empirical question but I submit that #2 is much more likely to be the case. Therefore, static typing cannot obviate the need to be prepared for your program to exhibit unexpected behavior at run time except in the most trivial cases.
There's ways around it (e.g. proving progress is always going to be made instead of termination) and there's a web server in Coq: http://coq-blog.clarus.me/pluto-a-first-concurrent-web-serve...
> 2. There are properties of interest that cannot be decided by static typing > > Which of those is the case is an empirical question but I submit that #2 is much more likely to be the case. Therefore, static typing cannot obviate the need to be prepared for your program to exhibit unexpected behavior at run time except in the most trivial cases.
Again, look at the sel4 project. It verifies the correctness of an entire OS showing that formal verification is powerful, practical and useful. Google for all the algorithms that have been formally verified with Coq, Isabelle and other proof assistants.
Why do you think it would be common properties of interest wouldn't be provable? Do you think mathematicians have this issue (there's not a lot of difference when you have expressive enough types)? You yourself must have an intuition about why the properties would be true so you should be able to write a formal proof of that although it can be very challenging currently.
Because proving all common properties of interest is tantamount to proving all interesting mathematical theorems.
> Do you think mathematicians have this issue
Yes, obviously. If they didn't they wouldn't have jobs.
> it can be very challenging currently
Yes indeed, and that is exactly my point. Humans just keep finding new and more complicated things to care about. Math doesn't converge.
- Quality (How many bugs)
- Dev time (How fast to develop)
- Maintainability (how easy to maintain and adapt for years, by others than the authors)
The argument is often that there is no formal evidence for static typing one way or the other. Proponents of dynamic typing often argue that Quality is not demonstrably worse, while dev time is shorter. Few of these formal studies however look at software in the longer perspective (10-20 years). They look at simple defect rates and development hours.
So too much focus is spent on the first two (which might not even be two separate items as the quality is certainly related to development speed and time to ship). But in my experience those two factors aren't even important compared to the third. For any code base that isn't a throwaway like a one-off script or similar, say 10 or 20 years maintenance, then the ability to maintain/change/refactor/adapt the code far outweigh the other factors. My own experience says it's much (much) easier to make quick and large scale refactorings in static code bases than dynamic ones. I doubt there will ever be any formal evidence of this, because you can't make good experiments with those time frames.
I think one of our problems is that people have downgraded the importance of this. Much code nowadays (rightly or wrongly) is considered "disposable" - people think that the likelihood of any given piece of code they are writing as surviving more than a few years is negligible. It is a natural assumption when you see the deluge of new technologies, hype cycles, etc. It is further reinforced by the fact that people's empirical experience is that a huge amount of their software work is abandoned, rewritten, outdated, obsoleted, etc.
I think these views are horribly mistaken, because at a deeper level even if 90% of code gets abandoned, the quality of the 10% that survives is still going to determine your maintenance cost. And half the reason we keep throwing code away is because it was created without consciousness of maintainability - it is so easy to say that the last person's code was garbage, so we are going to rewrite it because that is faster than understanding and then fixing the bugs in what they wrote.
I observe this in myself: my favorite language to code in is Groovy - a dynamic, scripting language with all kinds of fancy tricks. But my favorite language to decode is Java. Because it is so simple, boring, there is almost nothing clever it can do. Every type declared, exception thrown, etc. is completely visible in front of me.
Well, most code is written by relatively-inexperience developers, who have not had to retire a system or support a legacy one, and don't know what should be sought out & what should be avoided when designing a system. Thus, they make decisions with limited information to solve the problem at hand, and only later find out the implications of those decisions when someone wants to (say) deploy it as a dockerized service on k8s.
It's one thing to read The Mythical Man Month, and another to write a replacement system that stops providing business value after 30 months and needs to be rewritten to support the current needs.
> it is so easy to say that the last person's code was garbage, so we are going to rewrite it because that is faster than understanding and then fixing the bugs in what they wrote
There's no black and white answer here: sometimes the code is so convoluted (or in the wrong language) that it has to be rewritten; sometimes the design of the system strongly resists changes in behaviour & so much of it needs to be made more flexible that an incremental improvement would cost about the same as a full rewrite.
Well, this has nothing to do with static vs dynamic typing. You can write unmaintainable code in static languages very easily. In startups, developers often overlook maintainability, I completely agree but that's because everyone knows that the code you are writing today might not be needed 2 years down the line, you are mostly iterating to find PMF.
one of my favorite things about groovy is that it's easy to start strongly typing things as your code shapes up, because it allows for totally dynamic types, but it also allows for strong static typing. haven't really had the chance to use groovy since 2012, though.
I think younger devs think this. Once you get a decade or more experience, you grow wiser and realise that code never dies, and especially the code you wish would die is particularly tenacious. And this is pure speculation, but I would wager that the number of lines of legacy code that is kept alive with maintenance is much greater than the number of lines of code that gets abandoned/rewritten/obsoleted.
What both type annotations and DbC are is self-enforcing documentation (of an interface) that doesn't go out of sync with the actual code. But for that, you don't necessarily need static type checks. Now, type checking of type annotations that happens exclusively at runtime is an option that hasn't been explored much (after all, if you already have type annotations, why not let the compiler make use of them?), but an option that has sometimes been used successfully is having a mixture of static and dynamic type checks. You can often greatly simplify a type system by delaying (some) type checking until runtime (examples: for covariance or to have simpler generics).
For example, if you add a case to a variant or sum type, or change the parameter or return type of some function, in a static type system, the compiler can tell you all the locations you need to change. In a runtime system, you have to find them yourself, or wait till you see an error at runtime.
Now, this is still better than the alternative of having the error propagate until it crashes 10 functions down, but the compiler finding all the places that need to be changed is something I've found to be really useful, especially in early development when there's a lot of refactoring happening. Presumably, this is probably useful in later stages as well, when the system is large enough that you can't expect to find all the uses of a function or type manually.
I personally find dynamic languages allow for easy replacability, as there's less explicit references of types. However this is highly dependent on the system being somewhat modular I suppose.
My second big love in languages was Python. It's also the language in which I wrote my first major software product. It was this product that taught me to hate Python. Not because it was hard to create, or because quality was low. In fact, I was VERY FAST to produce 1.0. Took about a week. But after that, I had to work with other developers. That's where everything went to hell.
Then I got a new job a few months later where almost 100% of my time was spent doing maintenance on aging codebases written in Java, a language I never worked with before then. I won't say I fell in love with Java, but, I did fall in love with the ease of inspecting "the world" in each project. As soon as I had it setup in my IDE properly, it was so ridiculously easy to explore how everything related, and then to make refactoring changes? So much easier than it ever was in Python.
Now, at that point I didn't directly make the connection with the type system, but, in retrospect, I know that all of the value I derived from working in Java vs. Python came from having a descriptive, static type system. And frankly, I never once felt slowed down by the need to specify my types up front. In fact, the opposite is true. It taught me to put more thought into my data structures and vastly improved the quality of my software design before I even started writing logic.
Sadly now I'm moving into the data science/data engineering field and everything is Python and I don't know. I don't want to go back to this nightmare. It's like I spent the last decade in first class establishments with the best tools and now I'm going to have to work in the mud with sticks and shovels. I am interested in the field in terms of the capabilities it enables, and I have no problems working in Scala or whatever decently typed language is around, but, the reality is the lion's share of people in this field are doing everything in Python or R and I hate them both.
I figure I have two choices: help advance the capabilities of "better" platforms, or pursue some other direction in my career. It's too hard to know how much better life can be, then go back.
Now there are (optional) type annotations and mypy [0]. I've been using them in my latest projects and I found them useful/helpful.
Multi-decadal longitudinal studies are not too uncommon in medicine, epidemiology and psychology. Why there is no will to conduct, or fund, this kind of research in computer science, I am not sure.
People live for ~80 years... doing a 2-4 decade study isn't out of the realm of possibility.
Computers on the other hand... While there are a few mainframes that live to be 10 years old - the vast majority of the internet, program languages, apps, etc... Hell, even the iPhone just hit 10 years old.
How can you have a 20 year study when the majority of "code" is less than 10 years old?
The previous application I worked on is over a decade old (and it shows). The current application I'm working on is about 8 years old.
Neither applications has any sign of being replaced. Which would be insane, as they both have roughly a decade of laws & regulations and business lessons embedded in them. Despite the state of especially the older application, I don't see how rewriting the entire application would fix anything.
At best parts would be rewritten. And the parts I'm thinking about wouldn't be rewritten because of technical reasons, but because of the way they work. The prime example is a part that only 1 person, a business user, understands.
Ah the HN perception bubble.
Good code last longer than that. Bad code gets replaced.
About best you can hope for is a new "epoch" that forces a rewrite. In the MS world we went from classic VB and VC++ to .net, a lot of companies went through rewrites to keep up with that and some of that software is now nearing 20 years old. There has been a few other epoch like changes, terminal -> GUI, c++ -> java, desktop -> web, except for maybe the last one it's been quite a while since a new epoch has begun.
Parasolid (written in a C dialect) was a rewrite of Romulus (written in Fortran) and that goes back to 1974. And that was a rewrite of Build that originated from Ian Braid's PhD thesis. [2]
I know people who are still working on the same Parasolid code after 30 years. Some of them
Disclaimer: Parasolid dev 1989-1995
[1] https://en.wikipedia.org/wiki/Parasolid
[2] http://solidmodeling.org/awards/bezier-award/i-braid-a-graye...
The CSS/JS on the frontend rarely lasts more than a few years, usually changed to due to design trends (flat, responsive, mobile-first etc).
Most business critical software like SAP for example is also based on decade old codebases.
It's probably a good reference project, when talking about maintenance (nightmares).
Some of our internal infrastructure systems are 10-20 years old - some could definitely do with a complete rewrite, but in the meantime, they're mission critical systems.
As for our products - some of them have even longer timeframes than 20 years.
Another little thing is (from my own point of view) is that many application don't live in a vacuum : they use json schema, or WSDL; and database with types and constraints. So what the language does not "type", the context does.
I know "fun" is highly subjective but still important none the less.
What is interesting is using the type system to specify invariants about data structures and functions at the type level before they are implemented. This has two effects:
The developer is encouraged to think of the invariants before trying to prove that their implementation satisfies them. This approach to software development asks the programmer to consider side-effects, error cases, and data transformations before committing to writing an implementation. Writing the implementation proves the invariant if the program type checks.
(Of course Haskell's type system in its lowest-common denominator form is simply typed but with extensions it can be made to be dependently typed).
The second interesting property is that, given a sufficiently expressive type system (which means Haskell with a plethora of extensions... or just Idris/Lean/Agda), it is possible to encode invariants about complex data structures at the type level. I'm not talking about enforcing homogenous lists of record types. I'm talking about ensuring that Red-Black Trees are properly balanced. This gets much more interesting when embedding DSLs into such a programming language that compile down to more "unsafe" languages.
For my money, I work in a primarily dynamic language and I already have a set of practices that usually prevent relatively simple type mismatches so I very rarely see bugs slip into production that involve type mismatches that would be caught by a Go-level type system, and just that level of type information would add a lot of overhead to my code.
But if I were already using types, a more expressive system could probably catch a lot of invariance issues. So I feel like the sweet spot graph is more bimodal for me: the initial cost of switching to a basic static type system wouldn't buy me a lot in terms of effort-to-caught-bugs-ratio, but there's a kind of longer term payout that might make it worth it as the type system becomes more expressive.
"I don't see the benefit of typed languages if I keep writing code as if it was PHP/JavaScript/Go" ... OF COURSE YOU DON'T!
This is missing most of the benefits, because the main benefits of a better type system isn't realized by writing the same code, the benefits are realized by writing code that leverages the new possibilities.
Another benefit of static typing is that it applies to other peoples' code and libraries, not only your own.
Being able to look at the signatures and bring certain about what some function _can't_ do is a benefit that untyped languages lack.
I think the failure of "optional" typing in Clojure is a very educational example in this regard.
The failure of newer languages to retrofit nullabillity information onto Java is another one.
I am sorry, but I don't really see how you stating more benefits of static typing really counters either of them.
I recommend reading the article again. But this time, try not to read it as defending a specific language (I only mentioned my blub language so that it's a more specific and extensive reference in the cases where I use it - if you are not using my blub language, you should really just ignore everything I write about it specifically) and more as trying to talk on a meta-level about how we discuss these things. Because your comment is an excellent example of how not to do it and the kind of argument that prompted me to this writeup in the first place.
Camp A: Languages with mediocre static typing facilities, for example:
-- C (weakly typed)
-- C++ (weakly typed in parts, plus over-complicated
type features)
-- TypeScript (the runtime is weakly typed,
because it's Javascript all the way down)
Camp B: Languages with mediocre dynamic typing facilities, for example: -- Javascript (weakly typed)
-- PHP 4/5 (weakly typed)
-- Python and Ruby (no powerful macro system to
help you keep complexity well under control
or take fulll advantage of dynamicism)
Both camps are not the best examples of static or dynamic typing. A good comparison would be between:Camp C: Languages with very good static typing facilities, for example:
-- Haskell
-- ML
-- F#
Camp D: Languages with very good dynamic typing facilities, for example: -- Common Lisp
-- Clojure
-- Scheme/Racket
-- Julia
-- Smalltalk
I think that as long as you stay in camp (A) or (B), you'll not be entirely satisfied, and you will get criticism from the other camp.While, yes, top-quality dynamic code will have documentation and test cases to make up for this deficiency, it's often still not good enough for me to get my answer without spelunking the source or StackOverflow.
I feel like I learned this the hard way over the years after having to deal with my own code. Without types, I spend nearly twice as long to familiarize myself with whatever atrocity I committed.
Can you give an or some example(s) of this?
The main use case of generics, making collections and datastructures convenient and readable, is more than enough to justify the feature in my view, since virtually all code deals with various kinds of "collections" almost all of the time. It's a very good place to spend a language's "complexity budget".
I wrote an appreciable amount of Go recently, with advice and reviews from several experienced Go users, and the experience pretty much cemented this view for me. An awful lot of energy was wasted memorizing various tricks and conventions to make do with loops, slices and maps where in other languages you'd just call a generic method. Simple concurrency patterns like a worker pool or a parallel map required many lines of error-prone channel boilerplate.
I feel the same way going from languages with HKTs back to Java/C#...
Not sure why you think they're not as useful, it sounds like you're making the same argument as OP but just moving the bar one notch over...
Subjectively, I use ordinary generics all the time, but see the need for HKTs only occasionally. It's entirely possible I'm not experienced enough to see most of their possible use cases, but then I'd wager most programmers aren't.
Neither Python nor Java programmers have to do that.
If Go is annoying with how little power it provides, that's fair, but other type systems can be just as annoying then, because when given the ability to, type astronauts will blast off into space, purely as a matter of honor or instinct.
Besides, code generation isn't all that bad. Java programmers will eventually find some kind of code generation in their build setup (serialization/schema tools).
In a hypothetical world where the designers never added the specific containers they did, you'd get a whole lot more value out of generics for containers. But it turns out, the designers used what seems on the surface like a kludge to get most of the benefits, while saving most of the cost. It's a perfect embodiment of the kinds of tradeoffs I'm talking about.
Architecture astronauting can be prevented with best practices and code review, not with language limitations. It’s a fools errand to try, code generation allows you to get all the complexity and more of generics.
I have come to see type systems, like many pieces of computer science, can either be viewed as a math/research problem (in which generally more types = better) or as an engineering challenge, in which you're more concerned with understanding and balancing tradeoffs (bugs / velocity / ease of use / etc., as described in the post). These two mindsets are at odds and generally talk past each other because they don't fundamentally agree on which values are more important (like the great startups vs NASA example at the end).
Though I am not a type theorist (I only dabble in compilers and language design), I have noted that many people conflate static typing and dynamic typing with other additional ideas.
Static typing has certain benefits but also has certain disadvantages, dynamic typing has certain benefits but also has certain disadvantages.
What I find interesting is that few people fall into the soft typing arena, using static typing where applicable and advantageous and using dynamic typing where applicable and advantageous.
Static typing has a tendency in many languages to explode the amount of code required to get anything done, dynamic typing has a tendency to produce somewhat brittle code that will only be discovered at runtime. The implementation of static typing in many languages requires extensive type annotation which can be problematic.
But what is forgotten by most is that static typing is a dynamic runtime typing situation for the compiler even when the compiler is written in a static typed language.
Instead of falling into either camp, we need to develop languages that give us the beast of both world. Many of the features people here have raised as being a part of the static typing framework have been rightly pointed out as being of part of the language editors being used and are not specifically part of the static typing regime.
Many years ago a similar discussion was held on Lambda-the-Ultimate, and the sensible heads came to the conclusion that soft typing was the best goal to head for. Yet, in the intervening years,when watching language design aficionados at work, they head towards full static typing or full dynamic typing and rarely head in the direction of soft typing (taking advantage of both worlds).
S, the upshot, this discussion will continue to repeat itself for the foreseeable future and there will continue to NOT be a meeting of minds over the subject.
Start-ups decide not to write MVPs in languages like Haskell or Idris not because those languages aren't "rapid" enough, but because it's too difficult to find programmers experienced in those languages on the labor market. It's already difficult enough to find competent programmers - no founder wants to make their hiring woes even more difficult.
You write "Why then is it, that we don't all code in Idris, Agda or a similarly strict language?... The answer, of course, is that static typing has a cost and that there is no free lunch."
I take it that you wrote "of course" here through assuming that there must be some objective reason for the choice, and that it depends solely on strictness, but languages don't differ only in their strictness, so choices may be made objectively on the basis of their other differences, and we also know that choices are sometimes made on subjective or extrinsic grounds, such as familiarity. I don't know what proportion of professional programmers are familiar enough with Iris or Agda to be able to judge the value proposition of their strictness, but I would guess that it is rather small.
Now, to look at the sentences I elided in the above quote: "Sure, the graph above is suggestively drawn to taper off, but it's still monotonically increasing. You'd think that this implies more is better." As the graph is speculative, it cannot really be presented as evidence for the proposition you are making. I could just as well speculate that static program checking does not do much for program reliability until you are checking almost every aspect of program behavior, and that simple syntactical type checking is of limited value. That would be consistent with the fact that there is little empirical evidence for the benefit of this sort of checking, and explain why most people aren't motivated to take a close look at Iris or Agda. In this equally-speculative view of things, current language choices don't necessarily represent a global optimization, but might be due to a valley of much more work for little benefit between the status quo and the world of extensive-but-expensive static checking.
I've been thinking about the trajectory of C++ language development recently and the emphasis has definitely been on making generics more and powerful. You watch CppCon talks and see all this super expressive template spaghetti and see that while it's definitely a better way to write code - the syntax is just horrifying and hard to "get over"
Just like when "auto" took off and people starting thinking about having "const by default" - I'm starting to think that generic by default is the way to go. The composability of generic code is incredible powerful and needs to be more accessible
However the other end of the spectrum: dynamic code leaves a lot of performance on the table and leads to runtime errors
Especially when it comes to GUI programming, I really don't care if a BlueButton.Click() got called instead of RedButton.Click().
So year, static typing doesn't buy you much, but in some languages it's at least cheap.
I think this is key. The benefit of static typing isn't that they provide safety, it's that they provide _low-cost_ safety. For a large class of problems, types are cheaper than tests are. For other classes, tests are cheaper than types. The main downside of nonstatic languages is that you have to use tests for everything, even that class where types are a better choice.
Or, type can be specified when setting the variable:
[String]$myString = "Hello World!"
This would generate a type error:
[Int]$myString = "Hello World!"
Often, typed and untyped variables will sit together:
[Int]$EmployeeID,[String]$FullName,$Address = $Input -split ","
[xml]$someXmlDocument = Get-Content "path\to\file.xml"
And you get a deserialized version of the XML text.Also the fact that you can use types when declaring function arguments, removing the need to manually test if an object of the desired type was passed.
Powershell definitely strikes a good balance on type safety for a scripting language.
If no, then what is the use of the typesystem?
If yes, isn't that cumbersome, since I suppose most library functions have typed arguments?
Now this codebase was written with a high degree of quality (it's pretty good but not perfect), but the lack of compile (and of course runtime)-time checks has caused waste.
The second phase of my project to convert all promises to RX Observables :)
Pity you! I fear such tasks.
As mentioned,take advantage of async/await. Also, make sure you wrap everything in modules and access from outside through module exports.
And I disagree with the barrier to entry argument. Static typing, by enabling rich tooling, helps a beginner (like it helped me) a lot more by giving live feedback on your code, telling you immediately where you have a problem and why, telling you through a drop down what other options are available from there, etc. Basically makes the language way more self-discoverable than having to RTFM to figure out what you can do on a class.
That both forms of languages are popular shows that there are benefits in overall productivity to each; they are just different benefits.
I think the key is not to confuse both approaches and leverage the strengths of each to the max.
Disclaimer: Python user scarred by email header RFC violations
I think Go with its lack of algebraic type is more of the first, helping the compiler, so I wouldn’t use it as a good example of static typing.
Haskell, OCaml and Rust would make excellent case studies, but we have nothing to compare against.
So IMHO the best way to compare static typing vs dynamic typing is by comparing Typescript against JS. And in my experience the difference when writing code is huge. It completely eliminates the code-try-fix cycle during development.
This is a basic intuition behind all good practices, including CI, QA, etc.
Types allow one to discover program defects (even generalized ones, when using some of the programming languages) in (almost) shortest possible amount of time.
Types also allows one to constrain effects of various kind (again, use good language for this), which constraintment can make code simpler, safer and, in the end, more performant.
Have you familiarized yourself with Haskell?
Now, whether you consider that's actually helping solve the refactoring, or actually introducing new bugs, well - that's another issue :)
say you've got a function that takes a list of numbers, and some bounds, and gives you back a number from the list that is within the bounds (and maybe meets other criteria, whatever). your contract for the function could require not only that the list be comprised of numbers, and the bounds are numeric, but also that the lower bound is <= the upper bound, and that the return value was actually present in the input list.
The worst part is that versions of the API for other languages are fine. It's just the Python one they decided to go all "clever junior developer" on.
We've been here before with the C preprocessor. There's nothing wrong with having a preprocessor, but in C it is necessary to use the preprocessor and that causes a lot of problems, like making it especially difficult to write tools.
It is the number one thing that makes C++ templates unusable: semantics defined by means of code generation.
I have a large code base. I want to replace a fundamental data structure to support more operations/invariants/performance guarantees. I change the type at the roots of the code base. My instance of ghcid notifies me of the first type error. I fix it. This repeats until the program compiles again. I run the tests. All the tests pass.
This is insane in Python/C/Ruby. I've had to do it in C and Python. In Haskell I do it with impunity.
The type system doesn't just check what my program does, it is the compass, map, and hiking gear that gets me through the wilderness.
Personally, I find it pretty workable in Python with a big codebase (but you have to respect the rules like having a good test suite -- you change the time from compiling to running your test suite -- which you should have anyways)...
I find that the current Python codebase I'm working on (which has 15 years and around 15k modules -- started back in Python 2.4, currently in Python 2.7 and going to 3.5) pretty good to work in -- there's a whole class of refactoring that you do on typed languages to satisfy the compiler that you don't even need to care about in the Python world (I also work on a reasonably big Java source codebase and refactorings are needed much more because of that).
I must say I'm usually happier on the Python realm, although, we do use some patterns for types such as the adapter protocol quite a bit (so, you ask for a type and expect to get it regardless of needing a compiler to tell you it's correct and I remember very few cases of errors related to having a wrong type -- I definitely had much more null pointer exceptions on java than typing errors on Python) and we do have some utilities to specify an interface and check that another class implements all the needed methods (so, you can do light type checking on Python without a full static checking language and I don't really miss a compiler for doing that type checking...).
I do think about things such immutability, etc, but feel like the 'we're all responsible grown ups' stance of Python easier to work with... i.e.: if I prefix something with '_' you should not access it, the compiler doesn't have to scream that it's private and I don't need to clutter my code -- you do need good/responsible developers though (I see that as an advantage).
It's true that static type-checking proves the absence of an entire class of errors. But it doesn't prove that the code does the correct thing; it could be well-typed but completely wrong. On the other hand, tests prove that the code does the correct thing in certain cases. ...Of course, it's up to the developers to actually write a good test suite.
The faster we can all accept that there are pros and cons to both, the faster we can come up with a solution that takes advantage of the best of both worlds. That's the whole point of this OP.
I, personally, have always wondered about ways to dial in to the sweet-spot over time as a project matures. At the start of a project, shipping new features faster is often more important. But if the project survives, maintenance (by new developers) and backward compatibility become more and more of a priority.
It's only insane if you don't have test cases with good coverage, in which case you are very-very-very screwed, statically typed or not.
I really wish more languages took this to the logical conclusion and implemented first-class contract support. It seems work on contracts stopped with Eiffel (although I've heard that clojure spec is _kinda_ getting there).
Exactly. The author of the article implicitly equates "statically verified code" with "bug-free code". But that's not correct. It's quite possible (and even, dare I say it, fairly common) to have code that expresses, in perfectly type-correct fashion, an algorithm that doesn't do what the user actually wants it to do. Static typing doesn't catch that.
For example I can prove my my string reverse works in Idris (https://www.stackbuilders.com/news/reverse-reverse-theorem-p...). Or I could prove that my function squares all elements in a list. Etc.
Now a big part of the problem is expressing with sufficient accuracy what the properties of the algorithm you want to prove are. For example for string reverse I may want to show more than that `reverse (reverse s) = s`. Since after all if reverse does nothing that would still be true. I would probably want to express that the first and last chars swap when I just call reverse xs.
Not at all. First, the statements as put here are discrete (boolean even) while I present both "statically verified code" and "bug-freedom" as living on a continuum. Secondly, I don't equate them. If anything, I assume a monotonic, positive relationship between them (strictly speaking not even that. I make pretty clear that the curves could also have whatever shape. But I yield that I am very suggestive in this because I do strongly believe it to be the case). In fact, one of the main points of the argument is that the two are not equal - otherwise, the blue curves I drew would all be straight lines from (0,0) to (1,1). And lastly, none of this is done implicitly. I mention all of this pretty explicitly :)
It's also possible to grab a knife with your hand on the blade edge and cut yourself, but that doesn't diminish safety the value of knife handles.
E.g., my experience is that poor library design can sometimes be exacerbated in statically typed languages if the type logic is poor and doesn't match the problem domain. Dynamic languages sometimes inadvertently "correct" for this by smoothing over these sorts of issues.
I prefer static languages (or at least optionally typed ones) but there can be big downsides of the sort you're mentioning, that are exacerbated by third-party libraries.
Care to share those practices? I also primarily work (this year, at least) in dynamically typed languages.
Simple type-level errors come up the most when you have types that are easily conflated. That tends to happen when you have functions that accept more than one type of thing or output more than one type of thing - avoid that. Avoid polymorphism and OOP patterns that set up a complicated type hierarchy or override methods - you don't want any instance where you end up with something that looks a lot like one type of thing but isn't. Type hierarchies can often be factored out into behaviors provided by modules that supply functions that operate on plain data structures. For variables and parameters, stick to really simple types to whatever degree it's possible, e.g. language primitives, plain old data-container objects and "maybe" types (e.g. things that could be a primitive or null) when absolutely necessary (and check them whenever you might have one). Use union types extremely sparingly. Assignment/creation bottlenecks are useful: try to have only one source for objects of a certain type that always constructs them the same way (so you don't end up with missing fields).
A lot of programmers coming from a language with a stronger type system (especially when transitioning from OOP languages to functional or hybrid languages) tend to be nervous about writing functions without a guarantee about what kind of inputs they'll see, so they try to compensate for the lack of type safety by building functions that can cope with whatever is thrown at them. The idea is that this makes the function more robust but ironically, this tends to make bugs a lot harder to track down. In my experience it's better to write functions with specific expectations about their input that fast-fail if those aren't met, instead of trying to recover in some way - garbage-in-exceptions-out is better than garbage-in-garbage-out. If you send the wrong kind of thing to a function, you want it to throw an error then and there, and you'll likely catch it the first time you test that code.
A lot of the idea of this kind of advice is to shift the work that would be done by the compiler's type system to the very first pass of testing - if your program is basically a bunch of functions that only take one sort of thing in each argument slot, only emit one sort of thing as a result and fail fast when those expectations are violated, you'll typically see runtime type errors the first time those functions get executed, which is a lot like seeing them at compile time.
It really isn't as much about languages as it is about the people who use them. The key ability is to prove things about programs. Powerful type systems, especially those that have type inference, merely relieve the programmer from some of the most boring parts of the job. Sometimes.
Can you give an example of this overhead?
The point is to explore a comparative difference in value, and that is realized through mastery of the tool, not merely living in a world where it exists.
The inverse is also true; you don't really get the benefits of dynamic typing until you start doing things differently to take advantage of that difference. If you still code like you're in a static language, you'll miss the benefits of a dynamic one.
What exactly does it mean to have "good dynamic typing facilities"?
To quote Peter Norvig on the difference between Python and Lisp, but you could apply it to most other mainstream dynamic languages vs Lisp :
> Python is more dynamic, does less error-checking. In Python you won't get any warnings for undefined functions or fields, or wrong number of arguments passed to a function, or most anything else at load time; you have to wait until run time. The commercial Lisp implementations will flag many of these as warnings; simpler implementations like clisp do not. The one place where Python is demonstrably more dangerous is when you do self.feild = 0 when you meant to type self.field = 0; the former will dynamically create a new field. The equivalent in Lisp, (setf (feild self) 0) will give you an error. On the other hand, accessing an undefined field will give you an error in both languages.
Common Lisp has a (somewhat) sound, standardized language definition, and competing compiler/JIT implementations that are much faster than anything that could ever possibly come from the Python camp because the latter is actually too dynamic and ill-defined ("Python is what CPython does") and making Python run fast while ensuring 100% compatibility with its existing ecosystem, without putting further restraints into the language, is akin to a mirage.
> What exactly does it mean to have "good dynamic typing facilities"?
The ability to change the structure of your program at runtime will be at the top of the list for me. You can't do that with Ruby/Python.
Picking Common Lisp as an example:
(NOTE: Some of the features are also present in good statically typed languages as well, so what I advocate is to use good, well-featured languages, not really static vs dynamic.)
(NOTE 2: I'm sorry for being such a fanboy, but that thing is addictive like a hard drug...)
0. Code is a first class citizen, and it can be handled just as well as any other type of data. See "macros" below.
1. The system is truly dynamic: Functions can be redefined while the code is running. Objects can change class to a newer version (if you want to), while the code is running.
2. The runtime is very strong with regards to types. It will not allow any type mismatch at all.
3. The error handling system is exemplary: Not only designed to "catch" errors, but also to apply a potential correction and try running the function again. This is known as "condition and restarts", and sadly is not present in many programming languages.
4. The object oriented system (CLOS) allows multiple dispatch. This sometimes allows producing very short, easy to understand code, without having to resort to workarounds. The circle-ellipse problem is solved really easily here. (Note: You can argue that CLOS is in truth a statically typed system, and this is partly true -- the method's class(es) need to be specified statically, but the rest of arguments can be dynamic.)
5. The macro system reduces boilerplate code to exactly zero. And also allows you to reduce the length of your code, or have very explicit (clear to read) code at the high-level. This brings down the level of complexity of your code, and thus makes it easier to manage. It also reduces the need for conventional refactoring, since macros can do much more powerful, automatic, transformations to the existing code.
6. The type system is extensive -- i am not forced to convert 10/3 to 3.333333, because 10/3 can stay as 10/3 (fractional data type). A function that in some cases should return a complex number, will then return the complex number, if that should be the answer, rather than causing an error or (worse) truncating the result to a real. Arbitrarly length numbers are supported, so numbers usually do not overflow or get truncated. (Factorial of 100) / (factorial of 99) gives me the correct result (100), instead of overflowing or losing precision (and thus giving a wrong result).
So you feel safe, because the system will assign your data the data type that suits it the best, and afterwards will not try to attempt any further conversion.
7. The type system is very flexible. For example, i can (optionally) specify that a function's input type shall be an integer between 1 and 10, and the runtime will enforce this restriction.
8. There is an extensive, solid namespace system, so functions and classes are located precisely and names don't conflict with other packages. Symbols can be exported explicitely. This makes managing big codebases much easier, because the frontier between local (private) code versus "code that gets used outside", can be made explicit and enforced.
9. Namespaces for functions and variables (and keywords, and classes) are separate, so they don't clash. Naming things is thus easier; this makes programming a bit more comfortable and code easier to read.
10. Documentation is built into the system - function and class documentation is part of the language standard.
11. Development is interactive. The runtime and compiler is a "living" thing in constant interaction with the user. This allows, for example, for the compiler to immediately tell you where the definition of function "x" is, or which code is using such function. Errors are very explicit and descriptive. Functions are compiled immediately after definition, and it can also be dissasembled (to machine language) with just a commmand.
12. Closures are available. And functions are first-class citizens.
13. Recursion can be used without too many worries -- most implementations allow tail call optimizations.
14. The language can be extended easily; the reader can also be extended if necessary, so new syntaxes can be introduced if you like. Then, they need to be explicitely enabled, of course.
15. There is a clear distinction between "read time", "compile time" and "run time", and you can write code that executes on any of those three times, as you need.
16. Function signatures can be expressed in many ways, including named parameters (which Python also has and IMO is a great way to avoid bugs regarding to wrong parameters / wrong passing order.)
Maybe it's simply hard to have both a good type system and a friendly learning curve?
To be honest, i love Common Lisp, it might be the most powerful programming language out there, but it's not easy to learn at all. In part because, being a truly multi-paradigm language, you should better make sure you are well versed in most programming paradigms first, otherwise you won't leverage the full power of Lisp. Not to mention the paradigm of meta-programming and DSLs, something that is usually new to programmers foreign to Lisp.
However, languages like Clojure and Smalltalk can be rather easy to learn, and they are fairly powerful.
Smalltalk was designed to be taught to kids!!
Ruby has some of the best meta-programming facilities out there. Yes you can't manipulate syntax in the same way as lisp, but the fact that all methods are message passing and first class blocks make tons of very powerful meta programming possible. Basic features that look otherwise first class are based on Ruby's meta programming facilities like `attr_reader` and friends. Funnily enough, the meta programming facilities of Ruby are precisely what turns a lot of people off. The wtfs per minute of using something like ActiveRecord is super high for people with only passing familiarity because there's so much that that's defined through Ruby's meta programming facilities.
Smalltalk has worlds better dynamic and metaprogramming.
That said, ruby does have a lot of power, but it's not of the same order as self/smalltalk etc.
Ruby has meta-programming facilities, but they pale compared to the easiness of doing meta-programming in Common Lisp. In Ruby, meta-programming is an advanced topic (see for example the implementation of the RoR ActiveRecord). In Lisp, meta-programming is your everyday bread&butter, and one of the first things a beginner learns. Because it isn't too different from regular programming!!
[The same comment applies, mostly, also to Clojure, Racket, Scheme, and the other Lisps]
You ain't gonna to find any sane way to combine macros with a powerful type system in a way the doesn't make a 140+ IQ a requirement for any programmer touching the code using these features in a real world project...
Problem with programming language design is that the ideal/Nirvana solutions lie at the edge, or beyond, the limits of human intellect. If you want something that can be learnt and understood with reasonable effort (like in not making "5+ years experience" a requirement for even basic productivity on an advanced codebase), you're going to have to compromise heaviliy! The most obvious ways to compromise are throwing away unlimited abstraction freedom (aka "macros"), or type systems.
Sorry to break it to ya, but we're merely humans, and not that smart...
I'm very well versed in Python (i've delivered two financial software systems done in Python, written entirely by yours truly). However its features and facilities pale in comparison to the languages i listed in camp "D".
Then there's the whole tooling aspect of trying to mix type systems. It's different lifestyles. Dynamic programmers aren't going to start compiling their code to run it, static programmers aren't going to switch to a language with weaker tooling around the IDE-ish features, which are mostly built on the type system.
My conclusion is this: New languages should all be statically typed, because we shouldn't need new languages at all. We should be fine. The reason we need new languages at all, is because the trifecta of C++/Java/C# basically encompassed the entire statically typed world, but they're all infected with this fully overblown OOP obsession, and the null pointer bug--which newer languages have fixed, through more static typing. Basically we need to replace those languages with similar ones and then just stop making languages for a few decades, until whatever we're doing now looks as dumb as OOP and null pointers. In the long run, Go/Swift/Kotlin/Rust will take over the statically typed world and it's going to be great.
A simple example of this is a list. Now in statically typed languages, list are homogeneous (this is includes type unions). In dynamically typed languages, list can be heterogeneous, essentially anything can be added at runtime.
In soft typing, we can indicate that a list is homogeneous and the compiler will ensure that this is true or we can specify no type checking (as such) and this will be done at runtime.
Contrived yes, but I regularly use other aggregates (tables and sets) into which I do not want them to homogeneous.
One of the aspects that I like about functional languages is the polymorphism available, but in all that I have come across, there is no way to make a tree or list heterogeneous without declaring union types before hand.
My problem with C#, C++, Java, and their ilk, is that code is multiplied with their generics.
How the IDE and compiler and type systems interact is a design function and is not inherent to any type system.
One of the reasons I don't use specific main stream languages such as C#, C++ or JAVA is that they don't provide the specific programming features that I desire.
I have looked at Go, Swift and Rust and I am not at all impressed by the "relative stupidities" within those languages. For other programmers, what they consider to "relative stupidities" is entirely up to their experience and outlook.
For four days, I spent debugging a python production script because in one place I had typo'd ".recived=true" on an object and just couldn't understand why my state machine wouldn't work.
And very quickly, the whole team became fans of __slots__ in Python.
I still write 90% of my useful code in python, but that one week of debugging was exhausting & basically wouldn't have even compiled in a statically declared language. Even in python, the error is at runtime, after I got the __slots__ in place.
Not just, "who the hell uses this", but "where the hell is this defined" as well.
You would think by looking at the code that the creators had a 30 word vocabulary, because 80% of the code uses the same six nouns and four verbs to pass data around and what you use those for depends on the context of who is calling it.
Oh, but the entire thing is written using promises, so most of your function calls have no context. It's hell, and I'm starting to worry that Node has dug itself a reputation hole it will never get out of.
I think the best thing I've found for this personally is coccigrep, which works but I've only used it a couple of times. I'd like something I'd reach for about as often as I reach for grep. (Also I think these days you'd want it to be based on clang or something.)
One thing that does seem to be true is that the requirement to name types means that a textual grep is way more reliable than it is in C. If I want to find all places where a Python class is used in a large codebase, I might as well give up.
People were complaining about static typing for the dumbest reasons (it's too 'wordy'). It started as a backlash against old-school enterprise Java development (which was fair, EJB2 world sucked) but then it went completely off-the-rails. Typing in Java could be better, sure, but even with its quirks it's way better than the nothing you get with dynamic languages. There are a class of bugs you just never need to worry about when the compiler does some compile-time checks for you ... like worrying that you passed the wrong type into a function, or the wrong number of arguments.
Thank God people are coming to their senses.
In my experience, the cost of static typing feels roughly constant per line of code, while the benefits of static typing feel roughly O(N log N) in lines of code or O(N) in number of cross-type interactions. These are just wild guesses based on gut feelings, but they feel about right. The constant factors are affected by individual coder preference, experience, and ability, but also specifics of thy type systems involved and the strength of type inference in the tools being used.
In any case, I think often times dynamic typing proponents and static typing proponents have vastly different notions of what a large code base is, or at least the size code bases they typically use.
One problem is that many code bases start out where the advantages of static typing aren't readily apparent, but re-factoring into a statically typed language is often not realistic, even/especially when a project starts groaning under its own weight.
I'd love to see more mainstream use of gradual typing/optional typing/hybrid typing languages, especially something like a statically typed compiled subset of a dynamically typed interpreted language, where people could re-factor portions of their quick-and-dirty prototypes into nice statically typed libraries.
I disagree; that Java was the standard example of statically typed languages is what convinced me for a long time that I didn't want anything to do with it. Having to pollute my code with all that crap, deal with a lot of dumb restrictions and still have NPEs left me with a sour taste.
Only after I discovered Haskell (and more recently Idris), did I realize that static typing can actually be worthwhile.
There are no senses to come to. Static and dynamic typing each have their own benefits, and there are genuine tradeoffs to choosing one over the other. That we are even having this debate in 2017 shows that the world of typing is not a "solved problem" and there are still good reasons to use one over the other for various reasons.
There is a different problem with the more powerful and useful type systems in more modern statically typed languages though: learning curve. Haskell is dysfunctionally hard to learn and other languages do a little bit better but there's still friction in the learning curve that gets in the way of widespread adoption in projects that want to be able to hire rapidly.
This is the #1 reason I leaned away from static typing. Typescript and Swift have changed my mind. I now see typing as a helpful tool that can solve a lot of problems.
People always say this and it baffles me. Bugs like that should be caught immediately by your test cases. You shouldn't rely on the compiler to catch them for you.
In general though, the more you can formally reason about the program, the more you can automate program transformations (refactoring). Programmers in dynamic languages will argue that the amount of code is far less than the equivalent code in a mainstream statically-typed language, so while the cost of refactoring may be higher per unit, the number of units is less, so the overall cost is the same (or less).
I believe in using the best tool for the job...some use cases would benefit more from static typing, while others would benefit more from using a dynamic language. One of the most important factors is the team and its engineers' backgrounds, preferences, styles, etc.
The realization that with types, smart enough compiler can implement interfaces for me, was amazing.
Unfortunately, I am afraid that it will take a while for these techniques to go mainstream. A.f.a.i.k most a mainstream language can do, is to fill in the method stubs for you.
There are no diminishing returns. Defining types is easy and enhances code readability.
I don't think they enhance code readability - I think they make it worse. I never look at the type when I am reading code, it just gets in the way.
Large Python code bases are really hard to understand and work in (here).
Of course there are cases where dynamic languages do worse, but it balances out I think.
Unless you think original means Bill Opdyke's thesis work, where the tooling was written for C++ and written in CLOS.
Yes, you can find all references to a method in Smalltalk -- but those references are not separated-out from all the references to other methods that happen to have the same method name but are defined on a different class.
With type information for the receiver and method arguments, we can find just the references we're looking for.
I always thought dynamic typing is a feature for situations where the code needs an extreme amount of flexibility to adapt to a wide variety of data; even at the expense of performance.
Not necessarily. The phrase "dynamic language" actually bundles together a lot of related but distinct ideas: some of those add performance overhead, others don't. "Dynamic" features which are most notable for performance overhead are tagged unions and dynamic dispatch.
Imagine a dynamically typed language which provides types like int, bool, list, functions, etc. and we're free to assign (and reassign) any of those values to any variable we like:
x = 5 # An int
y = true # A bool
y = 3 # An int
z = square(x + y) # An int
It's very common to store such values as a tagged union: a "union" meaning that the data could represent an int, or a bool, or whatever, and "tagged" meaning that there's some extra data which tells us which one it is. For example, we might store `5` as a pair of machine words: `1` to indicate that it's an int, and `5` to represent the data. The boolean `true` might be the pair `2` to indicate bool and `1` for the data. And so on.This tagging adds overhead. We can reduce it in some cases, e.g. using "tagged pointers" where we use some of the bits in a word for the tag and some for the data. But it's still overhead.
However, there's nothing fundamental about using tagged unions. We could just as easily use 'unboxed' values, i.e. store the int `5` as the machine word `5`; the bool `true` as the machine word `1`; etc. There is no overhead, no metadata, etc. Whilst this is a perfectly reasonable implementation strategy, it means that we cannot tell what type a value is intended to have: e.g. if a value is stored as the machine word `1`, we have no way of knowing if that's the boolean `true`, or the integer `1`, or the character `SOH`, or whatever. This would probably lead to very buggy programs, since we have no type checker to enforce correct usage, and (since there's literally no difference between data of different types) we can't even do runtime assertions like `assert(isInt(x))`.
Likewise, many dynamic languages use dynamic dispatch to choose which functions to call based on the type of data we have. In the above example, we might have `x + y` running an integer addition function when `x` and `y` are integers, or a string concatenation function when `x` and `y` are strings, and so on. That's what Python and Javascript both do. Yet, again, there is no fundamental reason to do this! We can have a dynamic language which uses static dispath everywhere: consider that in PHP, `+` is only used for numerical addition, whilst `.` is only used for string concatenation. Dynamic dispatch adds overhead, since we need to chase pointers, etc. whilst static dispatch doesn't. This is simply a question of programmer convenience: do we want every function to have a distinct name, and write them out in full each time?
Note that both of these features: tagged unions and dynamic dispatch, can be used in static languages too. It's just that, historically, they tend to be default (and hence unavoidable) in dynamic languages, and something we must explicitly create in static languages. Hence when we compare static languages to dynamic ones, we tend to compare statements like `x + y` in Python to `x + y` in C, which are actually quite different semantically. A fairer comparison would be to compare `x + y` in Python with `callWith(lookUpSymbolFromType("+", lookUpType(x)), x, y)` in C (I've made up those function names, but the point is that the same functionality is there if we want it, but it will be just as slow as if we'd used Python)
This isn't to say that static typing isn't good at this. Just, even with that, there is a lot of effort that goes into making the rich tooling. A lot of very smart and capable folks work hard to make Visual Studio.
vscode seems to figure out the types in javascript without any static typing.
>But it is great for refactoring
Searching for strings isn't that much worse. Also, when it comes to web development, you cross into the client-side and suddenly you can't refactor. So you can only refactor the server-side and end up with a mismatch.
>finding all references to a function or a property or navigating through the code at design time
You can do that without static typing in many cases as well.
>Basically all the features visual studio excels at for .net languages.
When I was working in c# on the server and javascript on the client, I really hated having to go back into c#.
>telling you through a drop down what other options are available from there
vscode seems to be able to figure this out most of the time as well.
I think static typing is necessary when you need performance because all of the fast languages are statically typed.
I think a lot of these things are about organisational complexity and making sure new and average programmers don't screw up the software. It is about large companies trying to manage their organisation, it isn't about the complexity of the code itself.
There are a ridiculous amount of tech companies that have used dynamic languages to go from nothing to the biggest companies in the world and only switched to static languages well and truly after that occurred.
It is actually using TypeScript's engine for this. It uses TypeScript's type definitions where it is available (e.g. most popular libraries and built-ins), and some inference rules where it has to work with plain JavaScript.
While this does give you decent auto-complete in a lot of cases (and still getting better), it's not quite as good as using TypeScript directly.
> Searching for strings isn't that much worse.
I'm busy with a large refactoring project, and TypeScript has been amazing for this. After restructuring code, you can just keep on fixing things until there's no more red. This is not just about reducing bugs - it removes almost any thinking/mental overhead required for the refactoring process.
> Also, when it comes to web development, you cross into the client-side and suddenly you can't refactor. So you can only refactor the server-side and end up with a mismatch.
TypeScript also works very well for client-side. Of course, if you change the API between the client and server, that's a different story.
There are limits to type inference. And if you're going to rely on type inference to prevent runtime bugs you might as well double-down on static typing since you're giving up on some of the 'features' (insanity) of dynamic languages anyway.
Dart 2.0 now mandates strict typing but will allow you syntactic shortcuts as long as the compiler can infer the type (if it can't, you get a compile-time error) - that's a great compromise. I wish more people would love Dart. It's such a great, well-designed, language.
>Searching for strings isn't that much worse.
There are very real limits with what you can do with 'strings'. And yes, it is that much worse.
>When I was working in c# on the server and javascript on the client, I really hated having to go back into c#.
I do not understand that view. You are an alien to me. C# is a beautiful language that fixes a lot of syntactic problems in Java. It is much more pleasurable to write C# code than JS code (outside of dinky 50 line programs).
>I think static typing is necessary when you need performance because all of the fast languages are statically typed.
That's not the only reason but it is one of them. JIT and AOT compilers can do more with strongly typed code.
>There are a ridiculous amount of tech companies that have used dynamic languages to go from nothing to the biggest companies in the world and only switched to static languages well and truly after that occurred.
Sure. PHP (pre-5) and JavaScript made a ton of money for a ton of people. Both languages were integral in the Web revolution. Doesn't change the fact that PHP was a terrible language and JavaScript is still a terrible language.
The unknowable question is, would they have hit that sweet spot anyways by engineering their product more rigorously, and had less pain later? Or would it have impeded them in the exploratory phase of writing and rewriting their code until they hit that spot?
Doesn't it do this by treating Javascript as a statically-typed language (Typescript) and using type inference?
This is a biased sample though. You're saying 'ridiculous number', but the truth is most startups (using static or dynamic languages) fail and we don't actually know if their language choice had much impact on their success or failure.
As you might've noticed, I have been intentionally light on the details and only talked pretty abstractly about the specific scales and factors involved.
I use Cursive https://cursive-ide.com/ for working with Clojure, and it can do safe refactoring for symbols by doing static analysis of the source. It can show all usages of a symbol, rename it, do automatic imports, and so on.
Another piece of tooling that's not available in any statically typed languages at the moment is REPL integration with the editor seen here http://vvvvalvalval.github.io/posts/what-makes-a-good-repl.h...
I find that the REPL driven workflow found in Lisps is simply unmatched. When you have tight integration between the editor and the application runtime, you can run any code you write within the context of the application immediately. This means that you never have to keep a lot of context in your head when you're working with the application. You always know what the code is doing because you can always run and inspect it.
Having the runtime available during development gives you feedback much faster than the compile/run cycle. I write a function, and I can run it immediately within the context of my application. I can see exactly what it's doing and why.
The main cost of static typing is that it restricts the ways you can express yourself. You're limited to the set of statements that can be verified by the type checker. This is necessarily a subset of all valid statements you could make in a dynamic language.
Finally, dynamic languages use different approaches to provide specification that have different trade offs from static typing. For example, Clojure has Spec that's used to provide runtime contracts. Just like static typing, Spec provides a specification for what the function should be doing, and it can be used to help guide the solution as seen here https://www.anthony-galea.com/blog/post/hello-parking-garage...
Spec also allows trivially specifying properties that are either difficult or impossible to encode using most type systems. Consider the sort function as an example. The constraints I care about are the following: I want to know that the elements are in their sorted order, and that the same elements that were passed in as arguments are returned as the result.
Typing it to demonstrate semantic correctness is impossible using most type systems. However, I can trivially do a runtime verification for it using Spec:
(s/def ::sortable (s/coll-of number?))
(s/def ::sorted #(or (empty? %) (apply <= %)))
(s/fdef mysort
:args (s/cat :s ::sortable)
:ret ::sorted
:fn (fn [{:keys [args ret]}]
(and (= (count ret)
(-> args :s count))
(empty?
(difference
(-> args :s set)
(set ret))))))
The above code ensures that the function is doing exactly what was intended and provides me with a useful specification. Just like types I can use Spec to derive the solution, but unlike types I don't have to fight with it when I'm still not sure what the shape of the solution is going to be.I can encode all those invariants as QuickCheck properties and have them automatically tested against random inputs on every test run. It's still all runtime verification, but with random inputs I actually have more confidence of hitting a corner case than with just asserting during regular program runs or hand written example tests.
Also, with enough heavy lifting you can actually encode all of that in the types in a dependantly typed language like Idris [1]. And while a machine checked proof of your sorting algorithm is nice, it might be hitting the diminishing returns point the article mentions over just using property tests.
Which is not the argument made by anyone. Indeed, I explicitly acknowledge that there is a certain fraction of use-cases not covered by the builtins. So there isn't really any disagreement about this.
The question is how large this fraction is, how much it would benefit and how inconvenient/costly the existing workarounds are. Like all engineering questions, these are impossible to talk about when dealing in absolutes. And once you actually talk about these questions quantitatively, I achieved the goal I had with the post - to change the debate into a quantitative one explicitly acknowledging the tradeoffs involved.
> Architecture astronauting can be prevented with best practices and code review, not with language limitations.
I work at a company which has probably one of the highest standards in regards to code review in the industry. As such, I disagree with you that it is effective in addressing this.
> It’s a fools errand to try, code generation allows you to get all the complexity and more of generics.
If that's the case, where do the complaints come from about the lack of generics? It seems that Go really has generics then, in your opinion?
Of course, that's a strawman and a misrepresentation of your argument. But what makes this a strawman, the difference between the existing workarounds and actual generics, is just as effective an argument for your side as it is one for my side. Because codegen is made so inconvenient, people bias heavily towards using the builtins, away from custom data structures, if they can at all get away with it. Thus greatly reducing the overall complexity of the codebase.
So it would seem to me, that this argument is logically flawed. Either codegen is a poor replacement, thus leading to people using less generic code, thus there is an effective reduction in complexity. Or codegen has the same effect on complexity, which would mean it is used just as much, meaning it can't be that bad a workaround.
I don’t agree that using piles of built in objects makes the code easier to understand. If I want a Tree<Node, Node, Value>, how is using lists of lists and integer pairs making my code easier to reason about? Or using code generation to make reams of classes that create a Tree for everything I want, and anyone who uses my functions? How is encouraging either of those things a positive?
The author makes it clear that the analysis is not perfectly rigorous. There is a very wide landscape between perfectly rigorous and completely useless.
Do you think the article fails to hint at any of the fundamental dynamics of how type systems affect software development? How so?
For one example, I don't think it's a given that the green line (velocity vs % type-checked) should have a negative slope. Maybe in some cases, for some projects or some people, but certainly not universally. At least part of it would have been positive on almost all projects that I've worked on, and I'm not doing rocket science.
Then, the combined chart just looks at the amount of bug-free output, completely ignoring the amount of bug-ridden output. That latter part doesn't just get discarded, it needs fixing, and bugs that were only discovered in production are expensive to discover, debug and fix.
This is in addition to pretty much every other top level comment in this thread, a lot of which bring up important points that are unaccounted for even conceptually in the charts.
Disagree here, actually! Javascript (Typescript) and Python (mypy) are both seeing pretty big benefits from adding gradual typing.
This makes it much more likely to be used but it's fundamentally the same set of ideas.
A really cool idea I'm playing with at the moment is using fuzzing/static analysis based generators to feed spec/test.check.
I think it will help get past the, imo, biggest issue with generators in that they can miss exceptional cases in the code.
E.g. If (x=="jack and Jill) {exceptional case} is unlikely to be triggered with standard generators but "easy" for static analysis tools to solve.
> Also, with enough heavy lifting you can actually encode all of that in the types in a dependantly typed language like Idris [1]
In theory. In practice it is multiple orders of magnitude harder to prove properties in Idris than it is to spec them using property based testing.
Somebody has to be able to read that specification and understand that it's correct in a semantic sense. Ultimately, the specification itself becomes a full blown program that the type checker executes. So, now you run a program to try and verify aspects of your original program, but how do you verify that the specification itself is correct?
At some point a human has to be able to read the code and decide that it matches the intent. This step can't be automated, and I certainly don't think the Idris example improves things. I'd argue that it's far easier to tell that this version is correct:
fun insertionSort(arr, int n)
{
var i, key, j;
for (i = 1; i < n; i++)
{
key = arr[i];
j = i-1;
while (j >= 0 && arr[j] > key)
{
arr[j+1] = arr[j];
j = j-1;
}
arr[j+1] = key;
}
}At one of my clients they had one mainframe developer left that knew their systems. She had already tried to retire, but they got her to agree to stay on for 5 years in return for bags filled with money. That meant they had 5 years to rewrite on a platform they could actually hire people for. 5 years to replace a system with decades of history.
Groovy's still great for scripting on the JVM though, for stuff like those 10-liner build scripts for Gradle, glue code, and mock testing. Just don't use Groovy for building systems -- use a language based on static typing from the ground up, like Java, Scala, or Kotlin.
You keep saying this repeatedly but it just isn't true:
https://github.com/grails/grails-core/blob/master/grails-cor...
https://github.com/groovy/groovy-core/blob/master/src/main/g...
A large portion of advanced Haskell type system features seem to be about emulating things you could do with side-effects. I guess I prefer Rust's approach to managing side-effects, or even just Scala's implied convention of: use 'var' very sparingly, and mostly locally. Yes, some guarantees get traded away, but so much simplicity is gained.
I'm not very experienced with Haskell, but I've written a fair bit of Scala and I've utterly failed to see the value in scalaz and similar libraries, despite trying them a few times. They always seem to add lots of complexity without a tangible benefit.
Coming at it from another angle, I just don't see many cases where I feel I have to repeat myself due to a shortcoming of, say, Java's or C#'s type system. If I could add one feature to either, it'd actually be support for variadic type parameters.
Both Java and C# tend to rely heavily on frameworks such as Spring to workaround issues with the expressivity of the languages. This causes problems when one needs two frameworks (they don't in general compose). In Haskell, HKTs allow one to write polymorphic programs that are parametric with respect to certain behaviours and dependencies, no dependency injection framework needed.
Please don't judge Haskell using Scala and scalaz.
And I think an important take-away should be, that this perception is entirely subjective and colored by both of our experiences, preferences and the kinds of problems we work on :)
Some code sticks around because it's great at what it does. Some code sticks around because it works if you don't touch it and is impossible to delete due to various kinds of dependencies.
Most POSIX APIs, for instance, are confusing, obtuse and unnecessarily imperative but still good enough in spite of being 40 odd years old. There's way too much code that implements or calls them to justify making significant changes as this point.
Especially in SOA, it can be cheaper to replace a poorly written service than trying to rewrite all of it over time.
Null pointers are a type error. The fact that several nominally "statically" typed languages don't differentiate between nullable and non-nullable types is a significant source of failure in their type systems. Using a modern language that properly identifies nullable values as a distinct type from non-nullable ones goes a long way towards eliminating a whole host of problems. It will be interesting to see what things look like in 10 years or so once Rust has had time to really displace a significant portion of the C and C++ code in the wild, and hopefully Kotlin has killed off Java (and if we're really lucky Typescript has done the same more or less with Javascript).
I can say from my experience it is definitely possible to maintain large codebases in Python. The type errors of the superficial variety that the OP refers to were usually caught before they made it to production (and were rare besides if you were experienced enough to avoid them). It requires discipline to maintain tests and write code in a way that avoids errors.
I've been learning OCaml and Haskell for a couple of years along with formal methods using TLA+ and Lean. I used to think type theory was the accounting of maths. I still think that's at least partly true but the power it brings you as a programmer is quite powerful.
I find working with Haskell or OCaml to be much more productive. Instead of stepping through a debugger or following tracebacks (a descriptive error) I get prescriptive errors as I make changes to a Haskell codebase. The propositions in the type system form a much better specification than unit tests alone.
I still like Python and C for many reasons and will continue using them where appropriate. However I think Haskell/OCaml offer quite enough power that everyone should at least consider what they bring to the table.
So prove it yourself. Proving things about programs that rely on dynamism is invariably much harder than proving things about programs that don't.
You'd have thunk I didn't have to make them than. But I did, judging from literally every argument I had about this.
Dynamic languages can always lean on runtime features, but that's also their peril. Late binding deprives you of leveraging the tools in favor of "trust me".
In both cases you can get a maintenance nightmare, of course. The point as I see it is to move things toward the runtime when the error case is not troublesome, and towards the compiler when automating in more safeguards would help.
But I've never had the luxury of choosing a tech stack for it's purity of design. So the feature is wonderful in my day to day regardless.
I'd love to build 3D experiences in a language like lisp or scheme. It'd be great fun to learn but I don't currently have the luxury of the time it would take to ramp.
And I certainly don't have the political capital to convince my entire dev team to change.
There's a good reason people generate code instead (in statically and dynamically typed languages!): performance.
However, in a statically-typed language, you must satisfy the type checker for everything, which adds development time. In reality, there might be a small percentage of functions in your code base for which errors (either compile-time or run-time) would likely crop up, yet you must pay that cost for 100% of them.
So that's really where the debate comes from.
Dynamically typed languages can get around this problem by generative testing (in Clojure's case) which allow very fine-tuned aspects of your system's requirements to be automatically tested before run-time without writing tests, which offers some of the same confidence as a compiler.
Performance is a concern for some projects - you wouldn't write an OS, database kernel, or mainstream game engine in a dynamic language.
How is that not a valid concern in the dynamic vs static typing argument? The parent comment has a legitimate point.
Python is strongly typed, but dynamic. But slow. JavaScript and PHP are weakly typed and dynamic as they will coerce types in strange ways during operations and comparisons.
Lua is dynamically and strongly typed like Python, but LuaJIT can sometimes produce code on par with or even slightly faster than native code - because it's really JIT compiling the hot path to native code with some guards and offramps to interpreted code for special/unexpected cases.
But there are limits to those techniques and it's doubtful that dynamic languages will ever perform at the same level as static languages because the compiler simply has more information and doesn't have to be as pessimistic or insert as many runtime guards.
http://hackage.haskell.org/package/hotswap
Or this:
If these things are so good why does no one use them? EVERYONE using Erlang is using the same hotswapping facility. This sort of dynamism is just fighting against the language in an environment like Haskell.
Erlang has dialyzer, which is great but it's based on optimistic typing. Hot-reloading aside, what would be the issue to creating such language? Maybe some issue with the process pids, which are quite dynamic?
Usually I use normal keywords for throwaway or glue code, but anything important (my actual domain entities) will be namespaced, allowing (relatively) pain free refactoring.
Cursive has a few good refactoring tools/shortcuts, but I would also like to see extract/inline/move for functions.
Its not a huge problem, just something I miss from the static languages.
https://github.com/Microsoft/TypeScript/wiki/JavaScript-Lang...
> Visual Studio 2017 provides a powerful JavaScript editing experience right out of the box. Powered by a TypeScript based language service, Visual Studio delivers richer IntelliSense, support for modern JavaScript features, and improved productivity features such as Go to Definition, refactoring, and more.
It works off Typescript type annotations for packages and the Typescript compiler's type inference. VS Code's Javascript editing does not seem to be evidence for tooling around dynamic languages so much as it's evidence that if you have a lot of money and Anders Hejlsberg you can write a compiler for a statically typed language with type inference that looks like a specific dynamic language.
Are there not definitions written by someone or a tool that vscode looks at?
That is what I saw using some npm packages with typescript: http://definitelytyped.org/
I found this extremely weird and (or hence) interesting at the same time. Where can I read more about this?
http://www.gigamonkeys.com/book/beyond-exception-handling-co...
they're (hand wave hand wave) basically a matter of stuffing the current continuation inside of the exception, whenever you throw an exception, and then making use of that to provide more options whenever the exception is caught.
Nobody ever talks about that little horror when they bring up how Smalltalk could to refactoring JUST FINE without static types. It wasn't just fine, turns out.
In smalltalk the parameter names are part of the function name to reduce the likelyhood of name clashes. You still have the same problem when you have two functions with the same name and parameters.
Getting a language to the point where it's workable for serious projects is a lot of work. Rust is getting there with the backing of Mozilla, Haskell has made some decent strides too (but still has a way to go, and I think has some pretty fundamental flaws entirely apart from the type system). It'll be probably another decade at least before we see any dependently-typed languages getting a serious foothold, but I do think they're going to become a lot more common eventually.
IMO, machine assistance is useful to the extent it relieves us humans from work. In particular, types are useful to the extent they can be inferred. Beyond that, you still need to prove the correctness of your programs on your own, so there is no point to the ceremony of writing down those proofs in a machine-checkable format.
It's also self-evidently wrong. Agda was first released in 1999, ten years before Go. If you use a wallclock interpretation of "maturity", Agda is twice as old as Go and Idris is roughly as old as Go. Both are used significantly less (by several orders of magnitude), though. Despite them having a far stronger type-system.
If you, on the other hand, you are using a "developers' time" interpretation of maturity, on the other hand, you are making a circular argument, i.e. "Agda is seeing less use, because it has been used less", as resources invested in a language ecosystem tend to be strongly correlated with it's usage.
Essentially the types being used as so complex the type checker cannot automate the decision about two types being compatible so you have to write maths proofs to help. The types used in mainstream languages are simple enough that the type checker never needs help like that.
The cost of formal verification right now is immense but the benefit is close to bug free code. Mainstream strong statically languages require nowhere near the same amount of effort and give clear benefits over dynamic types.
No, they are not.
1. If the code doesn't conform to the contract, it will fail on the contract boundary, with a well-defined error. If this is useless, then `assert` is also useless, which it is, of course, not.
2. With a sufficiently well-designed language and sufficiently smart compiler, you can move some contract checks to compile-time. See Racket.
3. If your language supports both static and dynamic typing, the contracts are a dual of static types, which lets you interface the static and dynamic parts of the code seamlessly and automatically (in both directions). Again, see Racket.
Meta: I wonder, why it's mostly static-typing proponents who aggressively evangelize, insult the other side, are 110% sure they're right even though there is no scientific evidence and so on. Could it be the bondage&discipline approach of static typing just appeals to people with a certain mindset, who are statistically more probable to engage in such behaviors, no matter the subject?
1. With pencil and paper, you don't need to wait for a smart compiler - you can get started proving things about your programs today!
2. I can totally see what's coming next: “Being wrong is dual to being right, so being wrong is another possibility worth exploring”. Right?
> Meta: (slander)
No comment.
The point of having macros is that they allow you to solve problems that cannot be solved elegantly in any other way. But 95% of programmers don't need to solve such problems and can do very well without using macros.
Thanks, but.... NO THANKS! It's basically what makes languages like Scala or C++ horrible - a false sense of "you only need to know this subset of the language" and then you see that in real life: (1) nobody agrees on what that subset is and (2) you are going to have to hack your way through the most advanced frameworks and libraries (written by folk way smarter than you) and you are going to need to do it under unreasonable time pressure!
If a feature exists in the language you will be forced to understand it and become proficient at using it, whether you like it or not. Otherwise you're a "play pen programmer", only comfortable in his little patch of expertise.
I'm personally an "Expert Generalist" and like to be confident I can hack my way through anything this shitty life throws at me ;) This is kind of why I'm starting to love forcefully minimalistic, abstraction-wise-rigid, and intentionally "retarded" languages like Go nowadays :) (But yeah, when dynamic is the way to go, I'd prefer a Lisp with macros any time - one extreme or another, never the middle way, I'm not smart enough for it.)
That only works when you work by yourself (or in a small team to whom you can dictate the language subset), and without any third party code.
> But 95% of programmers don't need to solve such problems and can do very well without using macros.
Languages that have great macro systems use them to bootstrap themselves. So when you use the standard, documented features, you're using macros.
E.g. if you're writing in Lisp and your file begins with (defun ..., you've just used a macro.
Is it really? What about templates(as in C++ templates), macros, CSS and HTML? These are two examples of metaprogramming and two DSLs respectively.
> However, languages like Clojure and Smalltalk can be rather easy to learn, and they are fairly powerful.
It's not like I don't believe you, but if this is true, then where's the popularity? Why isn't it there? I'm asking because I genuinely don't know.
EDIT: punctuation.
"Lisp macros" go far, far beyond "C macros" ("preprocessor macros", and indeed go far beyond what you can do with C++ templates. You should take a look, but basically, explained in a few words:
In Lisp, code is data. Code is a first-class citizen. The functions and constructs that are there to manipulate data, also manipulate source code with the same easiness. So your code can manipulate code very, very easily. Writing code that creates code "on the fly" -be it at compile time or at runtime- is not only possible, it is also very easy to do, and it is 95% similar to writing regular code.
Thus, Lisp is sometimes described as "the programmable programming language."
>It's not like I don't believe you, but if this is true, then where's the popularity?
"Programming is pop-culture" -- Alan Kay.
The reasons a programming language gets highly popular is not always related to the quality of it. There are also other reasons. Consider Javascript for example. Before the ES6 specification, it was plainly a horrible programming language, full of pitfalls and missing features. You couldn't even be sure of the scope of the variable you just declared!! But it went popular, simply because it was the only programming language usable on all web browsers.
C, for example, was never a great programming language. But it ran efficiently on any hardware, so it started as a (very good) alternative to assembler. And then got more traction.
Then object-oriented programming got popular, because it allowed you to do nice stuff (on the Smalltalk language, where it was very well implemented). So somebody said: ok, i want C with object orientation, and C++ was invented, which wasn't a very good object oriented language, but since C was popular, and OOP was the next big thing, it got wildly popular.
C and C++ languages require you to manually manage memory, unlike in Smalltalk or Lisp, where the memory was automatically managed. So somebody at Sun said "ok, let's make a language with syntax similar to C++, but with automatic memory management", and Java was born, and thus, due to the small learning curve, and a LOT of marketing, went wildly popular, although many of the problems of C++ were present, plus it introduced limitations of its own. (I, as a student, loved Java when i learnt it, after having to use C++. How naive i was!!)
And the story goes on and on.
So it's more about riding the wave of popularity, rather than using the best tool for the job. It has also something to do with the triumph of UNIX over other operating systems. Otherwise, Smalltalk [what the groundbreaking Xerox machines used] and Lisp [what the groundbreaking Lisp Machines, and also the Xerox machines used] would be way more popular.
It also has something to do with speed -- Lisp (in the 60s) used to be a very slow language. Smalltalk (in the 70s and early 80s) used to run very slow as well. They also required a huge amount of memory. Nowadays they are not really memory hungry, and they can run very fast.
Some problems are much easier to express in Prolog, or Haskell, than Java or C++ or javascript; but they aren't popular languages. Popularity sometimes is harmful...
I think these timing benchmarks come after the jit warmup procedure, so the presumption is that the compilation cost is amortized over lots of runs in an HPC-type setting:
We're still doing this, for example the last production spec of the Java language finally incorporated a mechanism to pass functions as input parameter on a method. And the next version of Java (9) will attempt to have some interactivity, with a kind of REPL. This, coupled with the powerful facilities of good Java IDEs, will give Java developers of 2017 the level of Interactivity and easiness of development that Smalltalk and Lisp users have enjoyed since the late 70s. Sad but true.
Julia -an interesting language, by the way- borrows multiple dispatch from the Common Lisp Object System (CLOS), among other features. CLOS itself was a further evolution of the OOP brought to the table by Smalltalk, invented by a true genius: Alan Kay.
Rust is basically a "fixed C++", that is, a more usable, less annoying C++.
So it's difficult to say there are truly new things in programming language. But it's not everything limited to Smalltalk and Lisp -- Prolog, ML (and OCaml, F# and Haskell) do bring new concepts to the table, and are worth checking out.
There's a reason the expression "fighting the borrow checker" was coined.
You're confusing your own biases and preferences for facts.
Also they are backed by Clang and muuuuuuuuuuuch faster than in Visual Studio + whatever paid extension
A very nice feature it has and that I didn't see elsewhere is the optional case-sensitive renaming. eg renaming Foo to Bar will also change foo to bar and FoO to BaR.
1. The language specification is very big. This is true, it is a very big specification. On the other hand, this is mostly caused because the language spec also includes the spec for its own "standard library", unlike what happens in C or Java, for example, where the Std. lib is specified elsewhere. CL's "standard library" is very big, because there are many, many features.
The other reason the spec is so big, is that this is a language with a lot of features - you can do high level programming, low level, complex OOP, design-your-own OOP, bitwise manipulation, arbitrary precision arithmetic, dissasemble functions to machine language, redefine classes at runtime, etc etc etc.
Probably the extreme of the features is that there is a sql-like mini-programming language built in just for doing loops (!), "the LOOP macro". On the other hand, you can choose not to use it. And if you use it, it can help you write highly readable and concise code. More info:
http://cl-cookbook.sourceforge.net/loop.html
2. The "cruft"; Common Lisp is basically the unification ("common") of at least two main Lisp dialects that were in use during the 70s. So there are some parts (mind you, just some) in which some naming or function parameter orders could have been more consistent; for example here everything is consistent:
;; access a property list by property
(getf plist property)
;; access an array by index
(aref array index)
;; access an object's slot
(slot-value object slot-name)
... but here the consistency is broken: ;; gethash: obtain the element from a hash table, by key
(gethash key hash-table)
There is also sometimes some things that seem to be redundant, like for example "setq", where "setf" can do everything you can do with "setq" (and more); or for example "defparameter", and "defvar" where in theory "setf" might be enough. But there are differences, and knowing such differences help to write more readable and better code. And it's really nitpicking, for these are easy to overcome.3. Because of the above, CL is often criticized because of being a language "designed by committee". But, unlike other "committee-designed languages", this one was designed by selecting, from older Lisps, features that were already proven to be a Good Thing, and incorporating them into CL without too many changes. So you can also consider it to be "a curated collection of good features from older Lisps..."
4. Scheme, the other main "Lisp dialect", has a much, much smaller and simpler spec, so it's easier to learn. But on the other hand this also means that many features are just absent, and will need to be implemented by the programmer (or by external libs), without any standarization. On the other hand, due to the extensive standarization, usually Common Lisp code is highly portable between implementations, and often code will run in various CL implementations, straight away, with zero change.
Historically, Scheme was more popular inside the academic community while Common Lisp was more popular with production systems (i.e. science, space, simulation, CAD/CAM, etc.) Thus, there used to be an animosity between Schemers and Lispers, although jumping from one language to other is rather easy...
JavaScript is up to 885 pages. https://www.ecma-international.org/publications/files/ECMA-S...
The C++ 17 draft: 1623 pages. https://www.ecma-international.org/publications/files/ECMA-S...
C, which "is not a big language, and is not well served by a big book", according to Thompson and Ritchie's 1988 introduction in the K&R2, is up to 683 pages in C11. Almost triple the size of C90's 230 pages.
How about something non-language? USB 3.2 spec (just released Sep 22): 100+ megabyte .zip file download. Up from 2.0's 73.
1. This is a serious criticism, but it has nothing to do with soundness.
2. There is absolutely nothing wrong with a language being designed by a committee, so long as the committee's members are all competent.
3. Back to 0.
Perhaps I should have written "... nontrivially statically typed."
And there are all sorts of optimizations like this that simply aren't available to dynamically typed languages. Tracing JIT can only take you so far.
Yes, in the standard library, an immutable object can hide mutable state in e.g. a 'Mutex', and effects to the external system aren't wired through anything like monads or unique objects.
I see those as compromises Rust makes in the name of pragmatism and being a system'ey language. I don't necessarily like all of them, but I find the general uniqueness typing based approach interesting.
See articles comparing Clean and Haskell for an interesting historical perspective, including how both approaches could be used to model side-effects in a purely functional language. Haskell "won", possibly because it was seen as more generic and composable. I always felt Clean's approach had merit too, so I was really glad to see Rust bring the idea, or a closely related idea, to prominence.
Passing a unique world object around is effectively the same as composing with the IO monad, and borrowing 'f(&mut world)' is basically equivalent to 'let world = f(world)'.
Maybe someone will one day write a standard library in that style.
You can pretty much use any language in that scenario. But the chickens come to roost around day 30+ or so.
Meanwhile real-world non-trivial projects tend to be large projects.
Nice to be prepared for that instead of gambling on "surely we'll extract out smaller projects that fall on the right abstraction boundaries in the face of unknown future requirements."
How big are the "real-world codebases" you're talking about, and how many programmers are working on the code? Once you hit 5-10 million lines of code and/or thousands of developers, static typing really helps manage complexity.
Well that's provably false, because there exist properties that you can type check that literally can't be verified via unit tests, even in principle. For instance, race and deadlock freedom.
Are you kidding? Python programming's my day job but Haskell is an order of magnitude easier to refactor.
C# has a very weak version of this with the auto keyword. Languages like Crystal take it much further by tracing the flow of data through the entire program. It generally works quite well, though there are a few edgecases that require explicit type annotations.
As for auto-completion, some languages feature designs that make it easy to offer auto completion even without type information. For instance, Elixir doesn’t have methods. You only have functions defined on modules, and it’s trivially easy to know what functions are defined on a module.
So it’s possible, but there are some limitations.
> C# .... Crystal
Both are fully statically typed languages with type inference. Those are unrelated to the argument the parent comment is making.
That would imply there's no type inference possible with dynamic languages - either at "compile" time or run time.
(Wouldn't it also imply that static typing and strong typing are synonymous?)
I can and have built a totally untyped language within a fully statically typed language - nostrademons' Scheme-in-haskell exercise is a lot of fun.
You just have to define a Universal type and then back out of all that nasty compile-time nonsense. Everything is Univ and Univ is everything.
fn foo<T: Positioned>(x: &T) -> Point {
x.position()
}
Note that this doesn't work in languages that use templates for generics, like C++, where templates work more like compile-time duck typing.For OO languages and methods it does get a little guesswork-y, and tools often offer a wider range of guesses than is correct for completion. But you can show the class along with the offered completion, so it's not too bad.
I struggle to think of any benefits of dynamic typing on a reasonably sized code-base (e.g. 50k+ LOC).
There's always going to be an agility multiplier so long as VC is flowing and getting an mvp and a high head count is worth more than any kind of sustainable product.
I expect any programmer to be able to transfer their skillet to a new programming language - especially when we're talking about mainstream languages with a GC.
>There's always going to be an agility multiplier so long as VC is flowing and getting an mvp and a high head count is worth more than any kind of sustainable product.
What kind of agility multiplier? Maybe it makes a difference during a hackathon where you have 24 hours to put something together or maybe if you're putting together shell scripts ... but taking something to MVP takes weeks or months - mainstream dynamic languages simply don't have any edge in development speed in those instances.
Being able to define and initialize types at runtime offers more flexibility and it's quicker to develop in.
I'm frankly surprised nobody brings it up more often but the prototypical example of "static typing done right" - Haskell - is talked about nearly constantly but when I look for actual software I might use that's written using it.... there's so little it's almost embarrassing. One obscure window manager, one obscure VCS, facebook's spam filter and some tool for converting markup.
Given a sprinkling of asserts, a decent linter and a high level of test coverage I don't see much of a benefit to adding static typing.
Quicker if you're writing shell scripts or a small single-purpose applications. Not quicker if you're adding to a codebase of any significant size.
And yet, it's done all the time with great results.
I just don't understand what is so horrible and inexpressive about a static initialization block.
The only possible purpose I see to Spring is if for some reason you really need to be able to change how your dependencies are injected at runtime. (90% of Spring apologists point to this, and 99% of them never use it in practice.) Even then, I don't see how a Spring XML config file (which I have seen run to 4000+ lines, to my horror) is better than just reading some settings out of a properties file to pick an implementation in your static initializer.
Java's static initialiser blocks are too dangerous whenever one has threads.
But I think I get your general point: things like 'Control.Concurrent.Async' ('async'/'await') and 'Control.Monad.Coroutine' ('yield') are libraries that implement some and very generic type classes: 'Functor', 'Applicative', 'Monad'. This then lets you use features that are generic over those type classes ('do' syntax, 'fmap', ...).
It's been many years since I had a proper look at Haskell. Maybe it just takes more practice than I had back then to fully "get it". But I still don't see those abstractions being that useful in everyday programming. They seem to have huge potential for hard to follow code as you need to mentally unpack and remember more layers of abstraction, and the gain is not clear to me. Even the features that have trickled down to C# are not _that_ crucial I feel. The way mainstream languages pick the most useful use cases of those abstractions seems pretty OK to me.
(Also, macros and compiler plugins are another interesting avenue towards very powerful abstractions, with a different set of problems.)
As for Spring and dependency injection, I don't follow how HKTs would help there. Could you give an example? Aren't DI frameworks mostly about looking things up with reflection magic to automate, and arguably just obfuscate, the task of wiring things up in 'main'?
That's the beauty of abstraction without side effects, you don't need to unpack anything. If you know what the inputs are and the outputs are, you don't need to know how it works or what type classes are even used to transform certain things.
People use sequence
all the time in Scala, not realizing it's only able to be implemented with HKTs of Applicative and Traverse. FYI, sequence flips a list of Futures to a Future of List, or a vector of Trys to a Try of Vector, etc.
I don't buy your point about not needing to unpack side-effectless code, however. There are _always_ reasons to dig into code, be it bugs, surprising edge cases, poor documentation, insufficient performance, or even just curiosity. And those high-level abstractions tend to be visible in module interfaces too. I remember some Haskell libraries being very hard to figure out how to use if you didn't know your category theory :)
What I meant by "polymorphic programs" as an alternate to DI, is something like this:
doStuff :: HasLogger m => Input -> m Output
The effectful function "doStuff" above is polymorphic with respect to which logging implementation is used, it could even be one that uses IO. All made possible with HKTs.
Your DI example seems to be an example of my earlier point about "emulating things you could do with side-effects". No HKTs are needed when you just pass an impure side-effectful Logger object. Or, as discussed in another subthread, you could do side-effect management with Rust-style uniqueness typing, which results in a less elegant but arguably easier to use type system. It's debatable, but it seems people struggle less with the borrow checker than with advanced Haskell.
Also the compiler usually tells you what is it exactly that you screwed up this time and how to get out of this mess, which cannot be said about C++.
If fighting the borrow checker is annoying, that's because you don't get memory safety otherwise.
The vast majority of vulnerabilities in the wild are created because of sloppy usage of C / C++, which is basically unavoidable in absence of expensive static analyzers that become as annoying as Rust, while not being as good.
...but there's plenty of sales charlatans out there who will say "I intend to get the manager of that IT dept to make its developers use my tool for the job, and I don't care if my tool is the right one or the wrong one."
Yes. PHP developers.
:)
Automated refactoring was invented in Smalltalk, claiming it's a benefit only static typing provides is to not know history.
History: In Bill Opdyke's thesis work, the tooling was written for C++ and written in CLOS.
I agree, the side-effectful choices are either a global, some DI container, or just passing it down.
For loggers, I think global lookup from some (pluggable) logging library is justified because logging is probably the most ubiquitous cross-cutting concern ever. For pretty much everything else, I think passing as a parameter is actually the best option. It's explicit and simple, and you don't even need to explicitly pass it around _that_ much if you store it in a field of a class that plays the role of a module. Most uses of the dependency will be in non-static methods, lambdas, or inner classes.
I dislike Reader because it's similar to a DI container (or a global) in that it's more work to figure out, for a given call site, what the last value written to it was. With parameters, you just climb the call chain.
I guess the side effecting version is just to use some global registry to look up the logging implemention to use. But such code does not compose.
This allows you implement patterns like duck typing which are idiomatic in languages like javascript, but impossible to implement in java.
In fact, the only cases I've ever swapped even a HashMap has been to trove, which implements data structures for builtin types rather than reference types. In which case using interfaces everywhere doesn't help.
What crap? Types?
>deal with a lot of dumb restrictions
Like what?
Types, especially those that could easily be inferred and that provide no value to the programmer. Writing types down can be useful - Python programmers do it too. Having to tell the compiler every little thing is crap.
Like what?
Type erasure. Having to treat primitives and arrays differently from other types. No first class functions or classes. I won't go on, the arguments are easy to find.
I am surprised though that you were completely ignorant of the benefits of static typing outside of Java. I would think that simply out of curiosity you would do a language survey just to get an idea of what else is out there.
(There are also all sorts of optimizations that can't be done statically, so I guess there's that.)
I didn't say so. An essential feature of type systems is being able to determine the types of (many) values, statically or dynamically.
> You can not deference a floating point register as an address, for instance.
So you'd agree that architectures that don't make a distinction between integer and floating point registers are untyped? But sure, call this a type system if you must. It's just a very very weak one, so weak as to be almost entirely useless. (And the reason floating point registers are often separate from integer registers is not to provide this kind of "type safety", it's due to history and architecture.)
Weak and strong aren't meaningful terms. A machine ISA might have an inexpressive type system and/or an unsound type system (because it conflates addresses and integers).
> And the reason floating point registers are often separate from integer registers is not to provide this kind of "type safety", it's due to history and architecture.
No, the reason is performance. And we get performance by making statically known distinctions between datatypes, which is what the original poster asked about.
Intermingling the integer and floating point circuitry so they access the same register file would never improve performance over keeping them separate. You'd need longer wires to place them both near the same register files to minimize signal latency, an the added signal delay alone ensures lower performance.
> No, it's for architectural reasons.
You win, I guess?
C's built-in arrays are super weak, so you need some library to do proper resizable arrays. Since C doesn't have generics, such a library will use void * as the type for putting values into the array and getting them back out again. You'll be casting at every point of use, and nothing will check to make sure you got the cast right, other than running the code and crashing.
There are other options though like macros and code generation. Code gen in particular can give you more options than generics without sacrificing any type safety.
http://attractivechaos.github.io/klib/#Khash%3A%20generic%20...
This approach isn't just more strongly typed than using void * for everything, it's also typically more efficient. For an array, you can put larger structs directly in the array rather than being forced to use a pointer. For a hash table, the same applies, plus you can avoid expensive indirect calls to compute hashes. (There are alternative non-macro approaches that trade off that overhead for other types of overhead, but you can't do as well as with a specialized container.)
I guess you could ask, at that point why not just use C++? And a lot of people do, and the people left writing new C programs are often traditionalists who don't want to switch to new approaches. And to be fair, there are disadvantages to macro-based containers, like increasing build time. But I still think there's room for them to see more adoption.
Static typing versus dynamic typing is fairly binary: if your types are checked at compile time they're probably static, while if they're checked at run time they are probably dynamic. Haskell, C are statically typed, Python, JavaScript are dynamically typed.
Strong/weak typing is more of a spectrum. A strong type system can check many properties of programs and accommodate many patterns as types. A weak type system, on the other hand, can't check many properties of programs, and has to be bypassed to accommodate common patterns. JavaScript has probably the weakest type system, because it checks almost nothing ("hi" + 42 returns "hi42" even though this is nonsensical, {}.foo returns undefined rather than throwing a type error). C is fairly weakly typed because you can add disparate types (int* + int returns int* even if you intended to add two integers) and the type system has to be bypassed with void* to do anything sizeable. Python, ironically, is slightly stronger, in that applying operators to objects of types with no defined relationship throws exceptions ("hi" + 42 errors). A spectrum from weakest to strongest might look something like: JavaScript, C, Go, Ruby, Python, Java, C#, OCaml, Haskell.
My personal experience is that the difference between static and dynamic types isn't very important to my development process or code quality. I have to run and unit test my code to verify it, so the checks happen regardless of whether they happen at compile time or run time. But the difference between strong and weak typing is huge. Strong types catch more bugs, but perhaps more importantly, they catch those bugs where they occur. A type error when adding "hi" + 42 is far more useful for debugging than a mysterious unit test failure on a completely different function where it's returning "Hi42username" instead of "Hi Username" because you added the wrong variable. A segfault 30 lines later is harder to debug than an error when trying to add an int to the value at an int*.
Except when it's not. Just five days ago I was debugging an error in my Erlang port driver that was caused by me passing receiver (ErlDrvTerm, an int in disguise) in the place where I wanted number of iterations. The funnier thing was that the declaration of the function had the arguments in correct order (and that's what guided me), but definition had them swapped. The compiler did not catch that bug, because, well, both are ints, so apparently the declaration and definition match, don't they?
This article basically demonstrates GP point, though. It proves that `reverse` is self-inverse, but there are lots and lots of functions that are self-inverse (for example, `x -> x` is self-inverse. As would be the function that swaps any odd-index element with the one following it).
The claim was a difficulty of how to encode the actual correctness into your type-system. That this article doesn't actually encode correctness of reverse, seems like pretty good for that difficulty.
It's important to understand that type systems can encode correctness to the level you can specify it. So the program is therefore bug-free to the accuracy of your requirements on it.
Most people do not work with type systems which can do this and are unfamiliar with formal verification. The author presents directly (and argues through out) that there is a correlation not that bug free and static analysis are the same thing.
>Now a big part of the problem is expressing with sufficient accuracy what the properties of the algorithm you want to prove are. For example for string reverse I may want to show more than that `reverse (reverse s) = s`. Since after all if reverse does nothing that would still be true. I would probably want to express that the first and last chars swap when I just call reverse xs.
This is no different from writing tests in a dynamic language.
Types and tests are not equivalent. This is a prevalent myth among dynamic typing enthusiasts. There is no unit test that can ensure, eg. race and deadlock freedom, but there are type systems that can do so. There are many such properties, and tests can't help you there.
Types verify stronger properties than tests will ever be able to, full stop. You don't always need the stronger properties of types, except when you do.
However, a type proof can show that reverse, reverses all possible strings.
It is possible to test that a function on 16 bit integers returns the correct value for all inputs. Doing so would be a proof by exhaustion.
Type based proofs let us prove things using other methods than exhaustion, which is the only possible way to prove things with tests. That is an important property.
You can have bugs in type definitions and you can have bugs in tests as well and that'll be a problem until computers can read our minds. Type definitions are superior at checking what you specified though because the checking is exhaustive. Perfect is the enemy of good and all that.
> This is no different from writing tests in a dynamic language.
It's not. In that example, the type system will verify the stated property is true for all values of "s". Tests only check some specific examples whereas using types in this way tests all possible input and output pairs. It's like comparing a maths proof to checking an equation holds for a few examples you tried.
Given the caveats you mention, is there such thing as proving anything?
The second sense is the scientific "gather enough supporting evidence that your are reasonably sure". In this sense, you can prove a lot of things.
According to the principle of separation of concerns, this isn't the programmer's problem.
> You can prove a program does what the types say it should do but that is not what "correctness" means.
Of course, the ultimate arbiter of what “correctness” means is the program specification.
On Common Lisp, a dynamic language, I can also get this answered instantly. I just press a key combination on a method call and i jump to the definition.
So this isn't exclusive to statically typed languages.
Is that really a dynamic typing problem or a language that allows you to create instance members anywhere? It seems like that is a flaw in the declaration model of the language and not a static / dynamic issue.
$ txr
This is the TXR Lisp interactive listener of TXR 185.
Quit with :quit or Ctrl-D on empty line. Ctrl-X ? for cheatsheet.
1> (set a.b 3)
** warning: (expr-1:1) qref: symbol b isn't the name of a struct slot
** warning: (expr-1:1) unbound variable a
** (expr-1:1) unbound variable a
** during evaluation of form (slotset a 'b 3)
** ... an expansion of (set a.b 3)
** which is located at expr-1:1
Both warnings are static. If we put that into a function body and put that function into a file, and then load the file, we get the warnings.The diagnostics after the warnings are then from evaluation.
Those are nothing; TXR Lisp will get better diagnostics over time. I'm just starting the background work for a compiler.
There is dynamic and then there is crap dynamic.
Don't confuse the two.
There is crap static too. Shall we use C as the strawman examples of static? Hey look, two argument function called with three arguments; and there's a buffer overrun ...
I still don't think so. For example, plenty of silicon is spent on branch predictors simply because addresses and integers aren't distinguished, in general, thus permitting more expressive but costlier code.
Execution would be much faster if integers and addresses were forced to be distinct. Static typing pretty much always improves performance.
Persuading you on the general demerits of Java wasn't really my intention; my point was that if you want get more people on the static typing train, Java is a bad ambassador and might work against you.
I am surprised though that you were completely ignorant of the benefits of static typing outside of Java. I would think that simply out of curiosity you would do a language survey just to get an idea of what else is out there.
Well, back then (early 2000s) I was a self-taught teenager with a poorer grasp of English (I'm not a native speaker), so while my curiosity did lead me to discover a few languages (JavaScript, PHP, Python, Lua and C), anything remotely approaching "academic" - Haskell, OCaml, Oberon, etc - was out of bounds for me.
Since C was the only other statically-typed language I knew, and I also knew it was much older and designed for much slower machines, I assume static typing was mostly for efficiency, like manual memory management.
Maybe good tools are able to perform some static analysis and rule out some of the methods with the same name but impossible types, but the language doesn't rule out situations where the best the tool will be able to do is show you all of the function definitions with the same name as the (dynamically dispatched) function call site you're looking at.
Let's say you have seven different type hierarchies having dynamically dispatched functions named "run", with 5 definitions in each hierarchy, for a total of 35 functions named "run". In a statically typed language, if the code compiles, it's possible to narrow down the type for a given call site to either one of the hierarchies or one definition, meaning you have to look at either 1 or 5 definitions. In a dynamically typed language, there are situations where legal code results in the tool having to throw up its hands and show you all 35 definitions.
The flip side is that if you really have a spot where you'll need to dispatch to any of the different hierarchies, then in a strong statically typed language, you'll need to either create an algebraic sum type covering all 7 hierarchies, or you'll need something like a typecase / typeswitch statement to enumerate out your possibilities.
And even in a statically-typed language there will be cases where the tooling can only determine fairly generic things statically.
I don't see anyone advocating for abandoning static typing over that occasional limitation. Yet I do see people proposing similarly-infrequent issues as cause to abandon dynamic typing.
And that's really all I care about, I'm not about to start writing production code in Lisp.
Well, usually. It gets a little confused when you start using dependency injection containers, or dynamic requires, or anything like that.
How can it be a niche language, if it's an ANSI standard, has more than 8 or 10 fully feature, standard-conformant implementations, runs on most CPU types, and has been proven to work in production systems for spaceship guiding, worldwide airfare reservation and credit card transaction verification?
You use Js and Python because you choose to use it, but it's not the only choice. Not only Common Lisp, you could also be working in Clojure with many benefits.
The 99% stat is quite likely a big exaggeration.
Refactoring is far far harder to do reliably, and more straining as so much more responsibility on the programmer.
And the other way around: Statically typed languages without module systems, such as C, exhibit this problem too.
I mean, we can argue the semantics of what, exactly "static type checker" means, but...
A "static language" occurs when we have a model of program execution that involves erasing all of the type info before run-time. Or most of it. (Some static languages support OOP, and so stuff some minimal type info into objects for dispatch.)
Note how above, my expression executes anyway; the checks produce only warnings. The warning for the lack of a binding for the a variable is confirmed upon execution; the non-existence of the slot isn't since evaluation doesn't get that far.
If we retain the type info, we have dynamic typing. There is no limit to how much checking we can do in a dynamic setting. The checking can be incomplete, and it can be placed in an advisory role: we are informed about interesting facts that we can act on if we want, yet we can run the program anyway as-is.
Excellent. My point exactly. I have no fear of using a static or dynamic language, as long as it is a good implementation of a statically (or dynamically) typed language.
nothing stops you from using static typing with python3
The typing spec provides for supplementary "stub files" that can be provided for third-party dependencies without their own annotations. The typeshed project provides these for a fair and quickly increasing number of common dependencies, and is plugged into pycharm and mypy by default: https://github.com/python/typeshed
Can we stop pretending that adding type annotations to a 'dynamic' language solves the "good static typing" problem? It's just silly.
(About as silly as pretending that static types solve all problems.)
Yes, IME it's still quicker.
OTOH, if you're using "codebase of significant size" as code for "big ball of mud", static typing certainly helps, but integration tests are the real lifesaver.
No. I mean when the codebase is of a certain size, new features require some thought and planning. Features may span multiple-modules. They may require partial or full rewrites or the refactoring of any number of sub-components to support the new behaviour. This means that you proceed carefully because you may not want to introduce regression bugs. This costs time. At that point, you're not limited by your typing speed as you may be putting in net 10 lines of code a day. There is just no benefit to dynamic typing at that point. Worse for dynamic languages, this is where improved tooling and static type constraints start paying extreme dividends.
Yes. And all equally true for statically typed languages.
Refactoring without tests is easier in a statically typed language, but still stupid. If you assume a decent body of tests, the benefit dissipates quickly.
If you prototype faster, as you usually will in a dynamically typed language, your design mistakes cost less. Since in large software systems I tend to find that the two biggest sources of bugs are a) errors in specification and b) poorly designed APIs, not obscure edge case bugs, quicker prototyping helps in large projects too.
And, if your design is solid, you can still achieve similar benefits as static typing by "locking down" your boundary code with asserts so that future development that interacts with that boundary code will fail quickly if it interacted with in the wrong way.
>At that point, you're not limited by your typing speed as you may be putting in net 10 lines of code a day. There is just no benefit to dynamic typing at that point.
This is fallacious. You're never limited by your typing speed in any language at any point. The cost of "extra typing" is cognitive, not a finger speed limitation.
The benefit of dynamic typing is more flexibility in the code you write (especially useful for writing frameworks and such) and quicker turnaround when prototyping because you do not have to prespecify as much up front.
It's like spending ten years learning to speak Russian and then criticizing anyone who says that learning Russian is difficult.
Puzzling out scalaz code is difficult and requires an enormous investment in hours and practice, investment that a lot of people prefer to put into different learnings.
It's incredibly multipurpose, more so than even Spring or Guava or LINQ, and these are things that developers regularly have to invest serious time in.
The argument is just that FP libraries (like Scalaz) have a bigger payoff in the investment.
At Verizon Labs were have 20+ microservices that I have touched/looked at. Some use Akka, some use Play, some use Jetty, some use Http4s but everyone makes use of Scalaz somehow.
It depends on the people, not everybody has the inclination to dive so deep into hard core FP and they will be more productive using a different approach.
Don't make the mistake of thinking you've found the only software silver bullet that exists and that people who don't use it "don't get it", which is another attitude I've seen a lot of hardcore FP advocates embrace.
I am not sure it's a meaningful distinction. There are no triangles in the "real" world -- if you look close under a microscope, there will be more than three sides -- but geometry proves to be of great practical value just the same.
Incidentally, what's your source for "Ralph Johnson… was the creator of the first Smalltalk refactoring browser" ?
We still don't know if you simply confabulated your other claim.
When $problem happens in $language_I_dislike, it's a clear sign that the language itself is inherently broken.
/t/tmp.1q8r9dZAtX > cat test.c
int main() {
char *test = "test";
int i = 10;
return test == i;
}
/t/tmp.1q8r9dZAtX > cc test.c
test.c: In function ‘main’:
test.c:4:14: warning: comparison between pointer and integer
return test == i;
^~ $ python -c '"test" == 10'
$Good "dynamics": All the Lisp-family languages. Smalltalk. Julia. Lua. Tcl.
>Refactoring without tests is easier in a statically typed language, but still stupid. If you assume a decent body of tests, the benefit dissipates quickly.
We're not arguing whether dynamic language+extensive integration/unit test is better than static typing and no tests.
>And, if your design is solid ...
Yes, if you have a very solid architecture, strict coding guidelines, extensive integration and unit test coverage, experienced developers (etc. etc. etc.) you will mitigate a lot of problems with dynamic typing. So if you do everything right, avoid the pitfalls, you can have something solid. A similar argument is made to me when I assert JavaScript is a terrible language. I don't disagree with either but it doesn't prove anything.
>The cost of "extra typing" is cognitive, not a finger speed limitation.
Exactly.
I am, because that's the way I'd work in any language, because I'm not a hack.
>Yes, if you have a very solid architecture, strict coding guidelines, extensive integration and unit test coverage, experienced developers (etc. etc. etc.) you will mitigate a lot of problems with dynamic typing.
Eliminate.
>A similar argument is made to me when I assert JavaScript is a terrible language.
No, javascript is different. The weird and fucked up implicit type conversions render even a high level of testing insufficient to achieve a high level of confidence in the code. There's way too many edge case behaviors where it should be throwing exceptions and it does something weird instead. C suffers from this problem too despite being statically typed.
I worked on a small Python project some years ago and we had a type error in production despite having tests.
We traced it back to a call to a third party library. It was supposed to return a list of results, and all of the test cases around it worked and always got a list back. In production however we encountered an error because if there was only one value to return, the library would not return a list of one element, as we expected, but a scalar value. So the rest of the code was expecting a list and when it encountered a scalar it blew up.
You can blame it on us for having insufficient test cases, or not coding defensively enough, or not reading the source code of the library we used, or the author of the library for bad design, but ultimately, this bug would not have been possible in a statically typed language.
So just saying "have test cases" is not good enough. Your test cases can be not exhaustive, but a good static type system and type checker is.
That's the whole point: as it's often said "type checking keeps you honest"
I stumbled time and time again upon badly designed libraries... With a static type system, the painfulness will be obvious and felt the first time you'll try to build your code
with dynamic types, the pain might not be felt at all, until a crazy bit of code will be invoked, sometimes at the most unfortunate of times
That's the one. That's a damned stupid decision.
By adding sanity checking asserts in a dynamically typed language you can achieve more or less the same result.
Yes. We call that 'typing'.
I see all too often, java bad, just look at how many lines you need to write to get "hello world". Who cares, the IDE generates that for you, but if you don't understand static void main as a beginner... again who cares, just put it in and ignore it until you need to understand it. Also read that dynamic languages are so productive, but they never mention you need to then write tests to detect what the compiler would catch for free. Yes good code needs tests, but the compiler IMO is a damn good built in test suite for catching many classes of errors
What's wrong with relying on the compiler? Bugs like that will be caught immediately by a good static type system, so you don't have to rely on your test cases to catch them for you.
Relying on a test suite to verify basic properties that could be enforced automatically through a type system means you're only checking some cases instead of all cases. It also means you're cluttering your test suite with boilerplate about language mechanics instead of writing high value tests that verify the real operational behaviour of your system.
There are pros and cons to static vs. dynamic typing in general, but in this particular respect, static typing is strictly more powerful, less verbose and more efficient.
For example, these days it's quite possible to ask GHC to defer type errors to runtime. Does that mean that the GHC dialect of Haskell is dynamically typed? This is basically a command line switch away, btw.
Retention of type information does not "dynamic typing" make. As a trivial example, consider C++ RTTI.
You really have just reinvented static (type) checking and a good runtime. There's no shame in that, but let's not pretend that these are opposing forces.
I still think you're just arguing semantics.
EDIT: Incidentally, the statically typed crowd can even go the "other way", namely from runtime -> compile time. For example, it's quite possible to derive a static proof/type from a runtime value in e.g. Idris by pattern matching as long as you're meticulous about building up the proof.
None of those things are true.
It is not. A language feature is either amenable to static analysis (not necessarily type checking) or it is not.
> Consider dynamic class loading in Java, dlopen / reinterpret_cast in C++ etc.
This actually emphasizes my point. Features that are not amenable to static analysis are problematic for tooling.