app : Vect n a -> Vect m a -> Vect (n + m) a
That is pretty amazing if you ask me. I look forward to the day when we all use languages which save programmers from themselves.The whole line is the type declaration for a function (app) that takes a Vector of length m, and a Vector of length n, and produces a Vector of length (m+n). Which is to say, the compiler knows at compile time what length the resulting Vector will be, and will fail to compile if the function you've written doesn't - provably - always meet that requirement.
From a Haskeller's perspective it's amazing because container types are famous loopholes for code that produces runtime errors. For instance, if you call the "head" function (returns the first element) on an empty list it will halt the program at run time with an error, despite the fact that it passed the type check (because an empty list and a full one have the same type).
As someone who learned Haskell and subsequently have been writing a lot of Python, I keep a mental tally of how many of my bugs (some of which took ages to track down) would have been caught immediately by a type system like Haskell or Idris'. I'd say it's well over half.
In Haskell those kind of errors are drastically reduced, but a few still slip through. Languages like Idris can catch even more of them, in a clever way, which is awesome.
Contrast that with a language with method signatures that can only return types like "Vector" or "List", without any information that is dependent on the incoming parameters.
So then, the logic is checked - if the method returns a vector that is anything other than the size of the two incoming parameter vector sizes added together, it won't compile.
In other words, even more behavior that would normally be a runtime bug is checked at compile time, which is a good thing if you are working in a domain where correctness is important and want to avoid runtime bugs.
In reading Haskell (and apparently Idris) types, a name that starts with a lower case letter is an unbound type variable - sort of like a template parameter in C++. The `a` in each of the Vect's must be the same type but it can be any type (picked at the call site, for any given call).
You can have number systems modulo n, which are only compatible with other numbers modulo the same n because they are distinct types.
It's just another level of abstraction that pretty much nothing else has.
app : Vect n a -> Vect m a -> Vect (n + m) a
is a type signature for a function, `app`, which should append a vector (`Vect`) of length `n` to a vector of length `m`.What's cool about Idris and other dependently-typed languages is that the length of the resulting vector, `n + m`, can be tracked in the type. This can prevent a whole slew of errors.
That's the Haskell type notation (Idris and Haskell are similar). Here's an example:
plus5 :: Int -> Int
plus5 n = n + 5
The first line is the type declaration of the `plus5` function. You can read it as "plus5 takes an Int as argument and returns another Int". The second line is the function declaration. The equivalent in Python would be: def plus5(n):
return n + 5
The syntax is the same when functions have more arguments: add :: Int -> Int -> Int
add a b = a + b
This means that add takes two integers as arguments and returns another integer. This syntax might look strange (you could be asking yourself "How do I distinguish between the argument and return types?") but that's because in Haskell functions are curried[1].Haskell (and Idris as well) also has parametric polymorphism, so you can define functions which operate on generic types. For example, the identity function:
id :: a -> a
id x = x
You can read this as "`id` is a function that takes an argument of type a and returns another argument of type a (so the return type and argument type must be the same)". This means the id function will work for any type of argument (String, Float, Int, etc.). Note that of course this is not dynamic typing, Haskell and Idris are statically typed and therefore all the types are checked at compile-time.Now, let's move on to Idris. One of the reasons there's a lot of excitement over Idris is because it has dependent types, i.e., you can declare types that depend on values and these are checked at compile-time.
The example GP gave was this:
app : Vect n a -> Vect m a -> Vect (n + m) a
`app` here stands for append. This is the function that appends two vectors (lists of static length) together. In Idris, `Vect x y` is a vector of size x and type y, so Vect 5 Int
is a vector of 5 integers. So, essentially, `app` is a function that takes two vectors as arguments, `Vect n a` and `Vect m a` (this means they can be of different sizes but their content must be of the same type). It returns a `Vect (n + m) a`.Think about it for a minute. That is amazing! We can be sure that, at compile-time, just by looking at the function signature, our append function, given two vectors of size n and m, will _always_ return a vector of size (n + m).
If we accidentally implement a function that does adhere to the type specification (for example, it appends the first argument to itself, returning a Vect of size (n + n) instead of (n + m)), the code will not compile and the compiler will tell us where is the bug.
[1]: http://learnyouahaskell.com/higher-order-functions#curried-f...
template<typename T, std::size_t X, std::size_t Y>
std::array<T, X + Y> app(std::array<T, X>, std::array<T, Y>);There's also a huge difference between untyped templates and strongly typed generics, but in this particular case the difference is somewhat subtle.
My understanding was that concepts began as an attempt to fix this "problem".
To understand the complexity of the task, try proving that insertion sort is really a sort algorithm. You write more code, and writing it will take more time than you would do in a 'normal' typed language.
I'd highly recommend reading Chapter 1, Section 1.3 from Type-Driven Development. The first chapter is free.
https://www.manning.com/books/type-driven-development-with-i...
I'll see if I can contrive an example. Supposing you have a list L1 of vectors of length 5. You also have a vector V1 with user input, with length x (unknown at compile time). You then make a new list L2 by taking each vector in L1 and append V1 to it. L2 is now a list of vectors of length (5 + x). Even though x is unknown at compile time, the compiler still knows that all vectors in L2 are of the same length. It can make restrictions based on this fact.
It seemed strange to me when I heard this concept, I thought the compiler would need to consider infinite possibilities, but apparently similar things are possible even in Haskell. For instance, you can define a list as a recursive type that holds a value of a Null type (not just a null value of the list type) or another List. Apparently it can still reason about this.
However, these are things I've heard. Hopefully someone with more experience can chime in and let me know if I'm right here.
Github: https://github.com/idris-hackers
It's pretty exciting!
[1]: https://ghc.haskell.org/trac/ghc/wiki/DependentHaskell/Phase...
What's more, it seems to me that Haskell syntax is the lingua franca when discussing anything related to data types and functional programming these days. Those ->'s are everywhere, it's useful to know what they stand for.
Just skim a few chapters of learnyouahaskell.com; you won't regret it.
For example, I got into Ruby via Rails, because Rails lets you quickly prototype simple web apps. So I could go from "I wish I had an app that does X" to actually building it, deploying it and sharing it with others. What would a similar "learning flow" look like in Haskell? (doesn't have to be web-based)
Put another way, when I come across a problem, how do I recognize it as the type of problem that is best solved using Haskell, vs. an imperative language?
func train_model(X: [n d], y: [n 1])
So many lines of code are spent verifying that the sizes of two function arguments are compatible. function *{T, M, N, R}(a::Mat{M, N, T}, b::Mat{N, R, T}) function train_model(X::Array{Float64,d}, y::Array{Float64,1})It would be fantastic if I could casually throw in a proof while debugging a hard problem.
More importantly can we possibly implement church numerals in Haskell? /rhetorical
Yes.
I wonder how Idris is going to be affected when Dependent types come to Haskell as well (announced at last ICFP)
Not to mention, there is hope of giving a semantics to Idris since it is based on a fairly routine variant of type theory, whereas I don't think there is any hope at all of understanding "which" type theory the Haskell folks shall have implemented.
If you go all the way with Agda, you have significantly constrained recursion to the point that the language isn't turing-complete anymore, but in return you get termination checking and other nice things. As it turns out, most of the code we write doesn't need unbounded recursion. Seriously, can you think of the last time you wrote something like that?
And I think that'll be a hard pill for people to swallow, much like how it was really hard to sell memory safety to C/C++ guys back in the day.
It looks like it was released in 2012 on hackage but I don't know how far back 'Foldable' was envisioned.
[^1]: Ivor the Proof Engine is here, although apparently not actively updated any more: https://github.com/edwinb/Ivor
http://docs.idris-lang.org/en/latest/faq/faq.html
but could not infer why a dragon from Ivor the engine had been chosen. That the author had previously written a proof engine makes all the sense now. :)
:(
in my opinion, one of the worst ideas to plague many new languages
brackets are really, seriously, honestly a better visual cue for grouping
For example, what's the editing experience like compared to e.g. OCaml with emacs/Merlin? Is there anyone in a position to compare 1ML's approach to unifying type/value languages vs using dependent types?
Personally, I find 1ML's approach more pragmatic, or, at least, easier to digest for most programmers - including Haskell programmers. Dependent types are more powerful, but they aren't free from complications. For instance, unifying the type and value languages willy-nilly can cause trouble if you want your value language to be Turing-complete, because now type-level computation can diverge too! In a language based on System F-omega, type-level computation is guaranteed not to diverge, because its type level is the simply typed lambda calculus. Some use cases might warrant providing more powerful type-level computation facilities (e.g. calculating the shapes of multidimensional matrices), but it isn't clear to me that the full power of a Martin-Loef-style dependent type theory is needed.
These are functional languages, and there isn't a lot of code to chunk together with brackets. Functions are often just one line, and it's a bit cumbersome to make functions that require multiple lines. Do you believe brackets add to the readability of a one line function?
int func()
{
while dosomething()
{
dosomething()
dosomething()
doanotherthing()
}
dosomething()
}
def func():
while dosomething():
dosomething()
dosomething()
doanotherthing()
dosomething()A machine can look at the braces and re-indent the code, so that you don't have to look at the braces. All while you rest assured that the meaning didn't change:
I just popped it into Vim, selected all and hit =:
int func()
{
while dosomething()
{
dosomething()
dosomething()
doanotherthing()
}
dosomething()
}
Some semicolons are expected so things are a little off.There is no objective best answer here. I vastly prefer indentation based grouping, but I respect that you don't.
A big positive of using indentation is that it enforces a visual indication of scope vs. languages where culturally it is normal to find code with lots of giant one-liners and braces on one line.
But overall, you just get used to it and don't notice it. Everyone who yells about languages that use indentation is bikeshedding in the worst way.
I came from Python to Erlang / Scheme / Haskell and at this point I would answer your question,
> What kind of things can I build with Haskell?
With: Everything.
We use Haskell in production at Plum for our REST APIs, job schedulers, web applications, AWS service interfaces, a static site compiler, DB modeling, command line utilities, etc...
We also use it for two CLI utilities that are cross-compiled for the ARM9 on our IoT product.
I consider Haskell to be superior to any of the dynamically typed languages when writing production-level code, it's cleaner, safer, easier to maintain, easier to refactor, and much more fun IMHO.
[EDIT] I neglected the other part of your question, "What is the learning flow like?"
Definitely a bit rougher than Python or Ruby, I will not lie, but don't be discouraged. It simply means you need to do a bit more studying up-front first before you can tinker without being caught at every turn by the straight-jacket.
I would first go through Learn You a Haskell because it is pretty accessible and introduces the language basics well enough. Then study the type system. You must learn Haskell's type system and terminology before you can understand more advanced code.
This may be nitpicking but certain classes of programs cannot be realistically built with Haskell. Anytime you need to tightly control latency (soft realtime) won't really work since you have both a garbage collector and lazy evaluation. Memory constrained systems are pretty tough as well since you don't really get insight into allocation/deallocation, which also makes structuring your data into a particular memory layout tricky compared to C(++) for instance.
Not to say that Haskell isn't awesome. It should probably be used for more systems. It just can't be the "hammer" to make every problem into a nail.
That said, it's perfectly possible to write a DSL that handles scheduling, uses GHC's type machinery to track memory use and execution time, and have GHC generate a program that will generate C code that meets hard real-time guarantees. In fact, someone wrote it, it's available on hackage (https://hackage.haskell.org/package/atom) and my understanding is that it's used in production for control software in the auto industry.
It was slow at first, and there was a lot of "now what's this arrow doing here?", but I would go back to the books when I had questions and eventually things started making sense. That won't work for everyone, but maybe it will work for you too?
The folks in #haskell are generally pretty helpful, and delightfully easy to troll.
[2]: http://www.stephendiehl.com/llvm/ [1]: http://yannesposito.com/Scratch/en/blog/Haskell-the-Hard-Way...
There's a reasonable perspective that says the type of problem that's best solved in a language is a problem for which there's good library support. For that, I'd probably recommend scanning https://github.com/Gabriel439/post-rfc/blob/master/sotu.md
But in my experience, where Haskell really shines is sketching out operations on some type that I know I'm going to get wrong - substantially or subtly - my first many tries. When I go to fix it, the compiler helps me find everything I need to change in tandem, and feel ahead to find inconsistencies in my model before I get there in the code. One example of this kind of problem is compilers - where we parse into an AST, transform the AST, and then produce other things from the AST. As development goes on, the AST evolves, and you have a lot of help knowing what needs to change to match and what you can ignore. I've recently been doing this for SQL (though with the goal of analyzing the queries, not producing an executable).
https://pbrisbin.com/posts/developing_web_applications_with_...
You might want to try using stack instead of cabal though to ensure you avoid any dependency issues. I think cabal will work fine these days, but I have been using stack/stackage for some time so can't guarantee it.
Language with a type system but without generics have this above downside. Usually you'd just give up making a type for each of these, and simply use the Stack type.
Language without Idris-like dependent types suffer the same downside: instead of creating Stack<String, n>, you're forced to create Stack<String, One>, Stack<String, Two>, Stack<String, Three>, where One/Two/Three are unique types you created to conceptually represent a number. If you don't even have generics, you'd need to create StackStringOne, StackStringTwo, ...
Dependently typed languages encode numbers (among others) as unique types, to simplify. Imagine this:
put: Stack<String, Three> -> String -> Stack<String, Four>
This isn't too hard to imagine is it? No runtime checks needed. Once you get the output you can carry it around, along with its type, to places that accept Stack<String, Four>.So, asking "how come you don't have to generate these checks at runtime?" is similar to a programmer who's never seen generics asking "how come you can parametrize? Don't you have to generate all the StackString and StackInteger, etc.". It just works. Hope that analogy was clear =).
Dependent types in general blur the line between values and types, as you've seen above. Think of the usual values you use, and imagine having a new type to represent each one. Dependent types let you do this in a sane and logical way.
Languages that have generics can either erase the generic argument at runtime (like Java or Scala do) or generate a different concrete type for each instantiation (like C++ or Nim do). In either case, you usually know at compile time which concrete instances of the generic parameter are going to be used where.
That is, even in Java you may have a single class Stack<A>, but at the final use site this will eventually turn into some concrete type, such as Stack<String> (it may remain generic in intermediate steps). In other words, you are not going to use Stack with a generic parameter that you did not anticipate. You cannot read the runtime type of some instance, call it X, and then start writing functions that work on Stack<X>.
Now, let me turn to dependent types. A lot of the techniques that Idris promotes can be simulated in other languages, such as Scala, C++ or Nim. But in each case that I know of, you are required to know at compile time which concrete values you are going to use as generic parameters. That is, in C++ you can write functions that operate on types like Vector<A, N>, where N is the length, but you have to eventually specialize N to a concrete number before compiling.
The comment above states that Idris does not have this restriction, but it is not clear to me how that might work. I concede that by using an erasing technique (as opposed to specialization) one is not required to materialize such functions for all N - this is what one does for instance in Scala.
But the problem is: what do you do at the usage site? Say you are reading two lists off two JSON files, and you want to concatenate them using the above function. Wouldn't you need to provide at compile time a concrete value for N and M? (and consequently cast the unsized vectors to sized ones?).
If this is the case, then there is not much difference from letting the compiler materialize a function for concrete values of M and N that are used (because in either case, you have to encode knowledge about concrete values of M and N at compile time).
Maybe everything would be more concrete if I saw an Idris program that reads two vectors from external files at runtime, and then concatenates them, whatever their size is.
(This case of append-n-to-m-vector is a bit weird since it basically just amounts to calculating the length of the output, but my previous statement applies a bit more generally.)
[1] So one plausible representation for Vec n a would be a size + array of pointers to a's. I believe they're actually using linked list for Vec at the moment, though.
I would be curious to see an Idris program that reads two vectors out of some file (without knowing the size a priori) and concatenates them
The only thing I miss about braces is being able to jump to the end of a block easily. In python and similar that's a bit harder to do with an editor.
Having explicit delimiters (I don't care if it's {}, begin...end, or whatever) provide several benefits that I have often used.
1. (by far the most important) They trivialize the recovery of indentation that has been trashed by a bad/misconfigured/unfamiliar editor or coworker who inserted "\t" characters into the file. Just run indent-region or send the file through indent(1).
2. Moving code is easier. Just kill/yank it to where you want it and let indent-region or whatever re-indent as needed. Without delimiters I have to either re-indent manually or use an editor with a feature that changes a region's indent level. Even with such a feature, I still have to tell the editor "move this two levels deeper". With delimiters, I can leave that work to the editor.
3. The movement advantages you mentioned.
As for the extra work of having to close blocks (including the infamous LISP ")))))" motif), that's what features like electric-pair-mode is for.
Even better, delimiters can convey more information. Color cues have proven to be very effective at conveying important information; syntax highlighting is very popular, even when showing code outside an editor (such as inside a <pre> tag). I find extending these color cues to also show nesting[1] to be almost as useful as basic syntax highlighting. (I'm using rainbow-delimiters[2] in that screenshot).
That's also true if the intent is ambiguous (the braces say one thing and the indentation something else).
Hey, remove the braces and the intent is clear now!
Vect n a -> Vect m a -> Vect (n + m) a
What's that around n + m, and how eager are you to replace that with multi-line indentation?Meaning that without the braces the code itself would be wrong, in C. Not that C itself is wrong.
No, Idris doesn't actually care what "n" and "m" are (concretely)... it just knows that if you happen to have a (Vec n a) and (Vec m a) lying arond, then you'll get a (Vec (n+m) a) out.
> For usual generic types, erasure and code generation are equivalent in what they can do, because at some point you will specialize to concrete types.
I'm not sure I fully understand what you mean, but I think this is slightly wrong... You can think of the fully erased (Vec n a) representation as (n, void-star) in C-style or Array<Object> in JVM style. Note that "a" doesn't actually appear. So the type is actually never needed at runtime -- all functions which have (Vec n a) as a parameter will just assume that it has been (statically) proven[1] that the pointed-to-array has the correct size. Because of parametricity, functions will not be allowed to actually access any 'a' values directly. So it all works out.
Here's another comment of mine that goes a bit deeper into the interplay between static proof and runtime checks: https://news.ycombinator.com/item?id=10302863
Key being: You either need to prove statically or you need to prove statically that you've checked at runtime.
HTH.
[1] This is valid because call sites must provide such a proof (statically).
You cannot apply directly any of the size-aware functions, because it will not typecheck. I only see two avenues:
Either you assert statically that the vector has some concrete size, in which case the size is known statically and the code generation will work as well;
Or you use some other size-unaware function, in which case it is false that Idris can use runtime values as type parameters.
If I understand correctly, you are saying that there is a third possibility, but I am missing it
Yes, you can in fact do that! Let's assume you're building the list you're reading from a file as if it were a linked list. That's the point of dependent types -- the types can (and often do) depend on run-time values.
EDIT: I suppose I should expand on this a bit. The idea is that the "read-Vec-from-file" function can look like this:
readVecFromFile :: File -> (n, Vec n Int)
(let's say it's reading Ints)So the readVecFromFile function will return you a run-time value "n" (the number of items) and a vector which has that exact length. When using readVecFromFile, you'd do something like
(n, v) <- readVecFromFile
and you'd be absolutely free to call any function which expects a vector of any length. (If necessary you could also supply the actual length, but that's just a detail.)The implementation of readVecFromFile is left as an exercise, but it can be done. One idea is to start with the empty vector (length 0) and an "n" of 0 -- and then to just accumulate into it, all the while keeping a "proof" that the length of the vector is way the runtime value says it is. (Using structural induction, basically, to build a proof that the runtime value is the same as the length of the vector.)