Static, Ahead of Time Compiled Julia(juliacomputing.com) |
Static, Ahead of Time Compiled Julia(juliacomputing.com) |
1. Insufficient testing & coverage. Code coverage is now at 84% of base Julia, from somewhere around 50% at the time he wrote this post. While you can always have more tests (and that is happening), I certainly don't think that this is a major complaint at this point.
2. Package issues. Julia now has package precompilation so package loading is pretty fast. The package manager itself was rewritten to use libgit2, which has made it much faster, especially on Windows where shelling out is painfully slow.
3. Travis uptime. This is much better. There was a specific mystery issue going on when Dan wrote that post. That issue has been fixed. We also do Windows CI on AppVeyor these days.
4. Documentation of Julia internals. Given the quite comprehensive developer docs that now exist, it's hard to consider this unaddressed:
http://julia.readthedocs.org/en/latest/devdocs/julia/
So the legitimate issues raised in that blog post are fixed.
This is a really passive-aggressive weaselly phrasing. I’d recommend reconsidering this type of tone in public discussion responses.
Instead of suggesting that the other complaints were invalid or illegitimate, you could just not mention them at all, or at least use nicer language in brushing them aside. E.g. “... the main actionable complaints...” or “the main technical complaints ...”
* * *
> [...] I certainly don't think that this is a major complaint at this point. [...] it's hard to consider this unaddressed [...] fixed
After reading the original post and your responses, I think the responses come across as pretty smug and dismissive.
Third-party readers would probably be more optimistic if you just left it at “we’ve made a lot of improvement since then and we’re still working on it” or similar.
I am part of a team that now has over 60,000 lines of Julia code in our computer algebra package(s), and the intersection between our experience as users and the points danluu made is almost nil.
If one looks at the ranking of the language on Tiobe, it's quite obviously being used a lot for a language that hasn't even reached 1.0 yet.
I think the issues one is likely to have with Julia as an evolving language have completely moved on from those made in danluu's article. In fact, it might be useful for someone to write a "constructive criticism" blog article about the state of affairs today.
My personal opinion is that I would wait until 1.0 (which will arrive in less than 2 years time) if I were a fortune 500 company, unless you really need to be ahead of the game and are prepared to contribute to the development of the actual language itself. But for just about anything else, if you need the features Julia provides, it's probably vastly superior to the alternatives today.
Our experience is you will occasionally have to adjust some of your code to handle changes to the language prior to 1.0, and we've had to do that a few times so far (at most a few hours work each 0.x point release, even with our large, complex code base). And this mainly applies if you are really pushing Julia hard and exploring interesting corners of the language. Other than this, its more than stable enough for serious work.
The language allows super-readable code while remaining quite fast. The sort of research I'm doing involves interacting a bit with data then doing a bunch of simulations, and Julia excels at this. The expressiveness in how Julia handles anonymous functions, optional arguments, tuple construction/deconstruction, etc. allows for really concise configurations of how my little simulations run.
My only pain points:
- DataFrames are not as fast as the rest of the language.
- I'd like an infix function composition operator. (Possibly this exists, but I can't find it in the documentation).
9.5/10, absolutely recommend.
Do you mean like the pipe symbol?
rand() |> println
Or are you referring to something else?Bugs are quite easy to come across and updating software feels more like a dice roll than a normal upgrade. Parallel computing in particular has been in a pre-alpha state for ages (which may be more of a documentation issue than an implementation one). Packages were previously very slow to load, though with pre-compilation this was partially fixed (~1 order of magnitude difference). I don't write robust software in julia, so I don't know how the error handling side has been evolving. The API inconsistencies have been getting fixed, but this typically results in broken packages until the compatibility package (Compat.jl) includes some workaround. The core code is difficult to get through and architectural-level documentation was absent last I checked.
Even with these development flaws it's still by and large an enjoyable experience to use. It solves some hard problems and it makes my own work (mainly DSP/ML) move a lot faster. I would recommend it, but as their versioning scheme indicates, there's no 1.0 release yet.
I'd rather try to fix a bug in some undocumented codebase than wait 6 months for a new black box.
Going through the post in order:
The stable releases still have some bugs as you would expect in a young language, but 0.4 is now well below my tolerance level. For a rough idea, I now use julia daily and encounter a bug perhaps once every one or two weeks. In 0.4 I haven't encountered any bug which was a real show stopper and couldn't easily be worked around.
Testing has gotten a whole heap nicer with a decent test framework in Base (accessible in 0.4 via BaseTestNext.jl). Testing and package manager integrate in a simple but effective way which really makes the friction for writing a suite of tests for new packages very low, much lower than other languages I've used. I can't speak for actual coverage in Base, but I know it's now actually being measured and work has gone into the coverage tools.
I'm going to skip over the complaints about error handling, because others have already responded to this, for example StefanKarpinski's post to the julia-users list https://groups.google.com/d/msg/julia-users/GyH8nhExY9I/0Bzn...
Consistent benchmarking is currently being addressed, with great work going on at BenchmarkTrackers.jl, and a proper setup with dedicated benchmarking hardware for the language itself. I don't have the depth of knowledge to comment on Dan's other complaints regarding skewed benchmarking.
Regarding contributing, my experience is that contributions to Base and the runtime by unknowns (myself, say) are generally met with the fair skepticism and good taste that all good maintainers should display. Sometimes I feel the core devs could do more to encourage new contributors, and the environment can feel slightly hostile when suggesting new features. I'm not sure how to entirely avoid this, when a core job of a good maintainer is to say "no" to a lot of poorly considered requests! Much of the code in Base is still commented in a minimalistic fashion, if at all. In contrast my experience in contributing to packages has been almost entirely positive, with a lot of excitement and energy leading to some really great code and interactions.
With precompliation, slow package load times have really been improved to the extent that they're no longer a major hassle, but there's still room for improvement here.
The real sting in the tail of this blog post is the paragraph about nastiness in the community. There was a couple of unfortunately worded (though not unambiguously malicious) mails on the julia-users list following Dan's post, but the discussion was largely constructive and helpful. I've no idea about the "private and semi-private communications" and I can only hope things were patched up there.
Overall I've found the julia experience almost entirely positive. It's a joy to work with for numerical and statistical problems, and we're moving forward at work to get our first major pieces of julia code into production.
I'm the co-creator that Dan was talking about. He wrote a bunch of less-than-charitable comments on the aforementioned semi-private forum – not specifically to me, but where he surely knew I would read them – to which I responded with:
https://gist.github.com/StefanKarpinski/c72219ff8ce261172b11
You can judge for yourself whether I was nasty or dishonest. Things were, unfortunately, not patched up. Dan posted a number of responses, deleted all of them before I could read them, then left the conversation permanently.
What I personally also find worrisome is the perception (at least for me) that Julia is confined to scientific computing whereas I find it should really be a general purpose language.
Now this turns up a terrible delima, do I try to use Julia or Rust for writing embedded controllers? Rust has macros and direct memory control, but the article mentions. :)
I also think the more of us that recommend Julia for general compute, the more likely it'll get used that way.
It is intentional. Julia could be the new Fortran like Rust could be the new C++.
With 1-based array indexing? Very unlikely.
I was not aware that this was a Julia neologism. It seems like such an appropriate term for discussing how to make code make the most out of JIT-compilation.
really? There are quite a few things I don't like about julia, but using 1-indexing and "end" makes implementing algorithms much clearer IMHO.
The main pain points for me in julia are the module/pkg system and that the runtime is not just batteries but more like 10 generators included i.e. it could be way more minimal. But I get that the goal is to have a powerful scientific computing language and not to build a multipurpose language that emphasizes modular construction of code units and production ready package and build management utilities.
All in all, when judged by how well julia achieves its self-stated goals, I think it is excellent.
Guido explains his choice best:
https://plus.google.com/115212051037621986145/posts/YTUxbXYZ...
Congratulations to all involved in pushing the actual state of dynamic languages.
It's great! Looks like a very interesting contract.
Python uses it in the last couple of versions.
http://blogs.perl.org/users/ovid/2010/08/what-to-know-before...
You probably mean "static typing" versus "dynamic typing" which I wrote a bit about in the context of Julia here:
http://stackoverflow.com/questions/28078089/is-julia-dynamic...
Basically, I think "type stability" hasn't really been a thing in the past because in dynamic languages, people have traditionally not cared about ensuring that return types are predictable based on argument types, and in static languages, a program is incorrect if that's not the case. As people care more and more about being able to statically predict the behavior of programs in dynamic languages, the concept of type-stability in dynamic languages becomes increasingly important.
"Programming Languages: Application and Interpretation" Shriram Krishnamurthi, 2003.
However, like you, I noted that some of the complaints had not been accepted.
So if the reference implementation is a bare bones interpreter, even though there are JIT and AOT compilers tp choose from, they will say language X is interpreted.
Which in Python's case means many ignore the existence of PyPy, given that the language designers don't want to change the nature of CPython.
I had a look at Spark, but its linear algebra packages seemed too limited (I guess abstraction comes at a cost). I can see that Spark would be nice if it does what you need out of the box.
Heard good things about Scala, is it straightforward to get a process on a remote machine to execute code?
Did you look at MLlib and/or just using Breeze directly? There's a bit of awkwardness in the initial set up of the cluster (mainly just having LAPACK installed on all nodes, see https://spark.apache.org/docs/1.1.0/mllib-guide.html ). Spark itself is essentially just sugar to let you write a map/reduce in natural scala style and have it distributed across a cluster - it'll only work if you can factor your algorithm in a way that fits into that paradigm. (I've heard arguments that it's possible to do that with any distributable algorithm if you're clever enough, but I'm not sure I believe them).
> Heard good things about Scala, is it straightforward to get a process on a remote machine to execute code?
Honestly, no. I love the language but Spark is very much what I think of (perhaps unfairly) as typical scientific software. Spark clusters are finicky - they're cobbled together from a few unrelated projects (especially for cases where you need LAPACK as well), and it shows, especially when it comes to updating them. There are a few organizations like Cloudera (I think there was an open-source effort under the Apache umbrella somewhere too) that try to provide a working package, and various efforts with Puppet/Chef/etc. to automate the process of putting a cluster together, and it's certainly a lot better than it was even a few years ago, but a cluster still need at least a little bit of dedicated sysadmin time (or, at a bare minimum, a programmer with a bit of *nix admin experience who's willing to get their hands dirty - that was me at times) to keep it running reliably.
If you're part of an institution that already maintains a Spark cluster - or maintains an ordinary Hadoop cluster and you're friendly enough with the sysadmins to suggest they install it - it's wonderful. If you're having to do it all from scratch I won't lie, it's going to involve a lot of fiddling and may well not be worth it for your problem.
I have yet to come across any other linear algebra library for any other high level language that provides the depth of integration available in the Julia base library. Want all eigenvalues of a symmetric tridiagonal 10x10 matrix between 1.0 and 12.0? Simply call T=SymTridiagonal(randn(10), randn(9)); eigvals(T, 1.0, 12.0). Or if you want to work closer to LAPACK, simply call LAPACK.stein!. I don't see a wrapper in Breeze or SciPy for this function. Want an LU factorization on a matrix of high precision floats? lufact(big(randn(5,4))). And so on.
Julia may not have everything users want, but its base library really tries to make matrix computations easy and accessible.
Curious, are you all still coding the internals of the compiler in femtolisp, is most of it written in Julia indirectly relying on that, or no LISP now? A barrier to entry question basically.
But isn't Julia supposed to be fast? ;)
julia> (∘)(f, g) = x -> f(g(x))
∘ (generic function with 1 method)
julia> (sum ∘ rand)(10)
3.397728240035534
Tip: type \circ<tab> to get ∘
Enough Perl6 has left me with <compose>(.) to get ∘ :-)
Second, I don't find Guido's argument convincing. Yes, half-open ranges can be mathematically more elegant (and that's actually Dijkstra's argument), but that doesn't mean that the code necessarily becomes more readable. For example, to construct an array without the element at index i, you'd do the following with Python-style indexing:
a[0:i] + a[i+1:n]
and the following with closed intervals and indexing starting at 1: a[1:i-1] + a[i+1:n]
While there is an element of subjectivity to it, I at least find the latter option more readable (possibly because of habituation to mathematical notation).While the notation for the specific example of i:i+k-1 might be less elegant with closed ranges, closed ranges are something that you find in every math textbook, because sums, products, unions, intersections from a to b (and other operators in that style) operate on closed ranges normally. Closed ranges are the norm in conventional mathematical notation and it makes sense to pick the option that minimizes the overhead when transcribing between mathematical texts and code.
- the 1D index of element (i,j) in a matrix is i+(j-1)*m instead of i+j*m
- the i'th 3-element subvector of a vector is v[3*(i-1)+1:3*i]
instead of v[3*i:3*(i+1)]
- if you have vector of indices that partitions an vector into chunks,
the i'th chunk is v[ind[i]:ind[i+1]-1] instead of v[ind[i]:ind[i+1]]
Perhaps small issues, but these are all real examples from my most recent Matlab project that were annoying.But maybe, like the static typing issue, my opinion on this topic is distorted because I spent a lot of time programming in C++ and comparatively little time reading math papers.
Or maybe it would be equally easy to make a list of tasks that are ugly with [0:n) indexing.
>, my opinion on this topic is distorted because I spent a lot of time programming in C++
Mathematics-related programming[1] in MATLAB, R Language, Mathematica, SAS, etc all use 1-based indexing. Given that the originators of Julia are MATLAB users, it makes sense that they made a deliberate choice to keep 1-based indexing.
In other words, it was more important to grab mindshare from those previous math tools rather than appeal to C/C++/Java/etc programmers.
One outlier in the landscape of numerical programming is Python+NumPy/SciPy in the sense that it uses 0-based indices. While Julia also wants to be attractive to Python programmers, it still seems like the bigger motivation was programmers of MATLAB and other math software.
[1]https://www.youtube.com/watch?v=02U9AJMEWx0&feature=youtu.be...
With half-open ranges, for example, you will need different code to address a segment and the last element of a segment. E.g. if you have some structure with start_of(i) and end_of(i) expressions, then you can do a[start_of(i):end_of(i)] with closed indexing and a[end_of(i)] to access the last element, while with open intervals, you have to break the abstraction and use a[end_of(i)-1].
You can also iterate over start_of(i) .. end_of(i) in a for loop naturally if ranges are closed. (See how Python's iteration is defined in terms of half-open ranges and how iterating over closed ranges – which happens often enough when the values aren't indices – is a bit of a pain in Python.)
Not only does it have sound mathematical reasoning but also some anecdotal evidence of problems caused by one-based indexing in programming languages.
The other issue being that Julia gives fine grained control over a cluster in a way something more abstract couldn't. (After cobbling together a scripting-style map reducer based on the default functionality - ClusterUtils.jl.)
Both JavaScript and Python prohibit operations on types that don't support them. They both have TypeErrors (or something similar) that are thrown at runtime for certain operations, while others produce a result.
The difference is just that JavaScript allows a bunch of operations that Python prohibits, including several that serve almost no purpose.
Is allowing 1 + "1" allowing an operation on a type that doesn't support it, or does your language just allow adding numbers and strings? There is no principled way to answer that question.
That litmus test is probably not enough to make a true formal definition, but does allow you to make objective comparisons between languages for certain operations.
As for Julia, I'm not sure how strongly or weakly typed it is, but I would probably put it at around the same level as Java.
With that in mind, shouldn't we expect it to be a good base case for describing the "weakest typed" language?
"Programming Languages: Application and Interpretation" Shriram Krishnamurthi, 2003.
(somewhat in jest, there's also https://github.com/simonster/TwoBasedIndexing.jl)
http://docs.julialang.org/en/latest/manual/conversion-and-pr...
That's a good point. Probably the most widespread data example for non-programmers is spreadsheets (MS Excel, Google Sheets). The first row[1] in the spreadsheet is labeled as "1" instead of "0". The idiomatic Visual Basic programming code to loop through the rows would look something like:
For Each cell In Range("a1:a25") ' not "a0:a24"
' do work
Next cell
[1]https://www.google.com/search?q=microsoft+excel+spreadsheet+...But I think that's neither here nor there. Whenever the index has more use than as a label, mathematics starts at zero. Modular arithmetic, polynomials, discrete fourier transformations - for that matter, any discrete approximation of continuous math - all naturally start at zero, and generate lots of -1s in one-based indexing.