LLVM 3.0 Released(llvm.org) |
LLVM 3.0 Released(llvm.org) |
- The VMKit project is an implementation of a Java Virtual Machine (Java VM or JVM) that uses LLVM for static and just-in-time compilation.
Am I reading this right? A faster Java that doesn't require a virtual machine? Does this mean faster Java, Scala, Clojure, or even Ruby, and deployed anywhere that LLVM can build for (which means pretty much everywhere)? Sounds too good to be true.
- LanguageKit is a framework for implementing dynamic languages sharing an object model with Objective-C.
- Eero is a fully header-and-binary-compatible dialect of Objective-C 2.0
Has anyone tried Eero? It looks interesting (and seems to be compatible with Objective-C code): http://eerolanguage.org/from-objective-c-to-eero. One thing I didn't get from the documentation is does Eero still require header files?
>> Am I reading this right? A faster Java that doesn't require a virtual machine? Does this mean faster Java, Scala, Clojure, or even Ruby, and deployed anywhere that LLVM can build for (which means pretty much everywhere)? Sounds too good to be true.
You are reading it wrong. Every word in the sentence "A faster Java that doesn't require a virtual machine" is somewhat incorrect or inaccurate. Let me explain.
First of all, there are no guarantees that VMKit -powered Java is faster than any other Java implementation out there. The LLVM native code generators are powerful, but they were not initially designed for JIT compilation, so the compilation is said to be a little slower than many other JIT's. I'm not saying that VMKit will be slower, either. It's probably a case by case situation, where one JVM is faster in one case and another one in a different case.
Second, VMKit is a framework for building virtual machines, so the part about "without a virtual machine" doesn't make sense. A VMKit based virtual machine will interpret and JIT compile the Java bytecode pretty much like any other VM out there. In contrast, the GNU Java Compiler (gcj) compiles Java code into native code and works without a bytecode interpreter. I think GCJ is often used to pre-build the Java class libraries into machine code in some setups.
>> Does this mean faster Java, Scala, Clojure, or even Ruby, and deployed anywhere that LLVM can build for (which means pretty much everywhere)?
No, it does not. As said before, the virtual machine is there as usual and the virtual machine has to be ported to the target platform. This may prove to be problematic on e.g. Android, because it ships with a limited userland and you need for example full C++ standard libraries to get LLVM working, which you need for VMKit. (This does not apply to non-JIT'ed language, where LLVM is used ahead of time and not on the target arch).
The real issue of portability across platform is not the target architecture's instruction set architecture (ISA) and it never was (despite what Java marketing wanted you to believe in 1999). With a modern compiler like LLVM it is trivial to produce machine code to multiple architectures like x86, OpenRISC, ARM, MIPS or an AVR microcontroller.
The real issue of portability is the frameworks and libraries needed. It starts from the operating system and standard libraries. Then comes all the libs that are built on top of those.
So even if it were possible to get e.g. Ruby up and running on a previously unsupported platform, the problem would be that Ruby on Rails would probably not work as easily.
I hope this clarifies a few of the misconceptions you had about LLVM and projects that use it. Even though it does not quite fulfill your expectations, remember that LLVM kicks ass.
A few things I've been working on:
A N64 emulator that already runs a few demos quite fast. The Super Mario 64 rom loads, but it doesn't work yet. I suspect it to be a TLB issue.
A dosbox port, which I'll later try to use to run Windows 95 on top. I know, this is a monumental attack on Microsoft's copyright. I won't release this.
A port of the dillo browser. A browser inside another browser. I've looked into webkit but I'm sure it is too big to fit in a browser.
A port of PHP's cli interpreter.
If all of this is not fun, then I don't know what fun is! All thanks to LLVM and emscripten.
Sounds faster than the Java I know.
The Sun JVM is pretty well tuned. A lot of people have a lot of money riding on top of it performing well. There is room for improvement (see Azul) but I would be surprised if a FOSS project would break much ground without major commercial backing.
Static compilation would have a lot of benefits for short lived programs. I know IBM's i OS (OS/400) used to use statically compiled Java and was quite a bit faster than that era JRE, but it's somewhat cloudy as the whole machine was quasi-VM with the MI level. Not sure if they retained this but it would be interesting to compare now days to i.e. HotSpot in Java 7.
https://github.com/jvoorhis/ruby-llvm
Since the LLVM type system was rewritten, I have a lot of work to do to target 3.0.
I don't know about 'faster', but gcj has been compiling Java to machine code for a while now.
From the homepage:
> GCJ is a portable, optimizing, ahead-of-time compiler for the Java Programming Language. It can compile Java source code to Java bytecode (class files) or directly to native machine code, and Java bytecode to native machine code.
Similarly, running eclipse under gcj is not something you want to do, it becomes slow.
From http://llvm.org/ I looked at Overview, Features, Documentation and FAQ, and did not find the definition of LLVM. I ultimately had to go to Wikipedia.
The first sentence on the front page of llvm.org pretty much sums it up: "The LLVM Project is a collection of modular and reusable compiler and toolchain technologies."
It may not be the clearest LLVM description out there, but that's pretty much what it is. If the description had more detail, it would not fit in one sentence.
The hard thing about describing LLVM is that it's a huge complicated project in a domain that's outside even many professional programmers' domain.
I tend to say that LLVM is (to me) a "compiler infrastructure", because I use it to build compiler back ends. However, LLVM is so much more than that, as the project includes loosely coupled tools ranging from complete compilers (clang) to debuggers (lldb) to byte code and binary format introspection utilities (llvm's binutils counterparts). So a "compiler infrastructure" or any other dumbed down explanation wouldn't do it justice. That's why the first sentence on the front page is actually pretty good.
(I did not downvote you BTW)
Someone else replied that the project is now just referred to as LLVM. That's fine, but people expect acronyms to stand for something, and the definition or lack of should be way at the top of any project. Lots of people come to a project for the first time and aren't in the know.
> The name "LLVM" was once an acronym, but is now just a brand for the umbrella project.
(see: http://www.excelsior-usa.com/)
[disclaimer: I've worked for Excelsior LLC back then when initial support for AOT compilation of Eclipse was implemented]
The LLVM developers are experts in compilers, I'd much more prefer them spending time writing compilers than fixing little things on the web site.
I don't doubt it needs hacking to get running; I probably wouldn't want to install the latest release from source. However, when you have the package maintainers of two distros (Debian and Ubuntu) on your side, things get a bit easier.
I don't understand the downvotes, if you downvoted, please tell why. I tried to be sensible in explaining what LLVM is and why it's hard to explain. And why it's no longer an acronym.