Architecture for a JavaScript to C compiler

Architecture for a JavaScript to C compiler(timr.co)

90 points by timruffles 7 years ago | 45 comments

snek 7 years ago |

I don't understand why people keep making js-to-native compilers. the result is always less performance that what modern js engines can do to the original source code, because all those runtime behaviours the author mentions build up really fast. js engines actually run the code and figure out the types and such to create highly optimized machine code.

nicoburns 7 years ago | |

Yeah, the JS runtimes are really impressive. I'd be interested to see what a compiler could do with TypeScript though. There could perhaps be guidelines of which features you can use in hot code paths in order to get optimised code, which could be enforced by the compiler for sections of the code which you mark...

tjelen 7 years ago | | |

I think that the AssemblyScript project is halfway there. It compiles a stricter subset of Typescript to webassembly format, which can be then translated to CPU-specific machine code.

masklinn 7 years ago | | |

> Yeah, the JS runtimes are really impressive. I'd be interested to see what a compiler could do with TypeScript though.

You can probably just compare AOT and JIT Java or C# implementations and get something of an upper bound.

Actual improvements will likely depend on your use case: for short-running programs the overhead of the parsing, JIT machinery & interpretation might dwarf the execution time so even a straightforward compilation full of virtual calls will edge out the pre-JIT interpreter, however for longer-running programs the JIT should be able to devirtualise, inline and optimise the code much better than the AOT compiler will.

Inlining is the most important optimisation, and it's hard to AOT inline dynamic dispatch.

chrisseaton 7 years ago | | |

I would imagine JS JITs can determine more specific types at runtime than you can express in TypeScript source code.

ridiculous_fish 7 years ago | |

Modern JS engines rely on tracing JITs to achieve good performance. These optimizations only kick in once code has executed multiple times.

But a lot of JS code executes only once, such as layout code running at app launch. Heck, lots of JS code executes zero times. This code still imposes a cost (parsing, etc).

Consider the complaints about app launch time of Electron apps. A static compiler can be more effective at the runs-once or runs-zero cases.

pizlonator 7 years ago | | |

Not tracing. Speculation.

The difference is that you compile user control flow as-is unless you have overwhelming evidence that you should do otherwise.

And yes you are right. This is promising for run-once code, but all of the cost will be in dynamic things like the envGet. An interpreter can actually do better here because most environment resolution can be done as part of bytecode generation. So it’s possible that this experiment leads to something that is slower than JSC’s interpreter.

lttlrck 7 years ago | | |

There has been a huge amount of work in V8 addressing these cases and concerns:

https://v8.dev/blog/launching-ignition-and-turbofan https://v8.dev/blog/v8-release-66 https://v8.dev/blog/background-compilation

chrisseaton 7 years ago | | |

> Modern JS engines rely on tracing JITs

Which modern JS engine still relies on tracing? I thought they’d all moved on from that technique many years ago, but I’m not an expert in JS.

maxxxxx 7 years ago | | |

Can the compiler really do much considering how dynamic Javascript is?

timruffles 7 years ago | |

Oh sure, it's just for fun/learning! Check my first post.

a13n 7 years ago | |

Maybe you want to port some existing JS to another platform (eg. OS X, iOS, Android) where you want to run it natively rather than spinning up a JS engine and dealing with the bridge.

roryrjb 7 years ago | |

It may be slower, but I'd give up speed if this would result in less memory usage.

pizlonator 7 years ago | |

On the one hand, you’re totally right.

But maybe there will be some breakthrough, so it’s important to stay open-minded. That breakthrough may be something modest like if this style of JS execution was better for some niche use case.

jacobush 7 years ago | |

I could imagine a use case: you want to include some Javascript code base into your own EXE, but you don't want to drag in node.js or something heavy like that.

truth_seeker 7 years ago |

I wonder how much optimization will it bring to compared to existing JS runtimes such as V8.

Thanks to competing world for web browsers, JS runtimes not only efficiently parse to optimized native code but also provide really good JIT compilation benefits.

Speculative optimization for V8 - https://ponyfoo.com/articles/an-introduction-to-speculative-...

Parallel and Concurrent GC - https://v8.dev/blog/trash-talk

Good summary on 10 years of V8 - https://v8.dev/blog/10-years

ridiculous_fish 7 years ago | |

v8-style engines do not parse to optimized native code.

As described in the links, v8 parses to an AST, which then is compiled to bytecode. A bytecode VM then executes the JS, collecting runtime type information, which is input (along with the bytecode itself) into the next compilation tier; only at that point is machine code generated.

The key idea is that v8 expects to execute the JS code before it can generate native code. It won't generate native code from parsing alone.

rkeene2 7 years ago |

A lot of research and hard work has gone into TclQuadCode [0], which compiles Tcl (which is even more dynamic than JavaScript) into machine code via LLVM.

The authors indicated at one point it took around 5 PhDs to get it going.

[0] https://core.tcl.tk/tclquadcode/dir?ci=trunk

ridiculous_fish 7 years ago | |

This is bizarre and fascinating. I had no idea there were Tcl codebases of a size that could benefit from this sort of perf work. How much Tcl is out there?

ndesaulniers 7 years ago |

I actually think this is possible; and started prototyping it (because esprima is awesome, and not to many other languages have an equivalent that's so easy to use).

Some thoughts: I think it's easier to target C++ than C, since C++ can help you write more type generic code. I think it's easy to generate tagged unions, then for optimizations try to prove monomorphism. Finally, it may be simpler to start off with support for typescript, and fail to compile if there are any ANY types. I do think it's possible though. JS/TS -> C++ -> WASM (yes, I was out of my mind when I thought of this)

johnhenry 7 years ago | |

Isn't this kind of what V8 and other modern JavaScript engines do on the fly already?

mikece 7 years ago | | |

Yes: one of the Google engineers working on V8 talked about it here: https://softwareengineeringdaily.com/2018/10/03/javascript-a...

It was this conversation that make me wonder if at some point in the future V8 might have experimental support natively for TypeScript but it makes more sense that compiling web assembly to native binary would make more sense. Who knows? It's an awesome time to be a programmer!

maxxxxx 7 years ago |

With all the dynamic stuff Javascript has it seems really difficult to create performant C code.

There is a PHP to .NET compiler which probably has similar problems. On second thought that one is probably easier because .NEt has a dynamic runtime.

nicoburns 7 years ago | |

That, and in practice modern PHP is often relatively staticly typed (classes declare their fields, etc), and many codebases even include type annotations (which are supported in PHP and usually enforced at runtime).

tannhaeuser 7 years ago |

Congrats to completing this project. What's the status and further plans for it? I didn't find a license.

ridiculous_fish 7 years ago |

How are exceptions handled with this design? For example the `n < 3` may throw (it invokes the valueOf method).

timruffles 7 years ago | |

I went for longjmp(), non-local goto, for precisely the reason you highlighted: I realised pretty well everything in JS can throw, so needed something easy to trigger from anywhere

See https://github.com/timruffles/js-to-c/blob/1befbf4220753576e...

maxgraey 7 years ago |

I wonder why people still try write transpiler from JS to C/C++ or LLVM (which make sense at least). But this not performant way and usually produce much bigger overhead than jit vm which use speculative optimizations.

Some projects: 1. https://github.com/fabiosantoscode/js2cpp 2. https://github.com/raphamorim/js2c 3. https://github.com/ammer/js2c 4. https://github.com/NectarJS/nectarjs 5. https://github.com/ovr/StaticScript