I think the blurb about the downsides of Wasm is just too generic, it’s a sort of “why Wasm isn’t preferable to JS in all cases” for the uninitiated. It may not be meant to imply that number crunching is the use case.
Not at all! We're building a platform on which you can build your entire app. What you say may have been the case four years ago when Workers launched, but since then we've added Durable Objects, Cron triggers, much longer time limits, etc. We very much believe Workers can be a stand-alone alternative to other cloud providers.
> for example, they have tight memory limits and don’t have great performance
This isn't true.
"Performance" is a vague term, you need to clarify the use case and what you're measuring. But, I can't think of what you could mean by "don't have great performance", that seems to imply that they execute code slower or something, which just isn't true at all. In many cases, Workers perform much better than you could achieve with any other platform, due to the ability to spread work and data across the network and move it close to where it's needed.
The "memory limit" on a single worker instance is 128MB, but Cloudflare runs many instances of the worker around the world, so across the network you're really getting many gigabytes of memory. By building a distributed system based on Durable Objects, you can harness the memory of many instances to use on a single task. Workers definitely biases towards distributing load across the network rather than running a single fat instance of your server, but that just means Workers makes it easy to build apps that scale to much higher.
What this article is highlighting is that Wasm is still an immature technology. That is, unfortunately, just a fact. There's still work to be done, and progress is being made, but it's still early. The code footprint issue (because every app must bring along its own language runtime) is the biggest blocker. We hope to see that solved with dynamic linking.
But, Workers isn't primarily based on Wasm. The vast majority of Workers are written in JavaScript, where these issues don't exist. Workers runs JavaScript just as fast as any Node.js server, and runs it closer to the client resulting in better latency.
Time for Durable Workers: Run one Worker (per Durable Object instance) with higher RAM and wall-time ceiling, capable of serving WebSockets and WebRTC data-channels (and gRPC, if we are being ambitious)?
> The "memory limit" on a single worker instance is 128MB, but Cloudflare runs many instances of the worker around the world, so across the network you're really getting many gigabytes of memory.
Throw-in the zonal Cloudflare cache (which is free upto 500MB per cached-object) and some clever workarounds, a lot could be done. May be Cloudflare dev-rel could to add it in a "Workers SDK" of sorts to make this easier, or the eng team can work towards seamlessly exposing it to Workers instances as swap space? :D
> What this article is highlighting is that Wasm is still an immature technology.
I assume you meant on the server, where wasm is fairly new, that's true. On the browser, wasm is a mature and stable part of the Web platform.
Otherwise very good points!
That could be most web apps functionalities. Things like registration, authorization/authentication, sending emails, store/retrieve data, etc...
> so using them to do “serious number crunching” at the edge, which is the advertised use case, seems questionable.
Cloudflare workers don't run in the background. They block the HTTP request. For serious computation, Cloudflare should offer background workers that can run for extended periods of time. [1]
1: This could be tricked by triggering an async request, but there is no push API to get notify the "App" of the result.
https://developers.cloudflare.com/workers/platform/cron-trig...
You can use `event.waitUntil()` to schedule a task that runs after the HTTP response has completed, and you can use cron triggers to schedule background work in the absence of any HTTP request at all. You can even build a reliable async queuing system on top of cron triggers and Durable Objects, though at the moment it's a bit DIY -- we're working on improving that.
If the OP wants a zero-config typescript experience (assuming Deno isn't available on Cloudflare workers), I can't recommend esbuild enough
I'll put it this way: I've spent enough time with Webpack and Babel and TSC at this point that I can troubleshoot most issues without too much difficulty. But despite that I reach for esbuild every time I possibly can, because I just don't want to mess with all that stuff if I don't have to.
Seeing WASM evolve as the new sandboxed runtime target dejour is a super interesting and I love that it is bringing more variety of very powerful but traditionally backend or systems languages to the web.
https://community.cloudflare.com/t/fixed-cloudflare-workers-...
Basically the "[FIXED]" in the thread is about making the compilation time 10x faster, from seconds to hundreds of milliseconds, which is still very incompatible for a service that tries to make use of the few milliseconds from the TLS handshake to eliminate cold start latency.
That bit isn't true, unfortunately; the file list is stored at the end of the file. You either have to (sequentially, file-by-file) scan the zip to find the file you want, or look at the end. Even more unfortunately, there are several variable-length fields between the interesting stuff like the file list and the end of the archive, and the length of those fields is stored before the field itself, so you can't simply use a Range request to retrieve the last <x> bytes from the end, either. (And even more more unfortunately, the very last thing in the file is a "comment" field, which could conceivably contain the magic number that you have to look for in order to read the trailer/footer record.)
The Zip format is truly awful.
https://en.wikipedia.org/wiki/ZIP_(file_format)#Central_dire...
https://github.com/python/cpython/blob/7e465a6b8273dc0b6cb48...
All wasm instructions can do is read and write from the wasm Memory that the wasm is initialized with. They can't even refer to separate things like a new ArrayBuffer from JS. So you do need to copy.
Newer wasm additions like reference types allow an ArrayBuffer to be referred to inside wasm, but only as an opaque reference to the entire thing (an externref). There is still no ability to actually read and write from it inside wasm.
The solution to this is BYOB ("bring your own buffer") APIs, which JS is adding. They are experimental atm though. Here is the relevant one here:
https://developer.mozilla.org/en-US/docs/Web/API/ReadableStr...
Note how you pass in a view to the JS API. That can be a view into the wasm memory, letting the browser directly write data into there, and then wasm can operate on it immediately.
In that case, JavaScripts only duty is to pack input data into the byte array and pass control to WASM which writes to an output buffer (also shared byte array) and passes control back to JavaScript to process result avoiding an unnecessary data copy.
You’d have to have some types in JS which can process the data using bit shifts and whatnot though in a lower level way.
Disclaimer: I work at StackPath which offers containers and VMs at the edge with anycast IPs which is perfect for this use case.
But, it's a little hard to answer the question based on your description, because you've described your setup in traditional server terms that don't translate directly to how Workers does things. Workers doesn't have tasks running in VMs or containers, it's a serverless distributed systems platform where code runs in response to events across a wide network.
So, in order to tell you how to build your application on Workers, I'd need to know what the application actually does.
Most likely, though, you would replace your rocksdb/sqlite storage with Durable Objects storage. You could locate different Durable Objects in specific regions as desired.
EDIT: There's also bucklescript, which goes from OCaml to JS (skips the bytecode), but I think it is more relaxed in terms of preserving OCaml behavior. i.e., it relies on how JS behaves for certain operations, instead of trying to preserve how natively compiled OCaml would behave.
The Wasm module in question here was 2.6MB uncompressed, which we consider quite large. IIRC after enabling liftoff it took about 200ms to load. Smaller modules would be expected to load faster. We'd normally recommend doing some tree shaking to get the size down to something in the hundreds-of-kilobytes range, in which case the load time would be expected to be tens of milliseconds, comparable to a TLS handshake.
The developer who reported the problem here considered it to be fixed, and we haven't heard anyone complain about Wasm cold start times since then, so we haven't prioritized further optimizations for the time being. If current cold start times are a problem for you, though, then please let us know.
> we're working on improving that.
I'd assume you are working for Cloudflare. Do you see it going the way of firebase?
I don't know much about Firebase, to be honest, so I'm not sure how to answer that question. But, our aim is that every type of server compute should be something you can build on Workers. Meanwhile, our design philosophy is that Workers should feel like you're programming one big, globe-spanning computer, rather than lots of individual servers.
In both environments there are certainly use cases where Wasm provides huge advantages. But those use cases are still narrow. Over time it'll grow but there's still tons of work to do.
Yes, if you want to write a client-side webapp you run into limitations. That wasn't one of our main goals when we created wasm, though! It would be great if that materializes - more options are always good - but JavaScript is frankly the right tool for 99% of sites and we never intended wasm to directly compete with JS there.
Wasm is stable and mature for solving the needs of sites like Google Earth, Unity games, Figma, Meet, Zoom, etc. Those require more than what JS can offer and wasm is the perfect fit for the relevant parts of them.
On those websites wasm is often the difference between shipping and not shipping. That's a huge deal, and why wasm has been focused there. Other use cases like replacing JS with wasm might offer some benefits in speed, perhaps, but the impact of that would be smaller (but it could eventually apply to a wider set of sites, potentially).
It's when people want to write entire apps entirely in their language of choice, and want to accomplish this using Wasm, that the technology is still missing things. A lot of people want to do this, both on the browser side and the server side.
And from the reply, the response wasn't even correct. Bad form.
That's still a lot more than "a single file of javascript", certainly, but it's not that bad.
Trade-offs all the way down, as ever.
This is a pretty significant cause of bloat in practice, sadly, but toolchain improvements may help in the future. So for now, JS's advantage of the standard library being in the VM is pretty significant.
Adding GC to WASM makes it essentially like the JVM because it has to know about the layout of every type (to find pointers, etc.) As far as I can see, this effort is like bolting a VM that's 2x-5x as big (in terms of semantics) on top of the existing small WASM VM.
I think they will end up with something like the union of JVM and CLR [1], and even that's not enough.
JS already has garbage collection, but its runtime data types can't really host something like Java or OCaml efficiently.
----
The CLR is supposedly language-agnostic, but I'd argue it's not. Visual Basic was "broken" for this reason -- VB.NET is more like C# than VB6. The old code doesn't run.
I've heard PowerShell described as a weird shell-like syntax for writing C# programs.
And I remember F#'s behavior around null, algebraic data types, and exceptions was heavily influenced by the CLR. In some ways it's probably closer to C# than its prime influence of OCaml.
So while I don't know anything about the WASM GC effort (and haven't kept up with it), I'm skeptical that we'll get a true polyglot experience. What's more likely is that some languages will be favored over others, with the "losers" experiencing 2x - 10x slowdowns.
And this doesn't even get into the runtime library issues. For example PyPy is essentially perfectly compatible with CPython at the language level, and has been for over a decade. Still, many applications have difficulty migrating to it because they lose bindings to native libraries, like linear algebra with NumPy, and OpenGL, Win32 bindings, etc. (these are enormous)
I expect the analogous issue to be a big problem for using WASM in a polyglot fashion too.
----
As a separate issue, WASM is still not up to par with native code in terms of protections around the stack and the heap: https://lobste.rs/s/a9ghhz/maintain_it_with_zig#c_ghawis . Thus it favors Rust over C/C++, since Rust enforces more integrity at compile time.
Real apps need to poke many holes in the VM to get anything done, and those attack vectors matter more as that happens. WASM follows the principle of least privilege better but it has regressed in other dimensions (at least if you want to run legacy C code, which was the original use case advertised)
----
[1] Random article from Google suggests that this is HARD, and these are two of the most similar VMs out there: https://www.overops.com/blog/clr-vs-jvm-how-the-battle-betwe...
CLR includes instructions for closures, coroutines and declaration/manipulation of pointers, the JVM does not
Another smaller difference is that the CLR was built with instructions for dealing with generic types and for applying parametric specializations on those types at runtime.
The language that fully exposes its capabilities is C++/CLI and not C#.
In fact, most of the performance improvements since C# 7, have been how to surface those C++ capabilities into C#, while keeping the generated MSIL verifiable. C++/CLI is also able to generate unsafe MSIL sequences.
This is specially relevant since C++/CLI is Windows only, so one cannot just switch to it in cross platform .NET.
Secondly, .NET was also an excuse to reboot VB language, so some QuickBasic quirks were also thrown out, additionally, they have increasingly made VB less painful to migrate old VB 6 code.
Other than that, you are right, and this is also a reason why platform languages always have an edge over guest languages, even if they aren't as shiny in getting new language features out the door.
That is, WASM is a much more natural target for C++/Rust than anything higher level. It can also express unsafe code when you consider the issues brought up in the paper, i.e. that it's unaware of heap integrity. Conversely, the JVM/CLR was traditionally better for Java/C# like languages and you couldn't run C/C++ naturally.
This makes sense as I've heard some of the more recent C# features are to recover performance, like value types, slicing, etc.
It is WebAssembly that tends to be "sold" as if it was the first of its kind.
In fact even the CLR wasn't the first one, there were other bytecode formats for languages, like EM from Amsterdam Compiler Toolkit, IBM and Unisys mainframes/micros.
And around 2003, there was a Swedish startup trying to push a mobile OS that used a VM capable of J2ME, C and C++, the name I cannot longer remember, just some Sony-Ericson models used to have it.