Reality Check for Cloudflare Wasm Workers and Rust

Reality Check for Cloudflare Wasm Workers and Rust(nickb.dev)

156 points by comagoosie 4 years ago | 59 comments

dgreensp 4 years ago |

I think the one-sentence version of this is that Workers are meant for small, undemanding tasks (for example, they have tight memory limits and don’t have great performance), so using them to do “serious number crunching” at the edge, which is the advertised use case, seems questionable.

I think the blurb about the downsides of Wasm is just too generic, it’s a sort of “why Wasm isn’t preferable to JS in all cases” for the uninitiated. It may not be meant to imply that number crunching is the use case.

kentonv 4 years ago | |

> I think the one-sentence version of this is that Workers are meant for small, undemanding tasks

Not at all! We're building a platform on which you can build your entire app. What you say may have been the case four years ago when Workers launched, but since then we've added Durable Objects, Cron triggers, much longer time limits, etc. We very much believe Workers can be a stand-alone alternative to other cloud providers.

> for example, they have tight memory limits and don’t have great performance

This isn't true.

"Performance" is a vague term, you need to clarify the use case and what you're measuring. But, I can't think of what you could mean by "don't have great performance", that seems to imply that they execute code slower or something, which just isn't true at all. In many cases, Workers perform much better than you could achieve with any other platform, due to the ability to spread work and data across the network and move it close to where it's needed.

The "memory limit" on a single worker instance is 128MB, but Cloudflare runs many instances of the worker around the world, so across the network you're really getting many gigabytes of memory. By building a distributed system based on Durable Objects, you can harness the memory of many instances to use on a single task. Workers definitely biases towards distributing load across the network rather than running a single fat instance of your server, but that just means Workers makes it easy to build apps that scale to much higher.

What this article is highlighting is that Wasm is still an immature technology. That is, unfortunately, just a fact. There's still work to be done, and progress is being made, but it's still early. The code footprint issue (because every app must bring along its own language runtime) is the biggest blocker. We hope to see that solved with dynamic linking.

But, Workers isn't primarily based on Wasm. The vast majority of Workers are written in JavaScript, where these issues don't exist. Workers runs JavaScript just as fast as any Node.js server, and runs it closer to the client resulting in better latency.

nyanpasu64 4 years ago | | |

I don't see how your response addresses the article's issues where workers don't have enough memory, space, or runtime to perform number crunching, meaning that for people with those use cases it's not a full "alternative to other cloud providers".

Gepsens 4 years ago | | |

I have a a server engine that runs a lot of small (and potentially unique) tasks (say 0.2% of 1vCPU and 30Mb of RAM), but require durable disk (rocksdb, sqlite for instance + some random files) and I want to locate them in specific regions depending on the server they are targeting. Would Cloudflare workers be good enough for that or shall I stick to DO's cheap vms ?

ignoramous 4 years ago | | |

> We very much believe Workers can be a stand-alone alternative to other cloud providers.

Time for Durable Workers: Run one Worker (per Durable Object instance) with higher RAM and wall-time ceiling, capable of serving WebSockets and WebRTC data-channels (and gRPC, if we are being ambitious)?

> The "memory limit" on a single worker instance is 128MB, but Cloudflare runs many instances of the worker around the world, so across the network you're really getting many gigabytes of memory.

Throw-in the zonal Cloudflare cache (which is free upto 500MB per cached-object) and some clever workarounds, a lot could be done. May be Cloudflare dev-rel could to add it in a "Workers SDK" of sorts to make this easier, or the eng team can work towards seamlessly exposing it to Workers instances as swap space? :D

azakai 4 years ago | | |

Small note on this:

> What this article is highlighting is that Wasm is still an immature technology.

I assume you meant on the server, where wasm is fairly new, that's true. On the browser, wasm is a mature and stable part of the Web platform.

Otherwise very good points!

mwcampbell 4 years ago | | |

Still, as you wrote earlier this year [1], there are valid use cases for something chunkier than isolates (and yes, I know that lifting and shifting existing web apps isn't one of them). I'm looking forward to your edge containers becoming more widely available.

[1]: https://blog.cloudflare.com/containers-on-the-edge/

csomar 4 years ago | |

> I think the one-sentence version of this is that Workers are meant for small, undemanding tasks (for example, they have tight memory limits and don’t have great performance)

That could be most web apps functionalities. Things like registration, authorization/authentication, sending emails, store/retrieve data, etc...

> so using them to do “serious number crunching” at the edge, which is the advertised use case, seems questionable.

Cloudflare workers don't run in the background. They block the HTTP request. For serious computation, Cloudflare should offer background workers that can run for extended periods of time. [1]

1: This could be tricked by triggering an async request, but there is no push API to get notify the "App" of the result.

eloff 4 years ago | | |

Workers does have a Cron like functionality. My memory is fuzzy but it's been around for a while.

https://developers.cloudflare.com/workers/platform/cron-trig...

kentonv 4 years ago | | |

> Cloudflare workers don't run in the background. They block the HTTP request. For serious computation, Cloudflare should offer background workers that can run for extended periods of time.

You can use `event.waitUntil()` to schedule a task that runs after the HTTP response has completed, and you can use cron triggers to schedule background work in the absence of any HTTP request at all. You can even build a reliable async queuing system on top of cron triggers and Durable Objects, though at the moment it's a bit DIY -- we're working on improving that.

wrkronmiller 4 years ago | | |

Given HTTP pipelining and fetch() promise semantics, I'm not sure there's a practical benefit to pushing results instead of "blocking" the original request.

brundolf 4 years ago | |

I for one hadn't thought about the cost of shipping your own standard library with every bundle, so it was informative for me

wibagusto 4 years ago | | |

WASM tasks shouldn’t need a full standard library. If you statically compile against any library it should only keep the pieces used.

brundolf 4 years ago |

> I guess I’ll stick with my error prone Javascript Workers or, more likely, spend an afternoon migrating to a minimal Typescript setup.

If the OP wants a zero-config typescript experience (assuming Deno isn't available on Cloudflare workers), I can't recommend esbuild enough

comagoosie 4 years ago | |

I think this is an excellent suggestion (I'm OP / author), and one one can add just a dash to this for typechecking. Minimal setups are appreciated, especially when one has many small projects.

brundolf 4 years ago | | |

Glad I could help! Esbuild won't do the actual type-checking for you, but your editor will (hopefully also without configuration)

I'll put it this way: I've spent enough time with Webpack and Babel and TSC at this point that I can troubleshoot most issues without too much difficulty. But despite that I reach for esbuild every time I possibly can, because I just don't want to mess with all that stuff if I don't have to.

wtetzner 4 years ago | |

There's also js_of_ocaml if a good type system is desired.

brundolf 4 years ago | | |

I think that's out of scope for the OP's needs/wants. They like TypeScript for catching basic API mistakes, the only thing they don't want is configuration headaches. I didn't get the sense they would be interested in learning a new language for this use-case, especially since I'm going to guess Cloudflare doesn't publish OCaml types for their JavaScript API.

Yoric 4 years ago | | |

When we wrote opalang, the size of generated JS was problematic, despite serious efforts at minimizing it. Does js_of_ocaml do better?

dafelst 4 years ago |

Great overview OP, and it's nice to see a kind of "in-between" scenario tested, i.e. not a super fast web request or transformation, rather something more akin to a lightweight batch job. It may not be quite a "recommended" use case but it is always interesting (for me at least) to see how these sorts of services' capabilitied can be pushed and or (gently) abused. The memory and code size limitations do seem very restrictive right now, which is a shame though.

Seeing WASM evolve as the new sandboxed runtime target dejour is a super interesting and I love that it is bringing more variety of very powerful but traditionally backend or systems languages to the web.

up6w6 4 years ago |

iirc the compilation time of wasm in Cloudflare Workers is very problematic[1] and right now it contradicts their idea of running low latency fast scripts, does anyone know if anything has changed ?

https://community.cloudflare.com/t/fixed-cloudflare-workers-...

kentonv 4 years ago | |

Yes, it has changed, which is why the thread you linked has "[FIXED]" in the title. Details can be found later on in the thread.

up6w6 4 years ago | | |

Yeah, my bad, maybe a better sentence would be asking if the problem was solved completely...

Basically the "[FIXED]" in the thread is about making the compilation time 10x faster, from seconds to hundreds of milliseconds, which is still very incompatible for a service that tries to make use of the few milliseconds from the TLS handshake to eliminate cold start latency.

nostrebored 4 years ago | | |

Your replies throughout the thread have been pretty hostile. You're much closer to this than anyone here. Empathy to that goes a long way.

jfrunyon 4 years ago |

I believe a zip file could be streamed - most of the file metadata is duplicated between both the 'central directory record' trailer and a header in front of each file. In other words, the first thing in the zip file is a header that you can use to extract the first file, followed by that file, followed by the next file's header...

stavros 4 years ago | |

You can, yes. You can even download the header, open the zip file, choose which files to extract, and only download those. This is possible with HTTP today, and has been for decades.

jfrunyon 4 years ago | | |

> You can even download the header, open the zip file, choose which files to extract, and only download those.

That bit isn't true, unfortunately; the file list is stored at the end of the file. You either have to (sequentially, file-by-file) scan the zip to find the file you want, or look at the end. Even more unfortunately, there are several variable-length fields between the interesting stuff like the file list and the end of the archive, and the length of those fields is stored before the field itself, so you can't simply use a Range request to retrieve the last <x> bytes from the end, either. (And even more more unfortunately, the very last thing in the file is a "comment" field, which could conceivably contain the magic number that you have to look for in order to read the trailer/footer record.)

The Zip format is truly awful.

https://en.wikipedia.org/wiki/ZIP_(file_format)#Central_dire...

https://github.com/python/cpython/blob/7e465a6b8273dc0b6cb48...

wibagusto 4 years ago |

Correct me if I’m wrong but the memory copying issue is not an issue if you pass an array buffer into WASM from JavaScript. In that scenario there’s no data copying. E.g. similar to how you’d pass the canvas data to WASM for direct manipulation.

azakai 4 years ago | |

No, this is an issue currently, both for network data and canvas data.

All wasm instructions can do is read and write from the wasm Memory that the wasm is initialized with. They can't even refer to separate things like a new ArrayBuffer from JS. So you do need to copy.

Newer wasm additions like reference types allow an ArrayBuffer to be referred to inside wasm, but only as an opaque reference to the entire thing (an externref). There is still no ability to actually read and write from it inside wasm.

The solution to this is BYOB ("bring your own buffer") APIs, which JS is adding. They are experimental atm though. Here is the relevant one here:

https://developer.mozilla.org/en-US/docs/Web/API/ReadableStr...

Note how you pass in a view to the JS API. That can be a view into the wasm memory, letting the browser directly write data into there, and then wasm can operate on it immediately.

wibagusto 4 years ago | | |

So the workaround is to initialize WASM with an array buffer which allows direct manipulation of the data in WASM it seems. The canvas bytes can be used in WASM so long as the bytes are passed in at initialization.

In that case, JavaScripts only duty is to pack input data into the byte array and pass control to WASM which writes to an output buffer (also shared byte array) and passes control back to JavaScript to process result avoiding an unnecessary data copy.

You’d have to have some types in JS which can process the data using bit shifts and whatnot though in a lower level way.

Matthias247 4 years ago | |

So how do you fill that arraybuffer from Javascript datatypes (e.g. strings which describe the request)? The answer is "by copying the relevant data" - which is exactly what the bridging code between JS and WASM does. You can only avoid that if the source data is already plain byte arrays and not javascript objects.