docker run --rm -it --privileged -p 3000:3000 -p 443:443 linuxserver/kasm bash
I had spent a long time trying to simplify Linux Desktop application delivery with linuxserver/webtop and all the derivative dedicated app images, but the speed and quality was always lacking as it was using XRDP in tandem with Guacamole. The difference with this new KasmVNC https://github.com/kasmtech/KasmVNC implementation is night and day. Depending on client hardware it will deliver 60fps 1080p and 40-60fps 1440p for both the JPEG and QOI rendering modes. A quick video can be seen here https://youtu.be/VkzG5BU2gjo .
That is how NX4, Chrome, RuskDesk, Zoom, etc. all do it.
You also have an inherent latency issue as you have to buffer 2-5 frames at 16ms a pop server side to encode the data.
The standard install uses no priv containers. https://www.kasmweb.com/downloads
I only mention the Linuxserver container because most Linux/Docker users do not want to pollute their base OS with stuff just to try it out.
VNC is unfortunately inherently inefficient because it is just a framebuffer protocol, instead of RDP which passes through the graphics primitives to be rendered. The former will always involve encoding/decoding overhead at the server and client.
People used to do a lot of file sharing for example over the internet without Dropbox like corporate choke points. And of cousre this article's subject, remote desktop usage, has been important for individual freedom of computing, to be able to eg use your home desktop/IT infra from work or travels.
Sometimes you gotta take what you can get.
However, I agree with your point, that it's not that useful when I only use the full connection a few minutes a month. Still, it was only an extra 5 € from 300/300, so I figured why not.
Edit: I wish I could install this on my mac so I could VNC into it efficiently as well!
What is the state of the art for remote desktoping from linux to macos/windows, over "almost lan" conditions (symmetric 1 Gbit ethernet, < 10 ms rtt)? Enterprise VDI solutions I've used at work has worked better than anything I've managed to put together, especially if you consider stuff like streaming audio from the host...
Sending graphics primitives turned out to be the worst way to do remote desktop. All modern solutions just use video codecs. NX (the best solution on Linux) even switched from X11 forwarding to a video codec in NX4.
VNC is inefficient because it is ancient and uses extremely inefficient methods to encode the graphics. GIF is really inefficient too but you wouldn't say that means the idea of encoding animated images as bitmaps is a bad one.
But IIRC, by 2007, it switched completely to sending tiles just like VNC.
However, if you read the protocol descriptions, you get the wrong idea that primitives are still used.
Similarly, most X11 clients have been doing everything client side for ages, but many people still believe that peimitives dominate.
An example of a truly inefficient encoding method is ZRLE. It is a tiled zlib and run-length encoded format that can't be split up into multiple jobs because future computation depends on past computation.
They got things right with the tight encoding method of which there are two variants: zlib (lossless) and jpeg. With zlib, you can have up to 4 separate zlib streams, which means that you can utilise 4 CPU cores for encoding in parallel. The jpeg method has no such limitations.
RDP also supports plain framebuffers and that's what we use.
Anyhow, with the advent of specialised hardware for video encoding, passing framebuffers isn't necessarily such a bad option. You just have to make sure that the CPU doesn't touch the framebuffer before encoding and after decoding.
Consider that oculus link uses video compression and doesn’t introduce the extra latency you’re describing. You only need to buffer frames if you want the best compression ratio. But you can always configure the encoder to not do look ahead. It’s also better to choose cbr over vbr to avoid the second pass of a frame at the cost of reducing quality/bitrate a bit. I’m practice it can work really well because even 20mbit/s is sufficient to send high res text.
I wish Remote Desktop applications would copy the oculus link architecture. You can easily get only a few frames of latency (sub 100ms) provided you composite on the GPU, use hardware encode/decode, and slice the video stream (which ie send 1/4 of the screen while encoding the next 1/4) which cuts down on decode latency and ensures you smear the expensive work across the entire refresh cycle time instead of having to do it all at once in a non-pipeline fashion (which introduces bubbles into scheduling).
And just about every phone, tablet, laptop, desktop made in the past decade has some form of H.264 fixed-function hardware decoding.
I guess “by default” means if both client and server have the required hardware support and there aren’t too many concurrent sessions.