The HTTP of VR

The HTTP of VR(roderickkennedy.com)

53 points by rvkennedy 4 years ago | 73 comments

Animats 4 years ago |

Since I'm writing a new client, in Rust, for Second Life/Open Simulator, I'm very aware of these issues.

A metaverse client for a high-detail virtual world has most of the problems of an MMO client plus many of the problems of a web browser. First, much of what you're doing is time-sensitive. You have a stream of high-priority events in each direction that have to be dealt with quickly but don't have a high data volume. Then you have a lot of stuff that's less time critical.

The event stream is usually over UDP in the game world. Since you might lose a packet, that's a problem. Most games have "unreliable" packets, which, if lost, are superseded by later packets. ("Where is avatar now" is a typical use.) You'd like to have that stream on a higher quality of service than the others, if only ISPs and routers actually paid attention to that.

Then you have the less-critical stuff, which needs reliability. ("Object X enters world" is a typical use.) I'd use TCP for that, but Second Life has its own not very good UDP-based protocol, with a fixed retransmit timer. Reliable delivery, in-order delivery, no head of line blocking - pick two. TCP chooses the first two, SL's protocol chooses the first and third ones. Out of order delivery after a retransmit can cause avatars to lose clothing items, because the child item arrived before the parent item.

Then you have asset fetching. In Second Life/Open Simulator this is straight HTTP/1. But there are some unusual tricks. Textures are stored in progressive JPEG 2000. It's possible to open a connection and just read a few hundred bytes to get a low-rez version. Then, the client can stop reading for a while, put the low-rez version on screen, and wait to see if there's a need to keep reading, or just close the connection because a higher-rez version is not needed. The poor server has to tolerate a large number of stalled connections. Worse, the actual asset servers on AWS are front-ended by Akamai, which is optimized for browser-type behavior. Requesting an asset from an Akamai cache results in fetching the entire asset from AWS, even if only part of it is needed. There's a suspicion that large numbers of partial reads and stalled reads from clients sometimes causes Akamai's anti-DDOS detection to trip and throttle the data flow.

So those are just some of the issues "the HTTP of VR" must handle. Most are known to MMO designers. The big difference in virtual worlds is there's far more dynamic asset loading. How well that's managed has a strong influence on how consistent the world looks. It has to be constantly re-prioritized as the viewpoint moves.

(Demo, from my own work: https://vimeo.com/user28693218 This shows the client frantically trying to load the textures from the network before the camera gets close. Not all the tricks to make that look good are in this demo.)

It's not an overwhelmingly hard problem, but botch it and you will be laughed off Steam.

moron4hire 4 years ago | |

This is why I think it's a joke to be building metaverse apps in Unity. Unity and dynamic asset loading are not happy bed fellows.

There's not a lot I liked about Unity when I was working with it full-time a few years ago. But the one thing I could acknowledge that it has that was generally missing from open source web development was the asset pipeline. But dynamic, user-uploaded assets won't be able to use the asset pipeline. So one of the biggest drivers for using Unity goes right out the window.

Animats 4 years ago | | |

Unity and dynamic asset loading are not happy bed fellows.

Not Unreal Engine 4, either. UE5 has "asset streaming" and "open worlds", but mostly static and loaded from a local SSD on a Playstation 5. That's working nicely.

Asset management from the network is the real difference with seamless, modifiable virtual world systems. Otherwise, it's a minute of "...LOADING..." when you move to the next area. You need clients, servers, file formats, and protocols designed for it. It's a moderately hard engineering problem, and, as yet, there are no good off the shelf solutions.

There's a "check out, check in" approach. Decentraland uses that. You check out your parcel into a local Unity environment, edit, and check in the whole parcel to make it visible to others.

The Spatial OS people, Improbable, did some of this, but their solution cost so much to operate server side that all four of the games that used it went broke. So Improbable is trying to pivot to military simulation.

Probably by UE6 this will all be standard. It's one of those things that has to be done to move the metaverse from hype to usefulness.

jayd16 4 years ago | | |

You can do some level of dynamic asset loading. The real issue to get around in Unity is dynamic script loading. There's some progress being made with Unity's new visual scripting system. The visual scripts are stored as assets.

gfxgirl 4 years ago |

I think there is a different problem that needs to be solved and it's probably impossible.

I've dreamed of the metaverse since Snow Crash and maybe before (Tron?) but ... when it comes to actually making it, lets assume unlimited CPU/GPU power and unlimited memory.

Ideally, I want the Metaverse to allow people to run their own code. Whether its VR or AR it's a shared 3D space. So I want my Nintendo "Nintendogs" to be able to run around my "Ikea furniture" with my "Google/Apple/OSM maps" showing me navigation directions and my "FB Messenger/Discord/iOS Messenger" letting me connect to people inside. In a webpage, each of these things runs in an IFRAME isolated from the other and browsers go to great lengths to disllow one spying on another.

But in this 3D space my Nitendogs can't run through the space unless they can "sense the space". They need to know where the fire hydrants are, where the side walk is, what things they're allowed to climb/chew etc. But to do that effectively means they need enough info to spy on me.

Same for all the other apps. I can use messaging apps on my phone with GPS off and full network access off so that the app can't know my location, but order for different apps in the Metaverse to do similar they'll need to know at least the virtual location of themselves and the stuff around them which is enough to track/fignerprint

You can maybe get around some of this with a massive walled garden but that arguably is not the metaverse.

berkes 4 years ago | |

You presume a push model. And compare it to a pull model (iframes). I think that is where the solutions are.

The messages could be delivered as a simple XML feed. Your virtual home, or HUD knows where to place them. Through hyperlinks they know where to subscribe, or refresh, or get details. The messages don't need to know anything about placement and usage.

jayd16 4 years ago | |

Seems like you could share collision meshes without much risk of spying, no?

bborud 4 years ago |

Nothing in the blog posting suggests to me you can't use HTTP and Websockets for VR. The understanding of HTTP in the blog posting seems to be rooted in the early 2000s. I don't think the author has much experience in protocol design (it is harder than it looks).

It would be more productive to define a layer on top of HTTP/2 so we can leverage a lot of code that already works, rather than having to spend 10-15 years creating a new spec and codebases that need maturing.

And if you're not happy with websockets for low latency bidirectional communication: it would make more sense to improve websockets rather than reinvent the wheel.

Dirak 4 years ago |

Networking for multiplayer games is a super interesting problem space since games tend to be more sensitive to latency, packet loss, and the accuracy of game states between clients. The problems are even more pronounced in VR where noticeable latency or artifacts can cause motion sickness.

In modern fighter games, the industry seems to be tending toward predictive lockstep networking. This is a type of networking where if the client doesn't receive the inputs of other clients from the server, it will "predict" those inputs (usually by replaying the last received input) to give the illusion of zero latency gameplay. The drawback being that you need to implement rollback in the case where the predicted input doesn't match the real received input. When poorly executed, this could look like jittery player movement with entities rubber banding and teleporting and cause artifacts, but when done properly is mostly unnoticeable.

If you're interested in this domain, I recommend checking out https://www.ggpo.net/ which is the library used in many of the modern fighter games (notably Skullgirls). It also comes with an in depth explanation of how to implement predictive networking with rollback on your own https://drive.google.com/file/d/1cV0fY8e_SC1hIFF5E1rT8XRVRzP...

Mizza 4 years ago |

I don't want to have to strap a fucking telephone to my face to go to some shitty fake job. Please don't build this world.

Ono-Sendai 4 years ago |

I'm building something similar for metaverses, although with less emphasis on VR currently. See https://substrata.info/about_substrata

Currently it's a relatively simple bidirectional protocol over TLS. It's not fully documented yet but you can get an idea of it by looking at an example bot client in python: https://github.com/glaretechnologies/substrata-example-bot-p...

jayd16 4 years ago |

This is pretty silly. We can't throw away http because http solves problems that VR does not alleviate.

>A real-time, dynamic, stateful two-way client-server protocol. As such, it will be if not fully RTP then close to it.

Why didn't we always have this if all we needed to do was ask? So...realizing we still have the internet of today, what we actually need to rethink is html and the concept of the web as documents alone.

I would be interested to see some work on hyper-objects. As in, hypertext beyond text. The article should be "HTML for VR" and we should be musing about how to find, load, interact and link web based virtual objects.

raidicy 4 years ago |

Aframe comes to mind. You can have full VR experiences that link just like a Link in HTML to other VR experiences.

https://aframe.io/examples/

edoceo 4 years ago | |

I thought this post was gonna be about aframe. It's super cool and the docs are good enough that a fool like me could get something neat in a day. Made an aframe HTML with PHP reading from my DB. It's rad.

binarynate 4 years ago |

> By far the greatest reason to look beyond HTML and HTTP for spatial computing is simply this: these technologies will continue to develop, and will always be driven by their primary purpose: to deliver webpages, websites and static, or marginally dynamic content.

This is a valid point, but I believe there's still enormous potential to innovate on top of WebXR. Since browser engines are open source, it's possible for upstart XR browser apps to add additional features to Gecko or Chromium that push WebXR forward.

binarynate 4 years ago | |

On a related note, I develop libraries for embedding web browsers in Unity 3D (https://vuplex.com), including a library for embedding the Mozilla GeckoView library used by FireFox Reality. I plan to develop a WebXR driver for it, but haven't prioritized it yet. If you're interested in developing a WebXR driver for use with GeckoView (for example, to use with Oculus Quest), you can contact me, and I'll send you my notes from my research:

https://support.vuplex.com/contact

jimmySixDOF 4 years ago | | |

I'm interested in how you might be able to get through the CORS problem in WebXR/Browser standards ? In a VR Unity app with embedded browsers you can click through hyperlinks no problem but in WebXR the same hyperlink will trigger an origin mismatch and break out of immersion. Not having good 2D web content integration is a major blocker in WebXR for so many applications.

bullen 4 years ago |

The HTTP of VR is HTTP!

http://fuse.rupy.se/about.html

You also need a P2P protocol (probably some binary UDP thing) for tick based data like limb positions if you want body language.

But really VR is much less important for immersion than action MMO = Mario/Zelda with 1000+ players.

unwind 4 years ago |

Oh how this reminded me of the Verse protocol and Uni-Verse! To be young again, and so on. :)

[1]: https://en.m.wikipedia.org/wiki/Verse_protocol

sxp 4 years ago |

tl;dr: "So at Simul, for the past few years we’ve been building this protocol: it’s called Teleport VR. Let’s see what we can make with it!"

An alternative view would be that HTTP(S) would be "the HTTP of VR". With WebXR and standard JS APIs for HTTPS, async fetching, WebRTC, etc, all the items listed in "Imagine an application-layer protocol for VR with the following characteristics..." are satisfied. And the stack can use battle-tested web technologies so that it can leverage standard CDNs, cloud servers, etc.

VR has some extra constraints over 2D webpages due to tighter frames per second and latency tolerances, but most of the web protocols can get you 90% of the way there.

jayd16 4 years ago | |

I wouldn't even say performance is all that different.

Something that is unique is the idea that a website is a single document where as a virtual website might take the form of an interactive object and/or an interactive space.

I would say it's an open question how we want these web based virtual objects to interact with each other. Would we want to physically pull a video object off the Google Drive shelf and drop it into the YouTube workstation? How would such an interaction be possible? Even if, as today, they just never speak directly, could those objects live in the same space or would each website fully immerse the user?

schmorptron 4 years ago |

Kinda Off-topic, but if anyone is looking to play around with building vr spaces or games, i recently found out about LÖVR[0] which is a sipmle lua-based open source VR "framework". Haven't had a chance to play with it but it seems other people like it!

[0] https://lovr.org/

douglaswlance 4 years ago |

Latency is incredibly important in VR. If everything is streaming from a remote server, even if it's a straight fiber connection, it'll still be too much latency.

usrbinbash 4 years ago |

What exactly is the "metaverse" supposed to be, other than a marketing term to sell a more expensive class of IO devices?

People will not switch over in droves to do their text/image/video editing in VR all of a sudden, because other than a few special design applications, there is no point in doing so...it's slower, clumsier and the input devices are much less precise than mouse&keyboard.

Another supposed target demographic, people in IT won't switch either. I see no point in virtually grabbing a glowing code-ball and throing it into the "deploy-tube", or navigate a codebase using haptic gestures with the huge meat-styluses at the end of my arms, when I can simply type `git push` or `/myAwesomeStruct`

I also have a hard time imagining management sitting in meetings while wearing a 400g headset for 3h. Or companies being willing to cough up 350+$ for every employee just so they can join meetings, when Zoom is basically free.

So, what else is there? Gaming and maybe some "recreational apps" (aka. alsogaming, only less interactive). And since not all games will take place in the same unified MMORPG-ish permanent universe (yes, people want to play in sessions, and people want to play single player, and people want to play while not connected to the internet), this will not be a paradigm-shift, but rather a new toy in an already large collection of other toys.

1. I started with my eyes at floor level. 2. It moans about a guardian, this by far is the most soul sapping thing of all time. The thing is, I have dev mode, but I do find guardian useful (I punched some walls previously). It's just so annoying though. 3. It asks me to set up guardian every fucking time. 4. Followed by when I try the Oculus Link... do I want to trust this computer. 5. I start steam VR but it doesn't work as it cannot find my headset but at this point, I am strapped in and 2 meters from my desk (ala stationary guardian) so I take it off to restart steam VR and the Oculus app. 6. Sometimes the Oculus app simply doesn't work and I have to reinstall it. 7. For some reason my Oculus link cable is loose unlike other USB-C cables/ports so it disconnects from the movement of my standing desk intermittently enough to not be a problem but also highly annoying. 8. Sometimes things don't start in VR but in Flat mode, this means removing the headset to sort it out (see point 5). I feel like jumping in and out of the experience makes it almost unusable.