There is one operation to change the tree, called insmov (move-or-insert). Whenever a client is online it can sync changes C to a server. Whenever the server has remote changes for us, it will send us back a list R of all changes since our last sync in a global linear order. We then undo any of the insmovs in our changeset C, and (re)apply all changes in R + any new changes we didn't sync yet.
We don't use any fractional indices though. Instead, our insmov tuple not only contains a parent P, but also a previous sibling guid A. Because all tree ops will eventually be applied in the global linear order as determined by the server, "sorting" is handled by just using the insmov operation.
Most of the time the undo'ing of operations is not needed though. Only when the server has insmov changes we don't know about while we are sending new insmovs ourselves do we need to ensure we replay the operations in the correct order. That's likely to happen when you reconnect to wifi after a long flight, but not so likely when updates are pushed in real-time over websocket when you're online (plus it's not needed for non-insmov operations, like updating text).
For what it's worth, this sounds equivalent to the RGA list CRDT [1], using the server's global linear order as a logical timestamp (in place of e.g. Lamport timestamps).
After I finished the project, I kinda knew why Google Drive only allows to display and modify on the same hierarchical level. There are so many constraints that you have to consider when implementing this in a nested view with many nodes.
Making safe updates via regular REST is difficult, as you have to make sure someone with two tabs open isn't going to make an update on tab 1, then another on tab 2 which puts the overall profile into an invalid state. And in general, ordering matters. Skipping an update serverside that was properly applied clientside could break things.
The dumb-as-rocks solution I came up with is to just send the minimal amount of data over that can completely overwrite a particular chunk of state, and place it behind a queue. Usually thats fine, but sometimes thats a lot of wasted data, 50KB when the actual change was only a couple bytes.
I don't need CRDTs for any of the regular reasons, but it seems like it would make state management a million times easier, even for a single user. For one, I'd get syncing between a user's browser tabs, which is good. But moreover, I can just make simple changes to frontend state, and trust that the CRDT is going to negotiate it properly with the server. I no longer have to deal with it myself.
Does this make sense? Or is the overhead required to make something like Yjs work not worth it when I don't even need multiplayer and local-first.
Concurrent conflicts in such cases are notoriously hard to converge without contextual special handling [1]. Does this implementation generalize a solution for such use-cases?
I guess it should be possible to combine a list(or string) CRDT for leaf nodes (i.e text blocks) and use this tree CRDT for structural nodes (lists & tables).
But that will make augmenting every op with two-dimensional address (parent-id, index_offset_into_that_parent)
Augmenting every op with a parent-crdt-id field is unfortunate but I think unavoidable. Thankfully I suspect that in most real world use cases, it would be very common for runs of operations to share the same parent crdt. As such, I think those ID fields would run-length encode very well.
> This article introduces the implementation difficulties and challenges of Movable Tree CRDTs when collaboration, and how Loro implements it and sorts child nodes.
However, if your profile edits are single-user only, it's probably overkill to introduce a CRDT. At first glance, it seems the two-tabs-open scenario is your highest source of bugs, so what you could do is use a BroadcastChannel to signal update events to all other tabs.
If CRDT is complications and difficult to manage, either YJS resolves that completely, or more likely that complexity will leak out of the abstraction layer no matter what.
To me it seems more like that OP should compare and contrast concurrency solutions, one of which is CDRT via YJS or another could be something like concurrency based on Go routines.
Edit: Should obviously mention Loro, the literal thread we're in now lol
Maintaining the shared state with REST calls that overwrite parts of server state is indeed brittle, and really only suitable for overwriting fields on flat data records. It also requires that you consider server-client state coordination with care at all times, and things can easily get out of sync in non-happy paths.
Like you said, building out a CRDT which specifies how updates are coalesced will significantly reduce cognitive burden.
Let's say we're building an image editor like Photoshop. An uncompressed 102 megapixel image with 16-bit color depth per channel (a photo from a Fujifilm GFX100 camera) would be about 610 MB as a TIFF. Representing each pixel of the image as a separate last-write-wins register would impose a high overhead, but such a representation doesn't actually make any sense to preserve a user's intention. The edits the user will perform are things like "increase image contrast by 15%" or "paint spline [(0,0), (1500, 1500)] with brush Q and color #000". If we sync each pixel by lamport timestamp, we could end up with user 1's contrast applying to all pixels except for those painted by user 2, which would give a weird looking image with painted-over pixels looking out of place.
Instead you'd probably want to represent user intention as a list of edit operations, which are much smaller than a whole 102MB pixel grid. A CRDT data structure is one possible technical mechanism perform to synchronize that user intent, but you pick the structure to match the user intent semantics, not to match the concrete data layout of your output.
You may still end up having edit operations that contain massive amounts of data, like "add new layer named `bg` below layer `fg` with pixels `data:(10mb of pixels)` at (1500, 1500)". But the overhead for the synchronization of that kind of edit command is very low, it's size is O(1), not O(pixels in the edit command).
Never tried building it, though. And I’m not sure it would actually be practical, but it would at least preserve the full history of the document.
https://digest.browsertech.com/archive/browsertech-digest-ho...