Differential Synchronization with JSON Patch
You can use the ideas illustrated here to build your implementation to provide a real-time collaborative experience for users with a few minor tweaks here and there. But before we deep-dive, here is a brief history lesson that you can feel free to skip.
Background
Virtual collaboration requires that at least two individuals work on a single piece of document/code/art together in virtual space in real-time without stepping over each others foot. Mind me; this is a challenging problem to solve, which computer engineers have been wracking their brains for forever.
When Google released Google Docs with a collaborative editing feature, google engineers faced similar problems. Google seems to be using their modified Google Wave Operational Transformation Algorithm for supporting real-time collaborative editing today. Despite being an excellent algorithm for synchronising text streams, it doesn't seem to attract any mainstream usage elsewhere. I am speculating here, but probably because operational transformation is mathematical complex and quite inflexible with applied to other domains.
A precursor to the OT implementation was another algorithm called Differential Synchronization, designed by another brilliant Google engineer Neil Fraser, in 2009.
Differential Synchronisation
This algorithm seems to solve the same problem of collaborative editing. Although OT seems to be a better choice when making collaborative editing, differential synchronization's simplicity and adaptability makes it an excellent candidate for implementing a simple real-time synchronisation feature with deal with complex data.
Let's walk through a scenario to see how this simple but ingenious algorithm works.
Suppose we have two architects working on the same logical design document. One user (U1) is accessing the document from their web browser over a fast internet connection. The second user (U2) is accessing the same document from a desktop application from a remote workcation location with a poor internet connection. And now they want to collaborate on the same and make changes to the design together.
Initialisation
When the users open the design document in their application, an API request is made to the server with their authorisation and the document's identifier. When the server receives this request, it queries the document from its database and then stores 2 copies of these documents in ephemeral storage, Let's call these two copies of the document shadow copies, and we have two of them now S:U1:D1
and S:U2:D1
one for each user.
The server shadows are temporary copies that are cleared if the client disconnects or is non-interactive for very long periods. Duration being too long is a subjective value, and you can make it a configuration. In my personal opinion, 30 mins to 1 hour is a long enough duration as people tend to work in 1hr or 30mins cycle.
When the shadow copies are created from the database, two additional integers cv
and sv
must be set along with the copy, with values set to 0
. These two integer values are very critical for the differential synchronisation algorithm to function. When the shadow copy is successfully created, the API response is sent to the clients with the document along with the integers cv
and sv
with value 0
. In favour of reducing verbosity, let's call these two values the sync-version-tuple [cv, sv]
with the value [0,0]
On the client, create a similar shadow copy from the response, C:U1:D1
and C:U2:D1
and the sync version is set to [0,0]
. The user does not directly edit the client shadow copy, but the separate client-side document state.
This concludes the initialisation part of the process, and all the clients had converged to the state of the database (D1) and with the clients (browser/app) and server have two shadows, respectively. They also share a sync-version-tuple that keeps track of the relative version of the shadow document.
User makes a change
The users can interact with the browser/app and make some changes to the design. They may add a record, remove a record, replace a record, whatever they wish. The client is reactive and makes these changes immediately on its state and then calls a debounced function to start the syncing if nothing else changes. The debounce duration can again be controlled and is directly proportional to the reactiveness of syncing, but we must take care that syncing does not happen with every keypress; otherwise, the UI rendering will look janky. From experiential observation, the sweet spot seems to lie somewhere between 300ms
to 1s
.
For our case, let's assume this duration is 1 second. In 1 second, the sync function executes, and the following dance occurs:
1a
(client state) and1b
(shadow copy) are passed through a diffing/compare algorithm to generate a patch which, when applied on1b
would result in1a
. The patch is captured, and the shadow state is recorded, which is[0,0]
. So let's call this patchP[0,0]
.- After the patch is generated, the client shadow is replaced by the client state and the
cv
is incremented for the shadow to1
. - The patch is now sent to the server over the network in a single connection along with the
id
of the shadow and sync-version-tuple. - The server received the patch checks if the Server sync-version-tuple is the same as the client sent sync-version-tuple. If the versions do not match, then send back an
VersionMismatchConflict
error. - In case of a match, the patch is applied to the server shadow strictly. And then, the patch is applied to the database fuzzily. After this process, the server
cv
is incremented. - Next, the current state of the database is picked, and the same diffing algorithm is applied to generate the patch. If there is a non-empty patch, then it is marked as
P[1,0]
and sent to the client. The state of the DB replaces the server shadow copy and thensv
is incremented by 1. - The patch goes to the client and follows the same algorithm to make the patch happen.
Version mismatch/shadow expired
In any case of failure, the client should consider its copy of shadow as corrupted, restart with the generation of a new shadow, and then perform the diff over the new shadow. This means that using this algorithm, the last writer always wins.
Change Propagation
Whenever the DB is changed to a new state, the changes are not pushed to clients, rather a zero payload event is generated and transmitted to all the subscribers. In the previous stage, as you may have noticed that we only changed the server shadow of a single client who made the edit; all other server shadows are still lagging. When the clients receive this event, they start the diffing process on their end, if no changes were made, an empty patch would be generated and sent to the server, and in return, the server would also create the client-specific patch and send it back. In this way, all other clients would also get synchronised at their own speed.
Another school of thought here would be that when we are already figuring out which users to send the zero-payload message, why not just compute the patch itself and send it along as it would reduce the round-trip cost. This is possible and is what the original paper on differential synchronisation paper also recommends, but there are a few things we need to consider.
In a distributed application, an edit to a document shared by large groups may overwhelm the server as it must now compute the diff and create the patch for all the shadows for different users. One solution is to offload the computation of auxiliary patches to some worker instances, but it adds significant complexity to the overall system. Secondly, not all clients may have the same bandwidth and processing capability, so pushing down changes could again overwhelm the clients as they might not be able to process the changes and provide a fluid UI at the same time.
So it seems logical that a zero-payload event represents itself as a best of both worlds solution in the distributed age, although it adds a round-trip cost.
Handling Permissions
It is impactive that there will be different kinds of clients with varying levels of permissions in real use-case. Some may have edit permission, and some may have just read-only permission.
Considering that a document could be shared with two different clients, where one has edit permission, and the other has read-only permission. In such a situation, the read-only client needs to be updated with the edits done by others, but it can't do any updates.
To achieve this, we need to initialise the shadow copies for clients as read-only. If the read-only shadow also has a unique property that if they are attempted to be updated with a patch, they, in turn, generate an anti-patch that reverts the operation being attempted.
Concretely, if a patch P[1,0]
is sent to the server, the server generates an anti-patch for this call P'[1,1]
, notifying that it accepted the patch, but it can't be applied, so it requests the client to revert back.
Ideally, the client-side implementor should handle read-only shadows and not send any patches to the server to reduce unnecessary computation.
JSON Patch
We began this article with a promise that we will use differential synchronisation with complex tree-like data structures like JSON. This is where the JSON Patch specification comes to the rescue.
A formal specification was introduced as part of RFC 6902 that defined how a JSON patch should look. Considering that differential synchronisation is indifferent to the diffing and patching algorithm chosen, JSON patch could replace the traditional text diff and patch assumed in the paper. The algorithm should still stand valid.
This is the essential adaptation that I am proposing in this write-up. We take the differential synchronisation algorithm, and instead of doing diffing of texts, we instead diff the entire JSON/Javascript objects we are dealing with. The only change we have to make is that, 1a
(client state) objects and 1b
(shadow copy) objects are passed through a diffing/compare algorithm to generate a JSON patch.
The challenge is to write an algorithm that can reliably diff two objects with the least time complexity. We can either choose an implementation that is already available in the community or build our custom implementation. If your clients are just browsers or a javascript-based application would suggest using the fast-json-patch library.
The APIs that are essential for our implementation are the following:
// Function to diff two objects and generate a patch
jsonpatch.compare(document1: any, document2: any): Operation[]
// Example
var documentA = {user: {firstName: "Albert", lastName: "Einstein"}};
var documentB = {user: {firstName: "Albert", lastName: "Collins"}};
var diff = jsonpatch.compare(documentA, documentB);
//diff == [{op: "replace", path: "/user/lastName", value: "Collins"}]
// Function to update an object with a patch
function applyPatch<T>(document: T, patch: Operation[]): PatchResult<T>
// Example
var document = { firstName: "Albert", contactDetails: { phoneNumbers: [] } };
var patch = [
{ op: "replace", path: "/firstName", value: "Joachim" },
{ op: "add", path: "/lastName", value: "Wester" },
{ op: "add", path: "/contactDetails/phoneNumbers/0", value: { number: "555-123" } }
];
document = jsonpatch.applyPatch(document, patch).newDocument;
// document == { firstName: "Joachim", lastName: "Wester", contactDetails: { phoneNumbers: [{number:"555-123"}] } };
Fuzzy Patching vs Strict Patching
There is a subtle difference in how the patch is applied on the shadow copies and how the patch is applied on the client and database state should be noted. A fuzzy patching strategy is used to patch the database and client state, but strict patching is used for the shadow copies.
To strict patch a target object, it must be the same as the one from which the patches were generated. We usually handle this by leveraging the test
operation in the JSON patch. The fast-json-patch also supports this feature and generates a sequence of patches with test ops that must pass on the target object to apply the patch. The patches must also be applied in a single transaction, i.e. if one of the patches in a batch fails, all the patches must fail. It is essential to detect that the client-shadow and the server-shadow are in sync and that no corruption has occurred.
On the other hand, fuzzy patching is much more relaxed. When a fuzzy patch is applied, then it happens on best effort. I.e. if one of the patches fails, then the rest of the patches are not ignored but applied with the best effort as long as the final schema is not violated. The benefit that this provides is that in a distributed application where changes may be happening across different application servers, the app does not need to take exclusive lock over the record, making the synchronisation process multitudes faster and eventually consistent.
Database Patching
The final piece of the puzzle is to be able to apply the fuzzy patch to the database. Doing this will require some amount of work on your end. I have an implementation for performing the patch directly to MongoDB using aggregate queries. Here is the gist of the algorithm JSON Patch to MongoDB Patch (github.com) (Use with caution, as it has not been tested in production yet).
If you wish to store data in relational databases like MySQL, then some smart schema design where each property of the JSON is stored flat in different columns may be required. Although I am pretty confident that is possible, I have not attempted to make it work with a relational database, so some about of work might be required on your end where you can take a JSON patch and create a query that accurately updates the database.
One naive approach could also be to fetch the whole record from the database and use fast-json-patch again to patch it and save back the entire object into the DB. It should be fine if you know your records will not be large and you want to try out a POC but is not recommended for production environments. Let's be that cool kid and implement the small little patch to query converter.
The Compromise
All good things come with a price so does this algorithm. It trades off low latency, low bandwidth requirement and self-healing properties for a considerable space complexity. Although diffing is a compute-heavy operation, the algorithm gives us the flexibility to do it out of order, so more powerful clients can do it more often, while the less powerful client can sparingly. But in either case, the UI is not blocked, and the user can immediately start working after the shadow copy is fetched.
To summarise, we have taken the differential synchronisation algorithm designed for text-based synchronisation and adapted it to use JSON Patch for synchronising complex data structures.
Useful Links
- https://svn.apache.org/repos/asf/incubator/wave/whitepapers/operational-transform/operational-transform.html
- https://neil.fraser.name/writing/sync/
- https://datatracker.ietf.org/doc/html/rfc6902
Disclaimer: The ideas and opinions in this post are my own and not the views of my employer.