Collaborative Editing

Collaborative editing is a feature that allows multiple people to edit the same document simultaneously. Here are some examples of apps with this feature:

Craft
Miro
Google Docs
Figma

I highly recommend trying out collaborative editing if you haven't already. You can ask your colleague or a friend to sit with you and try editing something together on separate devices. It’s amazing how quickly the apps reflect changes made on one device to the others. The ability to quickly display edits from others is a fundamental aspect of collaborative editing.

Some leading apps without collaborative editing:

Git
Adobe Photoshop
FL Studio

In these apps, collaborative, simultaneous editing with others isn’t automatic. A typical workflow involves making changes, saving the file, and sending it to a collaborator. They then make their modifications and return an updated version to you. This back-and-forth continues multiple times until the document is merged into its final state.

Git is an interesting example: it’s fundamentally a collaborative system, which facilitates collaboration and enables people to work together on the same project. But Git doesn't have collaborative editing, meaning users must pull and push changes, which can lead to conflicts when two people edit a single file simultaneously. Resolving these conflicts often requires manual intervention. This makes Git collaborative software without collaborative editing.

Conflict Resolution

Conflict and conflict resolution are inherent to collaborative editing. How do these conflicts appear? Imagine Alice and Bob want to update Version 1 (V1) of a document that’s stored in the cloud. They each download V1 to their local devices and make some changes. Bob finishes first and uploads this edited V2 to the cloud, while Alice finishes her V2* and tries to upload it later. But Alice can't just save her version to the cloud, because it would overwrite Bob's changes. This is the essence of a conflict. To resolve it, Alice needs to download V2, and reconcile the differences between V2 and V2*, thereby creating V3 which contains both their changes. Only then can she save V3 to the cloud.

Example: Conflict Resolution

Collaborative editing software must exchange changes fast so that everyone can see the latest version of a file. For this reason, there are many opportunities for conflict, which must be resolved automatically, or it would be extremely difficult to use the software collaboratively. This means that two mechanisms are essential for the experience to work, which together form the Collaborative Editing Formula:

Exchange changes quickly + Automatically Resolve Conflicts = Collaborative Editing

Operational Transformation

There are two common approaches to implementing automatic conflict resolution. One is called Operational Transformation (OT). The key idea is that when a user receives changes from others, they have to "transform" those changes to make them compatible with their local copy. An example product that uses OT is Google Docs. For most implementations of this approach, a central coordinator is required. A more detailed description of the concept looks like this:

Represent text as a sequence of operations
When new edits are made, they are sent as operations to others
Before saving received edits they are transformed, to make them compatible with local state

In the simplest case, editing can be represented with two operations:

ADD(Index, Ch) inserts a character at a specified index of the text
DEL(Index) removes a character at a specified index.

For example, "Hello!" can be represented as a sequence of additions.

ADD(1, H)
ADD(2, E)
ADD(3, L)
ADD(4, L)
ADD(5, O)
ADD(6, !)

Representing "Hello!" as a sequence of operations.

To change "Hello!" to "Hello :)" 4 new operations have to be applied.

ADD(1, "H") 
ADD(2, "E") 
ADD(3, "L") 
ADD(4, "L") 
ADD(5, "O") 
ADD(6, "!")
New Operations
DEL(6, !) 
ADD(6, " ") 
ADD(7, ":") 
ADD(8, ")")

Representing a deletion

Now let's see how this works with a simple example. Alice and Bob start with a common state: "Helo", which is misspelled. Bob wants to fix it by adding an "l" at position 3. Alice wants to add an exclamation mark at the end by adding "!" at position 5.

OT Example: Starting state and intended changes

When Alice adds her operation and receives the operation from Bob, she'll end up with "Hello!" which was expected to happen; so there’s no problem here. But when Bob adds his operation and then receives the operation from Alice, he'll end up with "Hell!o" which doesn't make sense and is different from the state that Alice ended up with.

This happened because after Bob made his edit, Alice's changes became incompatible with Bob's version. This can be illustrated with a diagram showing the exact sequence of operations.

OT Example: Bob's version has a conflict

From the diagram, it's clear that Alice's operation needs to be transformed. Instead of adding "!" at position 5, it has to be added at position 6. After Bob transforms the operation from Alice he'll get the correct state: “Hello!”

OT Example: Bob transforms Alice's operation to get the correct state

This is Operational Transformation in a nutshell: sometimes you have to adjust changes coming from others so that everyone ends up with the same file state. OT has to define how to transform any local and any remote operation. If the local addition is add(pos1, chr1) and remote one is add(pos2, chr2) then the rule will look like this:

transform(add(pos1, chr1), add(pos2, chr2)):
    if pos1 > pos2:
        return ins(pos1, chr1)
    else:
        return ins(pos1 + 1, c1)

Example rule of transforming two additions

For the simplest OT implementation, there needs to be 3 more rules for the remaining combinations of possible operations. After defining those, the OT is ready. The provided example above demonstrates the core ideas of the OT approach. Of course, real implementations are more complicated than this, with more complex scenarios like "undo" and styling.

Conflict-free Replicated Data Types

Conflict-free Replicated Data Types (CRDTs) are another approach for implementing automatic conflict resolution. The key idea is to represent data in a structure that doesn't allow conflicts in the first place. As with OT, this is not a one-size-fits-all solution, but it’s a concept with many different implementations tailored for a wide range of use cases. These implementations typically can work in a peer-to-peer topology without the need for a central server. Users can drop off and still be able to catch up later.

One example of a CRDT is the Two Phase Set, which is a special set that works as follows: items can be added and removed from the set, and once an item is removed it cannot be added again. A good mental model for this is a checklist where each item can be crossed out to indicate that it's completed. A Two Phase Set is a viable structure for a checklist, unless you want to have "untick" functionality.

Example Two Phase Set: Movies to watch

A Two Phase Set can be represented by two sets: one for added items and one for removed items. Users are only allowed to add new items by adding them to the "added" set, or to cross out an existing item by adding it to the "removed" set. This means all allowed operations result only in additions to the "added" or "removed" sets that represent the Two Phase Set. Let's again use Alice and Bob for an example: they’re editing a list of movies to watch and begin with a common state, illustrated above.

Now Alice wants to add a new movie, Gladiator, to the list, and Bob wants to cross out Interstellar as they've watched it.

Example Two Phase Set: Making changes

After making changes to their local states, they must exchange their states with each other and merge the corresponding sets. Merging is a commutative operation, so it doesn't matter if Alice sends her state to Bob or vice versa because the result will be the same.

Example Two Phase Set: Merging states

As you can see, merging "added" and "removed" sets will represent exactly what Alice and Bob wanted to happen: Gladiator is added and Interstellar is crossed out. By design, the rules that are defined for working with the Two Phase Set prevent conflicts.

The Two Phase Set is a good example of a somewhat useful CRDT that’s easy to grasp. Real world use cases would require a structure that allows for re-adding previously removed items. Structures like this exist; they can represent text, charts, graphs, and more, but their design is also much more complex. I can recommend two good open-source libraries for creating CRDT based collaborative editing features for real-world applications:

For some CRDTs it is proven that if everyone continues to share information about their state, then eventually everyone converges upon the same state, which is what conflict resolution is about. Here's a link to the proof: https://github.com/trvedata/crdt-isabelle. This proof was created using a theorem-proving software and is hosted on github.

Summary

Let's wrap up this post! For collaborative editing software to function effectively, it must quickly exchange edits in real time and automatically resolve any conflicts that arise. The two most notable approaches to achieve this are Operational Transformation (OT), and Conflict-free Replicated Data Types (CRDTs). OT transforms incoming edits on the fly, while CRDTs use structures designed to prevent conflicts from occurring in the first place.

If you're interested in learning about the history of Craft's collaborative editing solution, stay tuned for upcoming posts.

Collaborative Editing