\item[]$\ell$ is the length of the document before the change,
\item[]$\ell'$ is the length of the document after the change,
\item[]$[c_1,c_2,c_3,...]$ is an array of $\ell'$ characters that described the document after the change.
\end{itemize}
Note that $\forall c_i : 0\leq i \leq\ell'$ is either an integer or a character.
\begin{itemize}
\item Integers represent retained characters in the original document.
\item Characters represent insertions.
\end{itemize}
\section{Constraints on Changesets}
\begin{itemize}
\item Changesets are canonical and therefor comparable. When represented in computer memory, we always use the same representation for the same changeset. If the memory representation of two changesets differ, they must be different changesets.
\item Changesets are compact. Thus, if there are two ways to represent a changeset in computer memory, then we always use the representation that takes up the fewest bytes.
\end{itemize}
Later we will discuss optimizations to changeset
representation (using ``strips'' and other such
techniques). The two constraints must apply to any
representation of changesets.
\section{Notation}
\begin{itemize}
\item We use the algebraic multiplication notation to represent changeset application.
\item While changesets are defined as operations on documents, documents themselves are represented as a list of changesets, initially applying to the empty document.
\end{itemize}
\paragraph{Example}
$A=(0\rightarrow5)[``hello"]$
$B=(5\rightarrow11)[0-4, ``\ world"]$
We can write the document ``hello world'' as $A\cdot B$ or
just $AB$. Note that the ``initial document'' can be made
into the changeset $(0\rightarrow
N)[``<\mathit{the\ document\ text}>"]$.
When $A$ and $B$ are changesets, we can also refer to $(AB)$ as ``the composition'' of $A$ and $B$. Changesets are closed under composition.
it is clear that there is a third changeset $C=(n_1\rightarrow n_3)[\cdots]$ such that applying $C$ to a document $X$ yields the same resulting document as does applying $A$ and then $B$. In this case, we write $AB=C$.
Given the representation from Section \ref{representation}, it is straightforward to compute the composition of two changesets.
\section{Changeset Merging}
Now we come to realtime document editing. Suppose two different users make two different changes to the same document at the same time. It is impossible to compose these changes. For example, if we have the document $X$ of length $n$, we may have $A=(n\rightarrow n_a)[\ldots n_a \mathrm{characters}]$, $B=(n\rightarrow n_b)[\ldots n_b \mathrm{characters}]$ where $n\neq n_a\neq n_b$.
It is impossible to compute $(XA)B$ because $B$ can only be applied to a document of length $n$, and $(XA)$ has length $n_a$. Similarly, $A$ cannot be applied to $(XB)$ because $(XB)$ has length $n_b$.
This is where \emph{merging} comes in. Merging takes two changesets that apply to the same initial document (and that cannot be composed), and computes a single new changeset that preserves the intent of both changes. The merge of $A$ and $B$ is written as $m(A,B)$. For the Etherpad system to work, we require that $m(A,B)=m(B,A)$.
Aside from what we have said so far about merging, there are many different implementations that will lead to a workable system. We have created one implementation for text that has the following constraints.
When users $A$ and $B$ have the same document $X$ on their screen, and they proceed to make respective changesets $A$ and $B$, it is no use to compute $m(A,B)$, because $m(A,B)$ applies to document $X$, but the users are already looking at document $XA$ and $XB$. What we really want is to compute $B'$ and $A'$ such that
``Following'' computes these $B'$ and $A'$ changesets. The definition of the ``follow'' function $f$ is such that $Af(A,B)=Bf(B,A)=m(A,B)=m(B,A)$. When we compute $f(A,B)$
\item Insertions in $A$ become retained characters in $f(A,B)$
\item Insertions in $B$ become insertions in $f(A,B)$
\item Retain whatever characters are retained in \emph{both}$A$ and $B$
\end{itemize}
\paragraph{Example}
Suppose we have the initial document $X=(0\rightarrow8)[``\mathit{baseball}"]$ and user $A$ changes it to ``basil'' with changeset $A$, and user $B$ changes it to ``below'' with changeset $B$.