Description
Problem statement
As a ShareDb client, I want to fetch a read-only snapshot of my document at a given version number or timestamp.
Motivation
By the time we're going to all the effort of storing a set of deltas between document versions, it's only natural that a client would wish to leverage this power to view a document at any point in its history.
The problem statement mentions fetching a document at a given timestamp, because it is far more natural to request a document at a particular time, than at a given (arbitrary) version number.
API
The proposed API is to add two methods to the Doc
class:
Doc.prototype.fetchVersion(version, callback)
takesversion
, which is anumber
, and recreates the snapshot up to that version number. The result is stored indoc.data
Doc.prototype.fetchAtTime(time, callback)
takestime
, which is aDate
, and recreates the snapshot using ops whose timestamps are up-to-and-including thatDate
. The result is stored indoc.data
Implementation details
Data flow
The request for the document version will be submitted like the existing fetch
function - by submitting an event to the server, and attaching a callback.
The message will be picked up by Agent._handleMessage
, which will then leverage Backend.getOps
to fetch the requested ops.
We may need to make a small change to Backend.getOps
to let us request metadata from the backend using the options
object. As discussed in this Pull Request, this will be done in such a way that keeps the option
object out of the public API (probably by creating an internal Backend._getOps
method that can take an options
object, and calling it with null
from Backend.getOps
).
Discussion
Read-only snapshots
Using the Doc
class is potentially a leaky concern, given that it will also have Doc.prototype.submitOp
, which doesn't really make sense when fetching an historical document.
A possible alternative could be to expose these functions instead on the Connection
class? That way it should be very clear that the consumer is receiving a snapshot, and not a full-blown Doc
?
Out of scope
The following possibilities are deemed out-of-scope for the initial solution.
Optimising for reversible types
Making a type reversible is optional. As such, any solution must at least be able to construct a document from its initial version, and build up. However, given the nature of documents, it is highly likely that users will wish to return to more recent versions, where it will probably be faster to start from the current version and work backwards.
This is deemed out-of-scope.
Caching ops
Fetching ops can be expensive. Ideally we would cache the ops for a given document, and - so long as the requested version/timestamp is lower than the latest op - then we could simply read ops back from the cache.
This is deemed out-of-scope.
Other starting snapshot optimisations
It could be possible to fetch the latest create
op, and start building from there, instead of the very beginning. It might also (theoretically) be possible to store intermittent snapshots of the document for faster reconstruction at a trade-off with space.
These sorts of optimisations are also deemed out-of-scope.
Doing the work
Given that we need this functionality, I'm happy to undertake the majority of the work on this, but I haven't developed in this codebase before, so may need some assistance (especially because I haven't really worked with all the features of ShareDb, such as projections).