Apply temporal to an existing document

MarkLogic’s temporal feature allows an out-of-the-box way to preserve copies of a document when it gets updated. You can read much more in the Temporal Developer’s Guide, but I had a need to look at a particular question recently — how do I make a non-temporal document temporal?

First, let’s think about what makes a document temporal. A temporal document 1) is in a temporal collection, 2) has timestamps that indicate its lifetime, and 3) has a collection named for its URI. The latest collection also indicates the current version of a temporal document. We can create a temporal document using the xdmp.temporalInsert function, which includes a parameter to specify the temporal collection. The timestamps are managed automatically. MarkLogic’s Data Hub Framework can also write temporal documents (using xdmp.temporalInsert under the hood), which I explored in the dhf-temporal project on Github.

So what happens if you try to do an xdmp.temporalInsert with a URI that already points to a non-temporal document? MarkLogic will throw a TEMPORAL-NOTINCOLLECTION error. We could brute force it by deleting the document and then doing a temporal insert, but that has to be done in two separate transactions to avoid XDMP-CONFLICTINGUPDATE. That works, but there is a non-zero risk that the delete will succeed, but the insert won’t.

The alternative is to update in place, applying temporal aspects to the non-temporal document. Here’s an example:

'use strict';
let uri = "/claims/claim3.json";
  [uri, "claim/temporal", "latest"]
    "claim-system-start": fn.currentDateTime(),
    "claim-system-end": "9999-12-31T11:59:59Z",
    "temporalDocURI": uri

Our target URI is a non-temporal document that we want to add to the claim/temporal temporal collection. As such, we add three collections. The collection named of the URI lets MarkLogic associate copies of this document together. The latest collection tells MarkLogic which is the most recent. Since there’s only one copy of this document, we want it to be the latest.

We also set the metadata. Note that the names used need to match up with the temporal axis values used in conjunction with the temporal collection.

We set the end time to “9999-12-31T11:59:59Z”, which is “the end of time”. We can set the start time to whatever value we choose. Note that you might want to set this to some time in the past to allow for more interesting temporal queries.

After running this, the document is now a temporal document. Running xdmp.temporalInsert on it will behave as expected, archiving this version and creating a new one.

To apply this change to an existing database, you can deploy the temporal collection & axis, then run a CORB job to add the non-temporal data to the temporal collection. If you have an on-going ingest process that may overwrite some of the relevant documents, consider pausing that process between the deployment and completion of the CORB job to avoid transition errors.

Leave a Reply

Your email address will not be published. Required fields are marked *