Apply Temporal to an Existing Document

Blog

Apply Temporal to an Existing Document

  • 28 October, 2022
  • By Dave Cassel
  • No Comments
blog-image

MarkLogic’s temporal feature allows an out-of-the-box way to preserve copies of a document when it gets updated. You can read much more in the Temporal Developer’s Guide, but I had a need to look at a particular question recently — how do I make a non-temporal document temporal?

First, let’s think about what makes a document temporal. A temporal document 1) is in a temporal collection, 2) has timestamps that indicate its lifetime, and 3) has a collection named for its URI. The latest collection also indicates the current version of a temporal document. We can create a temporal document using the xdmp.temporalInsert function, which includes a parameter to specify the temporal collection. The timestamps are managed automatically. MarkLogic’s Data Hub Framework can also write temporal documents (using xdmp.temporalInsert under the hood), which I explored in the dhf-temporal project on Github.

So what happens if you try to do an xdmp.temporalInsert with a URI that already points to a non-temporal document? MarkLogic will throw a TEMPORAL-NOTINCOLLECTION error. We could brute force it by deleting the document and then doing a temporal insert, but that has to be done in two separate transactions to avoid XDMP-CONFLICTINGUPDATE. That works, but there is a non-zero risk that the delete will succeed, but the insert won’t.

The alternative is to update in place, applying temporal aspects to the non-temporal document. Here’s an example:

'use strict';
declareUpdate();
let uri = "/claims/claim3.json";
xdmp.documentAddCollections(
  uri, 
  [uri, "claim/temporal", "latest"]
);
xdmp.documentPutMetadata(
  uri, 
  {
    "claim-system-start": fn.currentDateTime(),
    "claim-system-end": "9999-12-31T11:59:59Z",
    "temporalDocURI": uri
  })

Our target URI is a non-temporal document that we want to add to the claim/temporal temporal collection. As such, we add three collections. The collection named of the URI lets MarkLogic associate copies of this document together. The latest collection tells MarkLogic which is the most recent. Since there’s only one copy of this document, we want it to be the latest.

We also set the metadata. Note that the names used need to match up with the temporal axis values used in conjunction with the temporal collection.

We set the end time to “9999-12-31T11:59:59Z”, which is “the end of time”. We can set the start time to whatever value we choose. Note that you might want to set this to some time in the past to allow for more interesting temporal queries.

After running this, the document is now a temporal document. Running xdmp.temporalInsert on it will behave as expected, archiving this version and creating a new one.

To apply this change to an existing database, you can deploy the temporal collection & axis, then run a CORB job to add the non-temporal data to the temporal collection. If you have an on-going ingest process that may overwrite some of the relevant documents, consider pausing that process between the deployment and completion of the CORB job to avoid transition errors.

Share this post:

quote
MarkLogic’s temporal feature allows an out-of-the-box way to preserve copies of a document when it gets updated. You can read...

4V Services works with development teams to boost their knowledge and capabilities. Contact us today to talk about how we can help you succeed!

0 0 votes
Article Rating
Subscribe
Notify of
0 Comments
Inline Feedbacks
View all comments
cta-bg

Partnering for Success on Data Projects

We work with companies like yours to improve business operations through better data management. Our role is to put you in a position to succeed. Let's talk about your goals and a plan to get you there.