Blog
MarkLogic index data types
- 28 July, 2022
- By admin
- No Comments

MarkLogic offers several types of indexes: Universal, range, triples. These indexes provide fast access to your content and can be configured to work with specific data types. MarkLogic will even do some type conversions for you.
Universal Index
Let’s insert a couple documents. Note the difference between theupdated
properties (“T” versus no “T”) and the types of the someNumber
property.
'use strict'; declareUpdate(); xdmp.documentInsert( "/content/doc1.json", { "updated": "2022-07-13T00:00:00", "someNumber": 1 } ) xdmp.documentInsert( "/content/doc2.json", { "updated": "2022-07-12 00:00:00", "someNumber": "2" } )The Universal Index will store each of these values, along with the structure, as they are provided to MarkLogic. We can query those as soon as the transaction completes. To do so, we need to query for the specific value of the right type:
cts.jsonPropertyValueQuery("someNumber", 1)
will find doc1.json, but cts.jsonPropertyValueQuery("someNumber", "1")
will not.
Range Indexes
Let’s set up 2 range indexes:- On the “updated” property with type “dateTime”
- On the “someNumber” property with type “int”
xs.dateTime("2022-07-12 00:00:00")
would fail.) MarkLogic changed that at some point; our sample data values, both with and without the “T”, can be passed to the xs.dateTime
constructor successfully. If we ask MarkLogic for the values in the range index, we’ll see both dateTimes (with the “T”):
cts.values(cts.jsonPropertyReference("updated")) =>Likewise, we can do an inequality query whether our input has the “T” or not:2022-07-12T00:00:00
2022-07-13T00:00:00
cts.search( cts.jsonPropertyRangeQuery( "updated", ">=", xs.dateTime("2022-07-12 00:00:00") ) )
Triples Index
The triples index, which powers both triples and views, also does this conversion. Let’s add a template:'use strict'; const tde = require("/MarkLogic/tde.xqy"); const typeTemplate = xdmp.toJSON( { "template": { "context": "/", "directories": ["/content/"], "rows": [ { "schemaName": "test", "viewName": "types", "columns": [ { "name": "updated", "scalarType": "dateTime", "val": "updated", "invalidValues":"reject" }, { "name": "someNumber", "scalarType": "int", "val": "someNumber", "invalidValues":"reject" } ] } ] } } ); tde.templateInsert( "/test/typeTemplate.json" , typeTemplate, xdmp.defaultPermissions(), ["TDE"] )Now we can do a simple query and see that the values have been converted to their target types:
select * from test.types
test.types.updated | test.types.someNumber |
2022-07-13T00:00:00 | 1 |
2022-07-12T00:00:00 | 2 |
Impact
I find this implicit conversion especially helpful forxs.dateTime
. Relational databases often use the format without the “T” in the middle. When ingesting data from such sources (or accepting queries from consumers that expect that format), the ingest process would need to add the “T” in order to match the expected format if the implicit conversion didn’t happen.
The key thing is to remember that the value in the document (and in the Universal Index) hasn’t changed — MarkLogic stores whatever is provided. If you have a property where the source doesn’t reliably provide the same type, remember that your value queries will need to match both type and value (as in the case for the someNumber
property above).
MarkLogic offers several types of indexes: Universal, range, triples. These indexes provide fast access to your content and can be...