With MarkLogic’s SPARQL queries, we can bind a value to constrain the query. Using this capability, we can gather information about something of interest. But what if I want to query against multiple values?
Let’s start with some sample data.
'use strict'; declareUpdate(); const sem = require("/MarkLogic/semantics") const myPred = sem.iri("myPredicate"); sem.rdfInsert([1, 2, 3, 4, 5, 6, 7, 8, 9, 10] .map(num => sem.triple(sem.iri(`target${num}`), myPred, num)) )
After running this in Query Console, I have 10 triples. Now suppose I want to find the objects where the subject is one of target1
, target2
, or target3
? I have a couple choices. With MarkLogic, I can go after one of these values with a simple bind (the second parameter in sem.sparql
):
sem.sparql( ` select ?obj where { ?target ?pred ?obj } `, { target: sem.iri("target1") } )
To be complete in thinking about my options, I could brute-force it and just run multiple SPARQL queries, one for each of my targets. That’s pretty inefficient.
I could also use a FILTER
.
sem.sparql( ` select ?ojbj where { ?target ?pred ?obj FILTER (?target IN (<target1>, <target2>, <target3>)) } ` )
This is effective, but I’ve learned it’s not very efficient (but better than multiple queries). Another option is to bind the values we’re looking for using sem.sparql
‘s second parameter:
'use strict'; sem.sparql( ` select ?obj where { ?target ?pred ?obj }`, { target: [sem.iri("target1"), sem.iri("target2"), sem.iri("target3")] } )
I loaded up my database with 100,000 triples for a quick test (no other data, no other load). Both ran in a matter of milliseconds, but using the bind approach ran in about a quarter of the time that the FILTER
approach took.
If you’re looking to run a SPARQL query and you want to use multiple values, binding an array is the preferred way. Interesting to note, however — you can’t do that in a SPARQL update! Stay tuned for the next post where I’ll cover that.
David,
Hope you are doing well. Glad to see you posting on SPARQL. You must be aware of the “VALUES” clause :
https://www.w3.org/TR/sparql11-query/#inline-data
The same query can be re-written as:
select ?obj
where {
values ?target { <target1> <target2> <target3> }
?target ?pred ?obj
}
Thanks for the comment, Karthik. Yes,
values
lets us specify the inputs as well. I like the binding approach because if we’re re-running this query, MarkLogic can cache the plan for it and efficiently run with different values. I believe that if we modify the query string (including the use ofvalues
, MarkLogic would need to do the interpretation and optimization stages again. Is that consistent with your experience?