Skip to main content

What's New in MarkLogic 12

Shortest Path Graph Algorithm

Many real-world problems--such as ones within navigational systems, network routing, and circuit design--can be represented as shortest path problems.

In MarkLogic Server, triples and the Semantics capability can model and resolve such shortest path problems. Each subject and object can be considered as nodes separated by predicate edges. Crossing these edges comes at a cost, so finding the path with the fewest edges is crucial.

Likewise, the ways that recommendation systems and natural language processing applications use knowledge graphs to model relationships between people, words, or concepts can be looked at as shortest path problems.

Shortest path queries can help discover relevant facts in knowledge graphs and piece together information that may have been missed before.

To facilitate such queries, MarkLogic 12.0 EA1 introduces the SPARQL magic property, xdmp:shortestPath.

xdmp:shortestPath makes use of these named arguments:

  • xdmp:start

  • xdmp:end

  • xdmp:pattern

  • xdmp:path

  • xdmp:length

The framework to support xdmp:shortestPath can easily allow the addition of future SPARQL magic properties like xdmp:unnest.

Reference: W3C’s SPARQL/Extensions/Computed Properties

To better understand the shortest path problem, consider this simple data set:

@prefix p: <http://example.org/kennedy/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
p:john foaf:knows p:bob .
p:john foaf:knows p:alice .
p:john foaf:knows p:zak .
p:bob foaf:knows p:alice .
p:bob foaf:knows p:maia .
p:alice foaf:knows p:zak .
p:zak foaf:knows p:drew .
p:drew foaf:knows p:maia . 

This diagram represents the data set:

Diagram of data set showing that the shortest path from john to maia is through bob

The shortest path from john to maia would be the one with the fewest edges (blue arrows) between them: through bob.

The Shortest Path from John to Maia

This SPARQL query uses the new xdmp:shortestPath property to find the shortest path from john to maia:

PREFIX xdmp: <http://marklogic.com/xdmp#>
PREFIX p: <http://example.org/kennedy/>
SELECT * 
where {
  (
    [xdmp:start ?s ]
    [xdmp:end ?o]
  ) xdmp:shortestPath (
    [xdmp:path ?path]
    [xdmp:length ?length]
  )
  FILTER (?s = p:john && ?o = p:maia )
}

Here is the response:

[{
  "s": "<http://example.org/kennedy/john>",
  "o": "<http://example.org/kennedy/maia>",
  "path": [{
      "s": "http://example.org/kennedy/john",
      "_:ANON9088488689022242345": "http://xmlns.com/foaf/0.1/knows",
      "o": "http://example.org/kennedy/bob"
    }, {
      "s": "http://example.org/kennedy/bob",
      "_:ANON9088488689022242345": "http://xmlns.com/foaf/0.1/knows",
      "o": "http://example.org/kennedy/maia"
    }
  ],
  "length": "\"2\"^^xs:unsignedLong"
}]

By default, xdmp:shortestPath finds the shortest path from the subject (john) to the object (maia).

The FILTERs in this query are optional:

  • ?s: Constrains the start of the shortest path search on the subject. If left off, the query would display all the shortest paths that lead to the end object of a triple (in this case, to maia).

  • ?o: Constrains the end of the shortest path search on the object. If left off, the query would display all the shortest paths that start from the subject of a triple (in this case, from john).

For large graphs, apply FILTER on ?length to stop the shortest path search after reaching a specified number of hops.

The Shortest Path from Maia to John

What about getting from maia (object) to john (subject) instead? In this case, use xdmp:pattern to reverse the direction of the edges:

## query
PREFIX xdmp: <http://marklogic.com/xdmp#>
PREFIX p: <http://example.org/kennedy/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/knows>
SELECT *
where {
  (
    [xdmp:start ?s]
    [xdmp:end ?o]
    [xdmp:pattern "?s foaf:|^foaf: ?o"] 
  ) xdmp:shortestPath (
    [xdmp:path ?path] 
    [xdmp:length ?length] 
  )
  FILTER (?s = p:maia && ?o = p:john)
} 

xdmp:pattern takes a triple pattern lookup as a string. This string uses the OR operator (|) along with the inverse operator (^) to implement an inverse property path. This inverse property path reverses the relationship between subject and object, forcing the shortest path computation to start with object instead.

That query renders this result:

[{
  "s": "<http://example.org/kennedy/maia>",
  "o": "<http://example.org/kennedy/john>",
  "path": [{
      "o": "http://example.org/kennedy/bob",
      "s": "http://example.org/kennedy/maia"
    }, {
      "o": "http://example.org/kennedy/john",
      "s": "http://example.org/kennedy/bob"
    }
  ],
  "length": "\"2\"^^xs:unsignedLong"
}]

Reference: SPARQL 1.1 Property Paths

This new shortest path magic property can help discover links in a knowledge graph and supplies a method to retrieve the path from start to finish--or from finish to start.