Shortest Path Graph Algorithm
Many real-world problems--such as ones within navigational systems, network routing, and circuit design--can be represented as shortest path problems.
In MarkLogic Server, triples and the Semantics capability can model and resolve such shortest path problems. Each subject and object can be considered as nodes separated by predicate edges. Crossing these edges comes at a cost, so finding the path with the fewest edges is crucial.
Likewise, the ways that recommendation systems and natural language processing applications use knowledge graphs to model relationships between people, words, or concepts can be looked at as shortest path problems.
Shortest path queries can help discover relevant facts in knowledge graphs and piece together information that may have been missed before.
To facilitate such queries, MarkLogic 12.0 EA1 introduces the SPARQL magic property, xdmp:shortestPath
.
xdmp:shortestPath
makes use of these named arguments:
xdmp:start
xdmp:end
xdmp:pattern
xdmp:path
xdmp:length
The framework to support xdmp:shortestPath
can easily allow the addition of future SPARQL magic properties like xdmp:unnest
.
Reference: W3C’s SPARQL/Extensions/Computed Properties
To better understand the shortest path problem, consider this simple data set:
@prefix p: <http://example.org/kennedy/> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . p:john foaf:knows p:bob . p:john foaf:knows p:alice . p:john foaf:knows p:zak . p:bob foaf:knows p:alice . p:bob foaf:knows p:maia . p:alice foaf:knows p:zak . p:zak foaf:knows p:drew . p:drew foaf:knows p:maia .
This diagram represents the data set:
The shortest path from john
to maia
would be the one with the fewest edges (blue arrows) between them: through bob
.
The Shortest Path from John to Maia
This SPARQL query uses the new xdmp:shortestPath
property to find the shortest path from john
to maia
:
PREFIX xdmp: <http://marklogic.com/xdmp#> PREFIX p: <http://example.org/kennedy/> SELECT * where { ( [xdmp:start ?s ] [xdmp:end ?o] ) xdmp:shortestPath ( [xdmp:path ?path] [xdmp:length ?length] ) FILTER (?s = p:john && ?o = p:maia ) }
Here is the response:
[{ "s": "<http://example.org/kennedy/john>", "o": "<http://example.org/kennedy/maia>", "path": [{ "s": "http://example.org/kennedy/john", "_:ANON9088488689022242345": "http://xmlns.com/foaf/0.1/knows", "o": "http://example.org/kennedy/bob" }, { "s": "http://example.org/kennedy/bob", "_:ANON9088488689022242345": "http://xmlns.com/foaf/0.1/knows", "o": "http://example.org/kennedy/maia" } ], "length": "\"2\"^^xs:unsignedLong" }]
By default, xdmp:shortestPath
finds the shortest path from the subject (john
) to the object (maia
).
The FILTERs in this query are optional:
?s
: Constrains the start of the shortest path search on the subject. If left off, the query would display all the shortest paths that lead to the end object of a triple (in this case, tomaia
).?o
: Constrains the end of the shortest path search on the object. If left off, the query would display all the shortest paths that start from the subject of a triple (in this case, fromjohn
).
For large graphs, apply FILTER on ?length
to stop the shortest path search after reaching a specified number of hops.
The Shortest Path from Maia to John
What about getting from maia
(object) to john
(subject) instead? In this case, use xdmp:pattern
to reverse the direction of the edges:
## query PREFIX xdmp: <http://marklogic.com/xdmp#> PREFIX p: <http://example.org/kennedy/> PREFIX foaf: <http://xmlns.com/foaf/0.1/knows> SELECT * where { ( [xdmp:start ?s] [xdmp:end ?o] [xdmp:pattern "?s foaf:|^foaf: ?o"] ) xdmp:shortestPath ( [xdmp:path ?path] [xdmp:length ?length] ) FILTER (?s = p:maia && ?o = p:john) }
xdmp:pattern
takes a triple pattern lookup as a string. This string uses the OR operator (|
) along with the inverse operator (^
) to implement an inverse property path. This inverse property path reverses the relationship between subject
and object
, forcing the shortest path computation to start with object
instead.
That query renders this result:
[{ "s": "<http://example.org/kennedy/maia>", "o": "<http://example.org/kennedy/john>", "path": [{ "o": "http://example.org/kennedy/bob", "s": "http://example.org/kennedy/maia" }, { "o": "http://example.org/kennedy/john", "s": "http://example.org/kennedy/bob" } ], "length": "\"2\"^^xs:unsignedLong" }]
Reference: SPARQL 1.1 Property Paths
This new shortest path magic property can help discover links in a knowledge graph and supplies a method to retrieve the path from start to finish--or from finish to start.