Loading TOC...
Semantics Developer's Guide (PDF)

Semantics Developer's Guide — Chapter 7

Inference

In the context of MarkLogic Semantics, and Semantic technology in general, the process of 'inference' involves the automated discovery of new facts based on a combination of data and rules for understanding that data. Inference is the process of 'inferring' or discovering new facts about your data based on a set of rules. Inference with semantic triples means that automatic procedures can generate new relationships (new facts) from existing triples.

An inference query is any SPARQL query that is affected by automatic inference, that is automatic processing by a computer program. The W3C specification describing inference, with links to related standards, can be found here: http://www.w3.org/standards/semanticweb/inference

New facts may be added to the database (forward-chaining inference), or they may be inferred at query time (backward chaining inference), depending on the implementation. MarkLogic supports backward-chaining inference.

This chapter includes the following sections:

Automatic Inference

Automatic inference is done using rulesets and ontologies. As the name implies, automatic inference is performed automatically and can also be centrally managed. MarkLogic semantics uses backward-chaining inference, meaning that the inference is performed at query time. This is very flexible - it means you can specify which ruleset(s) and ontology (or ontologies) to use per-query, with defaults per-database.

This section includes these topics:

Ontologies

An ontology is used to describe your data; it describes relationships in your data that can be used to infer new facts about your data. What data is related to what other data, and how is it related? In Semantics, an ontology is a set of triples that provides a semantic model of a portion of the world, a model that enables knowledge to be represented for a particular domain (relationships between people, types of publications, or a taxonomy of medications). This knowledge model is a collection of triples used to describe the relationships in your data. Different vocabularies can supply sets of terms to define concepts and relationships to represent facts.

An ontology describes what types of things exist in the domain (classes), the relationships between them (properties), and the logical ways that they can be used together. A vocabulary is composed of terms with clear definitions controlled by some internal or external authority. An ontology is a vocabulary expressed in a language like OWL (Web Ontology Language) or RDFS (Resource Description Framework Schema).

For example, the ontology triple ex:color owl:equivalentProperty ex:hue states that hue and color are equivalent properties.

This SPARQL Update example inserts that ontology triple into a graph.

PREFIX owl: <http://www.w3.org/2002/07/owl/>
PREFIX ex: <http://example.org/>

INSERT DATA 
{ 
GRAPH <http://marklogic.com/semantics/products/inf-1> 
{
ex:color owl:equivalentProperty ex:hue .
}
}

You may want to use an ontology you have created to model your business or your area of research, and use that along with one or more rulesets to discover additional information about your data.

There are a number of ways to choose an ontology used to do inference:

  • Use FROM or FROM NAMED/GRAPH in the query to specify what data is being accessed. Ontologies are organized by collection/named graph.
  • Use default-graph= and named-graph= options to sem:sparql or sem:sparql-update.
  • Use a cts:query to include/exclude data to be queried. Ontologies can be organized by directory, or anything else that a cts:query can find.
  • Add the ontology to an in-memory store, and query across both the database and the in-memory store. In this case, the ontology is not stored in the database, and can be manipulated and changed for each query.
  • Add the ontology to a ruleset as axiomatic triples. Axiomatic triples are triples that the ruleset says are always true - indicated by having an empty WHERE clause in the rule. You can then choose to include the ontologies in certain ruleset files or not at query time.

Here is a JavaScript example of a SPARQL query where an ontology is added to an in-memory store:

var sem = require("/MarkLogic/semantics.xqy"); 

var inmem = sem.inMemoryStore(sem.rdfParse(' \
prefix ch: <http://marklogic.com/semantics/cheeses/> \
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> \
prefix owl: <http://www.w3.org/2002/07/owl#> \
prefix dcterms: <http://purl.org/dc/terms/> \
                                                     \
ch:FreshGoatsCheese owl:intersectionOf ( \
    ch:SoftFreshCheese \
    [ owl:hasValue ch:goatsMilk ; \
      owl:onProperty ch:milkSource ] \
  ) .',"turtle"));
var rules = sem.rulesetStore(
  ["intersectionOf.rules","hasValue.rules"],
  [inmem,sem.store()])
 
sem.sparql(" \
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> \
prefix dcterms: <http://purl.org/dc/terms/> \
prefix f: <http://linkedrecipes.org/schema/> \
prefix ch: <http://marklogic.com/semantics/cheeses/> \
                                                     \
select ?title ?ingredient WHERE { \
  ?recipe dcterms:title ?title ; \
          f:ingredient [ \
            a ch:FreshGoatsCheese ; \
            rdfs:label ?ingredient] \
}",[],[],rules) 

The query searches for a recipe using fresh soft cheese made of goat's milk and returns the title of the recipe. To get results back from this query, you would need to have a triplestore of recipes, along with some triples describing cheese made from goat's milk.

Rulesets

A ruleset is a set of inference rules, rules that can be used to infer additional facts. Rulesets are used by the inference engine in MarkLogic to infer new triples at query time from existing triples. A ruleset may be built up by importing other rulesets.

When inference is done at query time using rulesets, it is referred to as 'backward chaining' inference. The rules are applied at query time; each SPARQL query looks at the specified ruleset and creates new triples as a result. This type of inferencing is faster during ingestion and indexing, but potentially a bit slower at query time.

For example, if I know that John lives in London and London is in England, I (as a human) know that John lives in England. I inferred that fact. Similarly, if there are triples in the database that say that John lives in London and London is in England, and there are triples that express the meaning of 'lives in' and 'is in' as part of an ontology, MarkLogic can infer that John lives in England. When you query your data for all the people that live in England, John will be included in the results.

Here is a simple rule to express the concept of 'lives in':

# geographic rules for inference
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema/>
PREFIX ex: <http://example.com/>
PREFIX gn: <http://www.geonames.org/ontology/>

rule "livesIn" CONSTRUCT {
  ?person ex:livesIn ?place2
} {
  ?person ex:livesIn ?place1 .
  ?place1 gn:parentFeature ?place2
}

This rule states (reading from the bottom up): if place1 is in (has a parentFeature) place2, and a person lives in place1, then a person also lives in place2.

In general, inference is more expensive as you add more (and more complex) rules. MarkLogic allows you to apply just the rulesets you need for each query. For convenience, you can specify the default rulesets for a database, but you can also ignore those defaults for some queries. It is possible to override the default ruleset association to allow querying without using inferencing and/or querying with alternative rulesets.

Using Rulesets

Inference rules enable you to search over both asserted triples and inferred triples. The semantic inference engine uses rulesets to create new triples from existing triples at query time. You can associate one or more rulesets with a database, so that by default, queries made against that database will include the ruleset. You can also specify one or more rulesets for each query at query time.

A ruleset location is either a URI in the Schemas database for the database you are using, or a file name in <MarkLogic Install Directory>/Config, which contains standard, pre-defined rulesets such as rdfs, rdfs+ and OWL-Horst. The pre-defined rulesets can be specified as a simple name rather than a URI in a Schemas database (like /rules/livesIn.rules) .

These standards-based rulesets (RDFS, RDFS-Plus, and OWL Horst) are included with MarkLogic. Each ruleset has two versions; the full ruleset (xxx-full.rules) and the optimized version (xxx.rules). The components of each of these rulesets are available separately so that you can do fine-grained inference for queries. You can create your own rulesets by importing some of those rulesets and/or writing your own rules.

To see these rulesets (in Linux), go to your MarkLogic install directory, then the Config directory under that (/<MarkLogic_install_dir>/Config/*.rules). There you will see a set of files with a .rules extension.

/opt/MarkLogic/Config/*.rules

Each of these .rules files is a ruleset. If you open one in a text editor you will see that the rulesets are componentized - that is, they are defined in small component rulesets, then built up into larger rulesets. Here is an example of the rule domain.rules:

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema/>

...

rule "domain rdfs2" CONSTRUCT {
  ?x a ?c
} {
  ?x ?p ?y .
  ?p rdfs:domain ?c
}

In this example, a means 'type of' (rdf:type or rdfs:type). This rule states that if all the things in the second set of braces matches a triple (p has domain c - that is, for every triple that has the predicate p, the object must be in the domain c), then construct the triple in the first set of braces (if you see x p y, then x is a c).

By using a building block approach to creating and using rules, you can enable only the rules you really need, so that your query can be as efficient as possible.

If you have a default ruleset associated with a database and you specify a ruleset as part of your query, both rulesets will be used. Rulesets are additive. Use the no-default-ruleset option to ignore the default ruleset.

This example uses the rdfs.rules ruleset from the <MarkLogic-install-dir>/Config location:

import module namespace sem = "http://marklogic.com/semantics" 
  at "/MarkLogic/semantics.xqy";
let $sup :=
'
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

INSERT DATA
{ <someMedicalCondition> rdf:type <osteoarthritis> .
  <osteoarthritis> rdfs:subClassOf <bonedisease> . }'
return sem:sparql-update($sup)
; (: transaction separator :)

let $sq := 
'PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX d: <http://diagnoses#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?diagnosis
WHERE { ?diagnosis rdf:type <bonedisease>. } '
  
let $rs := sem:ruleset-store("rdfs.rules", sem:store())  
(: rdfs:rules is a predefined rule set in <MarkLogic-install-dir>/Config :)
return sem:sparql($sq, (), (), $rs)  
(: the rules specify that query for <bonedisease> will return the subclass <osteoarthritis> :)

You can manage rulesets using the REST Mangement API or XQuery Admin API. For details see the default-ruleset property in PUT /manage/v2/databases/{id|name}/properties and admin:database-add-default-ruleset.

Choosing Rulesets for Queries

You can choose which rulesets to use for your SPARQL query by using sem:ruleset-store. The sem:ruleset-store function returns a set of triples that result from the application of the ruleset to the triples defined by the sem:store function provided in $store (for example, 'all of the triples that can be inferred from the rule').

This statement specifies the rdfs.rules ruleset as part of sem:ruleset-store:

let $rdfs-store := sem:ruleset-store("rdfs.rules",sem:store() )

So this says, let the $rdf-store contain triples derived by inference using the rdfs.rules against the sem:store. If no value is provided for sem:store, the query uses the triples in the current database's triple index. The built-in functions sem:store and sem:ruleset-store are used to define the triples over which to query and the rulesets (if any) to use with the query. The $store definition includes a ruleset, as well as other ways of restricting a query's domain, such as a cts:query.

This example executes a SPARQL query against the data in $triples, using the inference rules rdfs:subClassOf and rdfs:subPropertyOf:

xquery version "1.0-ml";
import module namespace sem = "http://marklogic.com/semantics" 
  at "/MarkLogic/semantics.xqy";

PREFIX skos: <http://www.w3.org/2004/02/skos/core#Concept/>

sem:sparql("select * { ?c a skos:Concept; rdfs:label ?l }",(),(),
sem:ruleset-store(("subClassOf.rules","subPropertyOf.rules"),
  ($triples)
)
Specifying a Default Ruleset for a Database

In addition to the Admin API with XQuery, you can also use the Admin UI to set the default ruleset to be used with a database for queries. To specify the ruleset or rulesets for a database, click the database name under Databases in left-hand navigation in the Admin UI. Click the database name to expand the list. Scroll to Default Rulesets.

Click Default Rulesets to see the rulesets currently associated with the Documents database.

To add your own ruleset, click Add to enter the name and location of the ruleset.

Your custom rulesets will be located in the Schemas database. The rulesets supplied by MarkLogic are located in the Config directory under your MarkLogic installation directory (/<MarkLogic_install_dir>/Config/*.rules).

Click more items to associate additional rulesets with this database.

Security for rulesets is managed the same way that security is handled for MarkLogic schemas.

You can use Query Console to find out what default rulesets are currently associated with a database using the admin:database-get-default-rulesets function.

This example will return the name and location of the default rulesets for the Documents database:

xquery version "1.0-ml";
import module namespace admin = "http://marklogic.com/xdmp/admin" 
  at "/MarkLogic/admin.xqy";

let $config := admin:get-configuration()
let $dbid := admin:database-get-id($config, "Documents")
return admin:database-get-default-rulesets($config, $dbid)

=>

<default-ruleset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xmlns="http://marklogic.com/xdmp/database">
    <location>/rules/livesin.rules</location>
</default-ruleset>
Overriding the Default Ruleset

You can turn off or ignore a ruleset set as the default on a database. In this example, a SPARQL query is executed against the database, ignoring the default rulesets and using the rdfs:subClassOf inference ruleset for the query:

xquery version "1.0-ml";
import module namespace sem = "http://marklogic.com/semantics" 
  at "/MarkLogic/semantics.xqy";

PREFIX skos: <http://www.w3.org/2004/02/skos/core#Concept/>

sem:sparql("SELECT * { 
  ?c a skos:Concept; 
  rdfs:label ?l }",(),(),
sem:ruleset-store("subClassOf.rules",sem:store("no-default-rulesets"))
)

You can also turn off or ignore a ruleset as part of a query, through the Admin UI, or by using XQuery or JavaScript to specify the ruleset.

You can also change the default ruleset for a database in the Admin UI by 'deleting' the default ruleset from that database. In the Admin UI, select the database name from the left-hand navigation panel, click the database name. Click Default Rulesets.

On the Database: Documents panel, select the default ruleset you want to remove, and click delete. Click OK when you are done. The ruleset is no longer the default ruleset for this database.

This action does not delete the ruleset, only removes it as the default ruleset.

You can also use admin:database-delete-default-ruleset with XQuery to change a database's default ruleset. This example removes subClassOf.rules as the default ruleset for the Documents database.

xquery version "1.0-ml"; 
import module namespace admin = "http://marklogic.com/xdmp/admin" 
  at "/MarkLogic/admin.xqy";

let $config := admin:get-configuration()
let $dbid := admin:database-get-id($config, "Documents")
let $rules := admin:database-ruleset("subClassOf.rules")
let $c := admin:database-delete-default-ruleset($config, $dbid, $rules)

return admin:save-configuration($c)
Creating a Ruleset

One way to think of inference rules is as a way to construct some inferred triples, then search over the new data set (one that includes the database - the sem:store - plus the inferred triples). MarkLogic rulesets have the .rules extension and are located in the install directory:

/<MarkLogic_install_dir>/Config/*.rules

When you create your own rulesets, you will store them in the Schemas database.

The syntax of an inference rule uses the grammar of a SPARQL CONSTRUCT with the WHERE clause restricted to a combination of only triple patterns, joins, and filters.

The ruleset must have a unique name. If the name is used more than once, the operation will return an error. The import statement in the prolog of the ruleset file includes all rules from the ruleset found at the location given.

This ruleset from the /<MarkLogic Install>/Config directory includes four rules named rdfs8, rdfs9, rdfs10, and rdfs11. The ruleset includes prefixes, and each rule has a rule name and a CONSTRUCT clause:

PREFIX rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs:    <http://www.w3.org/2000/01/rdf-schema#>

rule "rdfs8" CONSTRUCT {
  ?c rdfs:subClassOf rdfs:Resource
} {
  ?c a rdfs:Class
}

rule "rdfs9" CONSTRUCT {
  ?x a ?c2
} {
  ?x a ?c1 .
  ?c1 rdfs:subClassOf ?c2 .
  FILTER(?c1!=?c2)
}

rule "rdfs10" CONSTRUCT {
  ?c rdfs:subClassOf ?c
} {
  ?c a rdfs:Class
}

rule "rdfs11" CONSTRUCT {
  ?c1 rdfs:subClassOf ?c3
} {
  ?c1 rdfs:subClassOf ?c2 .
  ?c2 rdfs:subClassOf ?c3 .
  FILTER(?c1!=?c2 && ?c2!=?c3 && ?c1!=?c3)
}

Note that two of the rules also include a FILTER clause.

This ruleset from same directory imports smaller rulesets to make a ruleset approximating the full RDFS ruleset:

import "rdf.rules"
import "domain.rules"
import "range.rules"
import "subPropertyOf.rules"
import "subClassOf.rules"

PREFIX rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs:    <http://www.w3.org/2000/01/rdf-schema#>

# Miscellaneous other axiomatic triples

rule "rdfs properties" CONSTRUCT {
  rdf:type rdfs:domain rdfs:Resource .
  rdf:type rdfs:range rdfs:Class .
}

If a ruleset at a given location is imported more than once, the effect of the import will be the same as if it had only been imported once. If a ruleset is imported more than once from different locations (for example from the /<MarkLogic Install>/Config directory and from the Schemas database directory), MarkLogic will assume they are different rulesets and raise an error if they contain duplicate rule names.

Here is a rule that you could create to infer geographic locations:

# geographic rules for inference
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema/>
PREFIX ex: <http://example.com/>
PREFIX gn: <http://www.geonames.org/ontology/>

rule "lives in" CONSTRUCT {
  ?person ex:livesIn ?place2
} {
  ?person ex:livesIn ?place1 .
  ?place1 gn:parentFeature ?place2
}

In Query Console, you can add the livesIn rule to the Schemas database using xdmp:document-insert. Make sure the Schemas database is selected as the Content Source before you run the code:

xquery version "1.0-ml";

xdmp:document-insert(
'/rules/livesin.rules',
text{
'
# geographic rules for inference
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema/>
PREFIX ex: <http://example.com/>
PREFIX gn: <http://www.geonames.org/ontology/>

rule "lives in" CONSTRUCT {
  ?person ex:livesIn ?place2
} {
  ?person ex:livesIn ?place1 .
  ?place1 gn:parentFeature ?place2
  }
'
}
)

The example stores the livesin.rule in the Schemas database, in the rules directory (/rules/livesin.rules).

You can include your ruleset as part of inference in the same way you can include the supplied rulesets. MarkLogic will check the location for rules in the Schemas database and then the location for the supplied rulesets.

Memory Available for Inference

The default, maximum, and minimum inference size values are all per-query, not per-system. The maximum inference size is the memory limit for inference. The appserver-max-inference-size function allows the administrator to set a memory limit for inference. You cannot exceed this amount.

The default inference size is the amount of memory available to use for inference. By default the amount of memory available for inference is 100mb (size=100). If you run out of memory and get an inference full error (INFFULL), you need to increase the default memory size using appserver-set-default-inference-size or by changing the default inference size on the HTTP Server Configuration page in the Admin UI.

You can also set the inference memory size in your query as part of sem:ruleset-store. This query sets the memory size for inference to 300mb (size=300):

Let $store := sem:ruleset-store(("baseball.rules", "rdfs-plus-full.rules"),
sem:store(), ("size=300"))

If your query returns an INFFULL exception, you can to change the size in ruleset-store.

Other Ways to Achieve Inference

Before going down the path of automatic inference, you should consider other ways to achieve inference, which may be more appropriate for your use case.

This section includes these topics:

Inference Using Paths

In many cases, you can do inference by rewriting your query. For example, you can do some simple inference with using unenumerated property paths. Property paths (as explained in Property Path Expressions) enable a simple kind of inference.

You can find all the possible types of a resource, including supertypes of a resources, using RDFS and the '/' property path in a SPARQL query:

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2001/01/rdf-schema#>
SELECT ?type
{ 
   <http://example/thing> rdf:type/rdfs:subClassOf* ?type 
}

The result will be all resources and their inferred types. The unenumerated property path expression with the asterisk (*) will look for a path that connects the subject and the object of the path by zero or more matches of a predicate (rdfs:subClassOf in the example).

For example, you could use this query to find the products that are subclasses of 'shirt':

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX ex: <http://example.com>
SELECT ?product
WHERE
  {
    ?product rdf:type/rdfs:subClassOf* ex:Shirt ;
}

Or you could use a property path to find people who live in England:

PREFIX gn: <http://www.geonames.org/ontology/>
PREFIX ex: <http://www.example.org>

SELECT ?p 
{
  ?p ex:livesIn/gn:parentFeature "England" 
}

For more about property paths and how to use them with semantics, see Property Path Expressions.

Materialization

A possible alternative to automatic inference is materialization, or forward-chaining inference, where you perform inference on your data as a whole, not as part of a query; and then store those inferred triples to be queried later. Materialization will work best for triple data that is fairly static, performing inference with rules and ontologies that do not change often.

To materialize these triples, construct SPARQL queries for the rules that you want to use for inference and run them on your data.

This process of materialization may be time consuming and will require a significant amount of memory for storage. You will need to write code or scripts to handle transactions and security, and to handle changes in data and ontologies.

These tasks are all handled for you if you choose automatic inference.

Materialization can be very useful if you need very fast queries and you are prepared to do the pre-processing work up front and use the extra disk space for the inferred triples. You may want to use this type of inference in situations where the data, rulesets, and ontologies do not change very much.

Using Inference with the REST API

When you execute a SPARQL query or update using the REST Client API methods POST /v1/graphs/sparql and GET /v1/graphs/sparql, you can specify rulesets through request parameters default-rulesets and rulesets. If you omit both of these parameters, the default rulesets for the database are applied.

After you set rdfs.rules and equivalentProperties.rules as the default rulesets for the database, you can perform this SPARQL query using REST from the Query Console:

xquery version "1.0-ml";
import module namespace sem = "http://marklogic.com/semantics" 
  at "/MarkLogic/semantics.xqy";

let $uri := "http://localhost:8000/v1/graphs/sparql"
return
let $sparql :='
PREFIX rdf:    <http://www.w3.org/1999/02/22-rdf-syntax-ns#> 
PREFIX rdfs:   <http://www.w3.org/2000/01/rdf-schema#>
PREFIX prod:   <http://example.com/products/> 
PREFIX ex:     <http://example.com/>

SELECT ?product
FROM <http://marklogic.com/semantics/products/inf-1>
WHERE 
 {
  ?product  rdf:type  ex:Shirt ;
  ex:color  "blue"
}
'
let $response :=
xdmp:http-post($uri,
<options xmlns="xdmp:http">
  <authentication method="digest">
    <username>admin</username>
    <password>admin</password> 
  </authentication>
  <headers>
    <content-type>application/sparql-query</content-type>
    <accept>application/sparql-results+xml</accept>
  </headers>
</options>
text {$sparql})
  return
  ($response[1]/http:code, $response[2] /node())

=>

    product
<http://example.com/products/1001>
<http://example.com/products/1002>
<http://example.com/products/1003>

Using the REST endpoint and curl (with the same default rulesets for the database), the same query would look like this:

curl --anyauth --user Admin:janem-3 -i -X POST 
-H "Content-type:application/x-www-form-urlencoded" 
-H "Accept:application/sparql-results+xml" 
--data-urlencode query='PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs:
<http://www.w3.org/2000/01/rdf-schema#> PREFIX prod:
<http://example.com/products/> PREFIX ex: <http://example.com/> 
SELECT ?product FROM <http://marklogic.com/semantics/products/inf-1>
WHERE {?product rdf:type ex:Shirt ; ex:color  "blue"}'
http://localhost:8000/v1/graphs/sparql

See Using Semantics with the REST Client API and Querying Triples in the REST Application Developer's Guide for more information.

Summary of APIs Used for Inference

MarkLogic has a number of APIs that can be used for semantic inference. Semantic APIs are available for use as part of the actual inference query (specifying which triples to query and which rules to apply). Database APIs can be used to choose rulesets to be used for inference by a particular database. Management APIs can control the memory used by inference by either an appserver or a taskserver.

Semantic APIs

MarkLogic Semantic APIs can be used for managing triples for inference and for specifying rulesets to be used with individual queries (or by default with databases). Stores are used to identify the subset of triples to be evaluted by the query.

Semantic API Description
sem:store

The query argument of sem:sparql accepts sem:store to indicate the source of the triples to be evaluated as part of the query. If multiple sem:store constructors are supplied, the triples from all the sources are merged and queried together.

The sem:store can contain one or more options along with a cts:query to restrict the scope of the triples to be evaluated as part of the sem:sparql query. The sem:store parameter can also be used with sem:sparql-update and sem:sparql-values.

sem:in-memory-store Returns a sem:store that represents the set of triples from the sem:triple values passed in as an argument. The default rulesets configured on the current database have no effect on a sem:store created with sem:in-memory-store.
sem:ruleset-store Returns a new sem:store that represents the the set of triples derived by applying the ruleset to the triples in sem:store in addition to the original triples.

The sem:in-memory-store function should be used with sem:sparql in preference to the deprecated sem:sparql-triples function (available in MarkLogic 7). The cts:query argument to sem:sparql has also been deprecated.

If you call sem:sparql-update with a store that is based on in-memory triples (that is, a store that was created by sem:in-memory-store) you will get an error because you cannot update triples that are in memory and not on disk. Similarly, if you pass in multiple stores to sem:sparql-update and any of them is based on in-memory triples you will get an error.

Database Ruleset APIs

These Database Ruleset APIs are used to manage the rulesets associated with databases.

Ruleset API Description
admin:database-ruleset The ruleset element to be used for inference on a database. One or more rulesets can be used for inference. By default, no ruleset is configured.
admin:database-get-default-rulesets Returns the default ruleset(s) for a database.
admin:database-add-default-ruleset Adds a ruleset to be used for inference on a database. One or more rulesets can be used for inference. By default, no ruleset is configured.
admin:database-delete-default-ruleset Deletes the default ruleset used by a database for inference.

Management APIs

These Management APIs are used to manage memory sizing (default, minimum, and maximum) alloted for inference.

Management API (admin:) Description
admin:appserver-set-default-inference-size Specifies the default value for any request's inference size on this application server.
admin:appserver-get-default-inference-size Returns the default amount of memory (in megabytes) that can be used by sem:store for inference by an application server.
admin:taskserver-set-default-inference-size Specifies the default value for any request's inference size on this task server.
admin:taskserver-get-default-inference-size Returns the default amount of memory (in megabytes) that can be used by sem:store for inference by a task server.
admin:appserver-set-max-inference-size Specifies the upper bound for any request's inference size. The inference size is the maximum amount of memory in megabytes allowed for sem:store performing inference on this application server.
admin:appserver-get-max-inference-size Returns the maximum amount of memory (in megabytes) that can be used by sem:store for inference by an application server.
admin:taskserver-set-max-inference-size Specifies the upper bound for any request's inference size. The inference size is the maximum amount of memory in megabytes allowed for sem:store performing inference on this task server.
admin:taskserver-get-max-inference-size Returns the maximum amount of memory (in megabytes) that can be used by sem:store for inference by a task server.

« Previous chapter
Next chapter »