Loading TOC...
Semantics Developer's Guide (PDF)

Semantics Developer's Guide — Chapter 11

Client-Side APIs for Semantics

MarkLogic Semantics can be accessed through client-side APIs that provide support for management of triples and graphs, SPARQL and SPARQL Update, and access to the search features of MarkLogic server. The Java Client and Node.js Client source are available on GitHub. Also on GitHub are MarkLogic Sesame and MarkLogic Jena.

The chapter includes the following sections:

Java Client API

The Java Client API enables you to use Java to manage graphs and triples, and to access SPARQL query and SPARQL Update functionality in MarkLogic (Java Client API 3.0.4 or later).

The Java Client API supports a variety of formats for RDF. For example, you can write a graph with data in Turtle syntax, read it back as N-Triples syntax, and query it using JacksonHandle and SPARQL.

The Java Client API is available on GitHub at http://github.com/marklogic/java-client-api or from the central Maven repository. Use the following Java Client API interfaces in the com.marklogic.client.semantics package for Semantic operations:

  • GraphManager
  • SPARQLQueryManager
  • SPARQLQueryDefinitions
  • GraphPermissions
  • SPARQLBindings
  • RDFTypes
  • RDFMimeTypes
  • SPARQLRuleset

The following topics cover the Java Client API semantic features in more detail:

Core Java Client API Concepts

This section provides a very brief introduction to Java Client API concepts that will help you understand the Semantics examples that use this API. For details, see the Java Application Developer's Guide.

All interactions with MarkLogic Server through the Java Client API require a DatabaseClient object. A DatabaseClient object encapsulates connection details to MarkLogic Server. Use DatabaseClientFactory.newClient to create a DatabaseClient object. For details, see Creating, Working With, And Releasing a Database Client in the Java Application Developer's Guide.

For example, the following code creates a DatabaseClient and then uses it to create a GraphManager object. The example DatabaseClient represents connection details to a MarkLogic installation on localhost, listening on port 8000, acting on the database named 'myDatabase'. The client uses the user 'myuser' and digest authentication.

import com.marklogic.client.DatabaseClientFactory;
...
DatabaseClient client = DatabaseClientFactory.newClient(
    "localhost", 8000, "myDatabase", "myuser", "mypassword",
    Authentication.DIGEST);
GraphManager gmgr = client.newGraphManager();

The Java Client API commonly uses Handle objects to reference data exchanged between your application and MarkLogic Server. For example, you can read triples from a file using a FileHandle or create a StringHandle that references a serialized triple in memory. For details, see Handles in the Java Application Developer's Guide.

The following example demonstrates reading triples in Turtle format from a file into a FileHandle. The resulting handle can be used as input to operations such as GraphManager.write.

import com.marklogic.client.io.FileHandle;
...
FileHandle fileHandle = 
  new FileHandle(new File("example.ttl"))
    .withMimetype(RDFMimeTypes.TURTLE);

Graph Management

Use the GraphManager interface to perform graph management operations such as creating, reading, updating, and deleting graphs. The following table summarizes key GraphManager methods. For more details, see the Java Client API Documentation.

GraphManager MethodDescription
read readAsRetrieve triples from a specific graph.
write writeAsCreate or overwrite a graph. If the graph already exists, the effect is the same as removing the graph and then recreating it from the input data.
replaceGraphs replaceGraphsAsRemove triples from all graphs, and then insert the quads in the input data set. Unmanaged triples are not affected. The effect is the same as first calling GraphManager.deleteGraphs, and then inserting the quads.
merge mergeAsAdd triples to a named graph or the default graph. If the graph does not exist, it is created.
mergeGraphs mergeGraphsAsAdd quads to the graphs specified in the input quad data. Any graphs that do not already exist are created.
deleteDelete a specific graph.
deleteGraphsDelete all graphs. Unmanaged triples are not affected.

To reference the default graph in methods that accept a graph URI as input, use GraphManager.DEFAULT_GRAPH.

As you saw in the earlier example, use DatabaseClient.newGraphManager to create a GraphManager object. The following code creates a DatabaseClient and then uses it to create a GraphManager:

DatabaseClient client = DatabaseClientFactory.newClient(...);
GraphManager gmgr = client.newGraphManager();

To create (or replace) a graph, use GraphManager.write. The following example reads triples in Turtle format from a file into a FileHandle and then writes the triples to a named graph with the URI 'myExample/graphURI':

FileHandle fileHandle = 
  new FileHandle(new File("example.ttl"))
    .withMimetype(RDFMimeTypes.TURTLE);
gmgr.write("myExample/graphURI", fileHandle);

Note that if you use GraphManager.write to load quads, any graph URI in a quad is ignored in favor of the graph URI parameter passed into write.

To merge triples into an existing graph, use GraphManager.merge. The following example adds a single triple in Turtle format to the default graph. The triple is passed via a StringHandle, using StringHandle.withMimetype to indicate the triple format to the operation.

StringHandle stringHandle = new StringHandle()
    .with("<http://example.org/subject2> " +
          "<http://example.org/predicate2> " +
          "<http://example.org/object2> .")
    .withMimetype(RDFMimeTypes.TURTLE);
gmgr.merge("myExample/graphUri", stringHandle);

You can also set a default MIME type on the GraphManager object rather than on each Handle. For details, see GraphManager.setMimetype in the Java Client API Documentation.

To read the contents of a graph, use GraphManager.read. Specify the output triple format by calling setMimetype on the output Handle or by setting a default MIME type on your GraphManager object. The following example retrieves the contents of the default graph, as triples in Turtle format. The results are available as strings, through a StringHandle.

StringHandle triples = gmgr.read(
    GraphManager.DEFAULT_GRAPH,
    new StringHandle().withMimetype(RDFMimeTypes.TURTLE);
//...work with the triples as one big string

To remove a graph, use GraphManager.delete. To remove all graphs, use GraphManager.deleteGraphs. The following example removes the graph with URI 'myExample/graphUri':

gmrg.delete("myExample/graphUri");

You can also use the GraphManager interface to manage graph permissions, either by passing permissions into individual graph operations (write, merge, and so on) and or by calling explicit permission management methods such as GraphManager.writePermissions. Use GraphManager.permission to create the set of permissions to be applied to a graph.

For example, the following code replaces any permissions on the graph 'myExample/graphUri' with a new set of permissions:

import com.marklogic.client.semantics.Capability;
...
gmgr.writePermissions(
  "http://myExample/graphUri",
  gmgr.permission("role1", Capability.READ)
      .permission("role2", Capability.READ, Capability.UPDATE));

The following example adds similar permissions as part of graph merge.

gmgr.merge(
  "myExample/graphUri", someTriplesHandle,
  gmgr.permission("role1", Capability.READ)
      .permission("role2", Capability.READ, Capability.UPDATE));

SPARQL Query

Use the SPARQLQueryManager interface to query RDF datasets using SELECT, CONSTRUCT, DESCRIBE, and ASK queries. The SPARQLQueryManager interface supports both update and read queries. The SPARQLQueryDefinition interface encapsulates the format of a query and bindings. Use SPARQLBindings interface to bind variables used in a SPARQL query.

Evaluating a SPARQL read or update query consists of the following basic steps:

  1. Create a query manager using DatabaseClient.newSPARQLQueryManager. For example:
    DatabaseClient client = ...;
    SPARQLQueryManager sqmgr = client.newSPARQLManager();
  2. Create a query using SPARQLQueryManager.newSPARQLQueryDefinition and configure the query as needed. For example:
    SPARQLQueryDefinition query = sqmgr.newSPARQLQueryDefinition(
      "SELECT * WHERE { ?s ?p ?o } LIMIT 10")
      .withBinding("o", "http://example.org/object1");
  3. Evaluate the query and receive results by calling one of the execute* methods of SPARQLQueryManager. For example:
    JacksonHandle results = new JacksonHandle();
    results.setMimetype(SPARQLMimeTypes.SPARQL_JSON));
    results = sqmgr.executeSelect(query, results);

The SPARQLQueryManager interface includes the following methods for evaluating the supported SPARQL query types:

  • executeSelect
  • executeAsk
  • executeConstruct
  • executeDescribe
  • executeUpdate

The following example puts it all together to evaluate a SELECT query. The query includes a binding for the variable 'o'. The query results are returned as JSON.

import com.marklogic.client.semantics.SPARQLQueryManager;
import com.marklogic.client.semantics.SPARQLQueryDefinition;
import com.marklogic.client.io.JacksonHandle;

// create a query manager
SPARQLQueryManager sparqlMgr = databaseClient.newSPARQLQueryManager();

// create a SPARQL query
String sparql = "SELECT * WHERE { ?s ?p ?o } LIMIT 10";
SPARQLQueryDefinition query = sparqlMgr.newQueryDefinition(sparql)
    .withBinding("o", "http://example.org/object1");

// evaluate the query
JacksonHandle handle = new JacksonHandle();
handle.setMimetype(SPARQLMimeTypes.SPARQL_JSON);
JacksonHandle results = sparqlMgr.executeSelect(query, handle);

// work with the results
JsonNode tuples = results.get().path("results").path("bindings");
for ( JsonNode row : tuples ) {
    String s = row.path("s").path("value").asText();
    String p = row.path("p").path("value").asText();
    ...
}

You can configure many aspects of a SPARQL query, including defining variable bindings, setting the default or named graphs to which to apply the query, and setting an optimization level. For details, see SPARQLQueryDefinition in the Java Client API Documentation.

You can define variable bindings one at a time using the fluent SPARQLQueryDefinition.withBinding definition, or build up a set of bindings using SPARQLBindings and then attach them to the query using SPARQLQueryDefinition.setBindings.

For example:

// incrementally attach bindings to a query
SPARQLQueryDefinition query = ...;
query.withBinding("o", "http://example.org/object1")
     .withBinding(...);

// build up a set of bindings and attach them to a query
SPARQLBindings bindings = new SPARQLBindings();
bindings.bind("o", "http://example.org/object1");
bindings.bind(...);
query.setBindings(bindings);

See SPARQLBindings for more examples using bindings.

When you evaluate a SPARQL SELECT query, by default, all results are returned. You can limit the number of results returned in a 'page' using SPARQLQueryManager.setPageLength or a SPARQL LIMIT clause. You can retrieve successive pages of results by repeatedly calling executeSelect with a different page start position. For example:

// Change the max page length
sqmgr.setPageLength(NRESULTS);

// Fetch at most the first N results
long start = 1;
JacksonHandle results = sparqlMgr.executeSelect(query, handle, start);

// Fetch the next N results
start += N;
JacksonHandle results = sparqlMgr.executeSelect(query, handle, start);

The Java Client API includes the SPARQLRuleset class with a set of built-in rulesets and a factory method to enable you to use custom rulesets. To associate a ruleset with a query, use SPARQLQueryDefinition.withRuleset or SPARQLQueryDefinition.setRulesets.

Default inferencing can be turned on or off using the SPARQLQueryDefinition.withIncludeDefaultRulesets(Boolean) method. By default it is on. Maintaining rulesets is a Management API function. See Using Rulesets.

For a complete list of the MarkLogic Java Client Semantic methods, see GraphManager and SPARQLQueryManager in the Java Client API Documentation.

MarkLogic Sesame API

Sesame is a Java API for processing and handling RDF data, including creating, parsing, storing, inferencing, and querying over this data. Java developers who are familiar with the Sesame API can now use that same API to access RDF data in MarkLogic. The MarkLogic Sesame API is a full-featured, easy-to-use interface, that provides simple access to MarkLogic Semantics functionality. The MarkLogic Sesame API includes the MarkLogicRepository interface, part of the persistence layer.

By including the MarkLogic Sesame API, you can leverage MarkLogic as a Triple Store using standard Sesame for SPARQL query and SPARQL Update. The MarkLogic Sesame API extends the standard Sesame API so that you can also do combination queries, variable bindings, and transactions. The MarkLogicRepository class provides support for both transactions and variable bindings.

The following example uses the MarkLogic Sesame API to peform a SPARQL query and a SPARQL Update against Semantic data stored in a MarkLogic database. The example first instantiates a MarkLogicRepository and defines a default ruleset and default permissions. Notice that the SPARQL query is constrained by an additional query that limits results to the results of the combined query:

package com.marklogic.semantics.sesame.examples;

import com.marklogic.client.DatabaseClient;
import com.marklogic.client.DatabaseClientFactory;
import com.marklogic.client.io.Format;
import com.marklogic.client.io.StringHandle;
import com.marklogic.client.query.QueryManager;
import com.marklogic.client.query.RawCombinedQueryDefinition;
import com.marklogic.client.query.StringQueryDefinition;
import com.marklogic.client.semantics.Capability;
import com.marklogic.client.semantics.GraphManager;
import com.marklogic.client.semantics.SPARQLRuleset;
import com.marklogic.semantics.sesame.MarkLogicRepository;
import com.marklogic.semantics.sesame.MarkLogicRepositoryConnection;
import com.marklogic.semantics.sesame.query.MarkLogicTupleQuery;
import com.marklogic.semantics.sesame.query.MarkLogicUpdateQuery;
import org.openrdf.model.Resource;
import org.openrdf.model.URI;
import org.openrdf.model.ValueFactory;
import org.openrdf.model.vocabulary.FOAF;
import org.openrdf.model.vocabulary.RDF; import org.openrdf.model.vocabulary.RDFS;
import org.openrdf.model.vocabulary.XMLSchema;
import org.openrdf.query.*;
import org.openrdf.repository.RepositoryException;
import org.openrdf.rio.RDFParseException;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import java.io.IOException;

public class Example2_Advanced {

  protected static Logger logger =
    LoggerFactory.getLogger(Example2_Advanced.class);

  public static void main(String... args) throws 
    RepositoryException, IOException, RDFParseException,
    MalformedQueryException, QueryEvaluationException {

    // instantiate MarkLogicRepository with Java api 
    // client DatabaseClient
    DatabaseClient adminClient =
      DatabaseClientFactory.newClient("localhost", 8200,
        "admin","admin", DatabaseClientFactory.Authentication.DIGEST);
    GraphManager gmgr = adminClient.newGraphManager();
    QueryManager qmgr = adminClient.newQueryManager();

    // create repo and init
    MarkLogicRepository repo = new MarkLogicRepository(adminClient);
    repo.initialize();

    // get repository connection
    MarkLogicRepositoryConnection conn = repo.getConnection();

    // set default rulesets
    conn.setDefaultRulesets(SPARQLRuleset.ALL_VALUES_FROM);

     // set default perms
     conn.setDefaultGraphPerms(
       gmgr.permission("admin", Capability.READ)
           .permission("admin", Capability.EXECUTE));

     // set a default Constraining Query
     StringQueryDefinition stringDef =
       qmgr.newStringDefinition().withCriteria("First");
     conn.setDefaultConstrainingQueryDefinition(stringDef);

     // return number of triples contained in repository
     logger.info("1. number of triples: {}", conn.size());

     // add a few constructed triples
     Resource context1 = conn.getValueFactory()
       .createURI("http://marklogic.com/examples/context1");
     Resource context2 = conn.getValueFactory()
       .createURI("http://marklogic.com/examples/context2");
     ValueFactory f= conn.getValueFactory();
     String namespace = "http://example.org/";
     URI john = f.createURI(namespace, "john");

     //use transactions to add triple statements
     conn.begin();
     conn.add(john, RDF.TYPE, FOAF.PERSON, context1);
     conn.add(john, RDFS.LABEL, 
       f.createLiteral("John", XMLSchema.STRING), context2);
     conn.commit();

     logger.info("2. number of triples: {}", conn.size());

     // perform SPARQL query
     String queryString = "select * { ?s ?p ?o }";
     MarkLogicTupleQuery tupleQuery =
       conn.prepareTupleQuery(QueryLanguage.SPARQL, queryString);

     // enable rulesets set on MarkLogic database
     tupleQuery.setIncludeInferred(true);

     // set base uri for resolving relative uris
     tupleQuery.setBaseURI("http://www.example.org/base/");

     // set rulesets for infererencing
     tupleQuery.setRulesets(SPARQLRuleset.ALL_VALUES_FROM,
       SPARQLRuleset.HAS_VALUE);

     // set a combined query
     String combinedQuery =
       "{\"search\":" + "{\"qtext\":\"*\"}}";
     RawCombinedQueryDefinition rawCombined =
       qmgr.newRawCombinedQueryDefinition(
         new StringHandle()
           .with(combinedQuery)
           .withFormat(Format.JSON));
     tupleQuery.setConstrainingQueryDefinition(rawCombined);

     // evaluate query with pagination
     TupleQueryResult results = tupleQuery.evaluate(1,10);

     //iterate through query results
     while(results.hasNext()){
       BindingSet bindings = results.next();
       logger.info("subject:{}",bindings.getValue("s"));
       logger.info("predicate:{}", bindings.getValue("p"));
       logger.info("object:{}", bindings.getValue("o"));
     }
     logger.info("3. number of triples: {}", conn.size());

     //update query
     String updatequery = "INSERT DATA { " +
       "GRAPH <http://marklogic.com/test/context10> {" +
       "<http://marklogic.com/test/subject> <pp1> <oo1> } }";
     MarkLogicUpdateQuery updateQuery =
       conn.prepareUpdate(QueryLanguage.SPARQL, updatequery,
         "http://marklogic.com/test/baseuri");

     // set perms to be applied to data
     updateQuery.setGraphPerms(
       gmgr.permission("admin", Capability.READ)
           .permission("admin", Capability.EXECUTE));

     try {
       updateQuery.execute();
     } catch (UpdateExecutionException e) {
       e.printStackTrace();
     }

     logger.info("4. number of triples: {}", conn.size());

     // clear all triples
     conn.clear();
     logger.info("5. number of triples: {}", conn.size());

     // close connection and shutdown repository
     conn.close();
     repo.shutDown();
  }
}

The MarkLogic Sesame API is available on GitHub at http://github.com/marklogic/marklogic-sesame, along with javadocs and examples.

The key interfaces of MarkLogic Sesame API are listed below:

  • MarkLogicRepository
  • MarkLogicRepositoryConnection
  • MarkLogicQuery
    • MarkLogicTupleQuery
    • MarkLogicGraphQuery
    • MarkLogicBooleanQuery
    • MarkLogicUpdateQuery

MarkLogic Jena API

Jena is a Java API for processing and handling RDF data, including creating, parsing, storing, inferencing, and querying over this data. Java developers who are familiar with the Jena API can now use that same API to access RDF data in MarkLogic. The MarkLogic Jena API is a full-featured, easy-to-use interface, that provides simple access to MarkLogic Semantics functionality.

By including the MarkLogic Jena API, you can leverage MarkLogic as a Triple Store using standard Jena for SPARQL query and SPARQL Update. The MarkLogic Jena API extends Jena so that you can also do combination queries, variable bindings, and transactions. The MarkLogicDataSet class provides support for both transactions and variable bindings.

Here is an example showing how to run queries using MarkLogic Jena:

package com.marklogic.jena.examples;

import org.apache.jena.riot.RDFDataMgr;
import org.apache.jena.riot.RDFFormat;

import com.hp.hpl.jena.graph.NodeFactory;
import com.hp.hpl.jena.query.QueryExecutionFactory;
import com.hp.hpl.jena.query.QuerySolution;
import com.hp.hpl.jena.query.ResultSet;
import com.hp.hpl.jena.update.UpdateExecutionFactory;
import com.hp.hpl.jena.update.UpdateFactory;
import com.hp.hpl.jena.update.UpdateProcessor;
import com.hp.hpl.jena.update.UpdateRequest;
import com.marklogic.semantics.jena.MarkLogicDatasetGraph;

/**
 * How to run queries.
 */
public class SPARQLUpdateExample {
    
    private MarkLogicDatasetGraph dsg;

    public SPARQLUpdateExample() {
        dsg = ExampleUtils.loadPropsAndInit();
    }

    private void run() {
        dsg.clear();
        
        String insertData = "PREFIX foaf: <http://xmlns.com/foaf/0.1/> "
                + "PREFIX : <http://example.org/> "
                +"INSERT DATA {GRAPH :g1 {"
                + ":charles a foaf:Person ; "
                + "        foaf:name \"Charles\" ;"
                + "        foaf:knows :jim ."
                + ":jim    a foaf:Person ;"
                + "        foaf:name \"Jim\" ;"
                + "        foaf:knows :charles ."
                + "} }";
        
        System.out.println("Running SPARQL update");
        
        UpdateRequest update = UpdateFactory.create(insertData);
        UpdateProcessor processor = UpdateExecutionFactory.create(update, dsg);
        processor.execute();
        
        System.out.println("Examine the data as JSON-LD");
        RDFDataMgr.write(System.out, dsg.getGraph(NodeFactory.createURI("http://example.org/g1")),
          RDFFormat.JSONLD_PRETTY);
        
        System.out.println("Remove it.");
        
        update = UpdateFactory.create("PREFIX : <http://example.org/> DROP GRAPH :g1");
        processor = UpdateExecutionFactory.create(update, dsg);
        processor.execute();
        dsg.close();
    }

    public static void main(String... args) {
        SPARQLUpdateExample example = new SPARQLUpdateExample();
        example.run();
     }

}

The MarkLogic Jena source is available on GitHub at http://github.com/marklogic/marklogic-jena, along with javadocs and examples.

The key interfaces of the MarkLogic Jena API are listed below:

  • MarkLogicDatasetGraph
  • MarkLogicDatasetGraphFactory
  • MarkLogicQuery
  • MarkLogicQueryEngine

Node.js Client API

The Node.js Client API can be used for CRUD operations on graphs; creating, reading, updating, and deleting triples and graphs. The DatabaseClient.graphs.write function can be used to create a graph containing triples, the DatabaseClient.graphs.read function reads from a graph. The DatabaseClient.graphs.remove function removes a graph. To query semantic data, use the DatabaseClient.graphs.sparql function.

See Working With Semantic Data in the Node.js Application Developer's Guide for more details. The Node.js Client source can be found on GitHub at http://github.com/marklogic/node-client-api. For additional operations, see the Node.js Client API Reference.

These operations only work with managed triples contained in a graph. Embedded triples cannot be manipulated using the Node.js Client API.

« Previous chapter
Next chapter »
Powered by MarkLogic Server 7.0-4.1 and rundmc | Terms of Use | Privacy Policy