Loading TOC...
Node.js Application Developer's Guide (PDF)

Node.js Application Developer's Guide — Chapter 1

Introduction to the Node.js Client API

The Node.js Client API enables you to create Node.js applications that can read, write, and query documents and semantic data in a MarkLogic database.

The Node.js API is an open source project maintained on GitHub. To access the sources, report or review issues, or contribute to the project, go to http://github.com/marklogic/node-client-api.

Getting Started

This section demonstrates loading documents into the database, querying the documents, updating a portion of a document, and reading documents from the database. The basic features demonstrated here have many more capabilities. The end of this section contains pointers to resources for exploring the Node.js Client API in more detail.

Before you begin, make sure you have installed the software listed in Required Software. You should also have the node and npm commands on your path.

If you are working on Microsoft Windows, you should use a DOS command shell rather than a Cygwin shell. Cygwin is not a supported enviroment for node and npm.

The following procedure walks you through installing the Node.js Client API, loading some simple JSON documents into the database, and then searching and modifying the documents.

  1. If you have not already done so, download, install, and start MarkLogic Server from http://developer.marklogic.com.
  2. Create or select a project directory from which to exercise the examples in this walk through. The rest of the instructions assume you are in this directory.
  3. Download and install the latest version of the Node.js Client API from the public npm repository into your project directory. For example:
    npm install marklogic
  4. Configure your MarkLogic connection information: Copy the following code to a file named my-connection.js. Modify the MarkLogic Server connection information to match your environment. You must change at least the user and password values. Select a MarkLogic user that has at least the rest-reader and rest-writer roles or equivalent privileges; for details, see Security Requirements.
    module.exports = {
      connInfo: {
        host: 'localhost',
        port: 8000,
        user: 'user',
        password: 'password'
      }
    };

    The rest of the examples in this guide assume this connection configuration module exists with the path ./my-connection.js.

  5. Load the example documents into the database: Copy the following script to a file and run it using the node command. Several JSON documents are inserted into the database using DatabaseClient.documents.write.
    // Load documents into the database.
    
    var marklogic = require('marklogic');
    var my = require('./my-connection.js');
    var db = marklogic.createDatabaseClient(my.connInfo);
    
    // Document descriptors to pass to write(). 
    var documents = [
      { uri: '/gs/aardvark.json',
        content: {
          name: 'aardvark',
          kind: 'mammal',
          desc: 'The aardvark is a medium-sized burrowing, nocturnal mammal.'
        }
      },
      { uri: '/gs/bluebird.json',
        content: {
          name: 'bluebird',
          kind: 'bird',
          desc: 'The bluebird is a medium-sized, mostly insectivorous bird.'
        }
      },
      { uri: '/gs/cobra.json',
        content: {
          name: 'cobra',
          kind: 'mammal',
          desc: 'The cobra is a venomous, hooded snake of the family Elapidae.'
        }
      },
    ];
    
    // Load the example documents into the database
    db.documents.write(documents).result( 
      function(response) {
        console.log('Loaded the following documents:');
        response.documents.forEach( function(document) {
          console.log('  ' + document.uri);
        });
      }, 
      function(error) {
        console.log(JSON.stringify(error, null, 2));
      }
    );

    You should see output similar to the following:

    Loaded the following documents:
      /gs/aardvark.json
      /gs/bluebird.json
      /gs/cobra.json
  6. Search the database: Copy the following script to a file and run it using the node command. The script retrieves documents from the database that contain the JSON property kind with the value 'mammal'.
    // Search for documents about mammals, using Query By Example.
    // The query returns an array of document descriptors, one per
    // matching document. The descriptor includes the URI and the
    // the contents of each document.
    
    var marklogic = require('marklogic');
    var my = require('./my-connection.js');
    
    var db = marklogic.createDatabaseClient(my.connInfo);
    var qb = marklogic.queryBuilder;
    
    db.documents.query(
      qb.where(qb.byExample({kind: 'mammal'}))
    ).result( function(documents) {
        console.log('Matches for kind=mammal:')
        documents.forEach( function(document) {
          console.log('\nURI: ' + document.uri);
          console.log('Name: ' + document.content.name);
        });
    }, function(error) {
        console.log(JSON.stringify(error, null, 2));
    });

    You should see output similar to the following. Notice that cobra is incorrectly labeled as a mammal. The next step will correct this error in the content.

    Matches for kind=mammal:
    
    URI: /gs/cobra.json
    Name: cobra
    
    URI: /gs/aardvark.json
    Name: aardvark
  7. Patch a document: Recall from the previous step that cobra is incorrectly labeled as a mammal. This step changes the kind property for /gs/cobra.json from 'mammal' to 'reptile'. Copy the following script to a file and run it using the node command.
    // Use the patch feature to update just a portion of a document,
    // rather than replacing the entire contents.
    
    var marklogic = require('marklogic');
    var my = require('./my-connection.js');
    
    var db = marklogic.createDatabaseClient(my.connInfo);
    var pb = marklogic.patchBuilder;
    
    db.documents.patch(
      '/gs/cobra.json',
      pb.replace('/kind', 'reptile')
    ).result( function(response) {
        console.log('Patched ' + response.uri);
    }, function(error) {
        console.log(JSON.stringify(error, null, 2));
    });

    You should see output similar to the following:

    Patched /gs/cobra.json
  8. Confirm the change by re-running the search or retrieving the document by URI. To retrieve /gs/cobra.json by URI, copy the following script to a file and run it using the node command.
    // Read documents from the database by URI.
    
    var marklogic = require('marklogic');
    var my = require('./my-connection.js');
    
    var db = marklogic.createDatabaseClient(my.connInfo);
    
    db.documents.read(
      '/gs/cobra.json'
    ).result( function(documents) {
      documents.forEach( function(document) {
        console.log(JSON.stringify(document, null, 2) + '\n');
      });
    }, function(error) {
        console.log(JSON.stringify(error, null, 2));
    });

    You should see output similar to the following:

    {
      "uri": "/gs/cobra.json",
      "category": "content",
      "format": "json",
      "contentType": "application/json",
      "contentLength": "106",
      "content": {
        "name": "cobra",
        "kind": "reptile",
        "desc": "The cobra is a venomous, hooded snake of the family Elapidae."
      }
    }
  9. Optionally, delete the example documents: Copy the following script to a file and run it using the node command. To confirm deletion of the documents, you can re-run the script from Step 8.
    // Remove the example documents from the database.
    // This example removes all the documents in the directory
    // /gs/. You can also remove documents by document URI.
    
    var marklogic = require('marklogic');
    var my = require('./my-connection.js');
    
    var db = marklogic.createDatabaseClient(my.connInfo);
    
    db.documents.removeAll(
      {directory: '/gs/'}
    ).result( function(response) {
      console.log(response);
    });

    You should see output similar to the following:

    { exists: false, directory: '/gs/' }

    Document removal is an idempotent operation. Running the script again produces the same output.

To explore the API further, see the following resources:

If You Want ToThen See
Explore more examplesThe examples and tests that are distributed with the API. Sources are available from http://github.com/marklogic/node-client-api or in your node_modules/marklogic directory after you install the API.
Learn about reading and writing documents and metadataManipulating Documents.
Learn about searching documents and querying lexicons and indexes

Querying Documents and Metadata.

The Search Developer's Guide

Learn about extension points such as content transformations and resource service extensionsExtensions, Transformations, and Server-Side Code Execution
Explore the low level API documentation.

The Node.js API Reference.

You can also generate a local copy of the API reference. For details, see the project page on GitHub: http://github.com/marklogic/node-client-api

Required Software

To use the Node.js Client API, you must have the following software:

  • MarkLogic 8 or later.
  • Node.js, version 6.3.1 or later. Node.js is available from http://nodejs.org.
  • The Node.js Package Manager tool, npm. The latest version compatible with a supported Node.js version is recommended.

The examples in this guide assume you have the node and npm commands on your path.

Security Requirements

This describes the basic security model used by the Node.js Client API, and some common situations in which you might need to change or extend it. The following topics are covered:

Basic Security Requirements

The user you specify when creating a DatabaseClient object must have appropriate URI privileges for the content accessed by the operations performed, such as permission to read or update documents in the target database.

The Node.js Client uses the MarkLogic REST Client API to communicate with MarkLogic Server, so it uses the same security model. In addition to proper URI privileges, the user must have one of the pre-defined roles listed below, or the equivalent privleges. The capabilities of each role in the table is subsumed in the roles below it.

RoleDescription
rest-extension-userEnables access to resource service extension methods. This role is implicit in the other pre-defined REST API roles, but you may need to explicitly include it when defining custom roles.
rest-readerEnables read operations, such as retrieving documents and metadata. This role does not grant any other privileges, so the user might still require additional privileges to read content.
rest-writerEnables write operations, such as creating documents, metadata, or configuration information. This role does not grant any other privileges, so the user might still require additional privileges to write content.
rest-adminEnables administrative operations, such as creating an instance and managing instance configuration. This role does not grant any other privileges, so the user might still require additional privileges.

Some operations require additional privileges, such as using a database other than the default database associated with the REST instance and using eval or invoke methods of DatabaseClient. These requirements are detailed below.

Controlling Document Access

Documents you create using the Node.js Client API default roles have a read permission for the rest-reader role and an update permission for the rest-writer role. By default, users with the rest-reader role can read all documents created as rest-reader and users with the rest-writer role can write all documents created as rest-writer. You can override this behavior using document permissions and/or custom roles.

To restrict access to particular users, create custom roles rather than assigning users to the default rest-* roles. For example, you can use a custom role to restrict users in one group from seeing documents created by another.

For details, see Controlling Access to Documents and Other Artifacts in the REST Application Developer's Guide.

Evaluating Requests Against a Different Database

When you connect to a MarkLogic Server instance by creating a DatabaseClient, the REST instance you connect to has a default content database associated with it. You can specify an alternative database when you create the DatabaseClient, but to perform operations against an alternative database requires the http://marklogic.com/xdmp/privileges/xdmp-eval-in privilege or equivalent.

To enable your application to use a different database:

  1. Create a role with the xdmp:eval-in execution privilege, in addition to appropriate mix of rest-* roles. (You can also add the privileges to an existing role.)
  2. Assign the role from Step 1 to a user.
  3. Create a DatabaseClient with the user from Step 2.

One simple way to achieve this is to inherit from one of the predefined rest-* roles and then addin the eval-in privileges.

For details about roles and privileges, see the Understanding and Using Security Guide. To learn more about managing REST API instances, see Administering REST API Instances.

Evaluating or Invoking Server-Side Code

You can use the DatabaseClient.eval and DatabaseClient.invoke operations to evaluate arbitrary code on MarkLogic Server. These operations require special privileges instead of (or in addition to) the normal REST API roles like rest-reader and rest-writer.

For details, see Required Privileges.

Using CombinedQueryDefinition

If you use CombinedQueryDefinition to create queries rather than using builder interfaces such as queryBuilder and valuesBuilder, then it is possible to specify query options that require the rest-admin role or equivalent privileges to evaluate the query.

For more details, see Using Dynamically Defined Query Options in the REST Application Developer's Guide.

Terms and Definitions

This guide uses the following terms and definitions:

TermDefinition
REST Client APIA MarkLogic API for developing applications that communicate with MarkLogic using RESTful HTTP requests. The Node.js Client API is built on top of the REST Client API.
REST API instanceA MarkLogic HTTP App Server specially configured to service REST Client API requests. The Node.js Client API requires a REST API instance. One is available on port 8000 as soon as you install MarkLogic. For details, see What Is a REST API Instance?.
npmThe Node.js package manager. Use npm to download and install the Node.js Client API and its dependencies.
builderAn interface in the Node.js Client API that exposes functions for building potentially complex data structures such as queries (marklogic.queryBuilder) and document patches (marklogic.patchBuilder).
PromiseA Promise is a JavaScript interface for interacting with the outcome of an asynchronous event. For details, see Promise Result Handling Pattern.
MarkLogic moduleThe module that encapsulates the Node.js Client API. Include the module in your application using require(). For details, see MarkLogic Namespace.
document descriptorAn object that encapsulates document content and metadata as named JavaScript object properties. For details, see Document Descriptor.
database clientA special object that encapsulates your connection to MarkLogic Server through a REST API instance. Almost all Node.js Client API operations take place through a database client object. For details, see Creating a Database Client.
gitA source control management system. You will need a git client if you want to checkout and use the Node.js Client API sources.
GitHubThe open source project repository that hosts the Node.js Client API project. For details, see http://github.com/.

Key Concepts and Conventions

MarkLogic Namespace

The Node.js Client API library exports a namespace that provides a database client factory method and access to builders such as queryBuilder (search), valuesBuilder (values queries), and patchBuilder (partial document updates).

To include the MarkLogic module in your code, use require() and bind the result to a variable. For example, you can include it by the name 'marklogic' if you have installed in the module under your own Node.js project:

var ml = require('marklogic');

You can use any variable name, but the examples in this guide assume ml.

Parameter Passing Conventions

Node.js Client API functions that require many input parameter values accept these values as named properties of a call object. For example, you can specify a hostname, port, database name, and several other connection properties when calling the createDatabaseClient() method. Do so by encapsulating these values in a single object, such as the following:

ml.createDatabaseClient({host: 'some-host', port: 8003, ...});

Where a parameter value can have one or more values, the value of the property can be either a single value or an array of values. Some functions support either an array or a list. For example:

db.documents.write(docDescriptor)
db.documents.write([docDescriptor1, docDescriptor2, ...])
db.documents.write(docDescriptor1, docDescriptor2, ...)

Where a function has a parameter that is frequently used without other parameters, you can pass the parameter directly as a convenient alternative to encapsulating it in a call object. For example, DatabaseClient.documents.remove accepts either a call object that can have several properties, or a single URI string:

db.documents.remove('/my/doc.json')
db.documents.remove({uri: '/my/doc.json', txid: ...})

For details on a particular operation, see the Node.js API Reference.

Document Descriptor

A document descriptor is an object that encapsulates document content and metadata as named JavaScript object properties. Node.js Client API document operations such as DatabaseClient.documents.read and DatabaseClient.documents.write accept and return document descriptors.

A document descriptor usually includes at least the database URI and properties representing document content, document metadata, or both. For example, the following is a document descriptor for a document with URI /doc/example.json. Since the document is a JSON document, its contents can be expressed as a JavaScript object.

{ uri : 'example.json', content : {some : 'data'} }

Not all properties are always present. For example if you read just the contents of a document, there will be no metadata-related properties in the resulting document descriptor. Similarly, if you insert just content and the collections metadata property, the input descriptor will not include permissions or quality properties.

{ uri : 'example.json', 
  content : {some : 'data'}, 
  collections : ['my-collection']
}

The content property can be an object, string, Buffer, or ReadableStream.

See DocumentDescriptor in the Node.js API Reference for a complete list of property names.

Supported Result Handling Techniques

Most functions in the Node.js Client API support the following ways of processing results returned by MarkLogic Server:

  • Callback: Call the result function, passing in a success and/or error callback function. Use this pattern when you don't need to synchronize results. For example:
    db.documents.read(...).result(function(response) {...})
  • Promise: Call the result function and process the results through a Promise. Use Promises to chain interactions together, such as writing documents to the database, followed by a search. Your success callback is not invoked until all the requested data has been returned by MarkLogic Server. For example:
    db.documents.read(...).result().then(function(response) {...})...

    For details, see Promise Result Handling Pattern.

  • Object Mode Streaming: Call the stream function and process the results through a Readable stream. Your code gets control each time a document or other discrete part is received in full. If you're reading a JSON document, it is converted to a JavaScript object before invoking your callback. For example:
    db.documents.read(...).stream().pipe(...)

    For details, see Stream Result Handling Pattern.

  • Chunked Mode Streaming: Call the stream function with a 'chunked' argument and process the results through a Readable stream. Your code gets control each time a sufficient number of bytes are accumulated, and the input to your callback is a byte stream.
    db.documents.read(...).stream('chunked').pipe(...)

    For details, see Stream Result Handling Pattern.

When you use the classic callback or promise pattern, your code does not get control until all results are returned by MarkLogic. This is suitable for operations that do not return a large amount of data, such as a read operation that returns a small number of documents or a write. Streaming is better suited to handling large files or a large number documents because it allows you to process results incrementally.

Errors in the user code of a success callback are handled in the next error callback. Therefore, you should include a catch clause to handle such errors. For details, see Error Handling.

Promise Result Handling Pattern

Node.js Client API functions return an object with a result() method that returns a Promise object. A Promise is a JavaScript interface for interacting with the outcome of an asynchronous event. A Promise has then, catch, and finally methods. For details, see http://promisesaplus.com/. Promises can be chained together to synchronize multiple operations.

The success callback you pass to the Promise then method is not invoked until your interaction with MarkLogic completes and all results are received. The Promise pattern is well suited to synchronizing operations.

For example, you can use a sequence such as the following to insert documents into the database, query them after the insertion completes, and then work with the query results.

db.documents.write(...).result().then( 
  function(response) {
    // search the documents after insertion
    return db.documents.query(...).result();
  }).then( function(documents) {
    // work with the documents matched by the query
  });

For a more complete example, see Example: Using Promises With a Multi-Statement Transaction.

You should include a catch clause in your promise chain to handle errors raised in user code in your success callbacks. For details, see Error Handling.

The Node.js Client API also supports a stream pattern for processing results. A stream is better suited to handling very large amounts of data than a Promise. For details, see Stream Result Handling Pattern.

Stream Result Handling Pattern

Node.js Client API functions return an object with a stream method that returns a Readable stream on the results from MarkLogic. Streams enable you to process results incrementally. Consider using streaming if you're reading a large number of documents or if your documents are large.

Streams can provide better throughput at lower memory overhead than the Promises when you're working with large amounts of data because result data can be processed as it is received from MarkLogic Server.

Two stream modes are supported:

  • Object Mode: Your code gets control each time a complete document or other discrete part is received. A Document Descriptor is the unit of interaction. For a JSON document, the content in the descriptor is converted into JavaScript object for ease of use. Object mode is the default streaming mode.
  • Chunked mode: Your code gets control each time a certain number of bytes is received. An opaque byte stream is the unit of interaction. Enable chunked mode by passing the value 'chunked' to the stream method.

Object mode is best when you need to handle each portion of the result as a document or object. For example, if you persist a collection of domain objects in the database as JSON documents and then want to restore them as JavaScript objects in your application. Chunked mode is best for handling large amounts of data opaquely, such as reading a large binary file from the database and saving it out to file.

The following code snippet uses a stream in object mode to process multiple documents as they are fetched from the database. Each time a complete document is received, the stream on('data') callback is invoked with a document descriptor. When all documents are received, the on('end') callback is called.

db.documents.read(uri1, uri2, uriN).stream()
  .on('data', function(document) {
    // process one document
  }).on('end', function() {
    //wrap it up
  }).on('error', function(error) {
    // handle errors
  });

The following code snippet uses a stream in chunked mode to stream a large binary file from the database into a file using pipe.

var fs = require('fs');
var ostrm = fs.createWriteStream(outFilePath);

db.document.read(largeFileUri).stream('chunked').pipe(ostrm);

The Promise pattern is usually more convenient if you are not processing a large amount of data. For details, see Promise Result Handling Pattern.

Streaming Into the Database

Most Node.js methods that deal with potentially large input datasets support using a ReadableStream to pass in the data. For example , the content property of a document descriptor passed to DatabaseClient.documents.write can be an object, a string, a Buffer, or a Readable stream. If you're simply streaming data from a source such as a file, this interface is all you need.

For example, the following call uses a Readable stream to stream an image from a file into the database:

db.documents.write({
  uri: '/my/image.png',
  contentType: 'image/png',
  content: fs.createReadStream(pathToImage)
})

If you are assembling the stream on the fly, or otherwise need to have fine grained control, you can use the createWriteStream method of the documents and graphs interfaces. For example, if you use DatabaseClient.documents.createWriteStream instead of DatabaseClient.documents.write, you can control the calls to write so you can assemble the documents yourself, as shown below:

var ws = db.documents.createWriteStream({
  uri: '/my/data.json',
  contentType: 'application/json',
});
// Resulting doc contains {"key":"value"}
ws.write('"{key"', 'utf8');
ws.write(': "value"}', 'utf8');
ws.end();

You can use the writeable stream interface to load documents and semantic graphs. For details, see documents.createWriteStream and graphs.createWriteStream in the Node.js API Reference.

Error Handling

When using the callback or promise pattern, errors in your success callback are handled in the next error callback. If you want to trap such errors, you should include a catch clause at the end of your promise chain (or after your result handler, in the case of the callback pattern). Simply wrapping a try-catch block around your call(s) will not trap such errors.

For example, in the case of the classic callback pattern, if you made a call to DatabaseClient.documents.write, you should end with a catch similar to the following. The onError function executes if the onSuccess callback throws an exception.

db.documents.write(...)
  .result(function onSuccess(response) {...})
  .catch(function onError(err) {...});

Similarly if you're chaining requests together using thePromise pattern, then you should terminate the chain with a similar handler:

db.documents.write(...).result()
  .then(function onSuccess1(response) {...})
  .then(function onSuccess2(response) {...})
  .catch(function onError(err) {...});

Creating a Database Client

All the interactions of your application with MarkLogic Server are through a marklogic.DatabaseClient object. Each database client manages a connection by one user to a REST API instance and a particular database. Your application can create multiple database clients for connecting to different REST API instances, connecting to different databases, or connecting as different users.

If you use multi-statement transactions and multiple database, note that the database context in which you perform an operation as part of a multi-statement transaction, or commit or rollback a transaction must be the same as the database context in which the transaction was created.

To create a database client, call marklogic.createDatabaseClient with a parameter that describes the connection details. For example, the following code creates a database client attached to the REST API instance listening on the default host and port (localhost:8000), using the default database associated with the instance, and digest authentication. The connection authenticates as user 'me' with password 'mypwd'.

var ml = require('marklogic');
var db = ml.createDatabaseClient({user:'me', password:'mypwd'});

The connection details must include at least a username and password and can include additional properties. The following table lists all the properties you can include in the connection object passed to createDatabaseClient.

Property NameDefaultValueDescription
host
localhost
A MarkLogic Server host with a configured REST API instance.
port
8000
The port on which the REST API instance listens.
database
the default database associated with the REST instanceThe database against which document operations and queries are performed. Specifying a database other than the REST API instance default requires the xdmp-eval-in privilege. For details, see Evaluating Requests Against a Different Database.
authType
digest
The authentication method to use in establishing the connection. Allowed values: basic, digest, digestbasic, application-level, or kerberos-ticket. This must match the authentication method configured on the REST API instance. For details, see the Understanding and Using Security Guide.
ssl
false
Whether or not to establish an SSL connection. For details, see Configuring SSL on App Servers in the Administrator's Guide. When set to true, you can include additional SSL properties on the connection object. These are passed through to the agent. For a list of these properties, see http://nodejs.org/api/https.html#https_https_request_options_callback
agent
max of 10 free sockets; total of 50 sockets kept alive for 60 secondsA connection pooling agent.

For details, see marklogic.createDatabaseClient in the Node.js API Reference and Administering REST API Instances.

Using the Examples in This Guide

All requests to MarkLogic Server using the Node.js Client API go through a DatabaseClient object. Therefore, all the examples begin by creating such an object. Creating a DatabaseClient requires you to specify MarkLogic Server connection information such as host, port, user, and password.

Most of the examples in this guide abstract away the connection details by require'ing a module named my-connection.js that exports a connection object suitable for use with marklogic.createDatabaseClient. This encapsulation is only done for convenience. You are not required to do likewise in your application.

For example, the following statements appear near the top of each example in this guide:

var my = require('./my-connection.js');
var db = marklogic.createDatabaseClient(my.connInfo);

To use the examples you should first create a file named my-connection.js with the following contents. This file should be co-located with any scripts you create by copying the examples in this guide.

module.exports = {
  connInfo: {
    host: 'localhost',
    port: 8000,
    user: your-ml-username,
    password: your-ml-user-password
  }
};

Modify the connection details to match your environment. You must modify at least the user and password properties. Most examples require a user with the rest-reader and/or rest-writer role or equivalent, but some operations require additional privileges. For details, see Security Requirements.

If you do not create my-connection.js, modify the calls to marklogic.createDatabaseClient in the examples to provide connection details in another way.

« Table of contents
Next chapter »
Powered by MarkLogic Server 7.0-4.1 and rundmc | Terms of Use | Privacy Policy