This chapter covers the following topics related to performing relational operations on indexed values and documents using the Optic capabilities of the Node.js Client API:
The Optic API enables you to perform relational operations on indexes and documents. For example, you can extract data in row format, perform joins, and perform relational queries on XML and JSON documents. You can also extract a row view of data from other sources, such as lexicons and semantic triples.
The Optic capabilities of the Node.js Client API closely mirror the server-side Optic API described in Optic API for Multi-Model Data Access in the Application Developer's Guide. Refer to that guide for conceptual details.
The usage model for an optic query in Node.js is as follows:
planBuilder
interface. For details, see Generating a Plan.Execution of a plan can yield a row that satisfies any of several common use cases:
In addition to executing a query plan, you can also perform the following Optic related operations:
The following are the key Optic interfaces in the Node.js Client API:
Interface or Method | Description |
---|---|
marklogic.planBuilder |
Use a planBuilder to construct an Optic plan to be evaluated using rows.query . For details, see Generating a Plan. |
DatabaseClient.rows | An object that exposes methods for evaluating Optic plans, such as query , queryAsStream , and explain . |
rows.query
rows.queryAsStream |
Execute an Optic plan and return the results specified by the plan, in the form of a row set. For details, see Invoking a Plan and Streaming Row Data. |
rows.explain |
Generate an Optic API execution plan that expresses the logical dataflow of a plan a sequence of atomic operations. For details, see Generating an Execution Plan. |
planBuilder.export | Generate a JavaScript object representation of the Abstract Syntax Tree for a plan, which can be serialized for later use. For details, see Serializing a Plan. |
The examples in this chapter use the data, templates, and a plan from SQL on MarkLogic Server Quick Start in the SQL Data Modeling Guide. If you want to run the examples, you should use the SQL quick start to set up your environment. See the following topics in the SQL Data Modeling Guide:
If you set up a database named SQLdata, as directed in the Quick Start, then add a database
property to your DatabaseClient
connection information. For example, if your my-connection.js module should be similar to the following when running the Optic examples:
module.exports = { connInfo: { host: 'localhost', port: 8000, user: your-ml-username, password: your-ml-user-password, database: 'SQLdata' } };
For more details on configuring a DatabaseClient
for running the examples, see Using the Examples in This Guide.
Use marklogic.planBuilder
to construct an Optic query plan. Usually, you then use the plan to extract row data from MarkLogic. You can also generate a plan explanation or serialize a built plan.
An Optic plan defines a pipeline of relational operations to be applied to a row set.
planBuilder.from*
. For example, planBuilder.fromLiterals
, planBuilder.fromView
, planBuilder.fromTriples
, or planBuilder.fromLexicons
.select
, joinInner
, where
, and orderBy
. planBuilder.PreparePlan.map
or planBuilder.PreparePlan.reduce
. The mapper or reducer function runs on MarkLogic.For example, the following code snippet builds a plan to list the employee ID, first name, and last name of all employees, in order of employee ID, using the configuration and data from the SQL on MarkLogic Server Quick Start chapter in the SQL Data Modeling Guide.
const marklogic = require('marklogic'); const my = require('./my-connection.js'); const db = marklogic.createDatabaseClient(my.connInfo); const pb = marklogic.planBuilder; pb.fromView('main', 'employees') .select(['EmployeeID', 'FirstName', 'LastName']) .orderBy('EmployeeID')
The plan is not executed until you process it with rows.query
or rows.queryAsStream
. For details, see Invoking a Plan.
For details on the logical structure of a plan and the available operators, see Objects in an Optic Pipeline in the Application Developer's Guide. The Node.js planBuilder
interface exposes methods with the same names and purpose as the Server-Side JavaScript Optic API.
The planBuilder
interface includes namespaces that expose proxies for many server-side operations. For example, you can call cts query constructors or a server-side function in the fn
, xdmp
, or map
namespaces. The placeholder in the plan translates into an equivalent invocation of a server-side function during plan execution.
For example, the following code constructs a plan that includes cts query constructor proxy calls:
const marklogic = require('marklogic'); const my = require('./my-connection.js'); const db = marklogic.createDatabaseClient(my.connInfo); const pb = marklogic.planBuilder; db.rows.query( pb.fromView('main', 'employees') .where(pb.cts.andQuery([ pb.cts.wordQuery('Senior'), pb.cts.wordQuery('Researcher')])) .select(['FirstName', 'LastName', 'Position']), {columnTypes: 'header'} ).then( function(rows) { console.log(JSON.stringify(rows, null, 2)); }).catch( function(error) { console.log(JSON.stringify(error, null, 2)); });
For more information on the proxy functions, see the planBuilder API Reference and the following topics in the Application Developer's Guide:
For details on the behavior of a particular server-side function, see the MarkLogic Server-Side JavaScript Function Reference.
Use rows.query
or rows.queryAsStream
to execute a plan and generate a row set. You can also pass in an options object to control the output format, specify parameter bindings, or pin the query to a specific point-in-time.
Unlike most other Node.js Client API operations, query
and queryAsStream
do not support a result
method. Instead, the plan is sent to MarkLogic for evaluation when you invoke the then
or on
methods. See the examples below.
To perform a query at a specific point in time, pass a Timestamp
object as the value of the timestamp
property of the options
parameter. For more details, see Performing Point-in-Time Operations.
For example, the following code constructs a plan, executes it on MarkLogic, and then processes the resulting row set. The plan and results are based on the configuration and data from the SQL on MarkLogic Server Quick Start chapter in the SQL Data Modeling Guide.
const marklogic = require('marklogic'); const my = require('./my-connection.js'); const db = marklogic.createDatabaseClient(my.connInfo); const pb = marklogic.planBuilder; db.rows.query( pb.fromView('main', 'employees') .select(['EmployeeID', 'FirstName', 'LastName']) .orderBy('EmployeeID') ).then( function(rows) { console.log(JSON.stringify(rows, null, 2)); }).catch( function(error) { console.log(JSON.stringify(error, null, 2)); });
The following example is the equivalent code using queryAsStream
to stream the results as JavaScript objects instead of returning them all at once. Several stream modes are available; for details, see Streaming Row Data.
const marklogic = require('marklogic'); const my = require('./my-connection.js'); const db = marklogic.createDatabaseClient(my.connInfo); const pb = marklogic.planBuilder; db.rows.queryAsStream( pb.fromView('main', 'employees') .select(['EmployeeID', 'FirstName', 'LastName']) .orderBy('EmployeeID'), 'object' ).on( 'data', function(rows) { console.log(JSON.stringify(rows, null, 2)); }).on ('end', function() { console.log('done'); });
The plan returns results similar to the following. You can use the options parameter of the query
and queryAsStream
methods to customize the output format; for details, see Configuring Row Set Format.
{ "columns": [ { "name": "main.employees.EmployeeID" }, { "name": "main.employees.FirstName" }, { "name": "main.employees.LastName" } ], "rows": [ { "main.employees.EmployeeID": { "type": "xs:integer", "value": 1 }, "main.employees.FirstName": { "type": "xs:string", "value": "John" }, "main.employees.LastName": { "type": "xs:string", "value": "Widget" } }, { "main.employees.EmployeeID": { "type": "xs:integer", "value": 2 }, "main.employees.FirstName": { "type": "xs:string", "value": "Jane" }, "main.employees.LastName": { "type": "xs:string", "value": "Lead" } }, { "main.employees.EmployeeID": { "type": "xs:integer", "value": 3 }, "main.employees.FirstName": { "type": "xs:string", "value": "Steve" }, "main.employees.LastName": { "type": "xs:string", "value": "Manager" } }, ... ] }
When you invoke a query plan as described in Invoking a Plan, the result is a row set. You can use the options parameter of rows.query
and rows.queryAsStream
to configuring the layout of the row set.
This section covers the following topics related to how configuration options affect the layout of a row set.
You can fetch row data in the form of JSON objects, JSON arrays, XML elements, or CSV (comma separated values).
For JSON and XML, you can also control whether column type information is a part of each row or is only part of the column header data. You should include type information with row unless you know each of your columns contain values of the same type.
Use the following options to configure the row set format and layout:
format
: Specify the overall format as json
, xml
, or csv
. Default: json
.structure
: When the format is JSON, specify whether each row should be represented as an object or an array. Default: object
.columnTypes
: Specify whether to embed column value type information in each row or only in the column header. Only meaningful when format
is json
or xml
. Default: rows
.complexValues
: Specify whether to return column values with non-atomic type inline or by reference. Only meaningful with rows.queryAsStream
. Default: inline
.The option settings can yield different layouts when returning a row set as a single document with query
versus an object stream with queryAsStream
. The rest of this section explores how various option settings interact for query
and queryAsStream
.
This section illustrates how various configuration option settings affect the data passed to your handler when you use rows.query
and the promise handling pattern. When you use this pattern, your response handler receives the entire row set at once. For more details on promises, see Promise Result Handling Pattern.
You can also stream a row set using queryAsStream. The output is similar, but your handler receives data in chunks. For details, see Streaming Row Data.
When you fetch rows as a JSON array, the first item in the array is the column header data. When you fetch rows as CSV, the first record is the column header data.
The following table summarizes the output produced by various option combinations when fetching rows using rows.query
. Each example displays the column header data (where appropriate) and one row.
You can use rows.queryAsStream
to stream a row set from MarkLogic. Depending on the streaming mode, data is returned to your handler in chunks of bytes, JavaScript objects, or JSON text sequence records.
Row sets formatted as XML or CSV can only be streamed in chunked mode. Rows sets formatted as JSON can be streamed in any mode.
When you fetch a row set with queryAsStream
in object mode, each invocation of your on('data')
handler receives either a JavaScript object or an array, depending on the value of structure option. Column header data is passed on the first invocation.
You can only use object mode when streaming row data as JSON. To stream XML or CSV row set data, use chunked mode. For details, see Chunked Mode Streaming.
The default streaming mode is chunked. To stream results in object mode, set the streamType
parameter of queryAsStream
to 'object'
:
db.rows.queryAsStream(plan, 'object', options)
The following table illustrates the output produced by various option combinations when fetching rows using rows.queryAsStream
with the stream in object mode. Each example includes the column header object or array passed in on the first invocation of your data handler, followed by an example data row.
When you fetch a row set with queryAsStream
in chunked mode, your handler receives control when some number of bytes are read.
Each chunk provided to your handler is a buffer of bytes, formatted as serialized JSON, XML, or CSV, depending on your option settings. The aggregate byte stream is formatted similar to the layouts shown in Layout Examples.
The default stream mode is chunked. You can also set it explicitly. For example:
db.rows.queryAsStream(plan, 'chunked', options)
When you fetch a row set with queryAsStream
in sequence mode, your handler receives control when a record in JSON text sequence format is available. Each record contains the byte representation of either a JSON object or JSON array, depending on your option settings.
You can only use sequence mode when streaming row data as JSON. To stream XML or CSV row set data, use chunked mode. For details, see Chunked Mode Streaming
Each record begins with a record separator (0x1E) in the first byte and ends with an ASCII line feed character (0x0A), as described in .
The default stream mode is chunked. To activate sequence mode, set the streamType parameter of queryAsStream
to 'sequence'
:
db.rows.queryAsStream(plan, 'sequence', options)
Each record is formatted like the examples Layout Examples when the format
option is set to 'json'
and the structure
option is set to either 'object'
or 'array'
.
For example, the following code produces records formatted as JSON objects, with column data in the header (first) record. For illustrative purposes, each record is converted to a string and stripped of the record separators before displaying it on the console.
const marklogic = require('marklogic'); const my = require('./my-connection.js'); const fs = require('fs'); const db = marklogic.createDatabaseClient(my.connInfo); const plan = fs.readFileSync('./plan.json', 'utf8'); const options = { format: 'json', structure: 'object', columnTypes: 'header' }; db.rows.queryAsStream(plan, 'sequence', options) .on( 'data', function(record) { const asString = record.toString(); console.log(asString.substring(1,asString.length-2)); }).on ('end', function() { console.log('done'); });
This example returns output similar to the following, where each record contains an object that represents one row. The first record contains the column header data.
{"columns":[{"name":"main.employees.EmployeeID","type":"xs:integer"},{"name":"main.employees.FirstName","type":"xs:string"},{"name":"main.employees.LastName","type":"xs:string"}]} {"main.employees.EmployeeID":1,"main.employees.FirstName":"John","main.employees.LastName":"Widget"} ...
If you set the structure option to array, then each record contains an array that represents one row. The first record contains the column header data.
[{"name":"main.employees.EmployeeID","type":"xs:integer"},{"name":"main.employees.FirstName","type":"xs:string"},{"name" :"main.employees.LastName","type":"xs:string"}] [1,"John","Widget"] ...
If your plan uses placeholder parameters, use the bindings
option to specify values for the placeholders when you invoke the plan.
The bindings option value is a JavaScript object where the property names correspond to parameters, and the values are either a parameter value or an object that specifies a type or language key and a value. That is, each property must take one of forms shown below:
{bindings : { paramName1 : value1, paramName2 : {value: value2, type: typeNameString}, paramName3 : {value: value3, lang: languageCode} }}
The type name can be any derivation of xs:atomicType
other than xs:QName
. For example, you can use type names such as 'string'
, 'integer'
, and 'decimal'
. If you do not specify a type, the value is interpreted as a string. Use a language code to bind language-tagged strings.
For example, if you defined a placeholder variable named start in your plan definition, then you could specify a value for start in the bindings
option in any of the following ways:
{bindings:{ start: 'apple' } {bindings:{ start: {value: 'apple', type: 'string'} } {bindings:{ start: {value: 'apple', lang: 'en'} }
If your row set includes column values with non-atomic type, such as XML elements, JSON arrays, JSON objects, binary content, or text nodes, they are serialized inline by default. For example, the following row contains a column named complex whose value is a serialized XML element:
{ "id":{ "type":"xs:integer", "value":1 }, "complex":{ "type":"element", "value":"<root><A>Detail 1</A><B>2015-12-01</B></root>" }}
You can use the complexValues
option of rows.queryAsStream
to specify whether such complex values should be included inline (as shown above) or by reference. You cannot configure the handling of complex values when executing a plan using rows.query
.
For example, the following option setting specifies complex values should be returned by reference.
{complexValues: 'reference'}
For example, suppose your row set includes a column named node whose value is an XML element. The default behavior (inline) yields a serialized string of the following form for the node value:
"node": { "type": "element", "value": "<alpha><a>true</a></alpha>" }
If you set the complexValues option to reference, then the same column value is rendered as follows:
"node": { "contentType": "application/xml", "format": "xml", "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<beta><b>false</b></beta>" }
The node value is actually returned as a reference to a complex value by MarkLogic, but the reference is resolved and expanded for you in the results returned from the Node.js Client API.
The details of the format depend on the output formatting options you choose and the type of complex value. For example, if the complex value is a binary document, then it would be returned as a base64 encoded value inline, but as a Node.js Buffer
when fetched by reference.
Use rows.explain
to generate an execution plan that expresses the logical flow of an Optic plan as a sequence of atomic operations. For more details, see Optic Execution Plan in the Application Developer's Guide.
For example, the following code generates an execution plan based on the data from SQL on MarkLogic Server Quick Start in the SQL Data Modeling Guide.
const marklogic = require('marklogic'); const my = require('./my-connection.js'); const db = marklogic.createDatabaseClient(my.connInfo); const pb = marklogic.planBuilder; db.rows.explain( pb.fromView('main', 'employees') .select(['EmployeeID', 'FirstName', 'LastName']) .orderBy('EmployeeID') .limit(3) ).then( function(rows) { console.log(JSON.stringify(rows, null, 2)); }).catch( function(error) { console.log(JSON.stringify(error, null, 2)); });
Use the export
method of a plan object to generate a serializable version of a plan that can be saved to persistent storage for later use. The export
method produces a JavaScript object representation of a plan, which can be serialized to JSON using, for example, JSON.stringfy
.
For example, the following code generates a JSON serialization of a plan and logs it to the console:
const marklogic = require('marklogic'); const my = require('./my-connection.js'); const db = marklogic.createDatabaseClient(my.connInfo); const pb = marklogic.planBuilder; console.log( JSON.stringify( pb.fromView('main', 'employees') .select(['EmployeeID', 'FirstName', 'LastName']) .orderBy('EmployeeID') .limit(3) .export(), null,2));
To subsequently use a serialized plan, convert it back into a JavaScript object and pass the object to rows.query
or rows.queryAsStream
. For example:
const marklogic = require('marklogic'); const my = require('./my-connection.js'); const db = marklogic.createDatabaseClient(my.connInfo); const pb = marklogic.planBuilder; // Read serialized plan from a file or other storage, then... const serializedPlan = ...; const thePlan = JSON.parse(serializedPlan); db.rows.query(thePlan) .then( function(rows) { console.log(JSON.stringify(rows, null, 2)); }).catch( function(error) { console.log(JSON.stringify(error, null, 2)); });