Loading TOC...
Node.js Application Developer's Guide (PDF)

MarkLogic Server 11.0 Product Documentation
Node.js Application Developer's Guide
— Chapter 4

Querying Documents and Metadata

This chapter covers the following topics related to querying database content and metadata using the Node.js Client API:

Query Interface Overview

The Node.js Client API includes interfaces that enable you to search documents and query lexicons using a variety of query types. The following interfaces support query operations:

Method Description
marklogic.queryBuilder
Construct a string query, QBE, or structured query to use with DatabaseClient.documents.query. For details, see Understanding the queryBuilder Interface.
DatabaseClient.documents.query
Search for documents that match a string query, structured query, combined query, or Query By Example (QBE), returning a search results summary, matching documents, or both. For details, see Searching with Query By Example, Searching with String Queries, or Searching with Structured Queries.
DatabaseClient.queryCollection
Search for the persisted JavaScript objects in a collection, and return the matching objects.
DatabaseClient.valuesBuilder
Construct a values query to use with DatabaseClient.values.read. For details, see Querying Lexicons and Range Indexes.
DatabaseClient.values.read
Query the values or tuples (co-occurrences) in lexicons or range indexes. For details, see Querying Lexicons and Range Indexes.
DatabaseClient.documents.suggest
Match strings in a lexicon to provide search term completion suggestions. For details, see Generating Search Term Completion Suggestions.
DatabaseClient.config.query
Manage query related customizations stored in the modules database, including search result transforms, snippeters, and string query parsers.

Introduction to Search Concepts

This section provides a brief introduction to search concepts and the capabilities exposed by the Node.js Client API. Search concepts are covered in detail in the Search Developer's Guide.

You can query a MarkLogic Server database in two ways: by searching documents contents and metadata, or by querying value and word lexicons created from your content. This topic deals with searching content and metadata. For lexicons, see Querying Lexicons and Range Indexes.

This section covers the following topics:

Search Overview

Performing a search consists of the following basic phases:

  1. Build up a set of criteria that defines your desired result set
  2. Refine the result set by defining attributes such as the number of results to return or the sort order.
  3. Search the database or a lexicon.

The Node.js Client API includes the marklogic.queryBuilder interface that abstract away many of the structural details of defining and refining your query. For details, see Understanding the queryBuilder Interface. Use DatabaseClient.documents.query to execute your query operation.

MarkLogic Server supports many different kinds of search criteria, such as matching phrases, specific values, ranges of values, and geospatial regions. These and other query types are explored in Types of Query. You can express your search criteria using one of several query styles; for details, see Query Styles.

Query result refinements include whether or not to return entire documents, content snippets, facet information, and/or aggregate results. You can also define your own snippeting algorithm or custom search result transform. For details, see Refining Query Results.

To perform iterative searches over the database at a fixed point in time, pass a Timestamp parameter in your query call. For details, see Performing Point-in-Time Operations.

You can also analyze lexicons created from your documents using marklogic.valuesBuilder and DatabaseClient.values.read. For details, see Querying Lexicons and Range Indexes.

Query Styles

When you search document content and metadata using the Node.js Client API, you can express your search criteria using the following query styles. The syntax of each style is different, and the expressive power varies.

Query Style Description
Query By Example (QBE) Search documents by modeling the structure of the documents you want to match. For details, see Searching with Query By Example and queryBuilder.byExample.
String Query Search documents and metadata using a Google-style query string such as a user enters in a search box. For example, a query of the form cat AND dog matches all documents containing the phrases cat and dog. For details, see Searching with String Queries and queryBuilder.parsedFrom.
Structured Query Search documents and metadata by building up complex queries from a rich set of sub-query types. For details, see Searching with Structured Queries.
Combined Query Search documents and metadata using a query object that enables you to combine the other query styles plus query options. Combined query is an advanced feature for users who prefer to build queries manually. For details see Searching with Combined Query.

All the query styles support a rich set of search features, but generally, QBE is more expressive than string query, structured query is more expressive than QBE, and combined query is more expressive than any of the others since it is a superset. String query and QBE are designed for ease of use and cover a wide range of search needs. However, they do not provide the same level of control over the search as structured query and combined query do.

The following diagram illustrates this tradeoff, at a high level.

You can combine a string query and structured query criteria in a single query operation. QBE cannot be combined with the other two query styles.

For more details, see Overview of Search Features in MarkLogic Server in the Search Developer's Guide.

Types of Query

A query encapsulates your search criteria. No matter what query style you use (string, QBE, or structured), your criteria fall into one or more of the query types described in this section.

The following query types are basic search building blocks that describe the content you want to match.

  • Range: Match values that satisfy a relational expression. You can express conditions such as less than 5 or not equal to true. A range query must be backed by a range index.
  • Value: Match an entire literal value, such as a string or number, in a specific JSON property or XML element. By default, value queries use exact match semantics. For example, a search for mark will not match Mark Twain.
  • Word: Match a word or phrase in a specific JSON property or XML element or attribute. In contrast to a value query, a word query will match a subset of a text value and does not use exact match semantics by default. For example, a search for mark will match Mark Twain, in the specified context.
  • Term: Match a word or phrase anywhere it appears. In contrast to a value query, a term query will match a subset of a text value and does not use exact match semantics by default. For example, a search for mark will match Mark Twain.

Additional query types enable you to build up complex queries by combining the basic content queries with each other and with criteria that add additional constraints. The additional query types fall into the following categories.

  • Logical Composers: Express logical relationships between criteria. You can build up compound logical expressions such as x AND (y OR z).
  • Document Selectors: Select documents based on collection, directory, or URI. For example, you can express criteria such as x only when it occurs in documents in collection y.
  • Location Qualifiers: Further limit results based on where the match appears. For example, x only when contained in JSON property z, or x only when it occurs within n words of y, or x only when it occurs in a document property.

With no additional configuration, string queries support term queries and logical composers. For example, the query string cat AND dog is implicitly two term queries, joined by an and logical composer.

However, you can easily extend the expressive power of a string query using parse bindings to enable additional query types. For example, if you use a range query binding to tie the identifier cost to a specific indexed JSON property, you enable string queries of the form cost GT 10. For details, see Searching with String Queries.

In a QBE, content matches are value queries by default. For example, a QBE search criteria of the form {'my-key': 'desired-value'} is implicitly a value query for the JSON property 'my-key' whose value is exactly 'desired-value'. However, the QBE syntax includes special property names that enable you to construct other types of query. For example, use $word to create a word query instead of a value query: {'my-key': {'$word': 'desired-value'}}. For details, see Searching with Query By Example.

For structured query, the queryBuilder interface includes builders corresponding to all the query types. You can use these builders in combination with each other. Every queryBuilder method that return a queryBuilder.Query creates a query or sub-query that falls into one of the above query categories. For details, see Searching with Structured Queries.

Indexing

Range queries must be backed by an index. Even queries that do not strictly require a backing index can benefit from indexing by enabling unfiltered searches; for details, see Fast Pagination and Unfiltered Searches in the Query Performance and Tuning Guide.

You can create range indexes using the Admin Interface, the XQuery Admin API, and the REST Management API. You can also use the Configuration Manager or REST Packaging API to copy index configurations from one database or host to another. For details, see the following references:

Use the element range index interfaces to create indexes on JSON properties. For purposes of index configuration, a JSON property is equivalent to an XML element in no namespace.

You can use the binding feature of the Node.js Client API to bind an index reference to a name that can be used in string queries. For details, see Using Constraints in a String Query and Generating Search Term Completion Suggestions. Values queries on lexicons and indexes also rely on index references. For details, see Building an Index Reference.

Understanding the queryBuilder Interface

Performing a search using the marklogic.queryBuilder interface consists of the following phases:

  1. Build up a set of search criteria, creating a query that defines your desired result set.
  2. Refine the result set by defining attributes such as the number of results to return or the sort order.
  3. Search the database.

The following diagram illustrates using the Node.js Client API to define and execute a search using queryBuilder and DatabaseClient.documents.query. In the diagram, qb represents a queryBuilder object, and db represents a DatabaseClient object. The functions in italics are optional.

The following procedure expresses these steps in more detail:

  1. Define your search criteria using string query (qb.parsedFrom), QBE (qb.byExample), or structured query (other builders, such as qb.word, qb.range, and qb.or). For example:
    qb.parsedFrom("dog")

    You can pass a string and one or more structured builders together, in which case they are AND'd together. You cannot combine a QBE with the other query types.

  2. Encapsulate your criteria in a query by passing them to queryBuilder.where. This produces a queryBuilder.BuiltQuery object suitable for passing to DatabaseClient.documents.query, with or without further result set refinement.
    qb.where(qb.parsedFrom("dog"))
  3. Optionally, apply further result set refinements to your query. Any or all of the following steps can be skipped, depending on the results you want.
    1. Use queryBuilder.slice to select a subset of documents from the result set and/or specify a server-side transformation to apply to the selected results. The default slice is the first 10 documents, with no transformations.
    2. Use queryBuilder.orderBy to specify a sort key and/or sorting direction.
    3. Use queryBuilder.calculate to request one or more aggregate calculations on the result set.
  4. Optionally, use queryBuilder.withOptions to add further refinements to your search, such as specifying low level search options or a transaction id, or requesting query debugging information.
  5. Perform the search by passing your final BuiltQuery object to the DatabaseClient.documents.query function. For example:
    db.documents.query(qb.where(qb.parsedFrom("dog")))

The following table contains examples of using queryBuilder to construct an equivalent query in each of the available query styles. The queries match documents containing both the phrases cat and dog. Notice that only the query building portion of the search varies based on the chosen query style.

Query Style Code Snippet
string
db.documents.query(
  qb.where(
    qb.parsedFrom('cat AND dog')
  ).orderBy(qb.sort('descending')
  .slice(0,5)
)
QBE
db.documents.query(
  qb.where(
    qb.byExample({
      $and:[{$word:'cat'},{$word:'dog'}]
    })
  ).orderBy(qb.sort('descending')
  .slice(0,5)
)
structured
db.documents.query(
  qb.where(
    qb.and(qb.term('cat'), qb.term('dog'))
  ).orderBy(qb.sort('descending')
   .slice(0,5)
)
combined string and structured
db.documents.query(
  qb.where(
    qb.term('cat'), 
    qb.parsedFrom('dog')
  ).orderBy(qb.sort('descending')
   .slice(0,5)
)

For details, see one of the following topics:

Searching with String Queries

A string query is a simple, but powerful text string, usually corresponding to query text entered into your application by users via a search box. This section includes the following topics:

Introduction to String Query

The MarkLogic Server Search API default search grammar allows you to quickly construct simple searches such as cat, cat AND dog, or cat NEAR dog. Such a string query often represents query text entered into a search box by a user.

The default grammar supports operators such as AND, OR, NOT, and NEAR, plus grouping. For grammar details, see Searching Using String Queries in the Search Developer's Guide.

The Node.js client supports string queries through the queryBuilder.parsedFrom method. For example, to construct a query that matches documents containing the phrases cat and dog, use the following queryBuilder call:

qb.parsedFrom('cat AND dog')

For details, see Example: Basic String Query and the Node.js API Reference.

By default, DatabaseClient.documents.query returns an array of document descriptors, one per matched document, including the document contents. You can further refine the search in various ways, such as controlling which and how many documents, returning snippets and/or facets, and returning a result summary instead of entire documents. For details, see Refining Query Results.

The string grammar also supports the application of search constraints to query terms. For example, you can include a term of the form constraintName:value or constraintName relationalOp value to limit matches to cases where the value satisfies the constraint. ConstraintName is the name of a constraint you configure into your query.

For example, if you define a word constraint named location over a JSON property of the same name, then the string query location:oslo only matches the term oslo when it occurs in the value of the location property.

Similarly, if you define a range constraint over a number-valued property, bound to the name votes, then you can include relational expressions over the value of the property such as votes GT 5.

The Node.js client supports constraints in string queries through parse bindings that bind a constraint definition to the name usable in a query. Use the queryBuilder.parseBindings function to define such bindings. For example:

qb.parsedFrom(theQueryString, qb.parseBindings(binding definitions...))

For details, see Using Constraints in a String Query and Using a Custom Constraint Parser.

Example: Basic String Query

The following example script assumes the database is seeded with data Loading the Example Data. The script searches for all documents containing the phrase oslo.

const marklogic = require('marklogic');
const my = require('./my-connection.js');

const db = marklogic.createDatabaseClient(my.connInfo);
const qb = marklogic.queryBuilder;

db.documents.query(
    qb.where(qb.parsedFrom('oslo'))
).result( function(results) {
  console.log(JSON.stringify(results, null, 2));
});

The search returns an array of document descriptors, one descriptor per matching document. Each descriptor includes the document contents.

For example, if the file string-search.js contains the above script, then the following command produces the results below. The search matches two documents, corresponding to contributors located in Oslo, Norway.

$ node string-search.js
[
  {
    "uri": "/contributors/contrib1.json",
    "category": "content",
    "format": "json",
    "contentType": "application/json",
    "contentLength": "230",
    "content": {
      "Contributor": {
        "userName": "souser10002@email.com",
        "reputation": 446,
        "displayName": "Lars Fosdal",
        "originalId": "10002",
        "location": "Oslo, Norway",
        "aboutMe": "Software Developer since 1987, mainly using Delphi.",
        "id": "sou10002"
      }
    }
  },
  {
    "uri": "/contributors/contrib2.json",
    "category": "content",
    "format": "json",
    "contentType": "application/json",
    "contentLength": "202",
    "content": {
      "Contributor": {
        "userName": "souser1000634@email.com",
        "reputation": 272,
        "displayName": "petrumo",
        "originalId": "1000634",
        "location": "Oslo, Norway",
        "aboutMe": "Developer at AspiroTV",
        "id": "sou1000634"
      }
    }
  }
]

To return a search summary instead of the document contents, use queryBuilder.withOptions to set categories to 'none'. For example:

db.documents.query(
    qb.where(qb.parsedFrom('oslo')).withOptions({categories: 'none'})
)

Now, the result is a search summary that includes a count of the number of matches (2), and snippets of the matching text in each document:

[{
  "snippet-format": "snippet",
  "total": 2,
  "start": 1,
  "page-length": 10,
  "results": [...snippets here...],
  "qtext": "oslo",
  "metrics": {
    "query-resolution-time": "PT0.005347S",
    "facet-resolution-time": "PT0.000067S",
    "snippet-resolution-time": "PT0.001523S",
    "total-time": "PT0.007753S"
  }]

You can also refine your results in other ways. For details, see Refining Query Results.

Using Constraints in a String Query

The string query interfaces enable you to create parse bindings that define how to interpret parts of the query. You can define a binding between a name and a search constraint so that when a query term is prefixed by the bound name, the associated constraint is applied to search for that term. You can create parse bindings on word, value, range, collection, and scope constraints.

For example, you can define a binding between the name rep and a constraint that limits the search to matching values in a JSON property named reputation. Then, if a string query includes a term of the form rep:value, the constraint is applied to the search for the value. Thus, the following term mean find all occurrences of the reputation property where the value is 120:

rep:120

For details, see Using Relational Operators on Constraints in the Search Developer's Guide.

Range constraints, such as the constraint on reputation used here, must be backed by a corresponding range index. For details, see Indexing.

Follow these steps to create and apply parse bindings. For a complete example, see Example: Using Constraints in a String Query.

  1. Create a binding name specification by calling queryBuilder.bind or queryBuilder.bindDefault. For example, the following call creates a bind name specification for the name rep:
    qb.bind('rep')
  2. Create a binding between the name (or default) and a constraint by calling one of the queryBuilder binding builder methods (collection, range, scope, value, or word) and passing in the binding name specification. For example, the following call creates a binding between the name 'rep' and a value constraint on the JSON property name 'reputation'.
    qb.value('reputation', qb.bind('rep'))
  3. Bundle your bindings into a queryBuilder.ParseBindings object using queryBuilder.parseBindings. For example:
    qb.parseBindings(
      qb.value('reputation', qb.bind('rep')), ...more bindings..
    )
  4. Pass the parse bindings as the second parameter of queryBuilder.parsedFrom to apply them to a specific query. For example:
    qb.parsedFrom('rep:120',
      qb.parseBindings(
        qb.value('reputation', qb.bind('rep')), ...more bindings..
      )
    )

You can also create a binding that defines the behavior when the query string is empty, using queryBuilder.bindEmptyAs. You can elect to return all results or none. The default is none. Note that because a query without a slice specifier returns matching documents, setting the empty query binding to all-results can cause an empty query to retrieve all documents in the database.

The following example returns all search results because the query text is an empty string and empty query binding specifies all-results. Calling queryBuilder.slice ensures the query will return at most 5 documents.

db.documents.query( qb.where(
  qb.parsedFrom('',
    qb.parseBindings(
      qb.bindEmptyAs('all-results')
  ))
)).slice(0,5)

Example: Using Constraints in a String Query

This example defines some custom parse binding rules and applies them to a string query based search. The example illustrates the capability described in Using Constraints in a String Query.

The example uses data derived from the marklogic-samplestack application. The seed data includes contributor JSON documents of the following form:

{ "com.marklogic.samplestack.domain.Contributor": {
    "userName": string,
    "reputation": number,
    "displayName": string,
    "originalId": string,
    "location": string,
    "aboutMe": string,
    "id": string
} }

The example script applies the following parse bindings to the search:

  • The term rep corresponds to the value of the reputation JSON property. It is bound to a range constraint, so it can be used with relational expressions such as rep > 100. This constraint is expressed by the following binding definition:
    qb.range('reputation', qb.datatype('int'), qb.bind('rep'))
  • Bare terms that are not covered by another constraint are constrained to match a word query on the aboutMe JSON property. This constraint is expressed by the following binding definition:
    qb.word('aboutMe', qb.bindDefault())

The database configuration includes an element range index on the reputation JSON property with scalar type int. This index is required to support the range constraint on reputation.

This combination of bindings and configuration causes the following query text to match documents where marklogic occurs in the aboutMe property. The term marklogic is a bare term because it is not qualified by a constraint name.

"marklogic"

The following query text matches documents where the value of the reputation property is greater than 50:

marklogic AND rep GT 50

You can use these clauses together to match all documents in which the aboutMe property contains marklogic and the reputation property is greater than 50:

marklogic AND rep GT 50

Without the bindings, the above query matches documents that contain the phrase marklogic anywhere, and the sub-expression rep GT 50 is meaningless because it compares the word rep to 50.

The following script creates the binding and applies them to the search text shown above.

const marklogic = require('marklogic');
const my = require('./my-connection.js');

const db = marklogic.createDatabaseClient(my.connInfo);
const qb = marklogic.queryBuilder;

db.documents.query( qb.where(
  qb.parsedFrom('marklogic AND rep GT 50',
    qb.parseBindings(
      qb.word('aboutMe', qb.bindDefault()),
      qb.range('reputation', qb.datatype('int'), qb.bind('rep'))
  ))
)).result(function (documents) {
  console.log(JSON.stringify(documents[0].content, null, 2));
}, function(error) {
  console.log(JSON.stringify(error, null, 2));
});

When run against the marklogic-samplestack seed data, the query matches a single contributor and produces output similar to the following:

{
  "Contributor": {
    "userName": "souser1601813@email.com",
    "reputation": 91,
    "displayName": "grechaw",
    "originalId": "1601813",
    "location": "Occidental, CA",
    "aboutMe": "XML (XQuery, Java, XML database) software engineer at MarkLogic. Hardcore accordion player.",
    "id": "sou1601813"
  }
}

Using a Custom Constraint Parser

Support for binding word, value, range, collection, and scope constraint parsing is built into the API. If these constraint types do not meet the needs of your application, you can create a binding to a custom constraint parser. Implement the parser as described in Creating a Custom Constraint in the Search Developer's Guide.

To apply a custom constraint parser to a string query with the Node.js Client, follow these steps:

  1. Create an XQuery module that implements your custom constraint parser. Use the parser interface for structured queries. For details, see Implementing a Structured Query parse Function in the Search Developer's Guide. You must following the naming conventions described below.
  2. Install your parser XQuery library module in the modules database associated with your REST API instance using DatabaseClient.config.query.custom.write. For details, see Example: Custom Constraint Parser.
  3. Use queryBuilder.parseFunction to create a parse binding between a constraint name and your custom parser.

The Node.js Client API imposes the following naming conventions on your custom constraint implementation:

  • Your parse function must be named parse.
  • Your start and finish facet functions, if present, must be called start-facet and finish-facet, respectively.
  • Your module namespace must be http://marklogic.com/query/custom/yourModuleName, where yourModuleName is a name of your choosing.

Example: Custom Constraint Parser

This example demonstrates implementing, installing, and using a custom constraint parser with the Node.js Client API. For details, see Using a Custom Constraint Parser.

This example is based on the marklogic-samplestack seed data. The data includes contributor documents, installed in the database directory /contributors/, and question documents, installed in the database directory /questions/.

The example constraint enables constraining a search to either the contributor or question category by including a term of the form cat:c or cat:q in your query text. The name cat is bound to the custom constraint using the queryBuilder parse bindings. The constraint parser defines the values c and q as corresponding to contributor and question data, respectively.

The example walks through the following steps:

Implementing the Constraint Parser

The following XQuery module implements the constraint parser. No facet handling functions are provided. The parser generates a directory-query based on the caller-supplied category name. The module maintains a mapping between the category names that can appear in query text and the corresponding database directory in the categories variable.

xquery version "1.0-ml";

module namespace my = "http://marklogic.com/query/custom/ss-cat";
import module namespace search =
  "http://marklogic.com/appservices/search"
  at "/MarkLogic/appservices/search/search.xqy";

(: The category name to directory name mapping:)
declare variable $my:categories := 
  map:new((
    map:entry("c", "/contributors/"),
    map:entry("q", "/questions/")
  ));

(: parser implementation :)
declare function my:parse(
  $query-elem as element(),
  $options as element(search:options)
) as schema-element(cts:query)
{
let $query :=
  <root>{
    let $cat := $query-elem/search:text/text()
    let $dir := 
      if (map:contains($my:categories, $cat))
      then map:get($my:categories, $cat)[1]
      else "/"
    return cts:directory-query($dir, "infinity")
  }</root>/*
return
(: add qtextconst attribute so that search:unparse will work -
   required for some search library functions :)
element { fn:node-name($query) }
  { attribute qtextconst {
      fn:concat(
        $query-elem/search:constraint-name, ":",
        $query-elem/search:text/text()) },
    $query/@*,
    $query/node()}
};
Installing the Constraint Parser

The following script installs the constraint parser module in the modules database, assuming the implementation is saved to a file named ss-cat.xqy. Installation is performed by calling DatabaseClient.config.query.custom.write. The module name passed as the first parameter must have the same basename as the module name in your module namespace declaration (ss-cat).

const fs = require('fs');
const marklogic = require('marklogic');
const my = require('./my-connection.js');
const db = marklogic.createDatabaseClient(my.connInfo);

db.config.query.custom.write(
  'ss-cat.xqy',
  [ {'role-name': 'app-user', capabilities: ['execute']} ],
  fs.createReadStream('./ss-cat.xqy')
).result(function(response) {
  console.log('Installed module ' + response.path);
}, function(error) {
  console.log(JSON.stringify(error, null, 2));
});

If you save the script to a file named install-parser.js, then running the script should produce results similar to the following:

$ node install-parser.sj
Installed module /marklogic/query/custom/ss-cat.xqy
Using the Custom Constraint in a String Query

To use this constraint, include a parse binding created by queryBuilder.parseFunction in your query. The first parameter must match the module name used when installing the implementation.

For example, the following call binds the name cat to the custom constraint parser installed above, enable queries to include terms of the form cat:c or cat:q.

qb.parseFunction('ss-cat.xqy', qb.bind('cat'))

Note that the module name (ss-cat.xqy) is the same as the module name passed as the first parameter to config.query.custom.write.

The following script uses the custom constraint to search for occurrences of marklogic in documents in the contributors category (cat:c) by specifying query text of the form marklogic AND cat:c.

const marklogic = require('marklogic');
const my = require('./my-connection.js');

const db = marklogic.createDatabaseClient(my.connInfo);
const qb = marklogic.queryBuilder;

db.documents.query( qb.where(
  qb.parsedFrom('marklogic AND cat:c',
    qb.parseBindings(
      qb.parseFunction('ss-cat.xqy', qb.bind('cat'))
  ))
)).result(function (documents) {
  for (const i in documents)
    console.log(JSON.stringify(documents[i].content, null, 2));
}, function(error) {
  console.log(JSON.stringify(error, null, 2));
});

If you save the script to a file named ss-cat.js and run it, the search returns two contributor documents:

$ node ss-cat.js
{
  "Contributor": {
    "userName": "souser1248651@email.com",
    "reputation": 1,
    "displayName": "Nullable",
    "originalId": "1248651",
    "location": "Ogden, UT",
    "aboutMe": "...My current work includes work with MarkLogic
       Application Server (Using XML, Xquery, and Xpath), WPF/C#, 
       and Android Development (Using Java)...",
    "id": "sou1248651"
  }
}
{
  "Contributor": {
    "userName": "souser1601813@email.com",
    "reputation": 91,
    "displayName": "grechaw",
    "originalId": "1601813",
    "location": "Occidental, CA",
    "aboutMe": "XML (XQuery, Java, XML database) software engineer 
       at MarkLogic. Hardcore accordion player.",
    "id": "sou1601813"
  }
}

If you remove the cat:c term so that the query text is just marklogic, the search returns an additional question document.

For more details and examples, see Creating a Custom Constraint in the Search Developer's Guide.

Additional Information

For additional information on creating and using custom constraints, see the following resources:

Searching with Query By Example

This section covers the following topics related to searching JSON documents using Query By Example (QBE).

Introduction to QBE

A Query By Example enables rapid prototyping of queries for documents that look like this using search criteria that resemble the structure of documents in your database.

For example, if your documents include an author property, then the following raw QBE matches documents with an author value of Mark Twain.

{ $query: { author: "Mark Twain" } }

Use queryBuilder.byExample to construct a QBE with the Node.js Client API. When working with JSON content, this interfaces accepts individual search criteria modeled on the content ({ author: "Mark Twain" } ) or an entire $query object as input. For example:

db.documents.query( qb.where(
  qb.byExample( {author: 'Mark Twain'} ))
)

When searching XML, you can pass in a serialized XML QBE. For details, see Querying XML Content With QBE.

The subset of the MarkLogic Server Search API exposed by QBE includes value queries, range queries, and word queries. QBE also supports logical and relational operators on values, such as AND, OR, NOT, greater than, less than, and equality tests.

You can only use QBE and the Node.js API to query document content. Metadata search is not supported. Also, you cannot search on fields. To query metadata or search over fields, use the other queryBuilder builder functions, such as queryBuilder.collection, queryBuilder.property, or queryBuilder.field. Use a field query to search on the metadataValues metadata category.

This guide provides only a brief introduction to QBE. For details, see Searching Using Query By Example in Search Developer's Guide.

Creating a QBE with queryBuilder

To create a QBE, call queryBuilder.byExample and pass in one or more search criteria parameters. When working with XML documents, you can also pass in a fully formed QBE; for details, see Querying XML Content With QBE.

For example, the documents created by Loading the Example Data include a location property. Running the following script against this data enables you to search for all contributors from Oslo, Norway.

const marklogic = require('marklogic');
const my = require('./my-connection.js');

const db = marklogic.createDatabaseClient(my.connInfo);
const qb = marklogic.queryBuilder;

db.documents.query(
    qb.where(qb.byExample( {location: 'Oslo, Norway'} ))
).result( function(results) {
  console.log(JSON.stringify(results, null, 2));
});

The search criteria passed to qb.byExample match only those documents that contain a location property with a value of 'Oslo, Norway'. A QBE criteria of the form{propertyName: value} is a value query, so the value must exactly match 'Oslo, Norway'.

You can construct other query types that model your documents, including word queries and range queries. For example, you can relax the above constraint to be tolerant of variations on the location value by using a word query. You can also add a criteria that only matches contributors with a reputation value greater than 400. The following table describes the QBE criteria you can use to realize this search:

QBE Criteria Description
location: {$word : 'oslo'} Match the phrase oslo when it appears in the value of location. $word is a reserved property name that signifies a word query. The use of word query means the match is case insensitive, and the value may or may not include other words. For details, see Word Query in the Search Developer's Guide.
reputation: {$gt : 400} Match documents where the value of reputation is greater than 400. $gt is a reserved property name that signifies the greater than comparison operator. For details, see Range Query in the Search Developer's Guide.
$filtered: true Perform a filtered search. QBE uses unfiltered search by default for best performance. However, range queries, such as {$gt : 400} require either filtered search or a backing range index, so we must enable filtered search. For details, see How Indexing Affects Your Query in the Search Developer's Guide.

The following script combines these criteria into a single QBE:

const marklogic = require('marklogic');
const my = require('./my-connection.js');
const db = marklogic.createDatabaseClient(my.connInfo);
const qb = marklogic.queryBuilder;

db.documents.query( qb.where(
  qb.byExample( {
    location: {$word : 'oslo'},
    reputation: {$gt : 400},
    $filtered: true
  }))
).result( function(results) {
  console.log(JSON.stringify(results, null, 2));
}, function(error) {
  console.log(JSON.stringify(error, null, 2));
});

You can pass criteria into byExample as individual objects or an array of objects. For example, the following calls are equivalent to the byExample call above:

// criteria as individual objects
qb.byExample(
  {location: {$word : 'oslo'}},
  {reputation: {$gt : 400}},
  {$filtered: true}
)
// criteria as an array of objects
qb.byExample([
  {location: {$word : 'oslo'}},
  {reputation: {$gt : 400}},
  {$filtered: true}
])

The inputs to queryBuilder.byExample in these examples correspond to search criteria in the $query portion of a raw QBE; for details, see Constructing a QBE with the Node.js QueryBuilder in the Search Developer's Guide.

You can also pass the raw $query portion of a QBE to queryBuilder.byExample by supplying an object that has a $query property. For example:

// raw QBE $query
qb.byExample(
  { $query: {
      location: {$word : 'oslo'},
      reputation: {$gt : 400},
      $filtered: true
  }}
)

Querying XML Content With QBE

Pass JavaScript query criteria to queryBuilder.byExample, as described in Creating a QBE with queryBuilder, implicitly creates JSON QBE, which only matches JSON content. By default, a QBE only matches documents with the same content type as the QBE. That is, a QBE expressed in JSON matches JSON documents, and a QBE expressed in XML matches XML documents. You can still search XML content by either using a serialized XML QBE or by setting the $format QBE property to 'xml'.

To use a QBE to search XML content, use one of the following techniques:

  • Pass a serialized XML QBE as input to queryBuilder.byExample. If your query relies on XML namespaces, you must use this technique. For example:
    qb.byExample(
      '<q:qbe xmlns:q="http://marklogic.com/appservices/querybyexample">'+
        '<q:query>' +
          '<my:contributor xmlns:my="http://marklogic.com/example">' +
            '<my:location><q:word>oslo</q:word></my:location>' +
          '</my:contributor>' +
          '<my:contributor xmlns:my="http://marklogic.com/example">' +
            '<my:reputation><q:gt>400</q:gt></my:reputation>' +
          '</my:contributor>' +
        '<q:filtered>true</q:filtered>' +
        '</q:query>' +
      '</q:qbe>'
    )
  • Pass a JavaScript object to queryBuilder.byExample that represents a fully formed QBE that includes a $format property with the value 'xml'. You can only use this technique when working with XML content that is in no namespace. For example:
    qb.byExample({
      $query: {
        location: {$word : 'oslo'},
        reputation: {$gt : 400},
        $filtered: true
      },
      $format: 'xml'
    })

In both cases, the data passed in to queryBuilder.byExample must be a fully formed QBE (albeit a serialized one, in the XML case), not just the query criteria as when searching JSON documents. For syntax, see Searching Using Query By Example in the Search Developer's Guide.

As with any search that matches XML, the XML content returned by the search is serialized and returned as a string.

Additional Information

For additional information on constructing and using QBE, see the following resources:

Searching with Structured Queries

The queryBuilder functions that return a queryBuilder.Query construct sub-queries of a structured query. A structured query is an Abstract Syntax Tree representation of a search expression. Use a structured query when the expressiveness of string query or QBE is not sufficient, or when you need to intercept a query and augment or modify it. For details, see Structured Query Overview in the Search Developer's Guide.

Basic Usage

When you pass one or more queryBuilder.query objects to a function that creates a queryBuilder.BuiltQuery, such as queryBuilder.where, the queries are used to build a structured query. A structured query is an Abstract Syntax Tree representation of a search expression. Use a structured query when the expressiveness of string query or QBE is not sufficient, or when you need to intercept a query and augment or modify it. For details, see Structured Query Overview in the Search Developer's Guide.

Structured queries are composed of one or more search criteria that you create using the builder methods of queryBuilder. For a taxonomy of builders and examples of each, see Builder Methods Taxonomy Reference.

For example, the following code snippet sends your query to MarkLogic Server as a structured query. The query matches documents in the database directory /contributors/ that also contain the term marklogic.

db.documents.query( 
  qb.where(
    qb.and(qb.directory("/contributors/", 
           qb.term("marklogic"))
))

Use the queryBuilder result refinement methods to tailor your results, just as you do when searching with a string query or QBE. For details, see Search Result Refiners.

Example: Using Structured Query

The following example relies on the sample data from Loading the Example Data.

This example demonstrates some of the ways you can use the structured query builders to create complex queries.

The following example finds documents in the /contributors/ database directory that contain the term marklogic. By default, the query returns the matching documents.

const marklogic = require('marklogic');
const my = require('./my-connection.js');

const db = marklogic.createDatabaseClient(my.connInfo);
const qb = marklogic.queryBuilder;

db.documents.query(
  qb.where(
    qb.and(
      qb.directory('/contributors/'), 
      qb.term('marklogic')
    )
  )
).result( function(results) {
  console.log(JSON.stringify(results, null, 2));
});

The query returns an array of document descriptors, one for each matching document. The sample data contains 2 documents that match, /contributors/contrib3.json and /contributors/contrib4.json, so you should see output similar to the following. The content property of the document descriptor contains the contents of the matching document.

[
  {
    "uri": "/contributors/contrib3.json",
    "category": "content",
    "format": "json",
    "contentType": "application/json",
    "contentLength": "323",
    "content": {
      "Contributor": {
        "userName": "souser1248651@email.com",
        "reputation": 1,
        "displayName": "Nullable",
        "originalId": "1248651",
        "location": "Ogden, UT",
        "aboutMe": "...My current work includes work with MarkLogic 
          Application Server (Using XML, Xquery, and Xpath), WPF/C#, 
          and Android Development (Using Java)...",
        "id": "sou1248651"
      }
    }
  },
  {
    "uri": "/contributors/contrib4.json",
    "category": "content",
    "format": "json",
    "contentType": "application/json",
    "contentLength": "273",
    "content": {
      "Contributor": {
        "userName": "souser1601813@email.com",
        "reputation": 91,
        "displayName": "grechaw",
        "originalId": "1601813",
        "location": "Occidental, CA",
        "aboutMe": "XML (XQuery, Java, XML database) software 
           engineer at MarkLogic. Hardcore accordion player.",
        "id": "sou1601813"
      }
    }
  }
]

You can optionally remove the call to queryBuilder.and because queryBuilder.where implicitly ANDs together the queries passed to it. For example, you can rewrite the original query as follows and get the same results:

db.documents.query(
  qb.where(
    qb.directory('/contributors/'), 
    qb.term('marklogic')
  )

You can also combine a string query with one or more structured query builder results. For example, you could further limit the results to documents that also contain java by adding qb.parsedFrom('java') to the query list passed to qb.where. The string query is implicitly AND'd with the other query terms. If you change the query to the following, the result set contains only /contributors/contrib3.json.

db.documents.query(
  qb.where(
    qb.directory('/contributors/'), 
    qb.term('marklogic'),
    qb.parsedFrom('java')
  )

The queryBuilder interface includes helper functions that make it easy to construct more complex query components, such as index references. For details, see Query Parameter Helper Functions.

As with the other query types, you can refine your result set using queryBuilder.slice and queryBuilder.withOptions. For details, see Refining Query Results.

Builder Methods Taxonomy Reference

Structured query explicitly exposes all the query types described in Types of Query through builder methods. This section is a quick reference for locating the builders you need, based on this categorization.

You can use most query types in combination with each other, as indicated by the parameters accepted by the builder functions. For details, see the queryBuilder interface in the Node.js API Reference.

The queryBuilder interface enables you to build complex structured queries without knowing the underlying structural details of the query. Cross-references into the structured query Syntax Reference in the Search Developer's Guide are included here if you require further details about the components of a specific query type.

Basic Content Queries

Basic content queries express search criteria about your content, such as JSON property A contains value B or any document containing the phrase 'dog'. These queries function as leaves in the structure of a complex, compound query because they never contain sub-queries.

The following table lists the Node.js builder methods that create basic content queries. A link to the corresponding raw JSON structured query type is provided in case you need more detail about a particular aspect of a query. You do not need to construct the raw query; the Node.js API does this for you.

queryBuilder Function Example Structured Query Sub-Query
term
qb.term('marklogic')
term-query
word
qb.word('aboutMe','marklogic')
word-query
value
qb.value('tags','java')
value-query
range
qb.range('reputation', '>=', 100)
range-query
geospatial
qb.geospatial(
  qb.geoElement('gElemPoint'), 
  qb.latlon(50, 44)
)
geo-elem-querygeo-elem-pair-querygeo-attr-pair-querygeo-json-property-querygeo-json-property-pair-querygeo-path-query
geospatialRegion
q.geospatialRegion(
  q.geoPath('/envelope/region'), 
  'intersects',
  q.circle(5, q.point(10,20))
)
geo-region-path-query
geoElement
qb.geospatial(
  qb.geoElement('gElemPoint'), 
  qb.latlon(50, 44)
)
geo-elem-query
geoElementPair
qb.geospatial(
  qb.geoElementPair(
    'gElemPair', 
    'latitude', 
    'longitude'), 
  qb.latlon(50, 44)
)
geo-elem-pair-query
geoAttrPair
qb.geospatial(
  qb.geoAttributePair(
    'gAttrPair', 
    'latitude', 
    'longitude'), 
  qb.circle(100, 240, 144)
)
geo-attr-pair-query
geoProperty
q.geospatial(
  q.geoProperty('gElemPoint'),
  q.point(34, 88)
)
geo-json-property-query
geoPropertyPair
qb.geospatial(
  qb.geoPropertyPair(
    'gElemPair', 
    'latitude', 
    'longitude'), 
  qb.latlon(12, 5)
)
geo-json-property-pair-query
geoPath
q.geospatial(
  q.geoPath('parent/child'), 
  q.latlon(12, 5)
)
geo-path-query
Logical Composers

Logical composers are queries that join one or more sub-queries into a logical expression. For example, documents which match both query1 and query2 or documents which match either query1 or query2 or query3.

The following table lists the Node.js builder methods for logical composers. A link to the corresponding raw JSON structured query type is provided in case you need more detail about a particular aspect of a query. You do not need to construct the raw query; the Node.js API does this for you.

queryBuilder Function Example Structured Query Sub-Query
and
qb.and(
  qb.word('text','marklogic'),
  qb.value('tags', 'java')
)
and-query
andNot
qb.andNot(
  qb.word('text','marklogic'),
  qb.value('tags', 'java')
)
and-not-query
boost
qb.boost(
  qb.word('text','marklogic'),
  qb.word('title', 'json')
)
boost-query
not
qb.not(qb.term('marklogic'))
not-query
notIn
qb.notIn(
  qb.word('text','json'),
  qb.word('text', 'json documents')
)
not-in-query
or
qb.or(
  qb.value('tags','marklogic'),
  qb.value('tags', 'nosql')
)
or-query
Location Qualifiers

Location qualifiers are queries that limit results based on where sub-query matches occur, such as only in content, only in metadata, or only when contained a specified JSON property or XML element. For example, matches for this sub-query that occur in metadata or matches for this sub-query that are contained in JSON Property P.

The following table lists the Node.js builder methods that create location qualifiers. A link to the corresponding raw JSON structured query type is provided in case you need more detail about a particular aspect of a query. You do not need to construct the raw query; the Node.js API does this for you.

queryBuilder Function Example Structured Query Sub-Query
documentFragment
qb.documentFragment(
  qb.term('marklogic')
)
document-fragment-query
locksFragment
qb.locksFragment(
  qb.term('marklogic')
)
locks-fragment-query
near
qb.near(
  qb.term('marklogic'),
  qb.term('xquery'), 5
)
near-query
propertiesFragment
qb.propertiesFragment(
  qb.term('marklogic')
)
properties-fragment-query
scope
qb.scope(
  'aboutMe',
  qb.term('marklogic')
)
container-query
Document Selectors

Document selectors are queries that match a group of documents by database attributes such as collection membership, directory, or URI, rather than by contents. For example, all documents in collections A and B or all documents in directory D.

The following table lists the Node.js builder methods that create document selectors. A link to the corresponding raw JSON structured query type is provided in case you need more detail about a particular aspect of a query. You do not need to construct the raw query; the Node.js API does this for you.

queryBuilder Function Example Structured Query Sub-Query
collection
qb.and(
  qb.collection('marklogicians'),
  qb.term('java')
)
collection-query
directory
qb.and(
  qb.directory('/contributors/'),
  qb.term('java')
)
directory-query
document
qb.and(
  qb.document(
    '/contributors/contrib1.json',
    '/contributors/contrib3.json'),
  qb.term('norway')
)
document-query

Query Parameter Helper Functions

The queryBuilder interface includes helper functions for building sub-query parameters that are structurally non-trivial.

For example, a container query (queryBuilder.scope) requires a descriptor that identifies the container (or scope), such as a JSON property or an XML element. The helper functions queryBuilder.property and queryBuilder.element enable you to define the container descriptor required by the scope function.

The following code snippet constructs a container query that matches the term marklogic when it occurs in a JSON property named aboutMe. The helper function queryBuilder.property builds the JSON property name specification.

db.documents.query(
  qb.where(
    qb.scope(qb.property('aboutMe'), qb.term('marklogic'))
  )
)

Key helper functions provided by queryBuilder are listed below. For details, see the Node.js API Reference and the Search Developer's Guide.

Helper Function Purpose
anchor
Defines a numeric or dateTime range for the bucket helper function. For details, see Constrained Searches and Faceted Navigation in the Search Developer's Guide.
attribute
Identifies an XML element attribute for use with query builders such as range, word, value, and geospatial query builders.
bucket
Defines a numeric or dateTime range bucket for use with the facet builder. For details, see Constrained Searches and Faceted Navigation in the Search Developer's Guide.
datatype
Specifies an index type (int, string, etc.) that can be used with the range query builder to disambiguate an index reference. You should only need this if you have multiple indexes of different types over the same document component.
element
Identifies an XML element for use with query builders such as scope, range, word, value, and geospatial query builders.
facet
Defines a search facet for use with calculate result builder. For details, see Constrained Searches and Faceted Navigation in the Search Developer's Guide.
facetOptions
Specifies additional options for use with the facet builder. For details, see Facet Options in the Search Developer's Guide.
field
Identifies a document or metadataValues field for use with the range, word, and value query builders. For details, see Fields Database Settings in the Administrator's Guide.
fragmentScope
Restrict the scope of a range, scope, value, or word query to document content or document properties.
pathIndex
Identifies a path range index for query builders such as range or geoPath. The database configuration must include a corresponding path range index. For details, see Understanding Path Range Indexes in the Administrator's Guide. The path expression is limited to a subset of XPath; for details, see Path Field and Path-Based Range Index Configuration in the XQuery and XSLT Reference Guide.
property
Identifies a JSON property name for query builders such as range, scope, value, word, geoProperty, and geoPropertyPair.
qname
Identifies an XML element QName (local name and namespace URI) for query builders such as range, scope, value, word, geoElement, and geoElementPair, geoAttributePair. Also used in constructing an attribute identifier.
rangeOptions
Additional search options available with the range query builder. For details, see the Node.js API Reference and Range Options in the Search Developer's Guide.
score
Specifies a range query relevance scoring algorithm for use with the orderBy results builder. For details, see Including a Range or Geospatial Query in Scoring in the Search Developer's Guide.
sort
Specifies the equivalent of a sort-order query option that defines the search result sorting criteria and order for use with the orderBy results builder. For details, see sort-order in the Search Developer's Guide.
termOptions
Specifies the equivalent of a term-option query option for use with the word and value query builders. For details, see Term Options in the Search Developer's Guide.
weight
Specifies a modified weight to assign to a query. Usable with query builders such as word and value. For details, see Using Weights to Influence Scores in the Search Developer's Guide.

Search Result Refiners

The queryBuilder interface includes several functions that enable you to refine the results of a search. For example, you can specify how many results to return, how to sort the results, and whether or not to include search facets.

These refinement functions usually return a queryBuilder.BuiltQuery object, in contrast to query builders, which usually return a queryBuilder.Query object.

You can chain result modifier calls together. For example:

db.documents.query(qb.where(someQuery).slice(0,5).orderBy(...))

For details, see Refining Query Results.

The table below summarizes the result modifier functions supported by queryBuilder. For details, see Node.js API Reference.

Helper Function Purpose
anchor
Defines a numeric or dateTime range for the bucket helper function. For details, see Generating Search Facets and Constrained Searches and Faceted Navigation in the Search Developer's Guide.
bucket
Defines a numeric or dateTime range bucket for use with the facet builder. For details, see Generating Search Facets and Constrained Searches and Faceted Navigation in the Search Developer's Guide.
facet
Defines a search facet for use with calculate result builder. For details, see Generating Search Facets and Constrained Searches and Faceted Navigation in the Search Developer's Guide.
facetOptions
Specifies additional options for use with the facet builder. For details, see Generating Search Facets and Facet Options in the Search Developer's Guide.
calculate
Builds a search facet specification. For details, see Generating Search Facets.
orderBy
Specifies sort order and sequencing. For example, you can specify a JSON property, XML element, XML element attribute on which to sort. For details, see sort-order in the Search Developer's Guide.
slice
Defines the slice of documents that should be returned from within the result set and any server-side transformation that should be applied to the results. For details, see Refining Query Results.
withOptions
Miscellaneous options that can be used to refine and tune you query. For example, use withOptions to specify the categories of data to retrieve from the matching documents, such as content or metadata, request query metrics, or specify a transaction id.

Searching with Combined Query

A combined query is a query object that can contain a combination of different query types plus query options. Most searches can be accomplished without using a combined query. For example, you can combine a string query and a structured query by simply passing the results of queryBuilder.parsedFrom and a queryBuilder.Query to queryBuilder.where.

This feature is best suited for advanced users who are already familiar with the Search API and who have one of the following requirements:

  • Your application must use query options previously persisted on MarkLogic Server.
  • You require very fine-grained control over query options at query time. (Most query options are already exposed in other parts of the Node.js API, such as the queryBuilder methods. You should use those interfaces when possible, rather than relying on combined query.)

In the Node.js Client API, CombinedQueryDefinition encapsulates a combined query. The API provides no builder for CombinedQueryDefinition. A CombinedQueryDefinition has the following form, where the search property contains the combined query, and the remaining properties can optionally be used to customize the results.

{ search: {
    query: { structuredQuery },
    qtext: stringQuery,
    options: { queryOptions }
  },
  categories: [ resultCategories ],
  optionsName: persistedOptionsName,
  pageStart: number,
  pageLength: number,
  view: results
}

The combined query portion can contain any combination of a structured query, a string query, and Search API query options. If you specify options inside the combined query that conflict with options implied by the settings in the CombinedQueryDefinition wrapper, the wrapper option settings override the ones inside the combined query. For example, if search.options includes 'page-length':5 and search.pageLength is set to 10, then the page length will be 10.

The following table describes the properties of a combined query:

Property Name Description
query Optional. A structured query conforming to the syntax described in Searching Using Structured Queries in the Search Developer's Guide.
qtext Optional. A string query conforming to the Search API string query syntax. For details, see Searching with String Queries and Searching Using String Queries in the Search Developer's Guide.
options Optional. One or more Search API query options. For details, see Appendix: Query Options Reference in the Search Developer's Guide.

Use the categories, pageStart, pageLength, and view properties to customize your search results, as described in Refining Query Results.

Use the optionsName property to name a set of previously persisted query options to apply to the search. If the CombinedQueryDefinition contains both options in the combined query and a persistent query options name, then the two sets of options are merged together. Where equivalent options occur in both, the settings in the combined query takes precedence.

You cannot use the Node.js Client API to persist query options. Instead, use the REST or Java Client APIs to do so. For details, see Configuring Query Options in the REST Application Developer's Guide or Query Options in the Java Application Developer's Guide.

The following example uses a CombinedQueryDefinition to find documents containing java and marklogic that are in the database directory /contributors. The combined query sets the return-query option to true to include the final query structure in the results. The categories property is set to none so that the search result summary is returned instead of the matching documents; the summary will contain the final query. Results are returned 3 at a time, due to the pageLength setting.

db.documents.query({
  search: {
    qtext: 'java',
    query: {
      'directory-query' : { uri: '/contributors/' },
      'term-query': { text: ['marklogic'] }
    },
    options: {
      'return-query': true
    }
  },
  categories: [ 'none' ],
  pageLength: 3
})

Searching Values Metadata Fields

Values metadata, sometimes called key-value metadata, can only be searched if you define a metadata field on the keys you want to search. Once you define a field on a metadata key, use the normal field search capabilities to include a metadata field in your search. For example, you can use queryBuilder.field and queryBuilder.word to create a word query on a metadata field.

For more details, see Metadata Fields in the Administrator's Guide.

Querying Lexicons and Range Indexes

The Node.js Client API enables you to search and analyze lexicons and range indexes in the following ways:

This section covers the following related topics:

For related search concepts, see Browsing With Lexicons in the Search Developer's Guide and Text Indexes in the Administrator's Guide.

Querying Values in a Lexicon or Range Index

Use the marklogic.valueBuilder interface to build queries against lexicons and range indexes, then use DatabaseClient.values.read to apply your query.

For example, if the database is configured to include a range index on the reputation JSON property or XML element, then the following query returns all the values in range index:

const marklogic = require('marklogic');
const my = require('./my-connection.js');

const db = marklogic.createDatabaseClient(my.connInfo);
const vb = marklogic.valuesBuilder;

db.values.read(
  vb.fromIndexes('reputation')
).result(function (result) {
  console.log(JSON.stringify(result, null, 2));
}, function(error) {
  console.log(JSON.stringify(error, null, 2));
});

If you save the script to a file and run against the data from Loading the Example Data, you should see results similar to the following. The query returns a values-response.tuple item for each distinct value.

{ "values-response": {
    "name": "structuredef",
    "types": {
      "type": [ "xs:int" ]
    },
    "tuple": [
      {
        "frequency": 1,
        "distinct-value": [ "1" ]
      },
      {
        "frequency": 1,
        "distinct-value": [ "91" ]
      },
      {
        "frequency": 1,
        "distinct-value": [ "272" ]
      },
      {
        "frequency": 1,
        "distinct-value": [ "446" ]
      }
    ],
    "metrics": {
      "values-resolution-time": "PT0.000146S",
      "total-time": "PT0.000822S"
    }
  }
}

You can use values.slice to retrieve a subset of the values. For example, if you modify the above script to so that the query looks like the following, then the query returns 2 values, beginning with the 3rd value:

db.values.read(
  vb.fromIndexes('reputation')
    .slice(2,4)
)

==>
{ "values-response": {
    "name": "structuredef",
    "types": {
      "type": [ "xs:int" ]
    },
    "tuple": [
      {
        "frequency": 1,
        "distinct-value": [ "272" ]
      },
      {
        "frequency": 1,
        "distinct-value": [ "446" ]
      }
    ],
    "metrics": {
      "values-resolution-time": "PT0.000174S",
      "total-time": "PT0.000867S"
    }
  }
}

Finding Value Co-Occurrences in Lexicons

A co-occurrence is a set of index or lexicon values occurring in the same document fragment. The Node.js Client API supports queries for n-way co-occurrences. That is, tuples of values from multiple lexicons or indexes, occurring in the same fragment.

To find values co-occurrences across multiple range indexes or lexicons, use the marklogic.valueBuilder interface to construct a query, then apply it using DatabaseClient.values.read. When a values query includes multiple index references, the results are co-occurrence tuples.

For example, the following script find co-occurrences of values in the tags and id JSON properties or XML elements, assuming the database configuration includes an element range index for tags and another for id. (Recall that range indexes on JSON properties use the element range index interfaces; for details, see Indexing.)

const marklogic = require('marklogic');
const my = require('./my-connection.js');

const db = marklogic.createDatabaseClient(my.connInfo);
const vb = marklogic.valuesBuilder;

db.values.read(
  vb.fromIndexes('tags','id')
).result(function (result) {
  console.log(JSON.stringify(result, null, 2));
}, function(error) {
  console.log(JSON.stringify(error, null, 2));
});

If you save the script to a file and run it, you should see results similar to the following. The query returns a values-response.tuple item for each co-occurrence. The property values-response.types can guide you in interpreting the data types of the values in each tuple.

{
  "values-response": {
    "name": "structuredef",
    "types": {
      "type": [
        "xs:string",
        "xs:string"
      ]
    },
    "tuple": [
      {
        "frequency": 1,
        "distinct-value": [
          "dbobject",
          "soq7684223"
        ]
      },
      {
        "frequency": 1,
        "distinct-value": [
          "dbobject",
          "sou69803"
        ]
      },...
    ],
    "metrics": {
      "values-resolution-time": "PT0.000472S",
      "total-time": "PT0.001251S"
    }
  }
}

You can use values.slice to retrieve a subset of the values. For example, if you modify the script to so that the query looks like the following, then the query returns two tuples, beginning with the 3rd value:

db.values.read(
  vb.fromIndexes('tags','id').slice(2,4)
)

==>
{
  "values-response": {
    "name": "structuredef",
    "types": {
      "type": [
        "xs:string",
        "xs:string"
      ]
    },
    "tuple": [
      {
        "frequency": 1,
        "distinct-value": [
          "java",
          "soq22431350"
        ]
      },
      {
        "frequency": 1,
        "distinct-value": [
          "java",
          "soq7684223"
        ]
      }
    ],
    "metrics": {
      "values-resolution-time": "PT0.00024S",
      "total-time": "PT0.001018S"
    }
  }
}

Building an Index Reference

Use valuesBuilder.fromIndexes to create index references for use in your values and co-occurrence queries. For example, a query such as the following includes a reference by name to an index on a JSON property or XML element named reputation:

db.values.read(vb.fromIndexes('reputation'))

You can use an index reference builder method to disambiguate the index reference, use another type of index, or specify a collation. The following interpretation is applied to the inputs to valuesBuilder.fromIndexes:

  • A simple name identifies a range index on a JSON property. For example, vb.fromIndexes('reputation') identifies a range index for the JSON property reputation.
  • An index reference identifies a range index. For example, vb.fromIndexes(vb.field('questionId')) identifies a field range index.
  • If you do not explicitly specify the data type of the range index, the API will attempt to look it up server-side during index resolution. Use valuesBuilder.datatype to explicitly specify the data type.

For example, all of the following index references identify a JSON property range index for the property named reputation.

vb.fromIndexes('reputation')

vb.fromIndexes(vb.range('reputation'))

vb.fromIndexes(vb.range(vb.property('reputation')))

vb.fromIndexes(vb.range(
  vb.property('reputation'), vb.datatype('int')))

The following table summarizes the index definition builder methods exposed by valuesBuilder:

Lexicon or Index Type valuesBuilder builder method
uri
vb.uri
collection vb.collection (with no arguments)
range
name
vb.range
field
vb.field
geospatial
vb.geoAttributePair
vb.geoElement
vb.geoElementPair
vb.geoPath
vb.geoProperty
vb.geoPropertyPair

The URI and collection lexicons must be enabled on the database in order to use them. For details, see Text Indexes in the Administrator's Guide. Use valuesBuilder.uri and valuesBuilder.collection (with no arguments) to identify these lexicons. For example:

db.values.read(
  vb.fromIndexes(
    vb.uri(),           // the URI lexicon
    vb.collection())    // the collection lexicon

Refining the Results of a Values or Co-Occurrence Query

You can refine the results of your queries in the following ways:

  • Use valuesBuilder.slice to select a subset of the results and/or specify a result transform.
  • Use valuesBuilder.BuiltQuery.withOptions to specify values query options or constrain results to particular forests. For a list of options, see the API documentation for cts.values (JavaScript) or cts:values (XQuery).
  • Use valuesBuilder.BuiltQuery.where to limit results to those that match another query.

You can use these refinements singly or in any combination.

For example, the following query returns values from the range index on the JSON property reputation. The where clause selects only those values in documents in the collection myInterestingCollection. The slice clause selects two results, beginning with the third value. The withOptions clause specifies the results be returned in descending order.

db.values.read(
  vb.fromIndexes('reputation').
  where(vb.collection('myInterestingCollection')).
  slice(2,4).
  withOptions({values: ['descending']})

Analyzing Lexicons and Range Indexes with Aggregate Functions

You can compute aggregate values over range indexes and lexicons using builtin or user-defined aggregate functions with valuesBuilder.BuiltQuery.aggregates. This section covers the following topics:

Aggregate Function Overview

An aggregate function performs an operation over values or value co-occurrences in lexicons and range indexes. For example, you can use an aggregate function to compute the sum of values in a range index.

Use valuesBuilder.BuiltQuery.aggregates to apply one or more builtin or user-defined aggregate functions to your values or co-occurrences query. You can combine builtin and user-defined aggregates in the same query.

MarkLogic Server provides builtin aggregate functions for several common analytical functions; for a list of functions, see the Node.js API Reference. For a more detailed description of each builtin, see Using Builtin Aggregate Functions in the Search Developer's Guide.

You can also implement aggregate user-defined functions (UDFs) in C++ and deploy them as native plugins. Aggregate UDFs must be installed before you can use them. For details, see Implementing an Aggregate User-Defined Function in the Application Developer's Guide. You must install the native plugin that implements your UDF according to the instructions in Using Native Plugins in the Application Developer's Guide.

You cannot use the Node.js Client API to apply aggregate UDFs that require additional parameters.

Using Builtin Aggregate Functions

To use a builtin aggregate function, pass the name of the function to valuesBuilder.BuiltQuery.aggregates. For a list of supported builtin aggregate function names, see the Node.js API Reference.

For example, the following script uses builtin aggregates to calculate the minimum, maximum, and standard deviation of the values in the range index over the JSON property named reputation. Use a slice clause of the form slice(0,0) to return just the computed aggregates, rather than the aggregates plus values.

const marklogic = require('marklogic');
const my = require('./my-connection.js');

const db = marklogic.createDatabaseClient(my.connInfo);
const vb = marklogic.valuesBuilder;

db.values.read(
  vb.fromIndexes('reputation')
    .aggregates('min', 'max', 'stddev')
    .slice(0,0)
).result(function (result) {
  console.log(JSON.stringify(result, null, 2));
}, function(error) {
  console.log(JSON.stringify(error, null, 2));
});

Running the script produces output similar to the following:

{ "values-response": {
  "name": "structuredef",
  "aggregate-result": [
    { "name": "min", "_value": "1" },
    { "name": "max", "_value": "446" },
    { "name": "stddev", "_value": "197.616632228498" }
  ],
  "metrics": {
    "aggregate-resolution-time": "PT0.000571S",
    "total-time": "PT0.001279S"
  }
} }
Using User-Defined Aggregate Functions

An aggregate UDF is identified by the function name and a relative path to the plugin that implements the aggregate, as described in Using Aggregate User-Defined Functions in the Search Developer's Guide. You must install your UDF plugin on MarkLogic Server before you can use it in a query. For details on creating and installing aggregate UDFs, see Aggregate User-Defined Functions in the Application Developer's Guide.

Once you install your plugin, use valuesBuilder.udf to create a reference to your UDF, and pass the reference to valuesBuilder.builtQuery.aggregates. For example, the following script uses a native UDF called count provided by a plugin installed in the Extensions database under native/sampleplugin:

const marklogic = require('marklogic');
const my = require('./my-connection.js');

const db = marklogic.createDatabaseClient(my.connInfo);
const vb = marklogic.valuesBuilder;

//console.log(vb.fromIndexes(vb.range(vb.pathIndex('/id'))));
db.values.read(
  vb.fromIndexes('reputation')
    .aggregates(vb.udf('native/sampleplugin', 'count')
    .slice(0,0)
).result(function (result) {
  console.log(JSON.stringify(result, null, 2));
}, function(error) {
  console.log(JSON.stringify(error, null, 2));
});

Generating Search Facets

You can use the Node.js Client API to include facets in your query results, as described in Constrained Searches and Faceted Navigation in the Search Developer's Guide. You define facets using queryBuilder.facet and include them in your search using queryBuilder.calculate. You can construct facets on JSON properties, XML elements and attributes, fields and paths. A facet must be backed by a range index.

This section includes the following topics:

For more details, see Constrained Searches and Faceted Navigation in the Search Developer's Guide.

Defining a Simple Facet

The following example facets on the reputation JSON property of documents in the database directory /contributors/. The results include only the facets, rather than the facets plus matching documents, because of the withOptions clause; for details, see Excluding Document Descriptors or Values From Search Results.

const marklogic = require('marklogic');
const my = require('./my-connection.js');

const db = marklogic.createDatabaseClient(my.connInfo);
const qb = marklogic.queryBuilder;

db.documents.query(
  qb.where(qb.directory('/contributors/'))
    .calculate(qb.facet('reputation'))
    .withOptions({categories: 'none'})
).result( function(results) {
  console.log(JSON.stringify(results, null, 2));
}, function(error) {
  console.log(JSON.stringify(error, null, 2));
});

If the database includes a range index on reputation, and you run the script against the example data from Loading the Example Data, you should see results similar to the following:

{ "snippet-format": "empty-snippet",
  "total": 4,
  "start": 1,
  "page-length": 0,
  "results": [],
  "facets": {
    "reputation": {
      "type": "xs:int",
      "facetValues": [
        { "name": "1",
          "count": 1,
          "value": 1 },
        { "name": "91",
          "count": 1,
          "value": 91 },
        { "name": "272",
          "count": 1,
          "value": 272 },
        { "name": "446",
          "count": 1,
          "value": 446 }
      ]
    }
  }
}

By default, the facet uses the same name as entity from which the facet is derived, such as an XML element or JSON property, but you can provide a custom name. For details, see Naming a Facet.

The facets property of the results includes a set of value buckets for the reputation facet, one bucket for each distinct value of reputation. Each bucket includes a name (auto-generated from the value by default), the number of matches with that value, and the actual value.

"facets": {
    "reputation": {       <-- name of the facet
      "type": "xs:int",
      "facetValues": [
        { "name": "1",    <-- bucket name
          "count": 1,     <-- number of matches with this value
          "value": 1 }    <-- value associated with this bucket

Naming a Facet

By default, the name of a facet is derived from the indexed element or property name on which the facet is based. For example, the following facet on the reputation property generates a facet with the property name reputation:

qb.facet('reputation')
==> "facets": { "reputation": {...} }

You can override this behavior by passing your own name in as the first argument to queryBuilder.facet. For example, the following facet on the reputation property generates a facet with the property name rep:

qb.facet('rep', 'reputation')
==> "facets": { "rep": {... } }

Including Facet Options

You can use queryBuilder.facetOptions to include options in your facet definition that affect attributes such as sort order and the maximum number of values to return. For details, see Facet Options in the Search Developer's Guide and the detailed API documentation for the query that corresponds to your facet index type, such as cts.values (JavaScript) or cts:values (XQuery).

For example, the following facet definition requests buckets be ordered by descending values and limits the number of buckets to two. Thus, instead of returning buckets ordered [1, 91, 272, 446], the results are ordered [446, 272, 91, 1] and truncated to the first 2 buckets:

qb.facet('rep','reputation', qb.facetOptions('limit=2','descending')))

==>

"facets": {
  "reputation": {
    "type": "xs:int",
    "facetValues": [
      { "name": "446",
        "count": 1,
        "value": 446 },
      { "name": "272",
        "count": 1,
        "value": 272 }
    ]
  }
}

Defining Bucket Ranges

By default, a facet is bucketed by distinct values. However, you can define your own buckets on numeric and date values using queryBuilder.bucket. A bucket can take on a range of values. The upper and lower bounds of the range of values in a bucket are the bucket anchors. You can include both anchor values, or omit the upper or lower anchor.

Buckets over dateTime values can use symbolic anchors such as now and start-of-day. The real values are computed when the query is evaluated. Such definitions describe computed buckets. For a list of the supported values, see computed-bucket in the Search Developer's Guide.

For example, you can divide the reputation values into buckets of less than 50, 50 to 100, and greater than 100 using a facet definition such as the following:

qb.facet('reputation', 
         qb.bucket('less than 50', '<', 50),
         qb.bucket('50 to 100', 50, '<', 101),
         qb.bucket('greater than 100', 101, '<'))
==>

"facets": {
  "reputation": {
    "type": "bucketed",
    "facetValues": [
      { "name": "less than 50",
        "count": 1,
        "value": "less than 50"
      },
      { "name": "50 to 100",
        "count": 1,
        "value": "50 to 100"
      },
      { "name": "greater than 100",
        "count": 2,
        "value": "greater than 100"
      }
    ]
  }
}

In the above example, '<' is a constant that serves as a boundary between the upper and lower anchor values. It is not a relational operator, per se. The separator enables the API to handle buckets with no lower bound, with no upper bound, and with both an upper and a lower bound.

For more examples of defining buckets, see Buckets Example in the Search Developer's Guide and Computed Buckets Example in the Search Developer's Guide.

Creating and Using Custom Constraint Facets

When you define a custom constraint, you can also define facet generators for your constraint, as described in Creating a Custom Constraint in the Search Developer's Guide. Use the following procedure to use a custom constraint facet generator.

  1. Implement an XQuery module that includes start-facet and finish-facet functions. For details, see Creating a Custom Constraint in the Search Developer's Guide.
  2. Install your custom constraint module in the modules database associated with your REST API instance using the DatabaseClient.config.query.custom interface, as described in Installing the Constraint Parser.
  3. Use queryBilder.CalculateFunction to create a reference to your facet generator when building your facet definitions.

For example, if your custom constraint module is installed as ss-cat.xqy, as shown in Installing the Constraint Parser:

db.config.query.write('ss-cat.xqy', ...)

Then you can use your facet generator in your facet definitions as follows:

qb.facet('categories', qb.calculateFunction('ss-cat.xqy'))

Refining Query Results

This section covers the following features of the Node.js Client API that enable you to customize your search results using queryBuilder.slice or valuesBuilder.BuiltQuery.slice:

Available Refinements

By default, when you perform a search using DatabaseClient.documents.query, you receive one page of matching document descriptors ordered by relevance ranking. Each descriptor includes the content of the matching document.

Some query options always cause a search result summary to be returned, in addition to the matching document descriptors. For example, when you enable options such as 'debug', 'metrics', or 'queryPlan', the additional data requested by the option is returned as part of the search result summary.

The Node.js Client API provides several result refinement queryBuilder methods that enable you to customize your results, including the following:

  • Change the size and/or starting document for the page using queryBuilder.slice. For details, see Paginating Query Results
  • Change the order of results using queryBuilder.orderBy. For details, see the Node.js Client API Reference.
  • Request metadata in addition to or instead of content using queryBuilder.withOptions. For details, see Returning Metadata.
  • Exclude the document descriptors from the response using queryBuilder.withOptions. This is useful when you just want to fetch snippets, facets, metrics, or other data about the matches. For details, see Excluding Document Descriptors or Values From Search Results
  • Request search match snippets in addition to or instead of matching documents using queryBuilder.snippet with queryBuilder.slice. You can customize your snippets. For details, see Generating Search Snippets
  • Request search facets in addition to or instead of matching documents, using queryBuilder.calculate. You can customize your facet buckets. For details, see Generating Search Facets.
  • Apply a read transform to the matched documents or search results summary. For details, see Transforming the Search Results.

The slice specifies the range of matching documents to include in the result set. If you do not explicitly call queryBuilder.slice, a default slice is still defined. The other refinement methods (calculate, orderBy, snippet, withOptions) have no effect if the slice is empty, whether the slice is empty because there are no matches for your query or because you defined an empty page range such as slice(0,0).

You can use these features in combination. For example, you can request snippets and facets together, without or without document descriptors.

Paginating Query Results

Use queryBuilder.slice and valuesBuilder.BuiltQuery.slice to fetch slice (batch) of results. A slice of results is defined by a zero-based starting position and an end position (the value in the end position is not included in the slice), similar to using Array.prototype.slice.

For example, the following queries return five results, beginning with the first one:

qb.where(qb.parsedFrom('oslo')).slice(0,5)

vb.fromIndexes('reputation').slice(0,5)

To return the next 5 results, you would use queries such as the following

qb.where(qb.parsedFrom('oslo')).slice(5,10)

vb.fromIndexes('reputation').slice(5,10)

The default maximum number of results is 10.

Setting the starting and end positions to zero selects no matches (or values), but returns an abbreviated result summary that includes, for example, estimated total number of matches for a search or computed aggregates for a values query.

Returning Metadata

By default, a query returns document descriptors for each matching documents, and the descriptors include the document content.To return metadata instead of contents, set the categories property of queryBuilder.withOptions to 'metadata'. For example:

db.documents.query(
  qb.where(qb.parsedFrom('oslo'))
    .withOptions({categories: 'metadata'})
)

To return both metadata and documents, set categories to both 'content' and 'metadata'. For example:

db.documents.query(
  qb.where(qb.parsedFrom('oslo'))
    .withOptions({categories: ['content', 'metadata']})
)

Excluding Document Descriptors or Values From Search Results

By default, a query returns document descriptors for each matching documents, and the descriptors include the document content. If you want to retrieve snippets, facets, or other search result data without the matching documents, set the categories property of queryBuilder.withOptions to 'none'.

For example, the following query normally returns the contents of two document descriptors:

db.documents.query(
  qb.where(qb.parsedFrom('oslo'))
)

If you add the following withOptions clause, you receive a search result summary that include search snippets, instead of receiving document descriptors:

db.documents.query(
  qb.where(qb.parsedFrom('oslo'))
    .withOptions({categories: 'none'})
)

The contents of the search result summary depend on the other refinements you apply to your query, but will never include the document descriptors.

Generating Search Snippets

A search results page typically shows portions of matching documents with the search matches highlighted, perhaps with some text showing the context of the search matches. These search result pieces are known as snippets.

Snippets are not included in your query results by default. To request snippets, include a snippet clause in your slice definition using queryBuilder.snippet. For example, the following query returns snippets in the default format:

const marklogic = require('marklogic');
const my = require('./my-connection.js');

const db = marklogic.createDatabaseClient(my.connInfo);
const qb = marklogic.queryBuilder;

db.documents.query(
    qb.where(
      qb.byExample({aboutMe: {$word: 'marklogic'}})
    ).slice(qb.snippet())
).result( function(results) {
  console.log(JSON.stringify(results, null, 2));
}, function(error) {
  console.log(JSON.stringify(error, null, 2));
});

You can include a snippet clause in a slice that has a start and end position, as well. For example:

slice(0, 5, qb.snippet())

To retrieve snippets without the matching documents, add a withOptions({categories: 'none'}) clause. For exaample:

...slice(qb.snippet()).withOptions({categories: 'none'})

You can use one of several builtin snippet generators or your own custom snippet generator by providing a name to queryBuilder.snippet. For example, the following slice definition requests snippets generated by the builtin metadata-snippet generator:

slice(0, 5, qb.snippet('metadata')

Some of the builtin snippeters accept additional options, which you can specify in the second parameter to queryBuilder.snippet. For example, the following snippet definition limits the size of the snippeted text 25 characters:

qb.snippet('my-snippeter.xqy', {'max-snippet-chars': 25})

For details on the supported options, see the Node.js API Reference and Specifying transform-results Options in the Search Developer's Guide.

Use the following procedure to use a custom snippeter:

  1. Implement your snippet generator in XQuery. Your snippet function must conform to the interface specified in Specifying Your Own Code in transform-results in the Search Developer's Guide.
  2. Install your snippeting module in the modules database of your REST API instance using DatabaseClient.config.query.snippet.write.
  3. Use the name of your custom snippeting module as the snippeter name provided to queryBuilder.snippet. For example:
    slice(0, 5, qb.snippet('my-snippeter.xqy'))

You cannot pass options or parameters to a custom snippeter.

For more information on snippet generation, see Modifying Your Snippet Results in the Search Developer's Guide.

Transforming the Search Results

You can make arbitrary changes to the response from a search or values query by applying a transformation function. Your transform is applied to each document returned by the query, as well as to the search or values response summary, if any.

Transforms must be installed on MarkLogic Server before you can use them. Use DatabaseClient.config.transforms to install and manage transforms.

To use a transform in a query, create a transform descriptor with queryBuilder.transform or valuesBuilder.transform. You must specify the name of a previously installed transform function. You can also include implementation-specific parameters. For details and examples, see Working with Content Transformations.

For example, the following query applies the transform named js-query-transform to the search results. Since no documents are returned (withOptions), the query only returns a search results summary and the transform is only applied to the summary. If the query returned documents, the transform would be applied to each matched document as well.

db.documents.query( 
  qb.where(
    qb.byExample({writeTimestamp: {'$exists': {}}})
  ).slice(qb.transform('js-query-transform'))
   .withOptions({categories: 'none'})
)

You can apply a transform to a values query in the same fashion. For example:

db.values.read(
  vb.fromIndexes('reputation')
    .slice(0, 5, vb.transform('js-query-transform'))

For details, see Working with Content Transformations.

Extracting a Portion of Each Matching Document

Use queryBuilder.extract to return a subset of the content in each matching document instead of the complete document. You can return selected properties, selected properties plus their ancestors, or everything except the selected properties. By default, only the selected properties are included.

Selected properties are specified using XPath expressions. You can only use a subset of XPath for these path expressions. For details, see The extract-document-data Query Option in the XQuery and XSLT Reference Guide.

The following example performs the same search as the first query in Creating a QBE with queryBuilder, but refines the results using queryBuilder.slice and queryBuilder.extract to return just the displayName and location properties from the matching documents. The search matches two documents when run against the documents created by Loading the Example Data.

const marklogic = require('marklogic');
const my = require('./my-connection.js');

const db = marklogic.createDatabaseClient(my.connInfo);
const qb = marklogic.queryBuilder;

db.documents.query(
    qb.where(qb.byExample( {location: 'Oslo, Norway'} ))
      .slice(qb.extract({'abc': 'http://marklogic.com/test/abc'}
      selected:'include',
      paths:['/Contributor/displayName', '/Contributor/location'])
      namespaces:{'abc': 'http://marklogic.com/test/abc'}
      )
).result( function(matches) {
  matches.forEach(function(match) {
    console.log(match.content);
  });
});

When you use queryBuilder.extract in the manner above, each matching document produces a document descriptor containing content of the following form:

{ context: original-document-context,
  extracted: [ obj-from-path1, obj-from-path2, ...] }

For example, the above query produces the following output:

{ context: 'fn:doc("/contributors/contrib1.json")',
  extracted: [ 
    { displayName: 'Lars Fosdal' }, 
    { location: 'Oslo, Norway' } ] 
}
{ context: 'fn:doc("/contributors/contrib2.json")',
  extracted: [ 
    { displayName: 'petrumo' }, 
    { location: 'Oslo, Norway' } ] 
}

You can produce a sparse representation of the original matching document instead by passing a selected value to queryBuilder.extract. You can created a sparse document that includes the selected property(s) plus ancestors, or the whole document exclusive of the selected property(s).

For example, the following query returns the same properties but includes their ancestors:

db.documents.query(
    qb.where(qb.byExample( {location: 'Oslo, Norway'} ))
      .slice(qb.extract({
        paths: ['/Contributor/displayName', '/Contributor/location'],
        selected: 'include-with-ancestors',
        namespaces:{'abc': 'http://marklogic.com/test/abc'}
      }))
)

The output from this query is the following a sparse version of the original documents:

{ Contributor: { 
  displayName: 'Lars Fosdal', 
  location: 'Oslo, Norway' } }
{ Contributor: { 
  displayName: 'petrumo', 
  location: 'Oslo, Norway' } }

The following table shows the effect of each supported value of the selected parameter of queryBuilder.extract on the returned content.

selected Value Output

include

(default)

{ context: 'fn:doc("/contributors/contrib1.json")',
  extracted: [ 
    { displayName: 'Lars Fosdal' }, 
    { location: 'Oslo, Norway' } ] }
{ context: 'fn:doc("/contributors/contrib2.json")',
  extracted: [ 
    { displayName: 'petrumo' }, 
    { location: 'Oslo, Norway' } ] }
include-with-ancestors
{ Contributor:
   { displayName: 'Lars Fosdal', 
     location: 'Oslo, Norway' } }
{ Contributor: 
   { displayName: 'petrumo', 
     location: 'Oslo, Norway' } }
exclude
{ Contributor:
   { userName: 'souser10002@email.com',
     reputation: 446,
     originalId: '10002',
     aboutMe: 'Software Developer since 1987, ...',
     id: 'sou10002' } }
{ Contributor:
   { userName: 'souser1000634@email.com',
     reputation: 272,
     originalId: '1000634',
     aboutMe: 'Developer at AspiroTV',
     id: 'sou1000634' } }
all
{ Contributor:
   { userName: 'souser10002@email.com',
     reputation: 446,
     displayName: 'Lars Fosdal',
     originalId: '10002',
     location: 'Oslo, Norway',
     aboutMe: 'Software Developer since 1987...',
     id: 'sou10002' } }
{ Contributor:
   { userName: 'souser1000634@email.com',
     reputation: 272,
     displayName: 'petrumo',
     originalId: '1000634',
     location: 'Oslo, Norway',
     aboutMe: 'Developer at AspiroTV',
     id: 'sou1000634' } }

If an extract path does not match any content in a matched document, then the corresponding property is omitted. If no extract paths match, a descriptor for the document is still returned, but it contains an extracted-none property instead of an extracted property or a sparse document. For example:

{ context: 'fn:doc("/contributors/contrib1.json")',
  extracted-none: null
}

Generating Search Term Completion Suggestions

This section describes how to generate search term completion suggestions using the Node.js Client API. The following topics are covered:

Understanding the Suggestion Interface

Search applications often offer suggestions for search terms as the user types into the search box. The suggestions are based on terms that are in the database, and are typically used to make the user interface more interactive and to quickly suggest search terms that are appropriate to the application.

Suggestions are drawn from the range indexes and lexicons you specify in your request. For performance reasons, a range or collection index is recommended over a word lexicon; for details, see the Usage Notes for search:suggest. Suggestions can be further filtered by additional search criteria.

Use DatabaseClient.documents.suggest to generate search term completion suggestions using the Node.js Client API. The simplest suggestion request takes the following form:

db.documents.suggest(partialText, qualifyingQuery)

Where partialText is the query text for which you want to generate suggestions, and qualifyingQuery is any additional search criteria, including index and lexicon bindings. Though the qualifying query can be arbitrarily complex, typically at least a portion of it will eventually be filled in by the completed phrase.

For example, the following call requests suggestions for the partial phrase doc. Because the first parameter to qb.parsedFrom is an empty string, there are no additional search criteria.

db.documents.suggest('doc',
  qb.where(qb.parsedFrom('', 
    qb.parseBindings(
      qb.value('prefix', qb.bind('prefix')),
      qb.range('name', qb.bindDefault()))
  ))
)

The parse bindings in the qualifying query include a binding for unqualified terms (qb.bindDefault()) to a range query on the JSON property named name (qb.range('name', ...)). The database must include a matching range index.

Thus, if the database contains documents of the following form, then suggestions for doc are drawn only from the values of name and never from the values of prefix or alias:

{ "prefix": "xdmp", 
  "name": "documentLoad", 
  "alias": "document-load" }

When the user completes the search term, possibly from the suggestions, the empty string can be replaced by the complete phrase in the document query. Thus if the user completes the term as documentLoad, the same query can be used as follows to retrieve matching documents:

db.documents.query(
  qb.where(qb.parsedFrom('documentLoad', 
    qb.parseBindings(
      qb.value('prefix', qb.bind('prefix')),
      qb.range('name', qb.bindDefault()))
  ))
)

The qualifying query can include other search criteria. The following example adds the query prefix:xdmp. The bindings associated the prefix term with a value query on the JSON property named prefix. The prefix:xdmp term could be a portion of search box text previously entered by the user.

db.documents.suggest('doc',
  qb.where(qb.parsedFrom('prefix:xdmp', 
    qb.parseBindings(
      qb.value('prefix', qb.bind('prefix')),
      qb.range('name', qb.bindDefault()))
  ))
)

In this case, suggestions are drawn from the name property as before, but they are limited to values that occur in documents that satisfy the prefix:xdmp query. That is, suggestions are drawn from values in documents that meet both these criteria:

  • Contain a JSON property named name whose value begins with doc, AND
  • Contain a JSON property named prefix with the exact value xdmp

The term to be completed can also use explicit bindings. For example, the following call requests suggestions for aka:doc, where aka is bound to a range index on the JSON property alias. Suggestions are only drawn from values of this property.

db.documents.suggest('aka:doc',
  qb.where(qb.parsedFrom('', 
    qb.parseBindings(
      qb.range('alias', qb.bind('aka')),
      qb.value('prefix', qb.bind('prefix')),
      qb.range('name', qb.bindDefault()))
  ))
)

The suggestions returned in this case include the prefix. For example, one suggestion might be aka:document-load.

The qualifying query can include both string query and structured query components, but usually will include at least one more index or lexicon bindings with which to constrain the suggestions. For example, the following code adds a directory query that limits suggestions to documents in the database directory /suggest/.

db.documents.suggest('doc',
  qb.where(qb.parsedFrom('', 
    qb.parseBindings(
      qb.value('prefix', qb.bind('prefix')),
      qb.range('name', qb.bindDefault()))
  ), qb.directory('/suggest/', true))
)

You can override bindings on a per suggest basis without modifying your qualifying query by including an additional suggestBinding parameter.

In cases where you're using a previously constructed qualifying query, but you want to add bindings that limit the scope of suggestions for other reasons (such as performance), you can add override bindings using queryBuilder.suggestBindings.

For example, the following code overrides the binding for bare terms in the qualifying query with a binding to a range index on the JSON property alias. Thus, if a document includes a name property with value documentLoad and an alias property with value document-load, then the suggestions would include documentLoad without the suggestBindings specification, but document-load with the override.

db.documents.suggest('doc',
  qb.where(qb.parsedFrom('', 
    qb.parseBindings(
      qb.value('prefix', qb.bind('prefix')),
      qb.range('name', qb.bindDefault()))
  )),
  qb.suggestBindings(qb.range('alias', qb.bindDefault()))
)

Overrides are per binding. In the example above, only the default binding for bare terms is overridden. The binding for prefix continues to take effect as long as the suggestBindings do not include a binding for prefix.

Example: Generating Search Term Suggestions

The example in this section illustrates the use cases described in Understanding the Suggestion Interface.

The script first loads the example documents into the database, and then generates suggestions from the them. To run the example, you must add the following range indexes. You can create them using the Admin Interface or the Admin API. For details, see Range Indexes and Lexicons in the Administrator's Guide.

  • An element range index of type string with local name name.
  • An element range index of type string with local name alias.

The example covers the following use cases, which are discussed in more detail in Understanding the Suggestion Interface.

  • Case 1: Suggestions for docdrawn from the name property
  • Case 2: Suggestions for doc drawn from name where prefix is xdmp
  • Case 3: Suggestions for doc drawn from name where prefix is xdmp and the suggestion is from a document in the /suggest/ directory.
  • Case 4: Suggestions for aka:doc where the aka prefix causes suggestions to be drawn from the alias property.
  • Case 5: Suggestions for doc drawn from the alias property by virtue of a suggest binding override.

The table below summarizes the property values in the example documents for quick reference.

URI name prefix alias
/suggest/load.json
documentLoad
xdmp
document-load
/suggest/insert.json
documentInsert
xdmp
document-insert
/suggest/query.json
documentQuery
cts
document-query
/suggest/search.json
search
cts
search
/elsewhere/delete.json
documentDelete
xdmp
document-delete

Running the example produces results similar to the following:

1: Suggestions for naked term "doc":
["documentDelete","documentInsert","documentLoad","documentQuery"]

2: Suggestions filtered by prefix:xdmp:
["documentDelete","documentInsert","documentLoad"]

3: Suggestions filtered by prefix:xdmp and dir /suggest/:
["documentInsert","documentLoad"]

4: Suggestions for "aka:doc":
[
  "aka:document-delete",
  "aka:document-insert",
  "aka:document-load",
  "aka:document-query"
]

5: Suggestions with overriding bindings:
["document-delete","document-insert","document-load"]

To run the example, copy the following script into a file, modify the database connection information as needed, and execute the script with the node command. The script assumes the connection information is contained in a file named my-connection.js, as described in Using the Examples in This Guide.

const marklogic = require('marklogic');
const my = require('./my-connection.js');
const db = marklogic.createDatabaseClient(my.connInfo);
const qb = marklogic.queryBuilder;

// NOTE: This example requires a database configuration
// that includes two element range index:
// - type string, local name name
// - type string, local name alias

// Initialize the database with the sample documents
db.documents.write([
  { uri: '/suggest/load.json',
    contentType: 'application/json',
    content: {
      prefix: 'xdmp',
      name: 'documentLoad',
      alias: 'document-load'
  } },
  { uri: '/suggest/insert.json',
    contentType: 'application/json',
    content: {
      prefix: 'xdmp',
      name: 'documentInsert',
      alias: 'document-insert'
  } },
  { uri: '/suggest/query.json',
    contentType: 'application/json',
    content: {
      prefix: 'cts',
      name: 'documentQuery',
      alias: 'document-query'
  } },
  { uri: '/suggest/search.json',
    contentType: 'application/json',
    content: {
      prefix: 'cts',
      name: 'search',
      alias: 'search'
  } },
  { uri: '/elsewhere/delete.json',
    contentType: 'application/json',
    content: {
      prefix: 'xdmp',
      name: 'documentDelete',
      alias: 'document-delete'
  } },
]).result().then(function(response) {
  // (1) Get suggestions for a naked term
  return db.documents.suggest('doc', 
    qb.where(qb.parsedFrom('', 
      qb.parseBindings(
        qb.range('name', qb.bindDefault()))
    ))
  ).result(null, function(error) {
    console.log(JSON.stringify(error, null, 2)); 
  });
}).then(function(response) {
  console.log('1: Suggestions for naked term "doc":');
  console.log(JSON.stringify(response));

  // (2) Get suggestions for a qualified term
  return db.documents.suggest('doc', 
    qb.where( qb.parsedFrom('prefix:xdmp', 
      qb.parseBindings(
        qb.value('prefix', qb.bind('prefix')),
        qb.range('name', qb.bindDefault()))
    ))
  ).result(null, function(error) {
    console.log(JSON.stringify(error, null, 2)); 
  });
}).then(function(response) {
  console.log('\n2: Suggestions filtered by prefix:xdmp:');
  console.log(JSON.stringify(response));

  // (3) Suggestions limited by directory
  return db.documents.suggest('doc', 
    qb.where( qb.parsedFrom('prefix:xdmp', 
      qb.parseBindings(
        qb.value('prefix', qb.bind('prefix')),
        qb.range('name', qb.bindDefault()))
      ), 
      qb.directory('/suggest/', true))
  ).result(null, function(error) {
    console.log(JSON.stringify(error, null, 2)); 
  });
}).then(function(response) {
  console.log('\n3: Suggestions filtered by prefix:xdmp and dir /suggest/:');
  console.log(JSON.stringify(response));

  // (4) Get suggestions for a term with a binding
  return db.documents.suggest('aka:doc', 
    qb.where( qb.parsedFrom('',
      qb.parseBindings(
        qb.range('alias', qb.bind('aka')),
        qb.range('name', qb.bindDefault()))
    ))
  ).result(null, function(error) {
    console.log(JSON.stringify(error, null, 2)); 
  });
}).then(function(response) {
  console.log('\n4: Suggestions for "aka:doc":');
  console.log(JSON.stringify(response, null, 2));

  // (5) Get suggestions using a binding override
  return db.documents.suggest('doc', 
    qb.where( qb.parsedFrom('prefix:xdmp', 
      qb.parseBindings(
        qb.value('prefix', qb.bind('prefix')),
        qb.range('name', qb.bindDefault()))
    )),
    qb.suggestBindings(
        qb.range('alias', qb.bindDefault()))
  ).result(null, function(error) {
    console.log(JSON.stringify(error, null, 2)); 
  });
}).then(function(response) {
  console.log('\n5: Suggestions with overriding bindings:');
  console.log(JSON.stringify(response));
}, function(error) {
  console.log(JSON.stringify(error, null, 2)); 
});

Loading the Example Data

Several of the examples in this chapter rely on data derived from the MarkLogic Samplestack seed data. Samplestack is an open-source implementation of the MarkLogic Reference Application architecture; for details, see the Reference Application Architecture Guide.

To load the data, copy the following script to a file and run it. The script uses the connection data described inUsing the Examples in This Guide.

Some of the examples require range indexes.

const marklogic = require('marklogic');
const my = require('./my-connection.js');

const documents = [
{ uri: '/contributors/contrib1.json', content:
  {"Contributor":{
    "userName":"souser10002@email.com", "reputation":446, 
    "displayName":"Lars Fosdal", "originalId":"10002", 
    "location":"Oslo, Norway", 
    "aboutMe":"Software Developer since 1987, mainly using Delphi.", 
    "id":"sou10002"}}},
{ uri: '/contributors/contrib2.json', content:
  {"Contributor":{
    "userName":"souser1000634@email.com", "reputation":272, 
    "displayName":"petrumo", "originalId":"1000634", 
    "location":"Oslo, Norway", 
    "aboutMe":"Developer at AspiroTV", 
    "id":"sou1000634"}}},
{ uri: '/contributors/contrib3.json', content:
  {"Contributor":{
    "userName":"souser1248651@email.com", "reputation":1, 
    "displayName":"Nullable", "originalId":"1248651", 
    "location":"Ogden, UT", 
    "aboutMe":"...My current work includes work with MarkLogic Application Server (Using XML, Xquery, and Xpath), WPF/C#, and Android Development (Using Java)...",
    "id":"sou1248651"}}},
{ uri: '/contributors/contrib4.json', content:
  {"Contributor":{
    "userName":"souser1601813@email.com", "reputation":91, 
    "displayName":"grechaw", "originalId":"1601813", 
    "location":"Occidental, CA", 
    "aboutMe":"XML (XQuery, Java, XML database) software engineer at MarkLogic. Hardcore accordion player.", 
    "id":"sou1601813"}}},
{ uri: '/test/query/extraDir/doc6.xml',
  collections: ['http://marklogic.com/test/abc'],
  contentType:'application.xml',
  content:
     <container xmlns:abc="http://marklogic.com/test/abc">
     <target>match</target>
     <abc:elem>word</abc:elem>
</container>'},
{ uri: ''/questions/q1.json'', content:
  { "tags": [ "java", "sql", "json", "nosql", "marklogic" ],
    "owner": {
      "userName": "souser1238625@email.com",
      "displayName": "Raevik",
      "id": "sou1238625"
    },
    "id": "soq22431350",
    "accepted": false,
    "text": "I have a MarkLogic DB instance populated with JSON documents that interest me. I have executed a basic search and have a SearchHandle that will give me the URIs that matched. Am I required to now parse through the flattened JSON string looking for my key?",
    "creationDate": "2014-03-16T00:06:06.497",
    "title": "MarkLogic basic questions on equivalent of SELECT with Java API"
 }},
{ uri: '/questions/q2.json', content:
  { "tags": [ "java", "dbobject", "mongodb" ],
    "owner": {
      "userName": "souser69803@email.com",
      "displayName": "Ankur",
      "id": "sou69803"
    },
    "id": "soq7684223",
    "accepted": true,
    "text": "MongoDB seems to return BSON/JSON objects.  I thought that surely you'd be able to retrieve values as Strings, ints etc. which can then be saved as POJO.  I have a DBObject (instantiated as a BasicDBObject) as a result of iterating over a list ... (cur.next()).  Is the only way (other than using some sort of persistence framework) to get the data into a POJO to use a JSON serlialiser/deserialiser?",
    "creationDate": "2011-10-07T07:27:18.097",
    "title": "Convert DBObject to a POJO using MongoDB Java Driver"
  }},
{ uri: '/questions/q3.json', content:
  { "tags": [ "json", "marklogic" ],
    "owner": {
      "userName": "souser1238625@email.com",
      "displayName": "Raevik",
      "id": "sou1238625"
    },
    "id": "soq22412345",
    "accepted": false,
    "text": "Does marklogic manage JSON documents?",
    "creationDate": "2014-02-10T00:13:03.282",
    "title": "JSON document management in MarkLogic"
 }},
];

const db = marklogic.createDatabaseClient(my.connInfo);

db.documents.write(documents)
  .result(null, function(error) {
      console.log(JSON.stringify(error));
    });

The corresponing query to extract document '/test/query/extraDir/doc6.xml' looks as follows:

db.documents.query(
  q.where(
    q.word('target','match')
).
  slice(0, 1, q.extract({
    selected:'include',
    paths:'//abc:elem',
    namespaces: {'abc': 'http://marklogic.com/test/abc'}
    }))
  )
  .result(function(response) {
    response.length.should.equal(1);
    var document = response[0];
    document.should.have.property('content');
    var content = document.content;
    var assert = require('assert');
  assert(content.includes('<abc:elem   xmlns:abc="http://marklogic.com/test/abc">word</abc:elem>'));
  done();
  })
« Previous chapter
Next chapter »