Search Developer's Guide (PDF)

MarkLogic 9 Product Documentation
Search Developer's Guide
— Chapter 5

« Previous chapter
Next chapter »

Searching Using Query By Example

This chapter describes how to perform searches using Query By Example (QBE). A QBE is a query whose structure closely models the structure of the documents you want to match. You can use a QBE to search XML and JSON documents with the REST, Node.js and Java APIs.

This chapter includes the following sections:

For details on supporting APIs, see Java Application Developer's Guide and REST Application Developer's Guide.

QBE Overview

The simple, intuitive syntax of a Query By Example (QBE) enables rapid prototyping of queries for documents that look like this because search criteria in a QBE resemble the structure of documents in your database. In its simplest form, a QBE models one or more XML elements, XML element attributes, or JSON properties in your documents.

For example, if your documents include an author XML element or JSON property, you can use the following QBE to find documents with an author value of Mark Twain.

Format Example
XML
<q:qbe xmlns:q="http://marklogic.com/appservices/querybyexample">
  <q:query>
    <author>Mark Twain</author>
  </q:query>
</q:qbe>
JSON
{
  "$query": { "author": "Mark Twain" }
}

A QBE always contains a query component in which you define search criteria. A QBE can include an optional response component for customizing search results, and flags and options that control search behaviors. For details, see QBE Structural Reference.

QBE exposes many powerful features of the Search API, including the following:

You can prototype queries using QBE without creating any database indexes, though doing so has implications for performance. For details, see How Indexing Affects Your Query.

This chapter covers the syntax and semantics of QBE. You can use a QBE to search XML and JSON documents with the following MarkLogic APIs:

API More Information
Node.js Client API Searching with Query By Example in the Node.js Application Developer's Guide.
Java Client API Prototype a Query Using Query By Example in the Java Application Developer's Guide
REST Client API Using Query By Example to Prototype a Query in the in REST Application Developer's Guide.

If you need access to more advanced search features, APIs are available for converting a QBE to a combined query, giving you a foundation on which to build. For details, refer to Client API documentation.

Search Criteria Based on Document Structure

A QBE uses search criteria expressed as XML elements, XML element attributes, or JSON properties that closely resemble portions of documents in the database.

For example, if the database contains documents of the following form:

Format Example Document
XML
<book>
  <title>Tom Sawyer</title>
  <author>Mark Twain</author>
  <edition format="paperback"/>
</book>
JSON
"book": {
  "title": "Tom Sawyer",
  "author" : "Mark Twain",
  "edition": [
    { "format": "paperback" }
  ]
} }

Then you can construct a QBE to find all paperback books by a given author by creating criteria that model the author, edition format. The following QBE finds all paperback books by Mark Twain.

Format Example
XML
<q:qbe xmlns:q="http://marklogic.com/appservices/querybyexample">
  <q:query>
    <author>Mark Twain</author>
    <edition format="paperback"/>
  </q:query>
</q:qbe>
JSON
{"$query": {
    "author": "Mark Twain",
    "edition": {
      "format": "paperback"
    }
} }

By default, the literal values in criteria must exactly match document contents. That is, the above query matches if the author value is Mark Twain, but it will not match documents where the author is M. Twain or mark twain. You can change this behavior using word queries and options. For details, see Understanding QBE Sub-Query Types and Adding Options to a QBE.

You can construct criteria that express value, word, and range queries. For example, you can construct a QBE that satisfies all of the following criteria. The Example Criteria column shows an XML and a JSON criteria that expresses each requirement.

Requirement Query Type Example Criteria
the author includes twain word
<author><q:word>twain</q:word></author>
"author": {"$word": "shakespeare"}
there is a paperback edition value
<edition format="paperback"/>
"edition": {"format": "paperback"}
the price of the paperback edition is less than 9.00 range
<edition>
  <price><q:lt>9.00</q:lt></price>
</edition>
"edition": {"price": {"$lt": 9.00} }

When you combine the above criteria into a single query, you get the following QBE. Notice that the child elements of query are implicitly AND'd together.

Format Example
XML
<q:qbe xmlns:q="http://marklogic.com/appservices/querybyexample">
  <q:query>
    <author><q:word>twain</q:word></author>
    <edition format="paperback">
      <price><q:lt>9.00</q:lt></price>
    </edition>
    <q:filtered>true</q:filtered>
  </q:query>
</q:qbe>
JSON
{
  "$query": {
    "author": {
      "$word": "twain"
    },
    "edition": {
      "format": "paperback",
      "price": { "$lt": 9.00 }
    },
    "$filtered": true
  }
}

The above examples demonstrate searching for direct containment, such as the author is Mark Twain. You can also search for matches anywhere within a containing XML element or JSON property. For example, suppose a book contains author and editor names, broken down into first-name and last-name:

Format Example Document
XML
<book>
  <author>
    <first-name>Mark</first-name>
    <last-name>Twain</last-name>
  </author>
  <editor>
    <first-name>Mark</first-name>
    <last-name>Matthews</last-name>
  </editor>
</book>
JSON
"book": {
  "author" : {
    "first-name": "Mark",
    "last-name": "Twain"
  },
"editor" : {
    "first-name": "Mark",
    "last-name": "Matthews"
  }
} }

You can search for any occurences of Mark as a first name contained by a book using criteria such as the following:

Format Example
XML
<q:qbe xmlns:q="http://marklogic.com/appservices/querybyexample">
  <q:query>
    <book><first-name>Mark</last-name></book>
  </q:query>
</q:qbe>
JSON
{"$query": {
    "book": { "first-name": "Mark" }
} }

Such criteria represent container queries. For details, see Container Query.

Logical Operators

You can use logical operators to create powerful composed queries. The QBE grammar supports and, or, not, and near composers. The following example matches documents that contain twain or shakespeare in the author XML element or JSON property.

Format Example
XML
<q:qbe xmlns:q="http://marklogic.com/appservices/querybyexample">
  <q:query>
    <q:or>
      <author><q:word>twain</q:word></author>
      <author><q:word>shakespeare</q:word></author>
    </q:or>
  </q:query>
</q:qbe>
JSON
{
  "$query" : {
    "$or": [
      { "author": {"$word": "twain" } },
      { "author": {"$word": "shakespeare" } }
    ]
  }
}

Sub-queries that are immediate children of query represent an implicit and query.

For details, see Composed Query.

Comparison Operators

The QBE grammar supports the following comparison operators for constructing range queries on XML element, XML attribute, and JSON property values: lt, le, eq, ne, ge, gt. For example, the following query matches all documents where the price is greater than or equal to 10.00 and less than or equal to 20.00.

Format Example
XML
<q:qbe xmlns:q="http://marklogic.com/appservices/querybyexample">
  <q:query>
    <price><q:ge>10.00</q:ge></price>
    <price><q:le>20.00</q:le></price>
    <q:filtered>true</q:filtered>
  </q:query>
</q:qbe>
JSON
{ "$query": {
    "$and": [
        {"price" : { "$ge": 10.00 } },
        {"price" : { "$le": 20.00 } } ],
    "$filtered": true
}}

The filtered flag is included in the above query because you must either use filtered search or back the range queries on price with a range index. For details, see How Indexing Affects Your Query.

For details, see Range Query.

Query by Value or Word

When you construct a criteria on a literal value, it is an implicit value query that matches an exact value. For example, the following criteria matches only when author is Mark Twain. It will not match mark twainor M. Twain:

Format Example
XML
<q:qbe xmlns:q="http://marklogic.com/appservices/querybyexample">
  <q:query>
    <author>Mark Twain</author>
  </q:query>
</q:qbe>
JSON
{
  "$query": { "author": "Mark Twain" }
}

When this is not the desired behavior, you can use a word query and/or options to modify the default behavior. A word query differs from a value query in two ways: It relaxes the default exact match semantics of a value query, and it matches a subset of the value in a document.

For example, the following query matches if the author contains twain, with any capitalization, so it matches values that are not matched by the original query, such as Mark Twain, M. Twain and mark twain.

Format Example
XML
<q:qbe xmlns:q="http://marklogic.com/appservices/querybyexample">
  <q:query>
    <author><q:word>twain</q:word></author>
  </q:query>
</q:qbe>
JSON
{
  "$query": { "author": { "$word": "twain" } }
}

For details, see Value Query and Word Query.

Search Result Customization

You can include a response XML element or JSON property to customize the contents of returned search results. The default search results include a highlighted snippet of matching XML elements or JSON properties. Use the response section of a QBE to disable snippeting, extract additional elements, or return an entire document.

For details, see Customizing Search Results.

Options for Controlling Search Behavior

The QBE grammar includes several flags and options to control your search. Flags usually have a global effect on your search, such as how to score search results. Options affect a portion of your query, such as whether or not perform an exact match against a particular XML element or JSON property value.

The following example uses the exact option to disable exact matches on value queries.

Format Example
XML
<q:qbe xmlns:q="http://marklogic.com/appservices/querybyexample">
  <q:query>
    <author><q:value exact="false">mark twain</q:value></author>
  </q:query>
</q:qbe>
JSON
{
  "$query": {
    "author": { 
      "$exact": false,
      "$value" : "mark twain"
    }
  }
}

For more details, see Adding Options to a QBE.

Example

This section includes an example that uses most of the query features of a QBE.

XML Example

This example assumes the database contains documents with the following structure:

<book>
  <title>Tom Sawyer</title>
  <author>Mark Twain</author>
  <edition format="paperback">
    <publisher>Clipper</publisher>
    <pub-date>2011-08-01</pub-date>
    <price>9.99</price>
    <isbn>1613800917</isbn>
  </edition>
</book>

The following query uses most of the features of QBE and matches the above document. The sub-queries that are immediate children of query are implicitly AND'd together, so all these conditions must be met by matching documents.

<q:qbe xmlns:q="http://marklogic.com/appservices/querybyexample">
  <q:query>
    <title>
      <q:value exact="false">Tom Sawyer</q:value>
    </title>
    <q:near distance="2">
      <author><q:word>mark</q:word></author>
      <author><q:word>twain</q:word></author>
    </q:near>
    <edition format="paperback">
      <q:or>
        <publisher>Clipper</publisher>
        <publisher>Daw</publisher>
      </q:or>
    </edition>
    <q:and>
      <price><q:lt>10.00</q:lt></price>
      <price><q:ge>8.00</q:ge></price>
    </q:and>
    <q:filtered>true</q:filtered>
  </q:query>
</q:qbe>

The following table explains the requirement expressed by each component of the query. Each of the subquery types used in this example is explored in more detail in Understanding QBE Sub-Query Types.

Requirement Example Criteria
The title is Tom Sawyer. Exact match is disabled, so the match is not sensitive to whitespace, punctuation, or diacritics. The match is case sensitive because the value (Tom Sawyer) is mixed case.
<title>
  <q:value exact="false">Tom Sawyer</q:value>
</title>
The author contains the word mark and the word twain within 2 words of each other.
<q:near distance="2">
  <author><q:word>mark</q:word></author>
  <author><q:word>twain</q:word></author>
</q:near>
The edition format is paperback and the publisher is Clipper or Daw. All the atomic values in this sub-query use exact value match semantics.
<edition format="paperback">
  <q:or>
    <publisher>Clipper</publisher>
    <publisher>Daw</publisher>
  </q:or>
</edition>
The price is less than 10.00 and greather than or equal to 8.00.
<q:and>
  <price><q:lt>10.00</q:lt></price>
  <price><q:ge>8.00</q:ge></price>
</q:and>
Use unfiltered search. This flag can be omitted if there is a range index on price. For details, see How Indexing Affects Your Query.
<q:filtered>true</q:filtered>

JSON Example

This example assumes the database contains documents with the following structure:

{"book": {
  "title": "Tom Sawyer",
  "author" : "Mark Twain",
  "edition": [
    { "format": "paperback",
      "publisher": "Clipper",
      "pub-date": "2011-08-01",
      "price" : 9.99,
      "isbn": "1613800917",
    }
  ]
} }

The following query uses most of the features of QBE and matches the above document. The sub-queries that are immediate children of query are implicitly AND'd together, so all these conditions must be met by matching documents.

{"$query": {
    "title": {
      "$value": "Tom Sawyer",
      "$exact": false
    },
    "$near": [
      { "author": { "$word": "mark" } },
      { "author": { "$word": "twain" } }
    ], "$distance": 2,
    "edition": {
      "format": "paperback",
      "$or" : [
        { "publisher": "Clipper" },
        { "publisher": "Daw" }
      ]
    },
    "$and": [
      {"price": { "$lt": 10.00 }},
      {"price": { "$ge": 8.00 }}
    ],
    "$filtered": true
} }

The following table explains the requirement expressed by each component of the query. Each of the subquery types used in this example is explored in more detail in Understanding QBE Sub-Query Types.

Requirement Example Criteria
The title is Tom Sawyer. Exact match is disabled, so the match is not sensitive to whitespace, punctuation, or diacritics. The match is case sensitive because the value (Tom Sawyer) is mixed case.
"title": {
  "$value": "Tom Sawyer",
  "$exact": false
}
The author contains the word mark and the word twain within 2 words of each other.
"$near": [
  { "author": { "$word": "mark" } },
  { "author": { "$word": "twain" } }
], 
"$distance": 2
The edition format is paperback and the publisher is Clipper or daw. All the atomic values in this sub-query use exact value match semantics.
"edition": {
  "format": "paperback",
  "$or" : [
    { "publisher": "Clipper" },
    { "publisher": "Daw" }
  ]
}
The price is less than 10.00 and greater than or equal to 8.00.
"$and": [
  {"price": { "$lt": 10.00 }},
  {"price": { "$ge": 8.00 }}
]
Use unfiltered search. This flag can be omitted if there is range index on price. For details, see How Indexing Affects Your Query.
"$filtered": true

Understanding QBE Sub-Query Types

The query portion of a QBE is composed of sub-queries. While QBE enables you to express a sub-query using syntax that closely models your documents, you should understand the query types represented by this modeling. You can express the following query types in a QBE:

Value Query

A value query matches an entire literal value, such as a string, date, or number.

By default, an XML element or JSON property criteria represents a value query with exact match semantics:

  • The value in the criteria is matched with case, diacritic, punctuation, and whitespace sensitivity enabled.
  • Stemming and wildcarding are not enabled.
  • The specified value must be an immediate child of the containing XML element or JSON property.
  • The value in the query will not match if it is a subset of the value in a document.

For example, the following criteria only matches documents where the author XML element or JSON property contains exactly and only the text Twain. It will not match author values such as Mark Twain or twain.

Format Example
XML
<q:qbe xmlns:q="http://marklogic.com/appservices/querybyexample">
  <q:query>
    <author>Twain</author>
  </q:query>
</q:qbe>
JSON
{
  "$query": { "author": "Twain" }
}

You can override some of the exact match semantics with options. For example, you can disable case-sensitive matches. For details, see Adding Options to a QBE.

A value query can be explicit or implicit. The example above is an implicit value query. You can make an explicit value query using the value QBE keyword. This is useful when you want to add options to a value query. The following example is an explicit value query that uses the case-sensitive option.

Format Example
XML
<q:qbe xmlns:q="http://marklogic.com/appservices/querybyexample">
  <q:query>
    <author>
      <q:value case-sensitive="false">Twain<q:value>
    </author>
  </q:query>
</q:qbe>
JSON
{ "$query": { 
    "author": {
      "$value": "Twain",
      "$case-sensitive": false
    }
} }

Word Query

A word query matches a word or phrase appearing anywhere in a text value. A word query will match a subset of a text value. By default, word queries do not use exact match semantics.

  • The value in the criteria is matched with case, diacritic, punctuation, and whitespace sensitivity disabled.
  • Stemmed matches are included.
  • Wildcard matching is performed if wildcarding is enabled for the database.
  • The specified word or phrase can occur in the value of the immediately containing XML element or JSON property, or in the value of child components.

You can use options to override some of the match semantics. For details, see Adding Options to a QBE.

Word queries occurring within another container, such as an XML element or JSON property that describes content in your document, match occurrences within the container. Word queries that are not in a container, such as word queries that are immediate children of the top level QBE query wrapper, match occurrences anywhere in a document. For details, see Container Query and Searching Entire Documents.

The following example QBE matches if the author contains twain with any capitalization, so it matches values such as Mark Twain, M. Twain and mark twain.

Format Example
XML
<q:qbe xmlns:q="http://marklogic.com/appservices/querybyexample">
  <q:query>
    <author><q:word>twain</q:word></author>
  </q:query>
</q:qbe>
JSON
{
  "$query": { "author": { "$word": "twain" } }
}

In JSON, the value in a word query can be either a string or an array of strings. An array of values is treated as an AND-related list of word queries. For example, the following query matches documents where author contains word matches for mark and twain. The matched values need not be array item values.

{
  "$query": { "author": { "$word": [ "mark", "twain" ] } }
}

Range Query

A range query matches values that satisfy a relational expression applied to a string, number, date, time, or dateTime value, such as less than 5 or not equal to 10. This section includes the following topics:

JSON Property Value Range Query

To construct a range query for a JSON property value, construct a JSON property with the operator name prefixed with $ as the name and the boundary value as the value:

{ "$operator" : boundary-value }

The following example criteria tests for format not equal to paperback:

"format": {"$ne": "paperback" }

You cannot construct a range query that is constrained to match an array item.

XML Element Value Range Query

To construct a range query on an XML element value, use the following syntax, where q is the namespace prefix for http://marklogic.com/appservices/querybyexample:

<container>
  <q:operator>boundary-value</q:operator>
</container>

The following example criteria tests for publication date greather than 2010-01-01:

<pub-date>
  <q:gt>2010-01-01</q:gt>
</pub-date>
XML Element Attribute Value Range Query

To construct a range query on an XML element attribute value, prefix the operator name with $ and put the comparison expression in the string value of the attribute on the containing element criteria:

<container attr="$operator value" />

The following example criteria tests that @format of edition does not equal paperback:

<edition format="$ne paperback" />
Type Conversion in Range Expressions

By default, values in range queries are treated as xs:boolean, xs:double, xs:dateTime, xs:date, or xs:time if castable as such, and as strings otherwise.

You can use the xsi:type (XML) or $datatype (JSON) option to force a particular type conversion; for details, see Adding Options to a QBE.

Composed Query

A composed query is one composed of sub-queries joined by a logical operator such and, or, not, or near. The following example matches documents where the value of author is Mark Twain or Robert Frost.

Format Example
XML
<q:qbe xmlns:q="http://marklogic.com/appservices/querybyexample">
  <q:query>
    <q:or>
      <author>Mark Twain</author>
      <author>Robert Frost</author>
    </q:or>
  </q:query>
</q:qbe>
JSON
{"$query": {
    "$or": [
      "author": "Mark Twain",
      "author": "Robert Frost"
    ]
} }

The near operator models a cts:near-query and accepts an optional distance XML attribute or JSON property to specify a maximum acceptable distance in words between matches for the operands queries. For example, the following near query specifies a maximum distance of 2 words. The default distance is 10.

Format Example
XML
<q:qbe xmlns:q="http://marklogic.com/appservices/querybyexample">
  <q:query>
    <q:near distance="2">
      <author><q:word>mark</q:word></author>
      <author><q:word>twain</q:word></author>
    </q:near>
  </q:query>
</q:qbe>
JSON
{"$query": {
    "$near": [
      { "author": { "$word": "mark" } },
      { "author": { "$word": "twain" } }
    ],
    "$distance": 2
} }

Container Query

A container query matches when sub-query conditions are met within the scope of a specific XML element or JSON property. In a container query, the relationship between the named container and XML element or JSON property names used in the sub-queries is contained by not merely child of.

A container query is implicitly defined when you use search criteria that model your document and that contain a composed query or structural sub-queries (XML element, XML attribute, or JSON property).

For example, an XML criteria such as the following defines defines a container query on edition because it contains an implicit value query on another element, price.

<edition><price>8.99</price></edition> 

By contrast, the following criteria is a value query, not a container query, on author:

<author>twain</author>

Similarly, the following JSON criteria is a container query on edition because it contains an implicit value query on another property, price.

"edition":{"price": 8.99}

By contrast, a criteria such as the following is a value query, not a container query, on author.

"author":"twain"

The examples below demonstrate how a container query for price contained by book matches at multiple levels.

Query Example Matching Documents
<q:qbe xmlns:q="http://marklogic.com/appservices/querybyexample">
  <q:query>
    <book><price>8.99</price></book>
  </q:query>
</q:qbe>
<book>
  <price>8.99</price>
</book>
<book>
  <edition>
    <price>8.99</price>
  </edition>
</book>
{"$query": {
  "book":{"price": 8.99}
} }
{ "book": {
    "price": 8.99
} }
{ "book": {
    "edition": {"price": 8.99 }
} }
{ "book": {
    "edition": [ {"price": 8.99} ]
} }

A query on an XML element attribute is a container query in that the element contains the attribute. However, only attributes on the containing element can match. The following criteria matches @format only when it appears as an attribute of edition. It does not match occurrences of @format on child elements of edition.

<edition format="paperback"/>

The following table contains XML examples of container queries.

Description Container Query

The element price is contained by the element edition and has a value of exactly 8.99.

price need not be an immediate child of edition.

<edition>
  <price>8.99</price>
</edition>
The attribute format is contained by the element edition and has the exact value "paperback".
<edition format="paperback"/>
The element price is contained by the element edition and has the exact value 8.99, or the element publisher is contained by the element edition and has the exact value "Fawcett".
<edition>
  <q:or>
    <price>8.99</price>
    <publisher>Fawcett</publisher>
  </q:or>
</edition>

The following table contains JSON examples of container queries.

Description Container Query

The JSON property price is contained in a property named edition and has the exact value 8.99.

price need not be an immediate child of edition.

"edition": {"price": 8.99 }
The JSON property price is contained in a property named edition and has the exact value 8.99, or the property named publisher is contained in JSON property named edition and has the exact value "Fawcett".
"edition": {
  "$or": [
    "price": 8.99,
    "publisher": "Fawcett"
  ]
}
The property named edition contains a price property with the value 8.99 and a publisher property with the value "Fawcett" anywhere in its substructure. If the value of edition is an array, then price and publisher must match within the same array item. Otherwise, the matches need not be within the same object or array.
"edition": {
  "$and": [
    "price": 8.99,
    "publisher": "Fawcett"
  ]
}
The property named edition contains a price property with the value 8.99 and a publisher property with the value "Fawcett" anywhere in its substructure. The matches need not be within the same object or array.
"$and": [
  "edition": {"price": 8.99},
  "edition": {"publisher":"Fawcett"}
]

Search Criteria Quick Reference

This section provides templates for constructing composed queries and frequently used criteria that model your documents.

XML Search Criteria Quick Reference

The table below provides a quick reference for constructing QBE search criteria and composed queries in XML. Use these examples as templates for your own criteria. For more details, see QBE Structural Reference.

The examples below assume that the namespace prefix q is bound to http://marklogic.com/appservices/querybyexample.

Criteria Description Example
element e has value v
<e>v</e>
<e><q:value>v</q:value></e>
the value of attribute a of element e is v
<e a="v"/>
<e a="$value v"/>
element e contains word w anywhere in the substructure of the element content
<e><q:word>w</q:word></e>
the value of attribute a of element e includes the word w
<e a="$word w"/>
element e has a value greater than 5
<e><q:gt>5</q:gt></e>
the value of attribute a of element e is greater than 5
<e a="$gt 5"/>
element e exists
<e><q:exists/></e>
attribute a of element e exists
not supported
element e1 contains element e2 with value v; e2 can occur anywhere in the substructure of the element content
<e1>
    <e2>v</e2>
</e1>
element e1 contains a descendant element e2, and e2 contains word w anywhere in the substructure of the element content
<e1>
    <e2><q:word>w</q:word></e2>
</e1>
element e1 contains a descendant element e2, that has an attribute a with value v
<e1>
    <e2 a="v"/>
</e1>
element e1 contains a descendant element e2 that has an attribute a with word w in its value
<e1>
    <e2 a="$word w"/>
</e1>
a descendant of element e has attribute a
not supported
element e has value v1 or v2
<q:or>
    <e>v1</e>
    <e>v2</e>
</q:or>
element e contains word w1 or w2 anywhere in the substructure of the element content
<q:or>
    <e><q:word>w1</q:word></e>
    <e><q:word>w2</q:word></e>
</q:or>
element e1 contains a descendant element e2, and e2 has value v1 or v2
<e1>
    <q:or>
        <e2>v1</e2>
        <e2>v2</e2>
    </q:or>
</e1>
the value of attribute a of element e has is v1 or v2
<q:or>
    <e a="v1"/>
    <e a="v2"/>
</q:or>
the value of attribute a of element e includes word w1 or w2
<q:or>
    <e a="$word w1"/>
    <e a="$word w2"/>
</q:or>
the value of attribute a of element e2 that is a descendant of element e1 is v1 or v2
<e1>
    <q:or>
        <e2 a="v1"/>
        <e2 a="v2"/>
    </q:or>
</e1>

JSON Search Criteria Quick Reference

The table below provides a quick reference for constructing QBE search criteria and composed queries in JSON. Where the example property name begins with c, the criteria represents a container query. For more details, see QBE Structural Reference.

This list of example critieria is not exhaustive. Additional forms are supported. For example, not all variants of explicit and implicit value queries are shown for a given criteria.

Criteria Description Example Criteria
Property k with value v
{ "k": "v"}
{ "k": {"$value": "v"}}
Property k containing word w anywhere in the substructure of the property value
{"k":{"$word":"w"}}
Property k with a value greater than 5
{"k":{"$gt":5}}
Property k exists
{"k":{"$exists":{}}}
A property named c containing a property named k with value v, where k can be anywhere in the substructure of c's value
{"c": {"k":"v"} }
{"c": {
  "k": {"$value": "v"}
} }
A property named c containing a property named k with a value that includes word w, where k can be anywhere in the substructure of c's value
{"c":{
  "k":{"$word":"w"}
}}
A property named k with value v1 or value v2
{"$or":[
  {"k":"v1"},
  {"k":"v2"}
]}
{"$or":[
  {"k": {"$value": "v1"}},
  {"k": {"$value": "v2"}}
]}
{"k": {"$or": [
  {"$value": "v1"},
  {"$value": "v2"}
]}}
A property named k that includes word w1 or w2 in its value.
{"$or":[
  {"k":{"$word":"w1"}},
  {"k":{"$word":"w2"}}
]}
{"k": {"$or": [
  {"$word": "v1"},
  {"$word": "v2"}
]}}
A property named c that contains a property named k with value v1 or v2 anywhere within the substructure of c's value
{"c":{"$or":[
  {"k":"v1"},
  {"k":"v2"}
]}}
A property named c that contains a property named k with value v1 and a property named k with value v2 anywhere within the substructure of c's value.
{"c":{"$and":[
  {"k":"v1"},
  {"k":"v2"}
]}}
{"c":{"$and":[
  {"k": {"$value": "v1"}},
  {"k": {"$value": "v2"}}
]}}
A property named k1 with value v1 and a property named k2 with value v2. k1 and k2 can be in different objects.
{
  "k1":"v1",
  "k2":"v2"
}
{"$and":[
  {"k1":"v1"},
  {"k2":"v2"}
]}
{
  "k1": {"$value": "v1"},
  "k2": {"$value": "v2"}
}
{"$and":[
  {"k1": {"$value": "v1"}},
  {"k2": {"$value": "v2"}}
]}

Searching Entire Documents

This section describes how to construct a query that matches words or phrases anywhere in a document, rather than constraining the match to occurrences in a particular XML element, XML attribute, or JSON property.

A word query has document scope if it is not contained in an XML element or JSON property criteria. For example, a word query that is an immediate child of the top level query element, or one that is a child at any depth of a hierarchy of composed queries (and, or, not, near). This also applies to the implicit and query that joins the immediate children of query.

For example, the following query matches all documents containing the phrase moonlight sonata:

Format Example
XML
<q:qbe xmlns:q="http://marklogic.com/appservices/querybyexample">
  <q:query>
    <q:word>moonlight sonata</q:word>
  </q:query>
</q:qbe>
JSON
{"$query": {
  "$word": "moonlight sonata"
} }

The following example matches all documents containing either the phrase moonlight sonata or the word sunlight.

Format Example
XML
<q:qbe xmlns:q="http://marklogic.com/appservices/querybyexample">
  <q:query>
    <q:or>
      <q:word>moonlight sonata</q:word>
      <q:word>sunlight</q:word>
    </q:or>
  </q:query>
</q:qbe>
JSON
{"$query": {
  "$or": [
    {"$word": "moonlight sonata"},
    {"$word": "sunlight"}
  ]
} }

An AND relationship between words and phrases can be either explicit or implicit. The following example queries match all documents contains both the phrase moonlight sonata and the word sunlight:

Format Example
XML
<q:qbe xmlns:q="http://marklogic.com/appservices/querybyexample">
  <q:query>
    <q:word>moonlight sonata</q:word>
    <q:word>sunlight</q:word>
  </q:query>
</q:qbe> 
<q:qbe xmlns:q="http://marklogic.com/appservices/querybyexample">
  <q:query>
    <q:and>
      <q:word>moonlight sonata</q:word>
      <q:word>sunlight</q:word>
    </q:and>
  </q:query>
</q:qbe>
JSON
{"$query": [
  {"$word": "moonlight"},
  {"$word": "sunlight"}
] } 
{"$query": {
  "$and": [
    {"$word": "moonlight"},
    {"$word": "sunlight"}
  ]
} }

QBE Structural Reference

This section describes the syntax and semantics of a QBE. The following topics are covered:

Top Level Structure

At the top level, a QBE must contain a query and can optionally contain a response and/or a format flag. A QBE has the following top level parts:

  • query: Define matching document requirements in the query.
  • response: Customize your search results in the response; if there is no response, the default search response is returned.
  • format: Use the format flag to override the interpretation of bare names as JSON property names or XML element names in no namespace, based on the query format. For details, see Scoping a Search by Document Type.
  • validate: Use the validate flag to enable query validation before evaluating the search. The default is no validation, which can result in surprising search results if your QBE contains errors. However, validation has a performance cost, so it is best used only for debugging during development.

The following table outlines the top level of a QBE:

Format Example
XML
<q:qbe xmlns:q="http://marklogic.com/appservices/querybyexample">
  <q:query>
    search parameters
  </q:query>
  <q:response>
    search result customizations
  </q:response>
  <q:format>xml-or-json</q:format>
  <q:validate>true-or-false</q:validate>
</q:qbe>
JSON
{
  "$query": {
    search parameters
  },
  "$response": {
    search result customizations
  },
  "$format": xml-or-json,
  "$validate": boolean
}

A query contains one or more XML elements or JSON properties defining element or property criteria or composed queries. Use criteria to model document structure. Use a composed query to logically join sub-queries using operators such as and, or, not, and near.

In XML, a QBE has a qbe wrapper element. Element and attribute names pre-defined by the QBE grammar, such as qbe, query, and word, are in the namespace http://marklogic.com/appservices/querybyexample. All other element and attributes names represent element and attribute names in your documents. For details, see Managing Namespaces

In JSON, all property names pre-defined by the QBE grammar have a $ prefix, such as $query or $word. Any property name without a $ prefix represents a property in your documents. For details, see Property Naming Convention.

You will not usually need to set the format flag. You only need to set the format flag to use a JSON QBE to match XML documents, or vice versa. For details, see Scoping a Search by Document Type.

Query Components

The table below describes the components of the query portion of a QBE. Additional format-specific details are covered in XML-Specific Considerations and JSON-Specific Considerations.

Component Type XML Local Name JSON Property Name Description
query
query
$query
Defines the search criteria. Required.
criteria
your element name
your property name

Defines search criteria to apply within the scope of an XML element or JSON property in your documents. The name corresponds to an element or property in the content to be matched by the query.

If the criteria wraps a composed query or another criteria, then it represents a container query. Otherwise, it represents a value, word, or range query.

composed query
and
or
not
near
$and
$or
$not
$near

Defines a composed query that joins sub-queries using logical operators.

The near operator accepts an optional distance XML attribute or JSON property:

  • <q:near distance=5>...</q:near>
  • { "$near": "$distance":5, [..] }
range query
lt, le
gt, ge
eq, ne
$lt, $le
$gt, $ge
$eq, $ne
Defines a relational expressions on a value in an XML element, XML attribute, or JSON property.
modifier
value
word
exists
$value
$word
$exists
A modifier on a value that defines how to match that value: with a value query (the default with no modifier), with a word query, or with an existence test.
flag
filtered
score
$filtered
$score

Flags are modifiers of search behavior.

Use the boolean filtered flag to control whether the search is filtered or unfiltered (default). For more details, see How Indexing Affects Your Query.

Use the score flag to override the search result scoring function. Allowed values: logtf, logtfidf, random, simple, zero. Default: logtfidf. For details, see Relevance Scores: Understanding and Customizing.

options
Use options to fine tune your search criteria and results. For details, see Adding Options to a QBE.

The following table summarizes where each component type can be used. Options are covered in Adding Options to a QBE.

Component Type Contains Contained By
query One or more criteria, composed queries, and the filtered or score flags

qbe (XML)

root object (JSON)

criteria
  • Nothing (empty); or
  • One value; or
  • One word, (explicit) value, or range query; or
  • One or more criteria or composed queries
query, composed query, criteria

composed query

(and, or, etc.)

One or more criteria, composed queries, or word queries. Word queries are only permitted when the composed query is an immediate child of query. query, composed query, or criteria

range query

(lt, gt, etc.)

a value criteria
word or value query a value (string, number, date, time, dateTime)

word: query, criteria, or composed query

value: criteria; composed query contained by a criteria

exists criteria
flag query

Response Components

You can use the response portion of a QBE to customize the format of your search results. The following table describes the components of a response. A response is optional, and can only occur at the top level of a QBE, as a sibling of query.

A response can contain the following formatter components:

XML Local Name JSON Property Name Description
snippet $snippet A snippet element controls what is returned for search matches. You can specify elements to prefer if they have a match and/or set a policy (default, document, none) for what to show.
extract $extract An extract element supplements a snippet by listing XML elements or JSON properties to extract from matching documents, whether or not a match occurs in the listed elements or property.

For details, see Customizing Search Results.

XML-Specific Considerations

This section covers structural and semantic details you should know when constructing a QBE in XML.

Managing Namespaces

Use the namespace http://marklogic.com/appservices/querybyexample for all pre-defined element names in the QBE grammar, such as qbe, query, and word. This namespace distinguishes the structural parts of the query from criteria elements that model your documents. You define this namespace at the top level of your QBE. For example:

<q:qbe xmlns:q="http://marklogic.com/appservices/querybyexample">
   ...
</q:qbe>

Define namespaces required by your element criteria on the criteria or any enclosing element container. You cannot bind the same namespace prefix to different namespaces within a QBE.

The following example demonstrates declaring user-defined namespaces on the root qbe element, on a containing element, and on an element criteria.

<q:qbe xmlns:q="http://marklogic.com/appservices/querybyexample"
       xmlns:ns1="http://marklogic.com/example1">
  <q:query xmlns:ns2="http://marklogic.com/example2">
    <ns1:author xmlns="http://marklogic.com/example">
      Mark Twain
    </ns1:author>
    <ns2:edition format="paperback"/>
    <title xmlns="http://marklogic.com/example3">Tom Sawyer</title>
  </q:query>
</q:qbe>
Querying Attributes

To query an element attribute, create an element criteria that contains the attribute. The following example represents a value query for the attribute edition/@format with a value of paperback.

<q:qbe xmlns:q="http://marklogic.com/appservices/querybyexample">
  <q:query>
    <edition format="paperback"/>
  </q:query>
</q:qbe>

The value of the attribute can be an implicit value query, as in the example above, or an explicit value, word, or range query. To create a word, range, or explicit value query on an attribute, use the following template for the attribute value, where keyword is a modifier (word or value) or comparator (lt, gt, etc.).

$keyword value

For example, the following QBE represents a range query on the attribute edition/@price.

<q:qbe xmlns:q="http://marklogic.com/appservices/querybyexample">
  <q:query>
    <edition price="$lt 9.00"/>
  </q:query>
</q:qbe>

You cannot use the exists modifier in an attribute value.

Multiple attributes on an element criteria are AND'd together. For example, the following QBE uses a range query on edition/@price and a word query on edition/@format paperback to find all paperback editions with a price less than 9.00.

<q:qbe xmlns:q="http://marklogic.com/appservices/querybyexample">
  <q:query>
    <edition price="$lt 9.00" format="$word paperback" />
    <q:filtered>true</q:filtered>
  </q:query>
</q:qbe>

You cannot use range, word, or value query options such as exact, min-occurs, or score-function on attribute criteria. If you need this level of control over an attribute query, use a structured query instead of QBE. For details, see Searching Using Structured Queries.

JSON-Specific Considerations

This section covers structural and semantic details you should know when constructing a QBE in JSON. The following topics are covered:

Property Naming Convention

In JSON, all pre-defined JSON property names in the QBE grammar have a $ prefix to distinguish them from names that occur in your documents. For example, the property name for the query part of a JSON QBE is $query.

If your documents include property names that start with $, the names in your content can conflict with the pre-defined property names. In such a case, you must use a structured query instead of QBE. For details, see Searching Using Structured Queries.

For a list of pre-defined property names, see Query Components and Response Components.

Matching Array Items

QBE does not distinguish between values contained in an array and values not contained in an array. For example, the following query:

{ "$query": {"k": ["v"]} }

Matches both of the following documents:

{ "k": "v" }

{ "k": ["v"] }

Also, the query is exactly equivalent to the following query that does not use array syntax:

{ "$query": {"k": "v"} }

Consequently, you cannot use QBE to match a property whose value is exactly and only a specified array value.

When you use array syntax and include multiple values, an AND relationship is implied between the values. For example, the following two queries are equivalent:

{"$query":
  {"k": ["v1", "v2"]}
}

{"$query": {
  "$and": [
    {"k": "v1"},
    {"k": "v2"}
  ]
}}

Both queries will match all of the following documents:

{ "k": ["v1", "v2"] }

{ "k": ["v1", "v2", "v3"] }

{ "c": [{"k": "v1"}, {"k": "v2"}] }

{"c": {"k": "v1", "c2": {"k": "v2"}}
Searching Array and Object Containers

The type of query represented by a criteria property that names a JSON property in your content depends on the type of value in the property. If the value is an object or a composed query, then it represents a container query. Otherwise, it is a value, word, or range query. You should understand how container queries apply to searching JSON documents..

A criteria property expresses Match a JSON property named k whose value meets these conditions if the value is a literal value, or a word, value, or range query. Such a criteria is not a container query. The table below illustrates these forms.

Criteria Template Example Criteria Description
name : value
{ "price" : 8.99 }
Match a property named "price" whose value is 8.99
name : { 
  word-or-value : value}
{ "title" : {
  "$word" : "sawyer"
} }
Match a property named "title" whose value includes "sawyer"
name : { 
  relational-op : value}
{ "price" : {
  "$lt" : 9
} }
Match a property named "price" whose value is less than 9

A criteria property in which the value is an object or a composed query is a container query. Such a query says Match a property named c that contains a value meeting these conditions anywhere in its substructure. The table below illustrtates these forms.

Criteria Template Example Criteria Description
name : object
{"edition": 
  {"price" : 8.99 } 
}
Match a JSON property named "price" whose value is 8.99 and that is contained somewhere within a property named "edition". The value can occur as an array item.
name : { 
  logical-op : [
    sub-query+
  ]
}
{"edition" : { 
  "$or": [
    {"format": "paperback"},
    {"format": "hardback"}
  ] 
} }
Match a JSON property named "format" that is contained somewhere within a property named "edition" and whose value is "paperback" or "hardback". The values can occur as array items.

Since a container query always matches its sub-queries anywhere within the container substructure, you cannot construct a JSON QBE that matches a container with property name k whose value is exactly and only this object.

The table below provides example documents matched by a value query and several kinds of container query. The matched document examples are not exhaustive. Each query is annotated with a textual description of what the criteria asserts about matching documents. For more examples, see JSON Search Criteria Quick Reference.

QBE Matches
{"$query":
  { "k": "v" }
}
Property k has value "v"
{ "k": "v" }
{ "k": ["v"] }
{"$query":
  {"k": ["v1", "v2"]}
}
Property k has value "v1" and value "v2".
{ "k": ["v1", "v2"] }
{ "c": [{"k": "v1"}, {"k": "v2"}] }
{"c": {"k": "v1", "c2": {"k": "v2"}}
{"$query":
  {"c": {"k": "v"}}
}
Property c contains a property k that has value "v", where k can occur anywhere in c's substructure.
{ "c" : {"k" : "v" } }
{ "c" :
   {"c2": {"k": "v" }}
}
Constructing a QBE with the Node.js QueryBuilder

This topic describes how to use the information in this chapter in conjunction with the Node.js Client API.

The Node.js Client API enables you to construct a QBE using the QueryBuilder.byExample function. The parameters of byExample correspond to the criteria within the $query portion of a raw QBE, expressed as a JavaScript object. For example, the table below shows a QBE example from elsewhere in this chapter and the equivalent QueryBuilder.byExample call.

Raw QBE QueryBuilder.byExample
{"$query": {
  "author": {"$word": "twain"},
  "$filtered": true
}}
qb.byExample({
  author: {$word: 'twain'},
  $filtered: true
})

You can also supply the entire $query portion of a QBE to byExample as a JavaScript object. For example:

qb.byExample(
  { $query: {
    author: {$word: 'twain'},
    $filtered: true
  }
)

However, you cannot specify $response portions of a raw QBE through QueryBuilder.byExample. Response customization is still available through QueryBuilder.extract and QueryBuilder.snippet.

For details, see Querying Documents and Metadata in the Node.js Application Developer's Guide.

How Indexing Affects Your Query

You do not have to define any indexes to use QBE. This allows you to get started with QBE quickly. However, indexes can significantly improve the performance of your search.

Unless your database is small or your query produces only a small set of pre-filtering results, you should define an index over any XML element, XML attribute, or JSON property used in a range query. To configure an index, see Range Indexes and Lexicons in Administrator's Guide.

If your QBE includes a range query, you must either have an index configured on the XML element, XML attribute, or JSON property used in the range query, or you must use the filtered flag to force a filtered search.

A filtered search uses available indexes, if any, but then checks whether or not each candidate meets the query requirements. This makes a filtered search accurate, but much slower than an unfiltered search. An unfiltered search relies solely on indexes to identify matches, which is much faster, but can result in false positives. For details, see Fast Pagination and Unfiltered Searches in Query Performance and Tuning Guide.

In the absence of a backing index, a range query cannot be used with unfiltered search. To enable filtered search, set the filtered flag to true in the query portion of your QBE, as shown in the following example:

Format Example
XML
<q:qbe xmlns:q="http://marklogic.com/appservices/querybyexample">
  <q:query>
    <author><q:word>twain</q:word></author>
    <q:filtered>true</q:filtered>
  </q:query>
</q:qbe>
JSON
{
  "$query": {
    "author": {"$word": "twain"},
    "$filtered": true
  }
}

Adding Options to a QBE

Options give you fine grained control over a QBE. Most options are associated with a value, word, or range query.

Specifying Options in XML

In an XML QBE, an option is an attributes of the predefined QBE element it modifies, such <q:lt/>, <q:word/> or <q:value>. The following query demonstrates use of the exact option on a value query.

<q:qbe xmlns:q="http://marklogic.com/appservices/querybyexample">
  <q:query>
    <author><q:value exact="false">mark twain</q:value></author>
  </q:query>
</q:qbe>

You cannot apply options to queries on attributes because the range, word, or value query is embedded in the attribute value. For example, you cannot add a case-sensitive option to the following attribute word query:

<q:qbe xmlns:q="http://marklogic.com/appservices/querybyexample">
  <q:query>
    <edition @format="$word paperback"/></edition>
  </q:query>
</q:qbe>

If you need such control over an element attribute query, you should use a structured or combined query.

Specifying Options in JSON

In a JSON QBE, an option is a sibling of the QBE object it modifies, such as a value, word, or range query. Option names always have a $ prefix.

The following example query uses the exact option to modify a value query by including it as a JSON property at the same level as the $value object:

{
  "$query": {
    "author": { 
      "$exact": false, 
      "$value": "mark twain"
    }
  }
}

Option List

The following table describes the options available for use in a QBE. The MarkLogic Server Search API supports additional options through other query formats, such as string or structured query, and through the use of persistent query options. For details, see Search Customization Using Query Options.

Option Attribute or Property Name Description
case-sensitive
Whether or not to perform a case-sensitive match. Default: false if the text to match is all lower case, true otherwise. Value type: boolean. Usable with: word or value query. For details, see cts:word-query or cts:value-query.
diacritic-sensitive
Whether or not to perform a diacritic-sensitive match. Default: Depends on context: false if the text to match contains no diacritics, true otherwise. Value type: boolean. Usable with: word or value query. For details, see cts:word-query or cts:value-query.
punctation-sensitive
Whether or not to perform a punctuation-sensitive match. Default: depends on context: false if the text to match contains no punctuation, true otherwise. Value type: boolean. Usable with: word or value query. For details, see cts:word-query or cts:value-query.
whitespace-sensitive
Whether or not to perform a whitespace-sensitive match. Default: false. Value type: boolean. Usable with: word or value query. For details, see cts:word-query or cts:value-query.
stemmed
Whether or not to use stemming. Default: Depends on context and database configuration; for details, see cts:word-query. Value type: boolean. Usable with: word or value query. For details, see cts:word-query or cts:value-query.
exact
Whether to perform an exact match or use the builtin context-sensitive default behaviors for the *-sensitive options. When true, exact is shorthand for case-sensitive, diacritic sensitive, punctuation-sensitive, whitespace-sensitive, unstemmed, and unwildcarded. Default: true for value and range query, false for word query. Value type: boolean. Usable with: word or value query.
score-function
Use the selected scoring function. Allowed values: linear, reciprocal. Usable with: range query. For details, see Including a Range or Geospatial Query in Scoring.
slope-factor
Apply the given number as a scaling factor to the slope of the scoring function. Default: 1.0. Value type: double. Usable with: range query. For details, see Including a Range or Geospatial Query in Scoring.
min-occurs
The minimum number of occurrences required. If there are fewer occurrences, the fragment does not match. Default: 1. Value type: integer. Usable with: range, word, or value query. For details, see cts:word-query.
max-occurs
The maximum number of occurrences required. If there are more occurrences, the fragment does not match. Default: Unbounded. Value type: integer. Usable with: range, word, or value query. For details, see cts:word-query.
lang
The language under which to interpret the content. The option value is case-insensitive. Allowed values: An ISO 639 language code. Default: The default language configured for the database. Usable with: query; range, word, or value query. In XML it can also appear on the qbe element. In JSON, it can appear as a top level property.
weight
A weight for this query. Higher weights move search results up in the relevance order. Allowed values: less than or equal to 64 and greater tha or equal to -16 (between -16 and 64). Default: 1.0. Usable with: a word or value query, or a range query that is backed by a range index. For details, see cts:word-query, cts:value-query, or cts:element-range-query.
constraint
The name of a range, values, or word constraint specified for the same XML element or JSON property in persisted query options associated with the search. Usable with: range, word, or value query. For details, see Using Persistent Query Options.
@xsi:type (XML)
$datatype (JSON)
The xsi:type to which to cast the value supplied in a range query. Default: Values are treated as xs:boolean, xs:double, xs:date, xs:dateTime, or xs:time if castable as such, and as string otherwise. Usable with: range query.

Using Persistent Query Options

The REST and Java APIs enable you to install persistent query options on your REST instance and apply them to subsequent searches. You can also use persistent query options with the Node.js Client API, but the API has no facility for creating and maintaining the persistent options.

Using persistent query options with a QBE allows you to use options not supported directly by the QBE grammar. Using persistent options with a QBE also allows you to define global options to apply throughout your query, such as making all word queries case-sensitive instead of specifying the case-sensitive option on each word query in your QBE.

Query options applied through through the constraint option override options specified inline on a QBE.

You can apply persistent query options to a QBE using the constraint option. To use this option:

  1. Install named, persistent query options following the directions appropriate for the API you are using.
  2. Specify the name of a constraint defined in the persistent options from Step 1 as the value of a constraint option on a word, value, or range query in your QBE. See the example, below.
  3. When you execute a search with your QBE, associate the persistent query options from Step 1 with your search in the manner prescribed by the client API (REST, Java, or Node.js).

For details on defining, installing and using persistent query options, see Configuring Query Options in REST Application Developer's Guide or Query Options in Java Application Developer's Guide.

The pre-defined constraint named by the constraint option should match the type of query to which it is applied. That is, name a range constraint for a range query, a value constraint for a value query, and a word constraint for a word query.

The following example pre-defines a word constraint called w-t that gives weight 2.0 to matches in a title XML element or JSON property, and then applies it to a QBE that contains a word query on title. This enables word queries on title a default weight that can be overridden by omitting the constraint option.

If the folloiwng persistent query options are installed specified as a parameter to the search performed with the QBE:

XML Options JSON Options
<search:options
    xmlns:search="http://marklogic.com/appservices/search">
  <search:constraint name="w-t">
    <search:word>
      <search:element name="title" ns=""/>
      <search:weight>2.0</search:weight>
    </search:word>
  </search:constraint>
</search:options>
{"options": {
  "constraint": [ {
    "name": "w-t",
    "word": {
      "json-property": "title",
      "weight": 2
    }
  } ]
} }

Then the following QBE applies the w-t option to a word query on title to give weight 2.0 to matches in a title element.

XML JSON
<q:qbe
    xmlns:q="http://marklogic.com/appservices/querybyexample">
  <q:query>
    <title>
      <q:word constraint="w-t">sawyer</q:word>
    </title>
  </q:query>
</q:qbe>
{ "$query": {
    "title": {
      "$word": "sawyer",
      "$constraint": "w-t"
    }
} }

Customizing Search Results

You can include a response XML element or JSON property to customize the contents of returned search results. You can modify or supplement the default search results using the snippet and extract formatters in the response section of a QBE.

This section covers the following topics:

When to Include a Response in Your Query

Add an optional response section to a QBE to do one or more of the following:

  • Return matching documents instead of snippets. (snippet)
  • Return only information about the document and the match, such as database URI, document format, and relevance score. (snippet)
  • Specify XML elements or JSON properties to prefer when constructing snippets. (snippet)
  • Specify XML elements or JSON properties to extract from matching documents, whether or not the match occurs within those elements or properties. (extract)

Advanced customization is available using result decorators, transforms, and persistent query options. For details, see Customizing Search Results in REST Application Developer's Guide or Transforming Search Results in Java Application Developer's Guide.

Using the snippet Formatter

Use snippet to control what, if anything, is included in the snippet portion of a search match and to identify preferred XML elements or JSON properties to include a snippet. The default snippet is a small text excerpt with the matching text tagged for highlighting. The following table contains an excerpt of the snippet section of a search response generated with the default policy.

Format Default Snippet Example
XML
<search:response ...>
  <search:result ...>
    <search:snippet>
      <search:match
          path="fn:doc(&quot;/books/sawyer.xml&quot;)/book">
        <search:highlight>Mark Twain</search:highlight>
      </search:match>
    </search:snippet>
  </search:result>
  ...
</searchresponse>
JSON
{
  ...
  "results": [ {
      ...
      "matches": [ {
        "path": "fn:doc(\"/books/sawyer.json\")/*:json/*:book/*:author",
        "match-text": [ { "highlight": "Mark Twain" } ]
      } ]
  } ],
  ...
}

The snippet formatter has the following form:

XML JSON
<q:response>
  <q:snippet>
    <q:policy/>
    preferred-element
  </q:snippet>
</q:response>
{ "$response": {
    "$snippet": { 
      policy: {},
      preferred-property: {}
    },
} }

The policy, preferred-element, and preferred-property are optional.

The snippeting policy controls whether or not snippets are included in the output and whether to include a small text excerpt (default) or the entire document when snippets are enabled. Use one of the following element or property names for policy.

XML JSON Description
default
$default
Include a small excerpt of the text around the matching terms, with the matched text tagged for highlighting.
document
$document
Return the entire document.
none
$none
Do not include any snippets.

The following example disables snippet generation by setting the snippet policy to none. In JSON, specify an empty object value for the policy property.

XML JSON
<q:response>
  <q:snippet>
    <q:none/>
  </q:snippet>
</q:response>
{ "$response": {
    "$snippet": { 
      "$none": {}
    }
} }

You can also specify one or more XML element or JSON property names to be preferred when generating snippets. For example, if you specify a preference for the title element or property, and both title and author contain a match, the snippet is generated from the match in title. In JSON, specify the preferred property with an empty object value.

XML JSON
<q:response>
  <q:snippet>
    <title/>
  </q:snippet>
</q:response>
{ "$response": {
    "$snippet": { 
      "title": {}
    }
} }

Using the extract Formatter

Use the extract formatter to specify additional XML elements or JSON properties to include in the search output. If snippets are included, the extracted components supplement any snippet in a match, rather than replacing it.

XML JSON
<q:response>
  <q:extract>
    <your-element/>
  </q:extract>
</q:response>
{ "$response": {
    "$extract": { 
      "your-property-name": {}
    }
} }

For example, the following response says to extract the title and author from a matching document. The title and author need not contain the matching terms or values.

XML JSON
<q:response>
  <q:extract>
    <title/>
    <author/>
  </q:extract>
</q:response>
{ "$response": {
    "$extract": { 
      "title": {},
      "author": {}
    }
} }

Extracted elements or properties go into the metadata section of the enclosing match. For an example, see Example: Search Customization.

Example: Search Customization

The following QBE modifies the search results to exclude snippets and to extract the title XML element or JSON property into the search result metadata section.

Format Example
XML
<q:qbe xmlns:q="http://marklogic.com/appservices/querybyexample">
  <q:query>
    <author>Mark Twain</author>
  </q:query>
  <q:response>
    <q:extract><title/></q:extract>
    <q:snippet><q:none/></q:snippet>
  </q:response>
</q:qbe>
JSON
{
  "$query": {
    "author": "Mark Twain"
  },
  "$response": {
    "$snippet": { "$none": {} },
    "$extract": { "title": {} }
  }
}

The following table shows the default output and the modified output produced by the above query.

Format Default Output Customized Output
XML
<search:response
    snippet-format="snippet"
    total="1" start="1"
    page-length="10"
    ...>
  <search:result index="1"
      uri="/books/sawyer.xml"
      ...>
    <search:snippet>
      <search:match ...>
        <search:highlight>
          Mark Twain
        </search:highlight>
      </search:match>
    </search:snippet>
  </search:result>
  ...
</search:response>
<search:response
  snippet-format="empty-snippet"
    total="1" start="1"
    page-length="10"
    ..>
  <search:result index="1"
      uri="/books/sawyer.xml"
      ...>
    <search:snippet/>
    <search:metadata>
      <title>Tom Sawyer</title>
    </search:metadata>
  </search:result>
  ...
</search:response>
JSON
{
  "snippet-format": "snippet",
  "total": 1,
  "start": 1,
  "page-length": 10,
  "results": [{
    "index": 1,
    "uri": "/books/sawyer.json",
    "matches": [{
      "path": ...,
      "match-text": [{
        "highlight": "Mark Twain"
      }]
    }]
  }],
  ...
}
{
  "snippet-format":
    "empty-snippet",
  "total": 1,
  "start": 1,
  "page-length": 10,
  "results": [{
    "index": 1,
    "uri": "/books/sawyer.json",
    ...,
    "matches": [],
    "metadata": [{
      "title": "Tom Sawyer"
    }]
  }],
  ...
}

Scoping a Search by Document Type

This section describes how the treatment of bare names in a QBE affects the type of documents matched by the query.

A bare name in a JSON QBE is a JSON property name that does not include a $ prefix. A bare name in an XML QBE is an element name in no namespace.

By default, the interpretation of bare names matches your query format. That is, bare names in a JSON QBE represent JSON property names in content, and bare names in an XML QBE represent element names in content that are in no namespace. The net effect is that an XML QBE only matches XML documents, and a JSON QBE only matches JSON documents by default.

Use the format option to override the default behavior, as shown in the following example:

Format Example
XML
<q:qbe xmlns:q="http://marklogic.com/appservices/querybyexample">
  <q:format>json</q:format>
  <q:query>...</q:query>
</q:qbe>
JSON
{
  "$format": "xml",
  "$query": {...}
}

Converting a QBE to a Combined Query

The primary use case for QBE is rapid prototyping of queries during development. For best performance and access to the full set of Search API capabilities, you should eventually convert your QBE to a combined query. A combined query is a lower level representation that combines a structured query and query options.

The REST and Java APIs include an interface for generating a combined query from a QBE. For details, see the following:

Validating a QBE

You can set the validate flag to true to perform query validation before evaluating a QBE. When validation is enabled, if you submit a QBE that contains errors, MarkLogic reports the errors and does not perform the search. If your query does not contain errors, the search proceeds as usual.

Performing query validation on every search can be expensive, so you should not enable validation in production. It is best used for debugging during development.

The following example is a QBE with validation enabled:

Format Example
XML
<q:qbe xmlns:q="http://marklogic.com/appservices/querybyexample">
  <q:query>
    <author>Mark Twain</author>
  </q:query>
  <q:validate>true</q:validate>
</q:qbe>
JSON
{
  "$query": {
    "author": "Mark Twain"
  },
  "$validate": true
}

« Previous chapter
Next chapter »
Powered by MarkLogic Server | Terms of Use | Privacy Policy