Search Developer's Guide (PDF)

MarkLogic 10 Product Documentation
Search Developer's Guide
— Chapter 6

« Previous chapter
Next chapter »

Composing cts:query Expressions

Searches in MarkLogic Server use expressions that have a cts:query type. This chapter describes how to create various types of cts:query expressions and how you can register some complex expressions to improve performance of future queries that use the registered cts:query expressions.

MarkLogic Server includes many Built-In XQuery functions to compose cts:query expressions. The signatures and descriptions of the various APIs are described in the MarkLogic XQuery and XSLT Function Reference.

This chapter includes the following sections:

Understanding cts:query

The second parameter for cts:search takes a parameter of cts:query type. The contents of the cts:query expression determines the conditions in which a search will return a document or node. This section describes cts:query and includes the following parts:

cts:query Hierarchy

The cts:query type forms a hierarchy, allowing you to construct complex cts:query expressions by combining multiple expressions together. The hierarchy includes composable and non-composable cts:query constructors.

A composable constructor is one that is used to combine multiple cts:query constructors together. A leaf-level constructor is one that cannot be used to combine with other cts:query constructors (although it can be combined using a composable constructor).

The following diagram shows the leaf-level cts:query constructors, which are not composable, and the composable cts:query constructors, which you can use to combine both leaf-level and other composable cts:query constructors. The diagram shows most of the available constructors, but not necessarily all of them.

Equivalent constructors exist for Server-Side JavaScript. For example, the JavaScript built-in cts.andQuery is equivalent to the XQuery built-in cts:and-query in the diagram above.

The remainder of this chapter goes into more detail on combining constructors.

Use to Narrow the Search

The core search cts:query API is cts:word-query. The cts:word-query function returns true for words or phrases that match its $text parameter, thus narrowing the search to fragments containing terms that match the query. If needed, you can use other cts:query APIs to combine a cts:word-query expression into a more complex expression. Similarly, you can use the other leaf-level cts:query constructors to narrow the results of a search.

Understanding cts:element-query

The cts:element-query function searches through a specified element and all of its children. It is used to narrow the field of search to the specified element hierarchy, exploiting the XML structure in the data. Also, it is composable with other cts:element-query functions, allowing you to specify complex hierarchical conditions in the cts:query expressions.

For example, the following search against a Shakespeare database returns the title of any play that has SCENE elements that have SPEECH elements containing both the words room and castle:

for $x in cts:search(fn:doc(), 
   cts:element-query(xs:QName("SCENE"), 
       cts:element-query(xs:QName("SPEECH"), 
           cts:and-query(("room", "castle")) ) ) ) 
return
($x//TITLE)[1]

This query returns the first TITLE element of the play. The TITLE element is used for both play and scene titles, and the first one in a play is the title of the play.

When you use cts:element-query and you have both the word positions and element word positions indexes enabled in the Admin Interface, it will speed the performance of many queries that have multiple term queries (for example, "the long sly fox") by eliminating some false positive results.

Understanding cts:element-word-query

While cts:element-query searches through an element and all of its children, cts:element-word-query searches only the immediate text node children of the specified element. For example, consider the following XML structure:

<root>
  <a>hello
    <b>goodbye</b>
  <a>
</root>

The following query returns false, because "goodbye" is not an immediate text node of the element named a:

cts:element-word-query(xs:QName("a"), "goodbye")

Understanding Field Word and Value Query Constructors

The cts:field-word-query and cts:field-value-query constructors search in fields for either words or values. A field value is defined as all of the text within a field, with a single space between text that comes from different elements. For example, consider the following XML structure:

<name>
  <first>Raymond</first>
  <middle>Clevie</middle>
  <last>Carver</last>
</name>

If you want to normalize names in the form firstname lastname, then you can create a field on this structure. The field might include the element name and exclude the element middle. The value of this instance of the field would then be Raymond Carver, with a space between the text from the two different element values from first and last. If your document contained other name elements with the same structure, their values would be derived similarly. If the field is named my-field, then a cts:field-value-query("my-field", "Raymond Carver") returns true for documents containing this XML. Similarly, a cts:field-word-query("my-field", "Raymond Carver") returns true.

For more information about fields, see Fields Database Settings in the Administrator's Guide. For information on lexicons on fields, see Field Value Lexicons.

Understanding the Range Query Constructors

The cts:element-range-query, cts:element-atribute-range-query, cts:path-range-query, and cts:field-range-query constructors allow you to specify constraints on a value in a cts:query expression. The range query constructors require a range index on the specified element or attribute. For details on range queries, see Using Range Queries in cts:query Expressions.

Understanding the Reverse Query Constructor

The cts:reverse-query constructor allows you to match queries stored in a database to nodes that would match those queries. Reverse queries are used as the basis for alert applications. For details, see Creating Alerting Applications.

Understanding the Geospatial Query Constructors

The geospatial query constructors are used to constrain cts:query expressions on geospatial data. Geospatial searches are used with documents that have been marked up with latitude and longitude data, and can be used to answer queries like show me all of the documents that mention places within 100 miles of New York City. For details on gesospatial searches, see Geospatial Search Applications.

Specifying the Language in a cts:query

All leaf-level cts:query constructors are language-aware; you can either explicitly specify a language value as an option, or it will default to the database default language. The language option specifies the language in which the query is tokenized and, for stemmed searches, the language of the content to be searched.

To specify the language option in a cts:query, use the lang=language_code option, where language_code is the two or three character ISO 639-1 or ISO 639-2 language code (http://www.loc.gov/standards/iso639-2/php/code_list.php). For example, the following query:

let $x := 
<root>
 <el xml:lang="en">hello</el>
 <el xml:lang="fr">hello</el>
</root>
return
$x//el[cts:contains(., 
         cts:word-query("hello", ("stemmed", "lang=fr")))]

returns only the French-language node:

<el xml:lang="fr">hello</el>

Depending on the language of the cts:query and on the language of the content, a string will tokenize differently, which will affect the search results. For details on how languages and the xml:lang attribute affect tokenization and searches, see Language Support in MarkLogic Server.

Creating a Query From Search Text With cts:parse

This section describes how to create a cts:query from a simple search string using the cts:parse. XQuery function or the cts.parse Server-Side JavaScript function. The following topics are covered:

String Query Overview

A string query is a plain text search string (query text) composed of terms, phrases, and operators that can be easily composed by end users typing into an application search box. For example, cat AND dog is a string query for finding documents that contain both the term cat and the term dog.

You can use the cts:parse XQuery built-in function or the cts.parse Server-Side JavaScript built-in function to convert such a string query into a cts:query (XQuery) or cts.query (JavaScript). Use the resulting query in any interface that accepts a cts query, such as the cts:search XQuery function, the cts.search JavaScript function, and several JSearch API interfaces.

The following example uses cts:parse to match documents that contain the term cat and the term dog.

Language Example
XQuery
cts:search(fn:doc(), cts:parse("cat AND dog"))
JavaScript
// with cts.search
cts.search(cts.parse('cat AND dog'))

// with JSearch
import * as jsearch from '/MarkLogic/jsearch.mjs';
jsearch.documents()
  .where(cts.parse('cat AND dog'))
  .result()

The string query grammar supported by cts:parse and cts.parse enables users to compose complex queries. Adjacent terms, phrases and sub-expressions are implicitly AND'd together.

The following are some examples of queries that work with cts:parse and cts.parse out of the box:

  • (cat OR dog) NEAR vet

    at least one of the terms cat or dog within 10 terms (the default distance for cts:near-query) of the word vet

  • dog NEAR/30 vet

    the word dog within 30 terms of the word vet

  • cat -dog

    the word cat where there is no word dog

You can also bind a tag name to an index reference, lexicon reference, or field name. When such a tag name appears in a query string, it parses to a word, value, or range query that is scoped to the bound entity.

For example, binding the tag color to a cts:reference to a JSON property named bodyColor enables users to create query text like the following:

  • color:red

    Match documents where the value of the bodyColor contains the word red

  • color NE blue

    Match documents where the value of bodyColor is not blue

Without the binding, the above examples are just word queries that include the term color. For example, without a binding, color NE blue becomes a query for documents containing the words color, NE, and blue.

You can also bind a tag name to a reference to a function that generates a query, giving you more control over the interpretation. For example, you can use a query generator function to scope a query to documents in a particular collection or directory.

For details, see Binding a Tag to a Reference, Field, or Query Generator.

Grammar Components and Operators

This section describes the components and operators you can use in query text passed to cts.parse. Some operators are only available in search terms that involve tags bound to query generators using the parse binding feature.

Basic Components and Operators

The table below describes the basic components and operators recognized by the cts:parse XQuery function and the cts.parse JavaScript function. If you define bindings, then additional operators become available for query expressions using a bound tag; for details, see Operators Usable With Bound Tags.

An empty query string (cts:parse("")) generates an empty cts:and-query that matches everything.

Query Example Description
any adjacent terms
dog
dog tail
"dog tail" cat mouse
dog (cat OR mouse)
Match one or more terms or query expressions, as with a cts:and-query. Adjacent terms and query expressions are implicitly joined with AND. For example, dog tail is the same as dog AND tail.
"phrase"
"dog tail"
"dog tail" "cat whisker"
dog "cat whisker"
Terms in double quotes are treated as a phrase. Adjacent terms and phrases are implicitly joined with AND. For example, dog "cat whisker" matches documents containing both the term dog and the phrase cat whisker. NOTE: You cannot use single quotes in place of double quotes.
( )
(cat OR dog) zebra
Parentheses indicate grouping. The example matches documents containing at least one of the terms cat or dog as well as the term zebra.
-query
-dog
-(dog OR cat)
cat -dog
A NOT operation, as with a cts:not-query. For example, cat -dog matches documents that contain the term cat but that do not contain the term dog.
query1 AND query2
dog AND cat
(cat OR dog) AND zebra
Match two query expressions, as with a cts:and-query. For example, dog AND cat matches documents containing both the term dog and the term cat. AND is the default way to combine terms and phrases, so the previous example is equivalent to dog cat.
query1 OR query2 dog OR cat Match either of two queries, as with a cts:or-query. The example matches documents containing at least one of either of terms cat or dog.
query1 NOT_IN query2 dog NOT_IN "dog house" Match one query when the match does not overlap with another, as with cts:not-in-query. The example matches occurrences of dog when it is not in the phrase dog house.
query1 NEAR query2 dog NEAR cat (cat food) NEAR mouse Find documents containing matches to the queries on either side of the NEAR operator when the matches occur within 10 terms of each other, as with a cts:near-query. For example, dog NEAR cat matches documents containing dog within 10 terms of cat.
query1 NEAR/N query2 dog NEAR/2 cat Find documents containing matches to the queries on either side of the NEAR operator when the matches occur within N terms of each other, as with a cts:near-query. The example matches documents where the term dog occurs within 2 terms of the term cat.
query1 BOOST query2 george BOOST washington Find documents that match query1. Boost the relevance score of documents that also match query2. The example returns all matches for the term george, with matches in documents that also contain washington having a higher relevance score. For more details, see cts:boost-query.
[opt,opt,...] cat[min-occurs=5] cat AND[ordered] dog Pass options or a weight to the cts query generated for query. Options after a word or phrase apply to the word query on that word or phrase. Options after the operator apply to the query associated with the operator, such as cts:and-query for AND. For details, see Including Options and Weights in Query Text.
Operators Usable With Bound Tags

When you bind a tag to an index, lexicon, field, or query generator, then you can use the tag name in the ways shown in the following table. If you use these operators in a context in which the left operand is not a tag name, then the operator is simply interpreted as another query term. That is, unbound LT value is a cts:and-query of word queries on the words unbound, LT, and value.

For more information on defining a binding, see Binding a Tag to a Reference, Field, or Query Generator. For tags bound to geospatial indexes, see Operators Usable with Geospatial Queries.

The sub-expressions enabled by these operators can be used in combination with the grammar features described in Basic Components and Operators. You can also associate options with sub-expressions that use tags; for details, see Including Options and Weights in Query Text.

If you bind a tag to a geospatial index reference, the value you compare to the tag can be geospatial point or region. Not all the operators listed below are sensible in a geospatial context. For details, see Binding to a Geospatial Index Reference.

Query Example Description
tag:value
color:red
decade:1980s
birthday:1999-12-31
Matches documents where value satisfies a word query against the reference bound to tag. For example, as with a cts:element-word-query.
tag:(valueList)
color:(red blue)
decade:(1980s 1990s)
Matches documents where at least one value in valueList satisfies a word query against the reference bound to tag. For example, as with a cts:element-word-query.
tag = value
color = red
decade = 1980s
birthday = 1999-12-31
Matches documents where value satisfies a value query against the reference bound to tag. For example, as with a cts:element-value-query.
tag = (valueList)
color = (red blue)
decade = (1980s 1990s)
Matches documents where at least one value in valueList satisfies a value query against the reference bound to tag. For example, as with a cts:element-value-query.
tag EQ value
color EQ red
decade EQ 1980s
birthday EQ 1999-12-31
Matches documents where value satisfies a range query with the = operator against the reference bound to tag. For example, as with a cts:element-range-query.
tag EQ (valueList)
color EQ (red blue)
decade EQ (1980s 1990s)
Matches documents where at least one value in valueList satisfies a range query with the = operator against the reference bound to tag. For example, as with a cts:element-word-query.
tag NE value
color NE red
birthday NE 1999-12-31
Matches documents where value satisfies a range query with the != operator against the reference bound to tag. For example, as with a cts:element-range-query.
tag LT value
color LT red
birthday LT 1999-12-31
Matches documents where value satisfies a range query with the < operator against the reference bound to tag. For example, as with a cts:element-range-query.
tag LE value
color LE red
birthday LE 1999-12-31
Matches documents where value satisfies a range query with the <= operator against the reference bound to tag. For example, as with a cts:element-range-query.
tag GT value
color GT red
birthday GT 1999-12-31
Matches documents where value satisfies a range query with the > operator against the reference bound to tag. For example, as with a cts:element-range-query.
tag GE value
color GE red
birthday GE 1999-12-31
Matches documents where value satisfies a range query with the >= operator against the reference bound to tag. For example, as with a cts:element-range-query.
query[opt,opt,...] color:(red,blue)[unstemmed] price GT 5[min-occurs=2] Pass options or a weight to the cts query generated for query. For details, see Including Options and Weights in Query Text
Operators Usable with Geospatial Queries

When you bind a tag to a geospatial point or region index, then you can use the tag name with the operators listed in this section. If you use these operators in a context in which the left operand is not a tag name, then the operator is simply interpreted as another query term. That is, unbound EQ value is a cts:and-query of word queries on the words unbound, EQ, and value.

For more information on defining a binding, see Binding a Tag to a Reference, Field, or Query Generator.

The sub-expressions enabled by these operators can be used in combination with the grammar features described in Basic Components and Operators. You can include options with geospatial sub-expressions; for details, see Including Options and Weights in Query Text.

The value operand must be a geospatial point or region literal. For details, see Binding to a Geospatial Index Reference.

You can use the following operators with tags bound to a geospatial point index, such as a geospatial element child index or geospatial path index.

Query Example Description
tag:value
pt:"37.5128,-122.2581"
Matches documents where value satisfies a point query against the geospatial point reference bound to tag. For example, as with a cts:element-geospatial-query.
tag = value
pt = "37.5128,-122.2581"
tag EQ value
pt EQ "37.5128,-122.2581"
[opt,opt,...]
pt EQ "37,-122"[precision=float]
Pass options or a weight to the generated point query. For details, see Including Options and Weights in Query Text.

Tags bound to a geospatial region index can only be used with the DE9IM_* operators listed below. These operators implement the DE9-IM semantics described in http://en.wikipedia.org/wiki/DE-9IM. Expressions using these operators produce a cts:geospatial-region-query (XQuery) or cts.geospatialRegionQuery (JavaScript).

Query Description
tag DE9IM_CONTAINS value Matches regions in the bound index that contain the region value. That is, regions where geo:region-contains(indexedRegion,value) returns true.
tag DE9IM_COVERED_BY value If R1 is a region in the bound index and R2 is value, then R1 is covered by R2 if every point of R1 is a point of R2, and the interiors of R1 and R2 have at least one point in common.
tag DE9IM_COVERS value If R1 is a region in the bound index and R2 is value, R1 covers R2 if R2 lies in R1. That is, no points of R2 lie in the exterior of R1, or every point of R2 is a point of the interior or boundary of R1.
tag DE9IM_CROSSES value If R1 is a region in the bound index and R2 is value, R1 crosses R2 if their interiors intersect and the dimension of the intersection is less than that of at least one of the regions.
tag DE9IM_DISJOINT value If R1 is a region in the bound index and R2 is value, R1 is disjoint from R2 if the intersection of the two regions is empty.
tag DE9IM_EQUALS value If R1 is a region in the bound index and R2 is value, R1 equals R2 if every point of R1 is a point of R2, and every point of R2 is a point of R1. That is, the regions are topologically equal.
tag DE9IM_INTERSECTS value If R1 is a region in the bound index and R2 is value, R1 intersect R2 if geo:region-intersects(R1,R2) returns true.
tag DE9IM_OVERLAPS value If R1 is a region in the bound index and R2 is value, then R1 overlaps R2 if R1 intersects R2, exclusive of boundaries, and neither region contains the other.
tag DE9IM_TOUCHES value If R1 is a region in the bound index and R2 is value, R1 touches R2 if they have a boundary point in common but no interior points in common.
tag DE9IM_WITHIN value If R1 is a region in the bound index and R2 is value, then R1 is within R2 if R2 contains R1.

As with point queries, pass options to a region query by putting the option list after the query. For example, if the tag region is bound to a geospatial region index, then you can specify the units option as follows:

region DE9IM_CONTAINS "@1 32,-122" [units=km]

Including Options and Weights in Query Text

Your query text can include query options or a weight that is passed through to the query generated by cts:parse. This is an advanced feature that you would not typically expose directly to end users. To use this feature, put the options or weight in brackets after query term or operator. The position depends on the type of query.

Place the option list adjacent to a word or phrase sub-expression or a sub-expression that uses a bound tag. For example:

cat[min-occurs=2]
tag LT value [min-occurs=2]
tag DE9IM_OVERLAPS [1, 10, 5, 20] [units=km]

Place the options adjacent to the operator when the operator is one of the operators listed in Basic Components and Operators (AND, OR, NEAR, etc.). For example:

cat AND[ordered] dog

To specify a weight, use weight=N. For example:

tag LT value [weight=2.0]

The following table provides additional examples of passing options and weights in query text. Assume that the query terms cat and dogs are simple words, and the query terms price, pt, and region are tags bound to an index, field, or lexicon reference.

Query Text Generated Query
cat
cts:word-query("cat", ("lang=en"), 1)
cat[case-sensitive]
cts:word-query(
  "cat", ("case-sensitive","lang=en"), 1)
chat[stemmed,lang=fr]
cts:word-query("chat", ("stemmed","lang=fr"), 1)
cat AND dog
cts:and-query(
  (cts:word-query("cat", ("lang=en"), 1),
   cts:word-query("dog", ("lang=en"), 1)
  ),("unordered"))
cat[min-occurs=3] AND
  dog[weight=2]
cts:and-query(
  (cts:word-query("cat",
    ("lang=en","min-occurs=3"), 1),
   cts:word-query("dog", ("lang=en"), 2)
  ),("unordered"))
cat[min-occurs=3]
  AND[ordered]
perro[lang=es]
cts:and-query((
   cts:word-query("cat",
    ("min-occurs=3","lang=en"), 1),
   cts:word-query("perro", ("lang=es"), 1)
  ), ("ordered"))
price GT 5 [min-occurs=2]
cts:json-property-range-query(
  "price", ">", xs:int("5"), (
  "min-occurs=2"), 1)
price EQ 5 [min-occurs=2]
  AND[ordered] perro[lang=es]
cts:and-query((
  cts:json-property-range-query(
    "price", ">", xs:int("5"), 
    ("min-occurs=2"), 1),
  cts:word-query("perro", ("lang=es"), 1)
 ), ("ordered"))
pt:"@1 37,-122"[units=km]
cts:element-child-geospatial-query(
  fn:QName(...), fn:QName(...),
  cts:point("37,-122"),
  ("coordinate-system=wgs84","units=km"), 1)
region 
  DE9IM_CONTAINS 
"@1 32,-122" [units=km]
cts:geospatial-region-query((
  cts:geospatial-region-path-reference(
    "/envelope/cts-region",(
    "coordinate-system=wgs84"))), 
  "contains", cts:circle("@1 32,-122"),
  ("units=km"), 1)

Binding a Tag to a Reference, Field, or Query Generator

This topic describes how to define parse bindings that enable the use of specially scoped relational and comparison operators in query text passed to the cts:parse XQuery function or cts.parse Server-Side JavaScript function. You can create bindings to XML elements, XML element attributes, JSON properties, fields, and paths, as well as to custom parsing functions.

The following topics are covered:

Binding Overview

The cts:parse XQuery function and the cts.parse JavaScript function accept an optional 2nd parameter that is a set of bindings between a tag and a content reference, field name, or a query generator function. When you use the tag in query text, cts:parse (cts.parse) uses the binding to generate a query based on the bound reference, field, or function.

In XQuery, bindings are represented by a map with the tag names as the keys. In JavaScript, the bindings are represented by a JavaScript object with the tag names as the object property names. For example, the following code snippet binds the tag by to an XML element/JSON property named author:

Language Example
XQuery
let $bindings := map:map()
let $_ := map:put(
  $bindings, 
  "by", cts:element-reference(xs:QName("author"))
JavaScript
const bindings =
  { by: cts.jsonPropertyReference('author') };

Given the above binding, you can use by in query text to represent the value of the author element or property. For example, the following query text parses to a cts:element-word-query (or cts.jsonPropertyWordQuery) for the phrase mark twain in the author XML element or JSON property.

by:"mark twain"

The example above uses an element reference in XQuery and a JSON property reference in JavaScript, but your choice of query language does not limit you to a particular reference type. For example, you can create a binding with cts:json-property-reference in XQuery and with cts.elementReference in JavaScript.

You can examine the serialized output produced by the parse in Query Console to observe the results of using a bound tag in query text. For example, passing the above query text and bindings to cts:parse yields the results shown below:

Language Example
XQuery
xquery version "1.0-ml";
let $bindings := map:map()
let $_ := map:put(
  $bindings, 
  "by", cts:element-reference(xs:QName("author")))
return cts:parse('by:"mark twain"', $bindings)

(: emits 
 : cts:element-word-query(fn:QName("","author"), "mark twain")
 :)
JavaScript
const bindings =
  { by: cts.jsonPropertyReference('author') };
cts.parse('by:"mark twain"', bindings)

// emits
// cts.jsonPropertyWordQuery("author", "mark twain")

You get this result because the : operator signifies comparison as per a word query, and the binding dictates the word query is scoped to a specific JSON property. Thus, the combination of the operator and the bound reference determines the generated query. For details, see Binding to a cts:reference.

The :, =, and EQ operators also accept a grouping of values, which is handled like an OR. For example, the following query matches documents where the author JSON property contains either the word twain or the word frost:

by:(twain frost)

If you define a binding with an empty string as the tag, the binding applies to unqualified terms like cat. For details, see Customizing Naked Term Handling With Bindings.

Binding to a simple string is similar, but the bound entity in that case is a field. For details, see Binding to a Field by Simple Name.

For a complete mapping of reference type and operator to query type, refer to the reference documentation for cts:parse in the MarkLogic XQuery and XSLT Function Reference or cts.parse in the MarkLogic Server-Side JavaScript Function Reference.

If the default query mapping does not satisfy the requirements of your application, you can bind a tag to a query generator function instead. Binding a tag to a function that generates a cts query gives you more control over the interpretation of a query sub-expression and enables using the following operators in query text: :, =, LT, LE, GT, GE, EQ, NE.

The bound function is expected to generate a cts:query (or cts.query) from the operator and operands. For example, you could cause the query text 'by:"mark twain"' to match mark twain in the author property only when the phrase occurs in documents in a specific collection. For details, see Binding to an XQuery Query Generator Function or Binding to a JavaScript Query Generator Function.

Function binding is designed to enable you to override the default query selection when a tag is bound to a reference or simple string. It is not a general purpose grammar extender. For example, you cannot define a new operators or change the number of operands expected by an operator.

Binding to a cts:reference

You can bind a tag to a cts:reference by using any cts:reference constructor. This enables you to bind a tag to an XML element or element attribute, JSON property, field, or path. Query expressions using the tag can parse to a word query, value query, or range query, depending on the operator context.

For example, the following code binds the tag cost to an XML element or JSON property named price, then uses the cost tag in the query expression cost LT 15. The use of the tag with the LT operator causes the expression to parse to a range query, so the database configuration should include a range index on price with type float.

Language Example
XQuery
xquery version "1.0-ml";
let $bindings := map:map()
let $_ := map:put(
  $bindings, 
  "cost", cts:element-reference(xs:QName("price")))
return cts:parse('cost LT 15', $bindings)

(: cts:element-range-query(
     fn:QName("","price"), "<", xs:float("15"), (), 1) 
:)
JavaScript
const bindings =
  { by: cts.jsonPropertyReference('price') };
cts.parse('cost LT 15', bindings)

// cts.jsonPropertyRangeQuery(
//    "price", "<", xs.float("15"), [], 1)

If you use the binding in a different operator context, the parser generates a different kind of query. For example, the : operator generates a word query in most cases, so the query text cost:15 parses to a cts:element-word-query or cts.jsonPropertyWordQuery, similar to the following:

cts:element-word-query(fn:QName("","price"), "15", ("lang=en"), 1)

cts.jsonPropertyWordQuery("price", "15", ["lang=en"], 1)

If you bind a tag to a geospatial index reference, the : operator generates a geospatial query. For details, see Binding to a Geospatial Index Reference.

For a complete list of the types of query generated by each operator, refer to cts:parse in the MarkLogic XQuery and XSLT Function Reference or cts.parse in the MarkLogic Server-Side JavaScript Function Reference.

By default, the parser checks for the existence of a backing index or lexicon for each cts reference when it processes your bindings. Though it is usually beneficial to have a backing index for a binding, you can suppress the check if you want to defer index creation or know you will never use the binding in a search context that actually requires an index. For example, range queries always require an index, but a word query does not necessarily require one. If you use an unchecked binding to create a query that requires an index, you will still get an error when you use the query in a search.

To suppress the parse time index check, add the unchecked and type options when creating the reference. The type option is required because the parser can no longer derive this information from the index definition. The following example illustrates the parse time check vs. the search time check:

Language Example
XQuery
(: parse time XDMP-ELEMRIDXNOTFOUND if no range index exists:)
xquery version "1.0-ml";
let $bindings := map:map()
let $_ := map:put(
  $bindings, 
  "cost", cts:element-reference(xs:QName("price")))
return cts:parse('cost LT 15', $bindings);

(: search time XDMP-ELEMRIDXNOTFOUND :)
xquery version "1.0-ml";
let $bindings := map:map()
let $_ := map:put(
  $bindings, "cost", cts:element-reference(xs:QName("price"),
    ("type=float","unchecked")))
return cts:search(cts:parse('cost LT 15', $bindings))
JavaScript
// parse time XDMP-ELEMRIDXNOTFOUND if no range index exists
cts.parse('cost LT 15', {p: cts.jsonPropertyReference('price')})

// Suppress the parse time index check
const query = cts.parse('cost LT 15', 
  {cost: cts.jsonPropertyReference(
    'price',['type=float','unchecked'])})
// But will still get search time error if no range index found
cts.search(query)        // XDMP-ELEMRIDXNOTFOUND
Binding to a Field by Simple Name

You can bind to a field by name or by cts:reference. This section describes how to bind to field by name. To use a reference constructor, instead, see Binding to a cts:reference.

When you bind a tag to a simple string, the string is interpreted as the name of a field. The database configuration should include a corresponding field definition. You can bind to any type of field, including metadata fields.

For example, the following binds the tag name to a field named person:

Language Example
XQuery
xquery version "1.0-ml";
let $bindings := map:map()
let $_ := map:put(
  $bindings, 
  "name", "person")
return cts:parse('name:"jane doe"', $bindings)
(: cts:field-word-query("name", "jane doe", ("lang=en"), 1) :)
JavaScript
const bindings = { name: 'person' };
cts.parse('name:"jane doe"', bindings)

// cts.fieldWordQuery("name", "jane doe", ["lang=en"], 1)

When you use the bound tag, it will parse to a cts:field-word-query, cts:field-value-query, or cts:field-range-query, depending on the operator context. If you use the tag name in a context that parses to a range query, you will get an error if the database configuration does not include a corresponding field range index.

To learn more about fields, see Fields Database Settings in the Administrator's Guide.

For a complete list of the kinds of query generated by the supported (cts:reference, operator) pairs, refer to cts:parse in the MarkLogic XQuery and XSLT Function Reference or cts.parse in the MarkLogic Server-Side JavaScript Function Reference.

Binding to a Geospatial Index Reference

If you bind a tag (or naked terms) to a cts:reference to a geospatial index, you can construct query terms that represent a geospatial point or region query. For example you can match documents containing a point within a region defined in the query text, or documents containing a region that intersects a region defined in the query text.

For example, if you bind the tag loc to a geospatial point index, then the following query text matches documents containing a point within a circle defined by a radius and a center point, using the syntax @radius lon,lat:

loc:"@5 37.5,-122.4"

The following code demonstrates how to define the binding and parse the above query text. In this example, the tag loc is bound to a geospatial point index on an XML element or JSON property named incidents. The resulting query matches documents containing points in the incidents element or property contained within the circle with center (37.5,-122.4) and a radius of 5 miles.

Language Example
XQuery
xquery version "1.0-ml";
let $bindings := map:map()
let $_ := map:put(
  $bindings, 
  "loc", cts:geospatial-element-reference(xs:QName("incidents")))
return cts:parse('loc:"@5 37.5,-122.4"', $bindings)
(: cts:element-geospatial-query(
 :   fn:QName("","incidents"), 
 :   cts:circle("@5 37.5,-122.4"),
 :   ("coordinate-system=wgs84"), 1) 
 :)
JavaScript
cts.parse('loc:"@5 37.5,-122.4"', 
  { loc: cts.geospatialJsonPropertyReference('incidents') })

// cts.jsonPropertyGeospatialQuery(
//    "incidents", 
//    cts.circle("@5 37.5,-122.4"),
//    ["coordinate-system=wgs84"], 1)

You can bind a tag to any of the index types described in Understanding Geospatial Query and Index Types. Parsing an expression that uses such a tag creates a query of the corresponding type. For example, a tag bound to a geospatial element reference produces an element geospatial query, and a tag bound to a geospatial region path reference produces a geospatial region query.

Use the :, EQ, and = operators with tags bound to a geospatial point index. Use the DE9IM_* operators with tags bound to a geospatial region index. For details, see Operators Usable with Geospatial Queries. For example:

mypoint:"@1 -122.2465038,37.5073428"

myregion DE9IM_OVERLAPS "@1 -122.2465038,37.5073428"

The right operand of a geospatial query expression must be a geospatial literal. You can specify a point, circle, box, or polygon using the shorthand shown below, or you can specify any supported region type using WKT. The shorthand is equivalent to the serialization of cts:point, cts:circle, cts:box, and cts:polygon in XQuery; and of cts.point, cts.circle, cts.box, and cts.polygon in JavaScript. For details, see the corresponding region constructors and Constructing Geospatial Point and Region Values.

Geospatial Entity Literal Syntax Example
point
lat,lon
tag:"37.5, -122.4"
circle
@radius lat,lon
tag:"@5 37.5,-122.4"
box
[sbound, wbound, nbound, ebound]
tag:"[45, -122, 78, 30]"
polygon
lat1,lon1 lat2,lon2 ...latN,lonN
tag:"100,0 101,0 101,1 100,1 100,0"

Geospatial point and region literals such as the point 37,-122 must be enclosed in double quotes. You cannot substitute single quotes for the double quotes.

For more details, see Constructing Geospatial Point and Region Values and Converting To and From Common Geospatial Representations.

Binding to an XQuery Query Generator Function

A query generator function should implement the following interface:

function (
  $operator as xs:string,
  $values as xs:string*,
  $options as xs:string*
) as cts:query?

If your function does not return a value, the query sub-expression is interpreted as text.

The following example adds a cts:collection-query to the search, corresponding to each term in the query text that is qualified by the tag name cat (as in category). If an unsupported category name is supplied, an error is thrown. If the operator is not : or EQ, no value is returned.

xquery version "1.0-ml";

(: The query generator :)
declare function local:scope-to-coll(
  $operator as xs:string,
  $values as xs:string*,
  $options as xs:string*)
as cts:query?
{
  if ($operator = (":", "EQ")) then
    let $known := ("classics", "fiction", "poetry")
    return cts:collection-query(
      for $c in ($values) 
      return
        if ($c = $known)
        then $c
        else fn:error(
          xs:QName("ERROR"), 
          fn:concat("Unrecognized category: ", $c))
    )
  else ()       (: unsupported operator :)
};

(: how to use it :)
let $bindings := map:map()
let $_ := map:put($bindings, "cat", local:scope-to-coll#3)
return cts:parse('cat EQ classics california', $bindings)
(: matchs docs in the "classics" collection that contain califorina :)

This query generator function produces the following results:

Query Text Result
cat:classics

cat EQ classics
cts:collection-query("classics")
cat:unrecognized
None - function reports an error
cat LT anything
(: interpreted as text :)
cts:and-query((
  (cts:word-query("cat", ("lang=en"), 1),
   cts:word-query("LT", ("lang=en"), 1),
   cts:word-query("anything", ("lang=en"), 1)   ),("unordered"))
Binding to a JavaScript Query Generator Function

A query generator function should implement the following interface:

function (operator, values, options)

Where operator is a string containing the operator token, and values and options are either a single value or a (possibly empty) Sequence.

Your function can return a cts.query, return nothing, or throw an error by calling fn.error. If you return nothing, the sub-expression is interpreted as text.

The following example adds a cts.collectionQuery to the search, corresponding to each term in the query text that is qualified by the tag name cat (as in category). If an unsupported category name is supplied, an error is thrown. If the operator is not : or EQ, no value is returned.

function scopeToColl(operator, category, options) {
  if (operator === ':' || operator === 'EQ') {
    // normalize input, which can be one val or an iterator
    const categories = 
        (category instanceof Sequence) 
        ? category.toArray() : [category];
    const known = ['classics', 'fiction', 'poetry']
    const collections = [];
    categories.forEach(function (c) {
      if (known.indexOf(c) != -1) {
        collections.push(c);
      } else {
        fn.error('ERROR', 'Unrecognized category: ' + c);
      }
    });
    return cts.collectionQuery(collections);
  }
  // else, unsupported operator, so return nothing
};

const bindings = { cat: scopeToColl };
cts.parse('cat:(classics poetry) california', bindings)

This query generator function produces the following results:

Query Text Result
cat:classics

cat EQ classics
cts.collectionQuery('classics')
cat:unrecognized
None - function reports an error
cat LT anything
// Function returns nothing, phrase interpreted as text
// by cts.parse
cts.andQuery(
  [cts.wordQuery("cat", ["lang=en"], 1),    cts.wordQuery("LT", ["lang=en"], 1),    cts.wordQuery("anything", ["lang=en"], 1)],
  ["unordered"])

The values in the second parameter may be strings or numbers. If a term in the query text can be represented as a number, then your function receives it as a number. Otherwise, the term is a string.

The following table illustrates how several variations on query text are interpreted and passed as input to your query generator:

Query Text Function Parameter Values
tag LT value
operator: 'LT'
values: value
options: an empty Sequence
tag = (val1 val2)
operator: '='
values: Sequence over val1 and val2
options: an empty Sequence
tag:42
operator: ':'
values: 42 as a number
options: an empty Sequence
tag:true
operator: ':'
values: 'true' (string, not boolean)
options: an empty Sequence
tag:value[opt]
operator: ':'
values: value
options: 'opt'
tag LT value[opt1,opt2=42]
operator: 'NE'
values: value
options: a Sequence over 'opt1' and 'opt2=42'

Customizing Naked Term Handling With Bindings

You can use bindings to control the interpretation of terms in query text that are not qualified by a tag (naked terms). For example, in query text such as cat AND dog, cat and dog are naked terms. The default interpretation of this query text is a query that matches the terms cat and dog anywhere they appear, similar to the following

cts:and-query((cts:word-query('cat'), cts:word-query('dog')))

If you create a binding with the empty string as the tag, you can customize the handling of terms that have no tag qualifier in the same way you can customize the interpretation of a defined tag. For example, you can configure the parser to scope the terms cat and dog to a particular XML element or JSON property.

You can bind naked terms to a content reference, field name, or a query generator function, just as when using a tag.

The following examples constrain naked terms to occurrences in an XML element/JSON property named title.

Language Example
XQuery
xquery version "1.0-ml";
let $bindings := map:map()
let $_ := map:put($bindings, "",
  cts:element-reference(xs:QName("title")))
return cts:parse('cat AND dog', $bindings)

(:
  cts:and-query((
    cts:element-word-query(fn:QName("","title"),"cat",("lang=en"),1),
    cts:element-word-query(fn:QName("","title"),"dog",("lang=en"), 1)),   ("unordered"))
:)
JavaScript
cts.parse(
  'cat AND dog',
  {'': cts.jsonPropertyReference('title')}
)

// cts.andQuery([
//    cts.jsonPropertyWordQuery("title", "cat", ["lang=en"], 1), //    cts.jsonPropertyWordQuery("title", "dog", ["lang=en"], 1)
//  ],
//  ["unordered"])

For more details on using bindings, see Binding a Tag to a Reference, Field, or Query Generator.

Query Text Parsing Examples

This section illustrates the output from the cts:parse XQuery function or cts.parse JavaScript function various inputs. For examples of queries that include option values, see Including Options and Weights in Query Text.

You can use a query similar to the following in Query Console to explore the parser output on your own. The bindings are only needed for the examples that use the color or loc tag. To parse some of the query text that uses the bound tags, you need to define an element range index on the body-color XML element or bodyColor JSON property, and a geospatial element ranage index on an XML element or JSON property named incidents.

Query Language Query Template
XML
xquery version "1.0-ml";
let $bindings := map:map()
let $_ := 
  map:put($bindings, 
    "color", cts:element-reference(xs:QName("body-color")))
return cts:parse(queryText, $bindings)
JavaScript
cts.parse(queryText, 
  { color: cts.jsonPropertyReference('bodyColor') })

The following table contains examples of input query text and the result returned by the parser.

Query Text cts:parse Output(XQuery) cts.parse Output(JavaScript)
cat
cts:word-query(
  "cat", ("lang=en"), 1)
cts.wordQuery(
  "cat", ["lang=en"], 1)
cat dog
cat AND dog
cts:and-query((
  cts:word-query(
    "cat", ("lang=en"), 1),
  cts:word-query(
    "dog", ("lang=en"), 1)
  ), ("unordered"))
cts.andQuery([
  cts.wordQuery(
    "cat", ["lang=en"], 1),
  cts.wordQuery(
    "dog", ["lang=en"], 1)
  ], ["unordered"])
cat dog OR mouse
cts:or-query((
  cts:and-query((
    cts:word-query(
      "cat",("lang=en"),1),
    cts:word-query(
      "dog",("lang=en"), 1)
    ), ("unordered")),   cts:word-query(
    "mouse",("lang=en"),1)
  ), ())
cts.orQuery([
  cts.andQuery([
    cts.wordQuery(
      "cat",["lang=en"],1),
    cts.wordQuery(
      "dog", ["lang=en"],1)
    ], ["unordered"]),
  cts.wordQuery(
    "mouse", ["lang=en"],1)
  ], [])
cat (dog OR mouse)
cts:and-query((
  cts:word-query(
    "cat", ("lang=en"), 1),
  cts:or-query((
    cts:word-query(
      "dog",("lang=en"),1),
    cts:word-query(
      "mouse",("lang=en"),
      1)
    ), ())
  ), ("unordered"))
cts.andQuery([
  cts.wordQuery(
    "cat", ["lang=en"], 1),
  cts.orQuery([
    cts.wordQuery(
      "dog",["lang=en"],1),
    cts.wordQuery(
     "mouse",["lang=en"],1)
    ], [])
  ], ["unordered"])
cat -dog
cts:and-query((
  cts:word-query(
    "cat", ("lang=en"), 1),
  cts:not-query(
    cts:word-query(
      "dog", ("lang=en"),
      1),
    1)
  ), ("unordered"))
cts.andQuery([
  cts.wordQuery(
    "cat", ["lang=en"], 1),
  cts.notQuery(
    cts.wordQuery(
     "dog", ["lang=en"],1),
    1)
  ], ["unordered"])
color:red
cts:element-word-query(
  fn:QName("","body-color"),
  "red", ("lang=en"), 1)
cts.jsonPropertyWordQuery(
  "bodyColor", "red",
  ["lang=en"], 1)
color = red
cts:element-value-query(
  fn:QName("","body-color"),
  "red", ("lang=en"), 1)
cts.jsonPropertyValueQuery(
  "bodyColor", "red",
  ["lang=en"], 1)
color EQ red
cts:element-range-query(
  fn:QName("","body-color"), 
  "=", "red",
  ("collation=..."), 1)
cts.jsonPropertyRangeQuery(
  "bodyColor", "=", "red",
  ["collation=..."], 1)
color:(red blue)
cts:element-word-query(
  fn:QName("","body-color"),
  ("red", "blue"),
  ("lang=en"), 1)
Matches if body-color contains either red or blue.
cts.jsonPropertyWordQuery(
  "color", ["red", "blue"],
  ["lang=en"], 1)
Matches if bodyColor contains either red or blue.
loc:"100.0,1.0"
cts:element-geospatial-query(
  fn:QName("","incidents"),
  cts:point("100,1"),
  ("coordinate-system=wgs84"),
  1)
cts.jsonPropertyGeospatialQuery(
  "incidents",
  cts.point("100,1"),
  ["coordinate-system=wgs84"],
  1)
loc:"[10,20,30,40]"
cts:element-geospatial-query(
  fn:QName("","incidents"),
  cts:box("[10, 20, 30, 40]"),
  ("coordinate-system=wgs84",
  1)
cts.jsonPropertyGeospatialQuery(
  "incidents", 
  cts.box("[10,20,30,40]"),
  ["coordinate-system=wgs84",
  1)

Combining multiple cts:query Expressions

Because cts:query expressions are composable, you can combine multiple expressions to form a single expression. There is no limit to how complex you can make a cts:query expressions. Any API that has a return type of cts:* (for example, cts:query, cts:and-query, and so on) can be composed with another cts:query expression to form another expression. This section has the following parts:

Using cts:and-query and cts:or-query

You can construct arbitrarily complex boolean logic by combining cts:and-query and cts:or-query constructors in a single cts:query expression.

For example, the following search with a relatively simple nested cts:query expression will return all fragments that contain either the word alfa or the word maserati, and also contain either the word saab or the word volvo.

cts:search(fn:doc(),
  cts:and-query( ( cts:or-query(("alfa", "maserati")), 
                   cts:or-query(("saab", "volvo") )
  ) )
)

Additionally, you can use cts:and-not-query and cts:not-query to add negation to your boolean logic.

Proximity Queries using cts:near-query

You can add tests for proximity to a cts:query expression using cts:near-query. Proximity queries use the word positions index in the database and, if you are using cts:element-query, the element word positions index. Proximity queries will still work without these indexes, but the indexes will speed performance of queries that use cts:near-query.

Proximity queries return true if the query matches occur within the specified distance from each other. You can specify both a maximum and a minimum distance.

For more details, see the MarkLogic XQuery and XSLT Function Reference for cts:near-query.

Using Bounded cts:query Expressions

The following cts:query constructors allow you to bound a cts:query expression to one or more documents, a directory, or one or more collections.

These bounding constructors allow you to narrow a set of search results as part of the second parameter to cts:search. Bounding the query in the cts:query expression is much more efficient than filtering results in a where clause, and is often more convenient than modifying the XPath in the first cts:search parameter. To combine a bounded cts:query constructor with another constructor, use a cts:and-query or a cts:or-query constructor.

For example, the following constrains a search to a particular directory, returning the URI of the document(s) that match the cts:query.

for $x in cts:search(fn:doc(), 
   cts:and-query((
     cts:directory-query("/shakespeare/plays/", "infinity"), 
         "all's well that"))
)
return xdmp:node-uri($x)

This query returns the URI of all documents under the specified directory that satisfy the query "all's well that".

In this query, the query "all's well that" is equivalent to a cts:word-query("all's well that").

Matching Nothing and Matching Everything

An empty cts:word-query will always match no fragments, and an empty cts:and-query will always match all fragments. Therefore the following are true:

cts:search(fn:doc(), cts:word-query("") )
=> returns the empty sequence
cts:search(fn:doc(), "" )
=> returns the empty sequence
cts:search(fn:doc(), cts:and-query( () ) )
=> returns every fragment in the database

You can also use cts:true-query and cts:false-query to match everything or nothing. For example:

cts:search(fn:doc(), cts:false-query())
==> returns the empty sequence

cts:search(fn:doc(), cts:true-query())
==> returns every fragment in the database

One use for an empty cts:word-query is when you have a search box that an end user enters terms to search for. If the user enters nothing and hits the submit button, then the corresponding cts:search will return no hits.

An empty cts:and-query or a cts-true-query that matches everything is sometimes useful when you need a cts:query to match everything.

Joining Documents and Properties with cts:properties-query or cts:document-fragment-query

You can use a cts:properties-query to match content in properties document. If you are searching over a document, then a cts:properties-query will search in the properties document at the URI of the document. The cts:properties-query joins the properties document with its corresponding document. The cts:properties-query takes a cts:query as a parameter, and that query is used to match against the properties document. A cts:properties-query is composable, so you can combine it with other cts:query constructors to create arbitrarily complex queries.

Using a cts:properties-query in a cts:search, you can easily create a query that returns results that join content in a document with content in the corresponding properties document. For example, consider a document that represents a chapter in a book, and the document has properties containing the publisher of the book. you can then write a search that returns documents that match a cts:query where the document has a specific publisher, as in the following example:

cts:search(collection(), cts:and-query((
  cts:properties-query(
    cts:element-value-query(xs:QName("publisher"), "My Press") ),
  cts:word-query("a small good thing") )) )

This query returns all documents with the phrase a small good thing and that have a value of My Press in the publisher element in their corresponding properties document.

Similarly, you can use cts:document-fragment-query to join documents against properties when searching over properties.

Registering cts:query Expressions to Speed Search Performance

If you use the same complex cts:query expressions repeatedly, and if you are using them as an unfiltered cts:query constructor, you can register the cts:query expressions for later use. Registering a cts:query expression stores a pre-evaluated version of the expression, making it faster for subsequent queries to use the same expression. Unfiltered constructors return results directly from the indexes and return all candidate fragments for a search, but do not perform post-filtering to validate that each fragment perfectly meets the search criteria. For details on unfiltered searches, see Using Unfiltered Searches for Fast Pagination in the Query Performance and Tuning Guide.

This section describes registered queries and provides some examples of how to use them. It includes the following topics:

Registered Query APIs

To register and reuse unfiltered searches for cts:query expressions, use the following XQuery APIs:

For the syntax of these functions, see the MarkLogic XQuery and XSLT Function Reference.

Must Be Used Unfiltered

You can only use registered queries on unfiltered constructors; using a registered query as a filtered constructor throws the XDMP-REGFLT exception. To specify an unfiltered constructor, use the "unfiltered" option to cts:registered-query. For details about unfiltered searches, see Using Unfiltered Searches for Fast Pagination in the Query Performance and Tuning Guide.

Registration Does Not Survive System Restart

Registered queries are only stored in the memory cache, and if the cache grows too big, some registered queries might be aged out of the cache. Also, if MarkLogic Server stops or restarts, any queries that were registered are lost and must be re-registered.

If you attempt to call cts:registered-query in a cts:search and the query is not currently registered, it throws an XDMP-UNREGISTERED exception. Because registered queries are not guaranteed to be registered every time they are used, it is good practice to use a try/catch around calls to cts:registered-query, and re-register the query in the catch if the it throws an XDMP-UNREGISTERED exception.

For example, the following sample code shows a cts:registered-query call used with a try/catch expression in XQuery:

(: wrap the registered query in a try/catch :)
try{
xdmp:estimate(cts:search(fn:doc(), 
  cts:registered-query(995175721241192518, "unfiltered")))
}
catch ($e) 
{
let $registered := 'cts:register(
		cts:word-query("hello*world", "wildcarded"))'
return
if ( fn:contains($e/*:code/text(), "XDMP-UNREGISTERED") )
then ( "retry this query with the following registered query ID: ",
       xdmp:eval($registered) )
else ( $e ) 
}

This code is somewhat simplified: it catches the XDMP-UNREGISTERED exception and simply reports what the new registered query ID is. In an application that uses registered queries, you probably would want to re-run the query with the new registered ID. Also, this example performs the try/catch in XQuery. If you are using XCC to issue queries against MarkLogic Server, you can instead perform the try/catch in the middleware Java layer.

Storing Registered Query IDs

When you register a cts:query expression, the cts:register function returns an integer, which is the ID for the registered query. After the cts:register call returns, there is no way to query the system to find the registered query IDs. Therefore, you might need to store the IDs somewhere. You can either store them in the middleware layer (if you are using XCC to issue queries against MarkLogic Server) or you can store them in a document in MarkLogic Server.

The registered query ID is generated based on a hash of the actual query, so registering the same query multiple times results in the same ID. The registered query ID is valid for all queries against the database across the entire cluster.

Registered Queries and Relevance Calculations

Searches that use registered queries will generate results having different scores from the equivalent searches using non-registered queries. This is because registered queries are treated as a single term in the relevance calculation. For details on relevance calculations, see Relevance Scores: Understanding and Customizing.

Example: Registering and Using a cts:query Expression

To run a registered query, you first register the query and then run the registered query, specifying it by ID. This section describes some example steps for registering a query and then running the registered query.

  1. First register the cts:query expression you want to run, as in the following example:
    cts:register(cts:word-query("hello*world", "wildcarded"))
  2. The first step returns an integer. Keep track of the integer value (for example, store it in a document).
  3. Use the integer value to run a search with the registered query (with the "unfiltered" option) as follows:
    cts:search(fn:doc(), 
              cts:registered-query(987654321012345678, "unfiltered") ) 

Adding Relevance Information to cts:query Expressions:

The leaf-level cts:query APIs (cts:word-query, cts:element-word-query, and so on) have a weight parameter, which allows you to add a multiplication factor to the scores produced by matches from a query. You can use this to increase or decrease the weight factor for a particular query. For details about score, weight, and relevance calculations, see Relevance Scores: Understanding and Customizing.

Serializations of cts:query Constructors

You can create an XML serialization of a cts:query. The XML serialization is used by alerting applications that use a cts:reverse-query constructor and is also useful to perform various programmatic tasks to a cts:query. Alerting applications (see Creating Alerting Applications) find queries that would match nodes, and then perform some action for the query matches. This section describes the serialized XML and includes the following parts:

Serializing a cts:query as XML

A serialized cts:query has XML that conforms to the <marklogic-dir>/Config/cts.xsd schema, which is in the http://marklogic.com/cts namespace, which is bound to the cts prefix. You can either construct the XML directly or, if you use any cts:query expression within the context of an element, MarkLogic Server will automatically serialize that cts:query to XML. Consider the following example:

<some-element>{cts:word-query("hello world")}</some-element>

When you run the above expression, it serializes to the following XML:

<some-element>
  <cts:word-query xmlns:cts="http://marklogic.com/cts">
    <cts:text xml:lang="en">hello world</cts:text>
  </cts:word-query>
</some-element>

If you are using an alerting application, you might choose to store this XML in the database so you can match searches that include cts:reverse-query constructors. For details on alerts, see Creating Alerting Applications.

Serializing a cts.query as JSON

You can construct the JSON representation of a cts query manually, or by applying xdmp.toJsonStringto the result of any cts.query constructor call. Consider the following example:

xdmp.toJsonString(cts.wordQuery("hello"))

If you evaluate the above expression in Query Console, you get the following output:

{"wordQuery":{"text":["hello"], "options":["lang=en"]}}

You can also turn a cts query into a JavaScript object in Server-Side JavaScript using the toObject method on the object turned by one of the cts.query constructors. For example, the following expression returns a JavaScript object equivalent to the above JSON.

cts.wordQuery('hello').toObject()

Add Arbitrary Annotations With cts:annotation

You can annotate your cts:query XML with cts:annotation elements. A cts:annotation element can be a child of any element in the cts:query XML, and it can consist of any valid XML content (for example, a single text node, a single element, multiple elements, complex elements, and so on). MarkLogic Server ignores these annotations when processing the query XML, but such annotations are often useful to the application. For example, you can store information about where the query came from, information about parts of the query to use or not in certain parts of the application, and so on. The following is some sample XML with cts:annotation elements:

<cts:and-query xmlns:cts="http://marklogic.com/cts">
  <cts:directory-query>
    <cts:annotation>private</cts:annotation>
    <cts:uri>/myprivate-dir/</cts:uri>
  </cts:directory-query>
  <cts:and-query>
    <cts:word-query><cts:text>hello</cts:text></cts:word-query>
    <cts:word-query><cts:text>world</cts:text></cts:word-query>
  </cts:and-query>
  <cts:annotation>
    <useful>something useful to the application here</useful>
  </cts:annotation>
</cts:and-query>

For another example that uses cts:annotation to store the original query string in a function that generates a cts:query from a string, see the last part of the example in Serializations of cts:query Constructors.

Constructing a cts:query From XML

You can turn an XML serialization of a cts:query back into an un-serialized cts:query with the cts:query function. For example, you can turn a serialized cts:query back into a cts:query as follows:

cts:query(
  <cts:word-query xmlns:cts="http://marklogic.com/cts">
    <cts:text>word</cts:text>
  </cts:word-query>
)
(: returns: cts:word-query("word", ("lang=en"), 1) :)

Constructing a cts.query From a JavaScript Object or JSON String

Before you can use a serialized cts.query in a context such as cts.search, you must de-serialize it and turn it back into an in-memory cts.query. When working with a serialized cts.query in Server-Side JavaScript, you will likely have the serialized query in memory as either a JavaScript object or as a JSON string.

To convert a JavaScript object into a cts.query node, pass the object to the cts.query constructor function. The following example artificially constructs a JavaScript object equivalent to the JSON serialization of a cts.query, for purposes of illustration.

const aQueryObject = 
  {wordQuery: {text : ['hello'], options: ['lang=en']}}
cts.query(aQueryObject)

To convert a JSON string cts.query serialization back into a cts.query node, first pass the JSON string through xdmp.fromJsonString, and then to the cts.query constructor function. Note that xdmp.fromJsonString returns a Sequence, so you must use the fn.head function to access the underlying node value. For example:

cts.query(fn.head(
  xdmp.fromJsonString(
    '{"wordQuery":{"text":["hello"], "options":["lang=en"]}}')
))

Example: Creating a cts:query Parser

The following sample code shows a simple query string parser that parses double-quote marks to be a phrase, and considers anything else that is separated by one or more spaces to be a single term. If needed, you can use the same design pattern to add other logic to do more complex parsing (for example, OR processing or NOT processing).

xquery version "1.0-ml";
declare function local:get-query-tokens($input as xs:string?) 
  as element() {
(: This parses double-quotes to be exact matches. :)
<tokens>{
let $newInput := fn:string-join(
(: check if there is more than one double-quotation mark.  If there is, 
   tokenize on the double-quotation mark ("), then change the spaces
   in the even tokens to the string "!+!".  This will then allow later
   tokenization on spaces, so you can preserve quoted phrases as phrase
   searches (after re-replacing the "!+!" strings with spaces).  :)
    if ( fn:count(fn:tokenize($input, '"')) > 2 )
    then ( for $i at $count in fn:tokenize($input, '"')
           return
             if ($count mod 2 = 0)
             then fn:replace($i, "\s+", "!+!")
             else $i )
    else ( $input ) , " ")
let $tokenInput := fn:tokenize($newInput, "\s+")

return (
for $x in $tokenInput
where $x ne ""
return
<token>{fn:replace($x, "!\+!", " ")}</token>)
}</tokens>
} ;

let $input := 'this is a "really big" test'
return
local:get-query-tokens($input)

This returns the following:

<tokens>
  <token>this</token>
  <token>is</token>
  <token>a</token>
  <token>really big</token>
  <token>test</token>
</tokens>

Now you can derive a cts:query expression from the tokenized XML produced above, which composes all of the terms with a cts:and-query, as follows (assuming the local:get-query-tokens function above is available to this function):

xquery version "1.0-ml";
declare function local:get-query($input as xs:string) 
{
let $tokens := local:get-query-tokens($input)
return
 cts:and-query( (cts:and-query(
        for $token in $tokens//token
        return 
        cts:word-query($token/text()) ) ))
} ;

let $input := 'this is a "really big" test'
return
local:get-query($input)

This returns the following (spacing and line breaks added for readability):

cts:and-query(
  cts:and-query((
    cts:word-query("this", (), 1), 
    cts:word-query("is", (), 1), 
    cts:word-query("a", (), 1), 
    cts:word-query("really big", (), 1), 
    cts:word-query("test", (), 1)
    ), ()) ,
  () )

You can now take the generated cts:query expression and add it to a cts:search.

Similarly, you can generate a serialized cts:query as follows (assuming the local:get-query-tokens function is available):

xquery version "1.0-ml";
declare function local:get-query-xml($input as xs:string) 
{
let $tokens := local:get-query-tokens($input)
return
 element cts:and-query { 
       element cts:and-query { 
           for $token in $tokens//token
           return 
           element cts:word-query { $token/text() } },
           element cts:annotation {$input} }
} ;

let $input := 'this is a "really big" test'
return
local:get-query-xml($input)

This returns the folllowing XML serialization:

<cts:and-query xmlns:cts="http://marklogic.com/cts">
  <cts:and-query>
    <cts:word-query>this</cts:word-query>
    <cts:word-query>is</cts:word-query>
    <cts:word-query>a</cts:word-query>
    <cts:word-query>really big</cts:word-query>
    <cts:word-query>test</cts:word-query>
  </cts:and-query>
  <cts:annotation>this is a "really big" test</cts:annotation>
</cts:and-query>
« Previous chapter
Next chapter »
Powered by MarkLogic Server | Terms of Use | Privacy Policy