Loading TOC...
Search Developer's Guide (PDF)

Search Developer's Guide — Chapter 6

Composing cts:query Expressions

Searches in MarkLogic Server use expressions that have a cts:query type. This chapter describes how to create various types of cts:query expressions and how you can register some complex expressions to improve performance of future queries that use the registered cts:query expressions.

MarkLogic Server includes many Built-In XQuery functions to compose cts:query expressions. The signatures and descriptions of the various APIs are described in the MarkLogic XQuery and XSLT Function Reference.

This chapter includes the following sections:

Understanding cts:query

The second parameter for cts:search takes a parameter of cts:query type. The contents of the cts:query expression determines the conditions in which a search will return a document or node. This section describes cts:query and includes the following parts:

cts:query Hierarchy

The cts:query type forms a hierarchy, allowing you to construct complex cts:query expressions by combining multiple expressions together. The hierarchy includes composable and non-composable cts:query constructors.

A composable constructor is one that is used to combine multiple cts:query constructors together. A leaf-level constructor is one that cannot be used to combine with other cts:query constructors (although it can be combined using a composable constructor).

The following diagram shows the leaf-level cts:query constructors, which are not composable, and the composable cts:query constructors, which you can use to combine both leaf-level and other composable cts:query constructors. The diagram shows most of the available constructors, but not necessarily all of them.

Equivalent constructors exist for Server-Side JavaScript. For example, the JavaScript built-in cts.andQuery is equivalent to the XQuery built-in cts:and-query in the diagram above.

The remainder of this chapter goes into more detail on combining constructors.

Use to Narrow the Search

The core search cts:query API is cts:word-query. The cts:word-query function returns true for words or phrases that match its $text parameter, thus narrowing the search to fragments containing terms that match the query. If needed, you can use other cts:query APIs to combine a cts:word-query expression into a more complex expression. Similarly, you can use the other leaf-level cts:query constructors to narrow the results of a search.

Understanding cts:element-query

The cts:element-query function searches through a specified element and all of its children. It is used to narrow the field of search to the specified element hierarchy, exploiting the XML structure in the data. Also, it is composable with other cts:element-query functions, allowing you to specify complex hierarchical conditions in the cts:query expressions.

For example, the following search against a Shakespeare database returns the title of any play that has SCENE elements that have SPEECH elements containing both the words 'room' and 'castle':

for $x in cts:search(fn:doc(), 
   cts:element-query(xs:QName("SCENE"), 
       cts:element-query(xs:QName("SPEECH"), 
           cts:and-query(("room", "castle")) ) ) ) 
return
($x//TITLE)[1]

This query returns the first TITLE element of the play. The TITLE element is used for both play and scene titles, and the first one in a play is the title of the play.

When you use cts:element-query and you have both the word positions and element word positions indexes enabled in the Admin Interface, it will speed the performance of many queries that have multiple term queries (for example, "the long sly fox") by eliminating some false positive results.

Understanding cts:element-word-query

While cts:element-query searches through an element and all of its children, cts:element-word-query searches only the immediate text node children of the specified element. For example, consider the following XML structure:

<root>
  <a>hello
    <b>goodbye</b>
  <a>
</root>

The following query returns false, because "goodbye" is not an immediate text node of the element named a:

cts:element-word-query(xs:QName("a"), "goodbye")

Understanding Field Word and Value Query Constructors

The cts:field-word-query and cts:field-value-query constructors search in fields for either words or values. A field value is defined as all of the text within a field, with a single space between text that comes from different elements. For example, consider the following XML structure:

<name>
  <first>Raymond</first>
  <middle>Clevie</middle>
  <last>Carver</last>
</name>

If you want to normalize names in the form firstname lastname, then you can create a field on this structure. The field might include the element name and exclude the element middle. The value of this instance of the field would then be Raymond Carver, with a space between the text from the two different element values from first and last. If your document contained other name elements with the same structure, their values would be derived similarly. If the field is named my-field, then a cts:field-value-query("my-field", "Raymond Carver") returns true for documents containing this XML. Similarly, a cts:field-word-query("my-field", "Raymond Carver") returns true.

For more information about fields, see Fields Database Settings in the Administrator's Guide. For information on lexicons on fields, see Field Value Lexicons.

Understanding the Range Query Constructors

The cts:element-range-query, cts:element-atribute-range-query, cts:path-range-query, and cts:field-range-query constructors allow you to specify constraints on a value in a cts:query expression. The range query constructors require a range index on the specified element or attribute. For details on range queries, see Using Range Queries in cts:query Expressions.

Understanding the Reverse Query Constructor

The cts:reverse-query constructor allows you to match queries stored in a database to nodes that would match those queries. Reverse queries are used as the basis for alert applications. For details, see Creating Alerting Applications.

Understanding the Geospatial Query Constructors

The geospatial query constructors are used to constrain cts:query expressions on geospatial data. Geospatial searches are used with documents that have been marked up with latitude and longitude data, and can be used to answer queries like 'show me all of the documents that mention places within 100 miles of New York City.' For details on gesospatial searches, see Geospatial Search Applications.

Specifying the Language in a cts:query

All leaf-level cts:query constructors are language-aware; you can either explicitly specify a language value as an option, or it will default to the database default language. The language option specifies the language in which the query is tokenized and, for stemmed searches, the language of the content to be searched.

To specify the language option in a cts:query, use the lang=language_code, where language_code is the two or three character ISO 639-1 or ISO 639-2 language code (http://www.loc.gov/standards/iso639-2/php/code_list.php). For example, the following query:

let $x := 
<root>
 <el xml:lang="en">hello</el>
 <el xml:lang="fr">hello</el>
</root>
return
$x//el[cts:contains(., 
         cts:word-query("hello", ("stemmed", "lang=fr")))]

returns only the French-language node:

<el xml:lang="fr">hello</el>

Depending on the language of the cts:query and on the language of the content, a string will tokenize differently, which will affect the search results. For details on how languages and the xml:lang attribute affect tokenization and searches, see Language Support in MarkLogic Server.

Creating a Query From Search Text With cts:parse

This section describes how to create a cts:query from a simple search string using the cts:parse. XQuery function or the cts.parse Server-Side JavaScript function. The following topics are covered:

String Query Overview

A string query is a plain text search string ('query text') composed of terms, phrases, and operators that can be easily composed by end users typing into an application search box. For example, 'cat AND dog' is a string query for finding documents that contain both the term 'cat' and the term 'dog'.

You can use the cts:parse XQuery built-in function or the cts.parse Server-Side JavaScript built-in function to convert such a string query into a cts:query (XQuery) or cts.query (JavaScript). Use the resulting query in any interface that accepts a cts query, such as the cts:search XQuery function, the cts.search JavaScript function, and several JSearch API interfaces.

The following example uses cts:parse to match documents that contain the term 'cat' and the term 'dog'.

Language Example
XQuery
cts:search(cts:parse("cat AND dog"))
JavaScript
// with cts.search
cts.search(cts.parse('cat AND dog'))

// with JSearch
var jsearch = require('/MarkLogic/jsearch.sjs');
jsearch.documents()
  .where(cts.parse('cat AND dog'))
  .result()

The string query grammar supported by cts:parse enables users to compose complex queries. Adjacents terms, phrases and sub-expressions are implicitly AND'd together.

The following are some examples of queries that work with cts:parse 'out of the box':

  • (cat OR dog) NEAR vet

    at least one of the terms cat or dog within 10 terms (the default distance for cts:near-query) of the word vet

  • dog NEAR/30 vet

    the word dog within 30 terms of the word vet

  • cat -dog

    the word cat where there is no word dog

You can also bind a tag name to an index reference, lexicon reference, or field name. When such a tag name appears in a query string, it parses to a word, value, or range query that is scoped to the bound entity.

For example, binding the tag 'color' to a cts:reference to a JSON property named bodyColor enables users to create query text like the following:

  • color:red

    Match documents where the value of the bodyColor contains the word 'red'

  • color NE blue

    Match documents where the value of bodyColor is not 'blue'

Without the binding, the above examples are just word queries that include the term 'color'. For example, without a binding, 'color NE blue' becomes a query for documents containing the words 'color', 'NE', and 'blue'.

You can also bind a tag name to a reference to a function that generates a query, giving you more control over the interpretation. For example, you can use a query generator function to scope a query to documents in a particular collection or directory.

For details, see Binding a Tag to a Reference, Field, or Query Generator.

Grammar Components and Operators

This section describes the components and operators you can use in query text passed to cts.parse. Some operators are only available in search terms that involve tags bound to query generators using the parse binding feature.

Basic Components and Operators

The table below describes the basic components and operators recognized by the cts:parse XQuery function and the cts.parse JavaScript function. If you define bindings, then additional operators become available for query expressions using a bound tag. For details, see Operators Usable With Bound Tags.

Note that an empty query string (cts:parse("")) generates an empty cts:and-query that matches everything.

Query Example Description
any adjacent terms
dog
dog tail
"dog tail" cat mouse
dog (cat OR mouse)
Match one or more terms or query expressions, as with a cts:and-query. Adjacent terms and query expressions are implicitly joined with AND. For example, dog tail is the same as dog AND tail.
"phrase"
"dog tail"
"dog tail" "cat whisker"
dog "cat whisker"
Terms in double quotes are treated as a phrase. Adjacent terms and phrases are implicitly joined with AND. For example, dog "cat whisker" matches documents containing both the term dog and the phrase cat whisker.
( )
(cat OR dog) zebra
Parentheses indicate grouping. The example matches documents containing at least one of the terms cat or dog as well as the term zebra.
-query
-dog
-(dog OR cat)
cat -dog
A NOT operation, as with a cts:not-query. For example, cat -dog matches documents that contain the term cat but that do not contain the term dog.
query1 AND query2
dog AND cat
(cat OR dog) AND zebra
Match two query expressions, as with a cts:and-query. For example, dog AND cat matches documents containing both the term dog and the term cat. AND is the default way to combine terms and phrases, so the previous example is equivalent to dog cat.
query1 OR query2 dog OR cat Match either of two queries, as with a cts:or-query. The example matches documents containing at least one of either of terms cat or dog.
query1 NOT_IN query2 dog NOT_IN "dog house" Match one query when the match does not overlap with another, as with cts:not-in-query. The example matches occurrences of dog when it is not in the phrase dog house.
query1 NEAR query2 dog NEAR cat (cat food) NEAR mouse Find documents containing matches to the queries on either side of the NEAR operator when the matches occur within 10 terms of each other, as with a cts:near-query. For example, dog NEAR cat matches documents containing dog within 10 terms of cat.
query1 NEAR/N query2 dog NEAR/2 cat Find documents containing matches to the queries on either side of the NEAR operator when the matches occur within N terms of each other, as with a cts:near-query. The example matches documents where the term dog occurs within 2 terms of the term cat.
query1 BOOST query2 george BOOST washington Find documents that match query1. Boost the relevance score of documents that also match query2. The example returns all matches for the term 'george', with matches in documents that also contain 'washington' having a higher relevance score. For more details, see cts:boost-query.
query [opt,opt,...] cat[min-occurs=5] Pass options or a weight to the cts query generated for query. For details, see Including Options and Weights in Query Text
Operators Usable With Bound Tags

When you bind a tag to an index, lexicon, field, or query generator, then you can use the tag name in the ways shown in the following table. If you use these operators in a context in which the left operand is not a tag name, then the 'operator' is simply interpreted as another query term. That is, 'unbound LT value' is a cts:and-query of word queries on the words 'unbound', 'LT', and 'value'.

For more information on defining a binding, see Binding a Tag to a Reference, Field, or Query Generator.

The sub-expressions enabled by these operators can be used in combination with the grammar features described in Basic Components and Operators. You can also associate options with sub-expressions that use tags; for details, see Including Options and Weights in Query Text.

If you bind a tag to a geospatial index reference, the value you compare to the tag can be geospatial point or region. Not all the operators listed below are sensible in a geospatial context. For details, see Binding to a Geospatial Index Reference.

Query Example Description
tag:value
color:red
decade:1980s
birthday:1999-12-31
Matches documents where value satisfies a word query against the reference bound to tag. For example, as with a cts:element-word-query.
tag:(valueList)
color:(red blue)
decade:(1980s 1990s)
Matches documents where at least one value in valueList satisfies a word query against the reference bound to tag. For example, as with a cts:element-word-query.
tag = value
color = red
decade = 1980s
birthday = 1999-12-31
Matches documents where value satisfies a value query against the reference bound to tag. For example, as with a cts:element-value-query.
tag = (valueList)
color = (red blue)
decade = (1980s 1990s)
Matches documents where at least one value in valueList satisfies a value query against the reference bound to tag. For example, as with a cts:element-value-query.
tag EQ value
color EQ red
decade EQ 1980s
birthday EQ 1999-12-31
Matches documents where value satisfies a range query with the '=' operator against the reference bound to tag. For example, as with a cts:element-range-query.
tag EQ (valueList)
color EQ (red blue)
decade EQ (1980s 1990s)
Matches documents where at least one value in valueList satisfies a range query with the '=' operator against the reference bound to tag. For example, as with a cts:element-word-query.
tag NE value
color NE red
birthday NE 1999-12-31
Matches documents where value satisfies a range query with the '!=' operator against the reference bound to tag. For example, as with a cts:element-range-query.
tag LT value
color LT red
birthday LT 1999-12-31
Matches documents where value satisfies a range query with the '<' operator against the reference bound to tag. For example, as with a cts:element-range-query.
tag LE value
color LE red
birthday LE 1999-12-31
Matches documents where value satisfies a range query with the '<=' operator against the reference bound to tag. For example, as with a cts:element-range-query.
tag GT value
color GT red
birthday GT 1999-12-31
Matches documents where value satisfies a range query with the '>' operator against the reference bound to tag. For example, as with a cts:element-range-query.
tag GE value
color GE red
birthday GE 1999-12-31
Matches documents where value satisfies a range query with the '>=' operator against the reference bound to tag. For example, as with a cts:element-range-query.
query [opt,opt,...] color:(red,blue)[unstemmed] Pass options or a weight to the cts query generated for query. For details, see Including Options and Weights in Query Text
Including Options and Weights in Query Text

Your query text can include query options or a weight that is passed through to the query generated by cts:parse. This is an advanced feature that you would not typically expose directly to end users. To use this feature, put the options or weight in brackets after query term:

query[option1, ..., optionN]

To specify a weight, use 'weight=N'. See the examples below.

The following table provides some examples of passing options and weights in query text.

Query Text Generated Query
cat
cts:word-query("cat", ("lang=en"), 1)
cat[case-sensitive]
cts:word-query(
  "cat", ("case-sensitive","lang=en"), 1)
chat[stemmed,lang=fr]
cts:word-query("chat", ("stemmed","lang=fr"), 1)
cat AND dog
cts:and-query(
  (cts:word-query("cat", ("lang=en"), 1),
   cts:word-query("dog", ("lang=en"), 1)
  ),("unordered"))
cat[min-occurs=3] AND
  dog[weight=2]
cts:and-query(
  (cts:word-query("cat",
    ("lang=en","min-occurs=3"), 1),
   cts:word-query("dog", ("lang=en"), 2)
  ),("unordered"))

Binding a Tag to a Reference, Field, or Query Generator

This topic describes how to define parse bindings that enable the use of specially scoped relational and comparison operators in query text passed to the cts:parse XQuery function or cts.parse Server-Side JavaScript function. You can create bindings to XML elements, XML element attributes, JSON properties, fields, and paths, as well as to custom parsing functions.

The following topics are covered:

Binding Overview

The cts:parse XQuery function and the cts.parse JavaScript function accept an optional 2nd parameter that is a set of bindings between a tag and a content reference, field name, or a query generator function. When you use the tag in query text, cts:parse (cts.parse) uses the binding to generate a query based on the bound reference, field, or function.

In XQuery, bindings are represented by a map with the tag names as the keys. In JavaScript, the bindings are represented by a JavaScript object with the tag names as the object property names. For example, the following code snippet binds the tag 'by' to an XML element/JSON property named 'author':

Language Example
XQuery
let $bindings := map:map()
let $_ := map:put(
  $bindings, 
  "by", cts:element-reference(xs:QName("author"))
JavaScript
var bindings =
  { by: cts.jsonPropertyReference('author') };

Given the above binding, you can use 'by' in query text to represent the value of the 'author' element or property. For example, the following query text parses to a cts:element-word-query (or cts.jsonPropertyWordQuery) for the phrase 'mark twain' in the 'author' XML element or JSON property.

by:"mark twain"

The example above uses an element reference in XQuery and a JSON property reference in JavaScript, but your choice of query language does not limit you to a particular reference type. For example, you can create a binding with cts:json-property-reference in XQuery and with cts.elementReference in JavaScript.

You can examine the serialized output from cts:parse in Query Console to observe the results of using a bound tag in query text. For example, passing the above query text and bindings to cts:parse yields the results shown below:

Language Example
XQuery
xquery version "1.0-ml";
let $bindings := map:map()
let $_ := map:put(
  $bindings, 
  "by", cts:element-reference(xs:QName("author")))
return cts:parse('by:"mark twain"', $bindings)

(: emits 
 : cts:element-word-query(fn:QName("","author"), "mark twain")
 :)
JavaScript
var bindings =
  { by: cts.jsonPropertyReference('author') };
cts.parse('by:"mark twain"', bindings)

// emits
// cts.jsonPropertyWordQuery("author", "mark twain")

You get this result because the ':' operator signifies comparison as per a word query, and the binding dictates the word query is scoped to a specific JSON property. Thus, the combination of the operator and the bound reference determines the generated query. For details, see Binding to a cts:reference.

The ':', '=', and 'EQ' operators also accept a grouping of values, which is handled like an OR. For example, the following query matches documents where the 'author' JSON property contains either the word 'twain' or the word 'frost':

by:(twain frost)

If you define a binding with an empty string as the tag, the binding applies to unqualified terms like 'cat'. For details, see Customizing Naked Term Handling With Bindings.

Binding to a simple string is similar, but the bound entity in that case is a field. For details, see Binding to a Field by Simple Name.

For a complete mapping of reference type and operator to query type, refer to the reference documentation for cts:parse in the MarkLogic XQuery and XSLT Function Reference or cts.parse in the MarkLogic Server-Side JavaScript Function Reference.

If the default query mapping does not satisfy the requirements of your application, you can bind a tag to a query generator function instead. Binding a tag to a function that generates a cts query gives you more control over the interpretation of a query sub-expression and enables using the following operators in query text: ':', '=', 'LT', 'LE', 'GT', 'GE', 'EQ', 'NE'.

The bound function is expected to generate a cts:query (or cts.query) from the operator and operands. For example, you could cause the query text 'by:"mark twain"' to match 'mark twain' in the author property only when the phrase occurs in documents in a specific collection. For details, see Binding to an XQuery Query Generator Function or Binding to a JavaScript Query Generator Function.

Function binding is designed to enable you to override the default query selection when a tag is bound to a reference or simple string. It is not a general purpose grammar extender. For example, you cannot define a new operators or change the number of operands expected by an operator.

Binding to a cts:reference

You can bind a tag to a cts:reference by using any cts:reference constructor. This enables you to bind a tag to an XML element or element attribute, JSON property, field, or path. Query expressions using the tag can parse to a word query, value query, or range query, depending on the operator context.

For example, the following code binds the tag 'cost' to an XML element or JSON property named 'price', then uses the 'cost' tag in the query expression 'cost LT 15'. The use of the tag with the 'LT' operator causes the expression to parse to a range query, so the database configuration should include a range index on 'price' with type 'float'.

Language Example
XQuery
xquery version "1.0-ml";
let $bindings := map:map()
let $_ := map:put(
  $bindings, 
  "cost", cts:element-reference(xs:QName("price")))
return cts:parse('cost LT 15', $bindings)

(: cts:element-range-query(
     fn:QName("","price"), "<", xs:float("15"), (), 1) 
:)
JavaScript
var bindings =
  { by: cts.jsonPropertyReference('price') };
cts.parse('cost LT 15', bindings)

// cts.jsonPropertyRangeQuery(
//    "price", "<", xs.float("15"), [], 1)

If you use the binding in a different operator context, the parser generates a different kind of query. For example, the ':' operator generates a word query in most cases, so the query text 'cost:15' parses to a cts:element-word-query or cts.jsonPropertyWordQuery, similar to the following:

cts:element-word-query(fn:QName("","price"), "15", ("lang=en"), 1)

cts.jsonPropertyWordQuery("price", "15", ["lang=en"], 1)

If you bind a tag to a geospatial index reference, the ':' operator generates a geospatial query. For details, see Binding to a Geospatial Index Reference.

For a complete list of the types of query generated by each operator, refer to cts:parse in the MarkLogic XQuery and XSLT Function Reference or cts.parse in the MarkLogic Server-Side JavaScript Function Reference.

By default, the parser checks for the existence of a backing index or lexicon for each cts reference when it processes your bindings. Though it is usually beneficial to have a backing index for a binding, you can suppress the index check if you want to defer index creation or know you will never use the binding in a search context that actually requires an index. For example, range queries always require an index, but a word query does not necessarily require one. If you use an unchecked binding to create a query that requires an index, you will still get an error when you use the query in a search.

To suppress the parse time index check, add the 'unchecked' and 'type' options when creating the reference. The 'type' option is required because the parser can no longer derive this information from the index definition. The following example illustrates the parse time check vs. the search time check:

Language Example
XQuery
(: parse time XDMP-ELEMRIDXNOTFOUND if no range index exists:)
xquery version "1.0-ml";
let $bindings := map:map()
let $_ := map:put(
  $bindings, 
  "cost", cts:element-reference(xs:QName("price")))
return cts:parse('cost LT 15', $bindings);

(: search time XDMP-ELEMRIDXNOTFOUND :)
xquery version "1.0-ml";
let $bindings := map:map()
let $_ := map:put(
  $bindings, "cost", cts:element-reference(xs:QName("price"),
    ("type=float","unchecked")))
return cts:search(cts:parse('cost LT 15', $bindings))
JavaScript
// parse time XDMP-ELEMRIDXNOTFOUND if no range index exists
cts.parse('cost LT 15', {p: cts.jsonPropertyReference('price')})

// Suppress the parse time index check
var query = cts.parse('cost LT 15', 
  {cost: cts.jsonPropertyReference(
    'price',['type=float','unchecked'])})
// But will still get search time error if no range index found
cts.search(query)        // XDMP-ELEMRIDXNOTFOUND
Binding to a Field by Simple Name

You can bind to a field by name or by cts:reference. This section describes how to bind to field by name. To use a reference constructor, instead, see Binding to a cts:reference.

When you bind a tag to a simple string, the string is interpreted as the name of a field. The database configuration should include a corresponding field definition.

For example, the following binds the tag 'name' to a field named 'person':

Language Example
XQuery
xquery version "1.0-ml";
let $bindings := map:map()
let $_ := map:put(
  $bindings, 
  "name", "person")
return cts:parse('name:"jane doe"', $bindings)
(: cts:field-word-query("name", "jane doe", ("lang=en"), 1) :)
JavaScript
var bindings = { name: 'person' };
cts.parse('name:"jane doe"', bindings)

// cts.fieldWordQuery("name", "jane doe", ["lang=en"], 1)

When you use the bound tag, it will parse to a cts:field-word-query, cts:field-value-query, or cts:field-range-query, depending on the operator context. If you use the tag name in a context that parses to a range query, you will get an error if the database configuration does not include a corresponding field range index.

To learn more about fields, see Fields Database Settings in the Administrator's Guide.

For a complete list of the kinds of query generated by the supported (cts:reference, operator) pairs, refer to cts:parse in the MarkLogic XQuery and XSLT Function Reference or cts.parse in the MarkLogic Server-Side JavaScript Function Reference.

Binding to a Geospatial Index Reference

If you bind a tag (or naked terms) to a cts:reference to a geospatial index, you can construct query terms that represents a geospatial query. For example you can match documents that include a point that is contained within a region defined in the query text, or documents containing a point equal to a point defined in the query text. You can define a point, circle, box, or polygon in the query text.

For example, if you bind the tag 'loc' to a geospatial index on an XML element or JSON property, then the following query text matches documents where the indexed element or property contains a point within a circle defined by a radius and a center point, using the syntax '@radius lon,lat':

loc:"@5 37.5,-122.4"

The following code demonstrates how to define the binding and parse the above query text. In this example, the tag 'loc' is bound to a geospatial index on an XML element or JSON property named 'incidents'. The resulting query matches documents containing points in the 'incidents' element or property contained within the circle with center (37.5,-122.4) and a radius of 5 miles.

Language Example
XQuery
xquery version "1.0-ml";
let $bindings := map:map()
let $_ := map:put(
  $bindings, 
  "loc", cts:geospatial-element-reference(xs:QName("incidents")))
return cts:parse('loc:"@5 37.5,-122.4"', $bindings)
(: cts:element-geospatial-query(
 :   fn:QName("","incidents"), 
 :   cts:circle("@5 37.5,-122.4"),
 :   ("coordinate-system=wgs84"), 1) 
 :)
JavaScript
cts.parse('loc:"@5 37.5,-122.4"', 
  { loc: cts.geospatialJsonPropertyReference('incidents') })

// cts.jsonPropertyGeospatialQuery(
//    "incidents", 
//    cts.circle("@5 37.5,-122.4"),
//    ["coordinate-system=wgs84"], 1)

Geospatial search terms can use either the MarkLogic representation of a point or region, or WKT. The following examples of query text represent the same underlying cts query for finding points contained within a polygon. The first uses the MarkLogic serialization of a cts:polygon, and the second uses the WKT representation of the same polygon.

"0,0 0,10 10,10 10,0 0,0"

"POLYGON((0 0, 0 10, 10 10, 10 0, 0 0))"

For more details about using WKT in MarkLogic, see Converting To and From Common Geospatial Representations.

The following table summarizes the MarkLogic syntax for defining points and regions in query text. This syntax is the serialization of cts:point, cts:circle, cts:box, and cts:polygon in XQuery; and of cts.point, cts.circle, cts.box, and cts.polygon in JavaScript. For details, see the corresponding region constructors and Constructing Geospatial Point and Region Values.

Geospatial Entity Pattern Example
point
lat,lon
tag:"37.5, -122.4"
circle
@radius lat,lon
tag:"@5 37.5,-122.4"
box
[sbound, wbound, nbound, ebound]
tag:"[45, -122, 78, 30]"
polygon
lat1,lon1 lat2,lon2 ...latN,lonN
tag:"100,0 101,0 101,1 100,1 100,0"

When you use a tag bound to a geospatial index with the ':' operator, the parse function produces a geospatial query. The type of query depends on the type of geospatial reference in the tag binding.

The following list shows the relationship between the reference type and geospatial query in XQuery:

The following list shows the relationship between the reference type and geospatial query in Server-Side JavaScript:

For more information on searching geospatial data, see Geospatial Search Applications.

Binding to an XQuery Query Generator Function

A query generator function should implement the following interface:

function (
  $operator as xs:string,
  $values as xs:string*,
  $options as xs:string*
) as cts:query?

If your function does not return a value, the query sub-expression is interpreted as text.

The following example adds a cts:collection-query to the search, corresponding to each term in the query text that is qualified by the tag name 'cat' (as in 'category'). If an unsupported category name is supplied, an error is thrown. If the operator is not ':' or 'EQ', no value is returned.

xquery version "1.0-ml";

(: The query generator :)
declare function local:scope-to-coll(
  $operator as xs:string,
  $values as xs:string*,
  $options as xs:string*)
as cts:query?
{
  if ($operator = (":", "EQ")) then
    let $known := ("classics", "fiction", "poetry")
    return cts:collection-query(
      for $c in ($values) 
      return
        if ($c = $known)
        then $c
        else fn:error(
          xs:QName("ERROR"), 
          fn:concat("Unrecognized category: ", $c))
    )
  else ()       (: unsupported operator :)
};

(: how to use it :)
let $bindings := map:map()
let $_ := map:put($bindings, "cat", local:scope-to-coll#3)
return cts:parse('cat EQ classics california', $bindings)
(: matchs docs in the "classics" collection that contain califorina :)

This query generator function produces the following results:

Query Text Result
cat:classics

cat EQ classics
cts:collection-query("classics")
cat:unrecognized
None - function reports an error
cat LT anything
(: interpreted as text :)
cts:and-query((
  (cts:word-query("cat", ("lang=en"), 1),
   cts:word-query("LT", ("lang=en"), 1),
   cts:word-query("anything", ("lang=en"), 1)   ),("unordered"))
Binding to a JavaScript Query Generator Function

A query generator function should implement the following interface:

function (operator, values, options)

Where operator is a string containing the operator token, and values and options are either a single value or a (possibly empty) ValueIterator.

Your function can return a cts.query, return nothing, or throw an error by calling fn.error. If you return nothing, the sub-expression is interpreted as text.

The following example adds a cts.collectionQuery to the search, corresponding to each term in the query text that is qualified by the tag name 'cat' (as in 'category'). If an unsupported category name is supplied, an error is thrown. If the operator is not ':' or 'EQ', no value is returned.

function scopeToColl(operator, category, options) {
  if (operator === ':' || operator === 'EQ') {
    // normalize input, which can be one val or an iterator
    var categories = 
        (category instanceof ValueIterator) 
        ? category.toArray() : [category];
    var known = ['classics', 'fiction', 'poetry']
    var collections = [];
    categories.forEach(function (c) {
      if (known.indexOf(c) != -1) {
        collections.push(c);
      } else {
        fn.error('ERROR', 'Unrecognized category: ' + c);
      }
    });
    return cts.collectionQuery(collections);
  }
  // else, unsupported operator, so return nothing
};

var bindings = { cat: scopeToColl };
cts.parse('cat:(classics poetry) california', bindings)

This query generator function produces the following results:

Query Text Result
cat:classics

cat EQ classics
cts.collectionQuery('classics')
cat:unrecognized
None - function reports an error
cat LT anything
// Function returns nothing, phrase interpreted as text
// by cts.parse
cts.andQuery(
  [cts.wordQuery("cat", ["lang=en"], 1),    cts.wordQuery("LT", ["lang=en"], 1),    cts.wordQuery("anything", ["lang=en"], 1)],
  ["unordered"])

The values in the second parameter may be strings or numbers. If a term in the query text can be represented as a number, then your function receives it as a number. Otherwise, the term is a string.

The following table illustrates the how several variations on query text are interpreted and passed as input to your query generator:

Query Text Function Parameter Values
tag LT value
operator: 'LT'
values: value
options: an empty ValueIterator
tag = (val1 val2)
operator: '='
values: ValueIterator over val1 and val2
options: an empty ValueIterator
tag:42
operator: ':'
values: 42 as a number
options: an empty ValueIterator
tag:true
operator: ':'
values: 'true' (string, not boolean)
options: an empty ValueIterator
tag:value[opt]
operator: ':'
values: value
options: 'opt'
tag LT value[opt1,opt2=42]
operator: 'NE'
values: value
options: a ValueIterator over 'opt1' and 'opt2=42'

Customizing Naked Term Handling With Bindings

You can use bindings to control the interpretation of terms in query text that are not qualified by a tag (naked terms). For example, in query text such as 'cat AND dog', 'cat' and 'dog' are naked terms. The default interpretation of this query text is a query that matches the terms 'cat' and 'dog' anywhere they appear, similar to the following

cts:and-query((cts:word-query('cat'), cts:word-query('dog')))

If you create a binding with the empty string as the tag, you can customize the handling of terms that have no tag qualifier in the same way you can customize the interpretation of a defined tag. For example, you can configure the parser to scope the terms 'cat' and 'dog' to a particular XML element or JSON property.

You can bind naked terms to a content reference, field name, or a query generator function, just as when using a tag.

The following examples constrain naked terms to occurrences in an XML element/JSON property named 'title'.

Language Example
XQuery
xquery version "1.0-ml";
let $bindings := map:map()
let $_ := map:put($bindings, "",
  cts:element-reference(xs:QName("title")))
return cts:parse('cat AND dog', $bindings)

(:
  cts:and-query((
    cts:element-word-query(fn:QName("","title"),"cat",("lang=en"),1),
    cts:element-word-query(fn:QName("","title"),"dog",("lang=en"), 1)),   ("unordered"))
:)
JavaScript
cts.parse(
  'cat AND dog',
  {'': cts.jsonPropertyReference('title')}
)

// cts.andQuery([
//    cts.jsonPropertyWordQuery("title", "cat", ["lang=en"], 1), //    cts.jsonPropertyWordQuery("title", "dog", ["lang=en"], 1)
//  ],
//  ["unordered"])

For more details on using bindings, see Binding a Tag to a Reference, Field, or Query Generator.

Query Text Parsing Examples

This section illustrates the output from the cts:parse XQuery function or cts.parse JavaScript function various inputs. For examples of queries that include option values, see Including Options and Weights in Query Text.

You can use a query similar to the following in Query Console to explore the parser output on your own. The bindings are only needed for the examples that use the 'color' or 'loc' tag. To parse some of the query text that uses the bound tags, you need to define an element range index on the 'body-color' XML element or 'bodyColor' JSON property, and a geospatial element ranage index on an XML element or JSON property named 'incidents'.

Query Language Query Template
XML
xquery version "1.0-ml";
let $bindings := map:map()
let $_ := 
  map:put($bindings, 
    "color", cts:element-reference(xs:QName("body-color")))
return cts:parse(queryText, $bindings)
JavaScript
cts.parse(queryText, 
  { color: cts.jsonPropertyReference('bodyColor') })

The following table contains examples of input query text and the result returned by the parser.

Query Text cts:parse Output(XQuery) cts.parse Output(JavaScript)
cat
cts:word-query(
  "cat", ("lang=en"), 1)
cts.wordQuery(
  "cat", ["lang=en"], 1)
cat dog
cat AND dog
cts:and-query((
  cts:word-query(
    "cat", ("lang=en"), 1),
  cts:word-query(
    "dog", ("lang=en"), 1)
  ), ("unordered"))
cts.andQuery([
  cts.wordQuery(
    "cat", ["lang=en"], 1),
  cts.wordQuery(
    "dog", ["lang=en"], 1)
  ], ["unordered"])
cat dog OR mouse
cts:or-query((
  cts:and-query((
    cts:word-query(
      "cat",("lang=en"),1),
    cts:word-query(
      "dog",("lang=en"), 1)
    ), ("unordered")),   cts:word-query(
    "mouse",("lang=en"),1)
  ), ())
cts.orQuery([
  cts.andQuery([
    cts.wordQuery(
      "cat",["lang=en"],1),
    cts.wordQuery(
      "dog", ["lang=en"],1)
    ], ["unordered"]),
  cts.wordQuery(
    "mouse", ["lang=en"],1)
  ], [])
cat (dog OR mouse)
cts:and-query((
  cts:word-query(
    "cat", ("lang=en"), 1),
  cts:or-query((
    cts:word-query(
      "dog",("lang=en"),1),
    cts:word-query(
      "mouse",("lang=en"),
      1)
    ), ())
  ), ("unordered"))
cts.andQuery([
  cts.wordQuery(
    "cat", ["lang=en"], 1),
  cts.orQuery([
    cts.wordQuery(
      "dog",["lang=en"],1),
    cts.wordQuery(
     "mouse",["lang=en"],1)
    ], [])
  ], ["unordered"])
cat -dog
cts:and-query((
  cts:word-query(
    "cat", ("lang=en"), 1),
  cts:not-query(
    cts:word-query(
      "dog", ("lang=en"),
      1),
    1)
  ), ("unordered"))
cts.andQuery([
  cts.wordQuery(
    "cat", ["lang=en"], 1),
  cts.notQuery(
    cts.wordQuery(
     "dog", ["lang=en"],1),
    1)
  ], ["unordered"])
color:red
cts:element-word-query(
  fn:QName("","body-color"),
  "red", ("lang=en"), 1)
cts.jsonPropertyWordQuery(
  "bodyColor", "red",
  ["lang=en"], 1)
color = red
cts:element-value-query(
  fn:QName("","body-color"),
  "red", ("lang=en"), 1)
cts.jsonPropertyValueQuery(
  "bodyColor", "red",
  ["lang=en"], 1)
color EQ red
cts:element-range-query(
  fn:QName("","body-color"), 
  "=", "red",
  ("collation=..."), 1)
cts.jsonPropertyRangeQuery(
  "bodyColor", "=", "red",
  ["collation=..."], 1)
color:(red blue)
cts:element-word-query(
  fn:QName("","body-color"),
  ("red", "blue"),
  ("lang=en"), 1)
Matches if body-color contains either red or blue.
cts.jsonPropertyWordQuery(
  "color", ["red", "blue"],
  ["lang=en"], 1)
Matches if bodyColor contains either red or blue.
loc:"100.0,1.0"
cts:element-geospatial-query(
  fn:QName("","incidents"),
  cts:point("100,1"),
  ("coordinate-system=wgs84"),
  1)
cts.jsonPropertyGeospatialQuery(
  "incidents",
  cts.point("100,1"),
  ["coordinate-system=wgs84"],
  1)
loc:"[10,20,30,40]"
cts:element-geospatial-query(
  fn:QName("","incidents"),
  cts:box("[10, 20, 30, 40]"),
  ("coordinate-system=wgs84",
  1)
cts.jsonPropertyGeospatialQuery(
  "incidents", 
  cts.box("[10,20,30,40]"),
  ["coordinate-system=wgs84",
  1)

Combining multiple cts:query Expressions

Because cts:query expressions are composable, you can combine multiple expressions to form a single expression. There is no limit to how complex you can make a cts:query expressions. Any API that has a return type of cts:* (for example, cts:query, cts:and-query, and so on) can be composed with another cts:query expression to form another expression. This section has the following parts:

Using cts:and-query and cts:or-query

You can construct arbitrarily complex boolean logic by combining cts:and-query and cts:or-query constructors in a single cts:query expression.

For example, the following search with a relatively simple nested cts:query expression will return all fragments that contain either the word alfa or the word maserati, and also contain either the word saab or the word volvo.

cts:search(fn:doc(),
  cts:and-query( ( cts:or-query(("alfa", "maserati")), 
                   cts:or-query(("saab", "volvo") )
  ) )
)

Additionally, you can use cts:and-not-query and cts:not-query to add negation to your boolean logic.

Proximity Queries using cts:near-query

You can add tests for proximity to a cts:query expression using cts:near-query. Proximity queries use the word positions index in the database and, if you are using cts:element-query, the element word positions index. Proximity queries will still work without these indexes, but the indexes will speed performance of queries that use cts:near-query.

Proximity queries return true if the query matches occur within the specified distance from each other. For more details, see the MarkLogic XQuery and XSLT Function Reference for cts:near-query.

Using Bounded cts:query Expressions

The following cts:query constructors allow you to bound a cts:query expression to one or more documents, a directory, or one or more collections.

These bounding constructors allow you to narrow a set of search results as part of the second parameter to cts:search. Bounding the query in the cts:query expression is much more efficient than filtering results in a where clause, and is often more convenient than modifying the XPath in the first cts:search parameter. To combine a bounded cts:query constructor with another constructor, use a cts:and-query or a cts:or-query constructor.

For example, the following constrains a search to a particular directory, returning the URI of the document(s) that match the cts:query.

for $x in cts:search(fn:doc(), 
   cts:and-query((
     cts:directory-query("/shakespeare/plays/", "infinity"), 
         "all's well that"))
)
return xdmp:node-uri($x)

This query returns the URI of all documents under the specified directory that satisfy the query "all's well that".

In this query, the query "all's well that" is equivalent to a cts:word-query("all's well that").

Matching Nothing and Matching Everything

An empty cts:word-query will always match no fragments, and an empty cts:and-query will always match all fragments. Therefore the following are true:

cts:search(fn:doc(), cts:word-query("") )
=> returns the empty sequence
cts:search(fn:doc(), "" )
=> returns the empty sequence
cts:search(fn:doc(), cts:and-query( () ) )
=> returns every fragment in the database

You can also use cts:true-query and cts:false-query to match everything or nothing. For example:

cts:search(fn:doc(), cts:false-query())
==> returns the empty sequence

cts:search(fn:doc(), cts:true-query())
==> returns every fragment in the database

One use for an empty cts:word-query is when you have a search box that an end user enters terms to search for. If the user enters nothing and hits the submit button, then the corresponding cts:search will return no hits.

An empty cts:and-query or a cts-true-query that matches everything is sometimes useful when you need a cts:query to match everything.

Joining Documents and Properties with cts:properties-query or cts:document-fragment-query

You can use a cts:properties-query to match content in properties document. If you are searching over a document, then a cts:properties-query will search in the properties document at the URI of the document. The cts:properties-query joins the properties document with its corresponding document. The cts:properties-query takes a cts:query as a parameter, and that query is used to match against the properties document. A cts:properties-query is composable, so you can combine it with other cts:query constructors to create arbitrarily complex queries.

Using a cts:properties-query in a cts:search, you can easily create a query that returns results that join content in a document with content in the corresponding properties document. For example, consider a document that represents a chapter in a book, and the document has properties containing the publisher of the book. you can then write a search that returns documents that match a cts:query where the document has a specific publisher, as in the following example:

cts:search(collection(), cts:and-query((
  cts:properties-query(
    cts:element-value-query(xs:QName("publisher"), "My Press") ),
  cts:word-query("a small good thing") )) )

This query returns all documents with the phrase a small good thing and that have a value of My Press in the publisher element in their corresponding properties document.

Similarly, you can use cts:document-fragment-query to join documents against properties when searching over properties.

Registering cts:query Expressions to Speed Search Performance

If you use the same complex cts:query expressions repeatedly, and if you are using them as an unfiltered cts:query constructor, you can register the cts:query expressions for later use. Registering a cts:query expression stores a pre-evaluated version of the expression, making it faster for subsequent queries to use the same expression. Unfiltered constructors return results directly from the indexes and return all candidate fragments for a search, but do not perform post-filtering to validate that each fragment perfectly meets the search criteria. For details on unfiltered searches, see 'Using Unfiltered Searches for Fast Pagination' in the Query Performance and Tuning Guide.

This section describes registered queries and provides some examples of how to use them. It includes the following topics:

Registered Query APIs

To register and reuse unfiltered searches for cts:query expressions, use the following XQuery APIs:

For the syntax of these functions, see the MarkLogic XQuery and XSLT Function Reference.

Must Be Used Unfiltered

You can only use registered queries on unfiltered constructors; using a registered query as a filtered constructor throws the XDMP-REGFLT exception. To specify an unfiltered constructor, use the "unfiltered" option to cts:registered-query. For details about unfiltered searches, see 'Using Unfiltered Searches for Fast Pagination' in the Query Performance and Tuning Guide.

Registration Does Not Survive System Restart

Registered queries are only stored in the memory cache, and if the cache grows too big, some registered queries might be aged out of the cache. Also, if MarkLogic Server stops or restarts, any queries that were registered are lost and must be re-registered.

If you attempt to call cts:registered-query in a cts:search and the query is not currently registered, it throws an XDMP-UNREGISTERED exception. Because registered queries are not guaranteed to be registered every time they are used, it is good practice to use a try/catch around calls to cts:registered-query, and re-register the query in the catch if the it throws an XDMP-UNREGISTERED exception.

For example, the following sample code shows a cts:registered-query call used with a try/catch expression in XQuery:

(: wrap the registered query in a try/catch :)
try{
xdmp:estimate(cts:search(fn:doc(), 
  cts:registered-query(995175721241192518, "unfiltered")))
}
catch ($e) 
{
let $registered := 'cts:register(
		cts:word-query("hello*world", "wildcarded"))'
return
if ( fn:contains($e/*:code/text(), "XDMP-UNREGISTERED") )
then ( "retry this query with the following registered query ID: ",
       xdmp:eval($registered) )
else ( $e ) 
}

This code is somewhat simplified: it catches the XDMP-UNREGISTERED exception and simply reports what the new registered query ID is. In an application that uses registered queries, you probably would want to re-run the query with the new registered ID. Also, this example performs the try/catch in XQuery. If you are using XCC to issue queries against MarkLogic Server, you can instead perform the try/catch in the middleware Java or .NET layer.

Storing Registered Query IDs

When you register a cts:query expression, the cts:register function returns an integer, which is the ID for the registered query. After the cts:register call returns, there is no way to query the system to find the registered query IDs. Therefore, you might need to store the IDs somewhere. You can either store them in the middleware layer (if you are using XCC to issue queries against MarkLogic Server) or you can store them in a document in MarkLogic Server.

The registered query ID is generated based on a hash of the actual query, so registering the same query multiple times results in the same ID. The registered query ID is valid for all queries against the database across the entire cluster.

Registered Queries and Relevance Calculations

Searches that use registered queries will generate results having different scores from the equivalent searches using a non-registered queries. This is because registered queries are treated as a single term in the relevance calculation. For details on relevance calculations, see Relevance Scores: Understanding and Customizing.

Example: Registering and Using a cts:query Expression

To run a registered query, you first register the query and then run the registered query, specifying it by ID. This section describes some example steps for registering a query and then running the registered query.

  1. First register the cts:query expression you want to run, as in the following example:
    cts:register(cts:word-query("hello*world", "wildcarded"))
  2. The first step returns an integer. Keep track of the integer value (for example, store it in a document).
  3. Use the integer value to run a search with the registered query (with the "unfiltered" option) as follows:
    cts:search(fn:doc(), 
              cts:registered-query(987654321012345678, "unfiltered") ) 

Adding Relevance Information to cts:query Expressions:

The leaf-level cts:query APIs (cts:word-query, cts:element-word-query, and so on) have a weight parameter, which allows you to add a multiplication factor to the scores produced by matches from a query. You can use this to increase or decrease the weight factor for a particular query. For details about score, weight, and relevance calculations, see Relevance Scores: Understanding and Customizing.

XML Serializations of cts:query Constructors

You can create an XML serialization of a cts:query. The XML serialization is used by alerting applications that use a cts:reverse-query constructor and is also useful to perform various programmatic tasks to a cts:query. Alerting applications (see Creating Alerting Applications) find queries that would match nodes, and then perform some action for the query matches. This section describes the serialized XML and includes the following parts:

Serializing a cts:query to XML

A serialized cts:query has XML that conforms to the <marklogic-dir>/Config/cts.xsd schema, which is in the http://marklogic.com/cts namespace, which is bound to the cts prefix. You can either construct the XML directly or, if you use any cts:query expression within the context of an element, MarkLogic Server will automatically serialize that cts:query to XML. Consider the following example:

<some-element>{cts:word-query("hello world")}</some-element>

When you run the above expression, it serializes to the following XML:

<some-element>
  <cts:word-query xmlns:cts="http://marklogic.com/cts">
    <cts:text xml:lang="en">hello world</cts:text>
  </cts:word-query>
</some-element>

If you are using an alerting application, you might choose to store this XML in the database so you can match searches that include cts:reverse-query constructors. For details on alerts, see Creating Alerting Applications.

Add Arbitrary Annotations With cts:annotation

You can annotate your cts:query XML with cts:annotation elements. A cts:annotation element can be a child of any element in the cts:query XML, and it can consist of any valid XML content (for example, a single text node, a single element, multiple elements, complex elements, and so on). MarkLogic Server ignores these annotations when processing the query XML, but such annotations are often useful to the application. For example, you can store information about where the query came from, information about parts of the query to use or not in certain parts of the application, and so on. The following is some sample XML with cts:annotation elements:

<cts:and-query xmlns:cts="http://marklogic.com/cts">
  <cts:directory-query>
    <cts:annotation>private</cts:annotation>
    <cts:uri>/myprivate-dir/</cts:uri>
  </cts:directory-query>
  <cts:and-query>
    <cts:word-query><cts:text>hello</cts:text></cts:word-query>
    <cts:word-query><cts:text>world</cts:text></cts:word-query>
  </cts:and-query>
  <cts:annotation>
    <useful>something useful to the application here</useful>
  </cts:annotation>
</cts:and-query>

For another example that uses cts:annotation to store the original query string in a function that generates a cts:query from a string, see the last part of the example in XML Serializations of cts:query Constructors.

Function to Construct a cts:query From XML

You can turn an XML serialization of a cts:query back into an un-serialized cts:query with the cts:query function. For example, you can turn a serialized cts:query back into a cts:query as follows:

cts:query(
  <cts:word-query xmlns:cts="http://marklogic.com/cts">
    <cts:text>word</cts:text>
  </cts:word-query>
)
(: returns: cts:word-query("word", ("lang=en"), 1) :)

Example: Creating a cts:query Parser

The following sample code shows a simple query string parser that parses double-quote marks to be a phrase, and considers anything else that is separated by one or more spaces to be a single term. If needed, you can use the same design pattern to add other logic to do more complex parsing (for example, OR processing or NOT processing).

xquery version "1.0-ml";
declare function local:get-query-tokens($input as xs:string?) 
  as element() {
(: This parses double-quotes to be exact matches. :)
<tokens>{
let $newInput := fn:string-join(
(: check if there is more than one double-quotation mark.  If there is, 
   tokenize on the double-quotation mark ("), then change the spaces
   in the even tokens to the string "!+!".  This will then allow later
   tokenization on spaces, so you can preserve quoted phrases as phrase
   searches (after re-replacing the "!+!" strings with spaces).  :)
    if ( fn:count(fn:tokenize($input, '"')) > 2 )
    then ( for $i at $count in fn:tokenize($input, '"')
           return
             if ($count mod 2 = 0)
             then fn:replace($i, "\s+", "!+!")
             else $i )
    else ( $input ) , " ")
let $tokenInput := fn:tokenize($newInput, "\s+")

return (
for $x in $tokenInput
where $x ne ""
return
<token>{fn:replace($x, "!\+!", " ")}</token>)
}</tokens>
} ;

let $input := 'this is a "really big" test'
return
local:get-query-tokens($input)

This returns the following:

<tokens>
  <token>this</token>
  <token>is</token>
  <token>a</token>
  <token>really big</token>
  <token>test</token>
</tokens>

Now you can derive a cts:query expression from the tokenized XML produced above, which composes all of the terms with a cts:and-query, as follows (assuming the local:get-query-tokens function above is available to this function):

xquery version "1.0-ml";
declare function local:get-query($input as xs:string) 
{
let $tokens := local:get-query-tokens($input)
return
 cts:and-query( (cts:and-query(
        for $token in $tokens//token
        return 
        cts:word-query($token/text()) ) ))
} ;

let $input := 'this is a "really big" test'
return
local:get-query($input)

This returns the following (spacing and line breaks added for readability):

cts:and-query(
  cts:and-query((
    cts:word-query("this", (), 1), 
    cts:word-query("is", (), 1), 
    cts:word-query("a", (), 1), 
    cts:word-query("really big", (), 1), 
    cts:word-query("test", (), 1)
    ), ()) ,
  () )

You can now take the generated cts:query expression and add it to a cts:search.

Similarly, you can generate a serialized cts:query as follows (assuming the local:get-query-tokens function is available):

xquery version "1.0-ml";
declare function local:get-query-xml($input as xs:string) 
{
let $tokens := local:get-query-tokens($input)
return
 element cts:and-query { 
       element cts:and-query { 
           for $token in $tokens//token
           return 
           element cts:word-query { $token/text() } },
           element cts:annotation {$input} }
} ;

let $input := 'this is a "really big" test'
return
local:get-query-xml($input)

This returns the folllowing XML serialization:

<cts:and-query xmlns:cts="http://marklogic.com/cts">
  <cts:and-query>
    <cts:word-query>this</cts:word-query>
    <cts:word-query>is</cts:word-query>
    <cts:word-query>a</cts:word-query>
    <cts:word-query>really big</cts:word-query>
    <cts:word-query>test</cts:word-query>
  </cts:and-query>
  <cts:annotation>this is a "really big" test</cts:annotation>
</cts:and-query>
« Previous chapter
Next chapter »