Search Developer's Guide (PDF)

MarkLogic 10 Product Documentation
Search Developer's Guide
— Chapter 3

« Previous chapter
Next chapter »

Searching Using String Queries

This chapter describes how to perform searches using simple string queries with Search API. This chapter includes the following sections:

This chapter provides background, design patterns, and examples of using string queries. For the function signatures and descriptions, see the Search documentation under XQuery Library Modules in the MarkLogic XQuery and XSLT Function Reference.

String Query Overview

A string query is a plain text search string composed of terms, phrases, and operators that can be easily composed by end users typing into an application search box. For example, cat AND dog is a string query for finding documents that contain both the term cat and the term dog.

For historical reasons, MarkLogic supports two similar string query grammars. The XQuery Search API, and the REST, Java, and Node.js Client APIs support the grammar discussed in this chapter. The XQuery cts:parse function, the Javascript cts.parse function, and the Javascript jsearch API support a similar grammar; for details, see Creating a Query From Search Text With cts:parse. The two grammars share the same basic set of operators, but differ in how you define constraints and the degree of customizability.

The syntax of a string query is determined by a configurable grammar. A powerful default grammar is pre-defined. You can modify or extend the grammar through the grammar search option. For details, see The Default String Query Grammar.

The default grammar provides a robust ability to generate complex queries. The following are some examples of queries that use the default grammar:

  • (cat OR dog) NEAR vet

    at least one of the terms cat or dog within 10 terms (the default distance for cts:near-query) of the word vet

  • dog NEAR/30 vet

    the word dog within 30 terms of the word vet

  • cat -dog

the word cat where there is no word dog.

You can use string queries to search contents and metadata with the following MarkLogic Server APIs:

The Default String Query Grammar

The Search API has a built-in default grammar for interpreting string querys such as cat AND dog. The default grammar enables you to write applications that perform complex queries against a database based on simple search strings.

Query Components and Operators

Use the following components and operators to form string queries with the default search grammar:

Query Example Description
any terms

dog

dog cat

Match one or more terms, as with a cts:and-query. Adjacent terms and phrases are implicitly joined with AND. For example, dog cat is the same as dog AND cat.
" "

"dog tail"

"dog tail" "cat whisker"

dog "cat whisker"

Terms in double quotes are treated as a phrase. Adjacent terms and phrases are implicitly joined with AND. For example, dog "cat whisker" matches documents containing both the term dog and the phrase cat whisker.
( ) (cat OR dog) zebra Parentheses indicate grouping. The example matches documents containing at least one of the terms cat or dog, and also contain the term zebra.
-query

-dog

-(dog OR cat)

cat -dog

A NOT operation, as with a cts:not-query. For example, cat -dog matches documents that contain the term cat but that do not contain the term dog.
query1 AND query2

dog AND cat

(cat OR dog) AND zebra

Match two query expressions, as with a cts:and-query. For example, dog AND cat matches documents containing both the term dog and the term cat. AND is the default way to combine terms and phrases, so the previous example is equivalent to dog cat.
query1 OR query2 dog OR cat Match either of two queries, as with a cts:or-query. The example matches documents containing at least one of either of terms cat or dog.
query1 NOT_IN query2 dog NOT_IN "dog house" Match one query when the match does not overlap with another, as with cts:not-in-query. The example matches occurrences of dog when it is not in the phrase dog house.
query1 NEAR query2

dog NEAR cat

(cat food) NEAR mouse

Find documents containing matches to the queries on either side of the NEAR operator when the matches occur within 10 terms of each other, as with a cts:near-query. For example, dog NEAR cat matches documents containing dog within 10 terms of cat.
query1 NEAR/N query2 dog NEAR/2 cat Find documents containing matches to the queries on either side of the NEAR operator when the matches occur within N terms of each other, as with a cts:near-query. The example matches documents where the term dog occurs within 2 terms of the term cat.
constraint:value

color:red

decade:1980s birthday:1999-12-31

Find documents that match the named constraint with given value, as with a cts:element-range-query or other range query. For details, see Using Relational Operators on Constraints.
operator:state

sort:relevance

sort:date

Apply a runtime configuration operator such as sort order, defined by an operator XML element or JSON property in the search options. For details, see Operator Options.
constraint LT value color LT red birthday LT 1999-12-31 Find documents that match the named range constraint with a value less than value. For details, see Using Relational Operators on Constraints.
constraint LE value color LE red birthday LE 1999-12-31 Find documents that match the named range constraint with a value less than or equal to value. For details, see Using Relational Operators on Constraints.
constraint GT value color GT red birthday GT 1999-12-31 Find documents that match the named range constraint with a value greater than value. For details, see Using Relational Operators on Constraints.
constraint GE value color GE red birthday GE 1999-12-31 Find documents that match the named range constraint with a value greater than or equal to value. For details, see Using Relational Operators on Constraints.
constraint NE value color NE red birthday NE 1999-12-31 Find documents that match the named range constraint with a value that is not equal to value. For details, see Using Relational Operators on Constraints.
query1 BOOST query2 george BOOST washington Find documents that match query1. Boost the relevance score of documents that also match query2. The example returns all matches for the term george, with matches in documents that also contain washington having a higher relevance score. For more details, see cts:boost-query.

Operator Precedence

The precedence of operators in the default grammar, from highest to lowest, is shown in the following table. Each row in the table represents a precedence level. Where multiple operators have the same precedence, evaluation occurs from left to right. Query sub-expressions using operators higher in the table are evaluated before sub-expressions using operators lower in the table.

Operator
:, LT, LE, GT, GE, NE
-
NOT_IN
BOOST
( ), NEAR, NEAR/N
AND
OR

For example, AND has higher precedence than OR, so the following queries:

A AND B OR C
A OR B AND C

Evaluate as if written as follows:

(A AND B) OR C
A OR (B AND C)

Using Relational Operators on Constraints

The relational query operators :, LT, LE, GT, GE, and NE accept a constraint name on the left hand side and a value on the right hand side. That is, queries using these operators are of the following form:

constraint op value

These relational operators match fragments that meet the named constraint with a value that matches the relationship defined by the operator (equals, less than, greater than, etc.). For example, if your query options define an element word constraint named color, then color:red matches documents that contain elements meeting the color constraint with a value of red. For details and more examples, see Constraint Options.

The constraint name must be the name of a <constraint/> XML element or "constraint" JSON object defined by the query options governing the search. The constraint can be a word, value, range, or geospatial constraint. There must be a range index associated with the constraint.

If the constraint is unbucketed, the value on the right hand side of the operator must be convertible to the type of the constraint. For example, if the range index behind the constraint has type xs:date, then the value to match must represent an xs:date.

If the constraint is bucketed, then the value must be the name of a bucket defined by the constraint. For example, if searching using the decade bucketed constraint defined in Bucketed Range Constraint Example, then the value on the right hand side must be a bucket name such as 1920s or 2000s, such as decade:1920s.

String Query Examples

The default grammar provides a robust ability to generate complex queries. The following are some examples of queries that use the default grammar:

  • (cat OR dog) NEAR vet

    at least one of the terms cat or dog within 10 terms (the default distance for cts:near-query) of the word vet

  • dog NEAR/30 vet

    the word dog within 30 terms of the word vet

  • cat -dog

the word cat where there is no word dog

« Previous chapter
Next chapter »
Powered by MarkLogic Server | Terms of Use | Privacy Policy