Search Developer's Guide (PDF)

MarkLogic Server 11.0 Product Documentation
Search Developer's Guide
— Chapter 11

« Previous chapter
Next chapter »

Using Range Queries in cts:query Expressions

MarkLogic Server allows you to access range indexes in a cts:query expression to constrain a search by a range of values in an element or attribute. This chapter describes some details about these range queries and includes the following sections:

Overview of Range Queries

This section provides an overview of what range queries are and why you might want to use them, and includes the following sections:

Uses for Range Queries

Range queries are designed to constrain searches on ranges of a value. For example, if you want to find all articles that were published in 2005, and if your content has an element (or an attribute or a property) named PUBLISHDATE with type xs:date, you can create a range index on the element PUBLISHDATE, then specify in a search that you want all articles with a PUBLISHDATE greater than December 31, 2004 and less than January 1, 2006. Because that element has a range index, MarkLogic Server can resolve the query extremely efficiently.

Because you can create range indexes on a wide variety of XML datatypes, there is a lot of flexibility in the types of content with which you can use range queries to constrain searches. In general, if you need to constrain on a value, it is possible to create a range index and use range queries to express the ranges in which you want to constrain the results.

Requirements for Using Range Queries

Keep in mind the following requirements for using range queries in your cts:search operations:

  • Range queries require a range index to be defined on the element or attribute in which you want to constrain the results.
  • The range index must be in the same collation as the one specified in the range query.
  • If no collation is specified in the range query, then the query takes on the collation of the query (for example, if a collation is specified in the XQuery prolog, that is used). For details on collation defaults, see How Collation Defaults are Determined.

Because range queries require range indexes, keep in mind that range indexes take up space, add to memory usage on the machine(s) in which MarkLogic Server runs, and increase loading/reindexing time. As such, they are not exactly free, although, particularly if you have a relatively small number of them, they will not use a huge amount of resources. The amount of resources used depends a lot on the content; how many documents have the elements and/or attributes specified, how often do those elements/attributes appear in the content, how large is the content set, and so on. As with many performance improvements, there are trade-offs to analyze, and the best way to analyze the impact is to experiment and see if the cost is worth the performance improvement. For details about range indexes and procedures for creating them, see the Range Indexes and Lexicons chapter in the Administrator's Guide.

Performance and Coding Advantages of Range Queries

Most of what you can express using range queries you can also express using predicates in XPath expressions. There are two big advantages of using range queries over XPath predicates:

  • Performance
  • Ease of coding

Using range queries in cts:query expressions can produce faster performance than using XPath predicates. Range indexes are in-memory structures, and because range indexes are required for range queries, they are usually very fast. There is no requirement for the range index when specifying an XPath predicate, and it is therefore possible to specify a predicate that might need to scan a large number of fragments, which could take considerable time. Additionally, because range queries are cts:query objects, you can use registered queries to pre-compile them, adding more performance advantages.

There are also coding advantages to range queries over XPath predicates. Because range queries are leaf-level cts:query constructors, they can be combined with other constructors (including other range query constructors) to form complex expressions. It is fairly easy to write XQuery code that takes user input from a form (from drop-down lists, text boxes, radio buttons, and so on) and use that user input to generate extremely complex cts:query expressions. It is very difficult to do that with XPath expressions. For details on cts:query expressions, see Composing cts:query Expressions.

Range Query cts:query Constructors

The following XQuery APIs are included in the range query constructors:

Each API takes QNames, the type of operator (for example, >=, <=, and so on), values, and a collation as inputs. For details of these APIs and for their signatures, see the MarkLogic XQuery and XSLT Function Reference.

For release 3.2, range queries do not contribute to the score, regardless of the weight specified in the cts:query constructor.

Examples of Range Queries

The following are some examples that use range query constructors.

Consider a document with a URI /dates.xml with the following structure:

<root>
  <entry>
    <date>2007-01-01</date>
    <info>Some information.</info>
  </entry>
  <entry>
    <date>2006-06-23</date>
    <info>Some other information.</info>
  </entry>
  <entry>
    <date>1971-12-23</date>
    <info>Some different information.</info>
  </entry>
</root> 

Assume you have defined an element range index of type xs:date on the QName date (note that you must either load the document after defining the range index or complete a reindex of the database after defining the range index).

You can now issue queries using the cts:element-range-query constructor. The following query searches the entry element of the document /dates.xml for entries that occurred on or before January 1, 2000.

cts:search(doc("/dates.xml")/root/entry, 
  cts:element-range-query(xs:QName("date"), "<=",
      xs:date("2000-01-01") ) )

This query returns the following node, because it is the only one that satisfies the range query:

<entry>
    <date>1971-12-23</date>
    <info>Some different information.</info>
</entry>

The following query uses a cts:and-query to combine two date ranges, dates after January 1, 2006 and dates before January 1, 2008.

cts:search(doc("/dates.xml")/root/entry, 
  cts:and-query((
   cts:element-range-query(xs:QName("date"), ">",
      xs:date("2006-01-01") ),
   cts:element-range-query(xs:QName("date"), "<",
      xs:date("2008-01-01") ) )) )

This query returns the following two nodes:

<entry>
    <date>2007-01-01</date>
    <info>Some information.</info>
</entry>

<entry>
    <date>2006-06-23</date>
    <info>Some other information.</info>
</entry>

For queries against a dateTime index, when $value is an xs:dayTimeDuration or xs:yearMonthDuration, the query is executed as an age query. $value is subtracted from fn:current-dateTime() to create a xs:dateTime used in the query. If there is more than one item in $value, they must all be the same type.

For example, given a dateTime index on element startDateTime, queries cts:element-range-query(xs:QName ("startDateTime"), ">", xs:dayTimeDuration("P1D")) and cts:element-range-query(xs:QName ("startDateTime"), ">", fn:current-dateTime() - xs:dayTimeDuration("P1D")) are the same; both match values within the last day.

« Previous chapter
Next chapter »
Powered by MarkLogic Server | Terms of Use | Privacy Policy