Loading TOC...

MarkLogic 12 EA 2 Product Documentation
cts:near-query

cts:near-query(
   $queries as cts:query*,
   [$distance as xs:double?],
   [$options as xs:string*],
   [$distance-weight as xs:double?]
) as cts:near-query

Summary

Returns a query matching all of the specified queries, where the matches occur within the specified distance from each other.

Parameters
queries A sequence of queries to match.
distance A distance, in number of words, between any two matching queries. The results match if two queries match and the distance between the two matches is equal to or less than the specified distance. A distance of 0 matches when the text is the exact same text or when there is overlapping text (see the third example below). A negative distance is treated as 0. The default value is 10.
options Options to this query. The default value is ().

Options include:

"ordered"
Any near-query matches must occur in the order of the specified sub-queries.
"unordered"
Any near-query matches will satisfy the query, regardless of the order they were specified.
"minimum-distance"
The minimum distance between two matching queries. The results match if the two queries match and the minimum distance between the two matches is greater than or equal to the specified minimum distance. The default value is zero. A negative distance is treated as 0.
distance-weight A weight attributed to the distance for this query. Higher weights add to the importance of distance (as opposed to term matches) when the relevance order is calculated. The default value is 1.0. The weight should be between 64 and -16. Weights greater than 64 will have the same effect as a weight of 64. Weights less than the absolute value of 0.0625 (between -0.0625 and 0.0625) are rounded to 0, which means that they do not contribute to the score. This parameter has no effect if the word positions index is not enabled.

Usage Notes

If the options parameter contains neither "ordered" nor "unordered", then the default is "unordered".

The word positions index will speed the performance of queries that use cts:near-query. The element word positions index will speed the performance of element-queries that use cts:near-query.

If you use cts:near-query with a field, the distance specified is the distance in the whole document, not the distance in the field. For example, if the distance between two words is 20 in the document, but the distance is 10 if you look at a view of the document that only includes the elements in a field, a cts:near-query must have a distance of 20 or more to match; a distance of 10 would not match. The same applies to minimum distance as well.

If you use cts:near-query with cts:field-word-query, the distance supplied in the near query applies to the whole document, not just to the field. This too applies to the minimum distance as well. For details, see cts:field-word-query.

Expressions using the ordered option are more efficient than those using the unordered option, especially if they specify many queries to match.

Minimum-distance and distances apply to each near-query match. Therefore, if minimum-distance is greater than distance there can be no matches.

Example

 The following query searches for paragraphs containing
 both "MarkLogic" and "Server" within 3 words of each
 other, given the following paragraphs in a database:

  <p>MarkLogic Server is an enterprise-class
  database specifically built for content.</p>
  <p>MarkLogic is an excellent XML Content Server.</p>

  cts:search(//p,
    cts:near-query(
      (cts:word-query("MarkLogic"),
      cts:word-query("Server")),
      3))

  =>
  <p>MarkLogic Server is an enterprise-class
  database specifically built for content.</p>

Example

let $x := <p>Now is the winter of our discontent</p>
return
cts:contains($x, cts:near-query(
                    ("discontent", "winter"),
                    3, "ordered"))

=> false because "discontent" comes after "winter"

let $x := <p>Now is the winter of our discontent</p>
return
cts:contains($x, cts:near-query(
                    ("discontent", "winter"),
                    3, "unordered"))

=> true because the query specifies "unordered",
        and it is still a match even though
        "discontent" comes after "winter"

Example

let $x := <p>Now is the winter of our discontent</p>
return
cts:contains($x, cts:near-query(
                    ("is the winter", "winter of"),
                    0))

=> true because the phrases overlap

let $x := <p>Now is the winter of our discontent</p>
return
cts:contains($x, cts:near-query(
                    ("is the winter", "of our"),
                    0))

=> false because the phrases do not overlap
         (they have 1 word distance, not 0)

Example

let $x := <p>Now is the winter of our discontent</p>
return
cts:contains($x, cts:near-query(
                    ("winter", "discontent"),
                    5, ("ordered", "minimum-distance=4")))

=> false because the distance between the queries is greater than the minimum
distance

let $x := <p>Now is the winter of our discontent</p>
return
cts:contains($x, cts:near-query(
                    ("winter", "discontent"),
                    5, ("ordered", "minimum-distance=3")))

=> true because the distance between the queries is less than or equal to the
minimum distance

Stack Overflow iconStack Overflow: Get the most useful answers to questions from the MarkLogic community, or ask your own question.