Loading TOC...

cts:element-value-query

cts:element-value-query(
   $element-name as xs:QName*,
   [$text as xs:string*],
   [$options as xs:string*],
   [$weight as xs:double?]
) as cts:element-value-query

Summary

Returns a query matching elements by name with text content equal a given phrase. cts:element-value-query only matches against simple elements (that is, elements that contain only text and have no element children).

Parameters
$element-name One or more element QNames to match. When multiple QNames are specified, the query matches if any QName matches.
$text One or more element values to match. When multiple strings are specified, the query matches if any string matches.
$options Options to this query. The default is ().

Options include:

"case-sensitive"
A case-sensitive query.
"case-insensitive"
A case-insensitive query.
"diacritic-sensitive"
A diacritic-sensitive query.
"diacritic-insensitive"
A diacritic-insensitive query.
"punctuation-sensitive"
A punctuation-sensitive query.
"punctuation-insensitive"
A punctuation-insensitive query.
"whitespace-sensitive"
A whitespace-sensitive query.
"whitespace-insensitive"
A whitespace-insensitive query.
"stemmed"
A stemmed query.
"unstemmed"
An unstemmed query.
"wildcarded"
A wildcarded query.
"unwildcarded"
An unwildcarded query.
"exact"
An exact match query. Shorthand for "case-sensitive", "diacritic-sensitive", "punctuation-sensitive", "whitespace-sensitive", "unstemmed", and "unwildcarded".
"lang=iso639code"
Specifies the language of the query. The iso639code code portion is case-insensitive, and uses the languages specified by ISO 639. The default is specified in the database configuration.
"min-occurs=number"
Specifies the minimum number of occurrences required. If fewer that this number of words occur, the fragment does not match. The default is 1.
"max-occurs=number"
Specifies the maximum number of occurrences required. If more than this number of words occur, the fragment does not match. The default is unbounded.
"synonym"
Specifies that all of the terms in the $text parameter are considered synonyms for scoring purposes. The result is that occurrences of more than one of the synonyms are scored as if there are more occurrences of the same term (as opposed to having a separate term that contributes to score).
"lexicon-expansion-limit=number"
Specifies the limit for lexicon expansion. This puts a restriction on the number of lexicon expansions that can be performed. If the limit is exceeded, the server may raise an error depending on whether the "limit-check" option is set. The default value for this option will be 4096.
"limit-check"
Specifies that an error will be raised if the lexicon expansion exceeds the specified limit.
"no-limit-check"
Specifies that error will not be raised if the lexicon expansion exceeds the specified limit. The server will try to resolve the wildcard.
$weight A weight for this query. Higher weights move search results up in the relevance order. The default is 1.0. The weight should be between 64 and -16. Weights greater than 64 will have the same effect as a weight of 64. Weights less than the absolute value of 0.0625 (between -0.0625 and 0.0625) are rounded to 0, which means that they do not contribute to the score.

Usage Notes

If neither "case-sensitive" nor "case-insensitive" is present, $text is used to determine case sensitivity. If $text contains no uppercase, it specifies "case-insensitive". If $text contains uppercase, it specifies "case-sensitive".

If neither "diacritic-sensitive" nor "diacritic-insensitive" is present, $text is used to determine diacritic sensitivity. If $text contains no diacritics, it specifies "diacritic-insensitive". If $text contains diacritics, it specifies "diacritic-sensitive".

If neither "punctuation-sensitive" nor "punctuation-insensitive" is present, $text is used to determine punctuation sensitivity. If $text contains no punctuation, it specifies "punctuation-insensitive". If $text contains punctuation, it specifies "punctuation-sensitive".

If neither "whitespace-sensitive" nor "whitespace-insensitive" is present, the query is "whitespace-insensitive".

If neither "wildcarded" nor "unwildcarded" is present, the database configuration and $text determine wildcarding. If the database has any wildcard indexes enabled ("three character searches", "two character searches", "one character searches", or "trailing wildcard searches") and if $text contains either of the wildcard characters '?' or '*', it specifies "wildcarded". Otherwise it specifies "unwildcarded".

If neither "stemmed" nor "unstemmed" is present, the database configuration determines stemming. If the database has "stemmed searches" enabled, it specifies "stemmed". Otherwise it specifies "unstemmed". If the query is a wildcarded query and also a phrase query (contains two or more terms), the wildcard terms in the query are unstemmed.

When you use the "exact" option, you should also enable "fast case sensitive searches" and "fast diacritic sensitive searches" in your database configuration.

Negative "min-occurs" or "max-occurs" values will be treated as 0 and non-integral values will be rounded down. An error will be raised if the "min-occurs" value is greater than the "max-occurs" value.

Note that the text content for the value in a cts:element-value-query is treated the same as a phrase in a cts:word-query, where the phrase is the element value. Therefore, any wildcard and/or stemming rules are treated like a phrase. For example, if you have an element value of "hello friend" with wildcarding enabled for a query, a cts:element-value-query for "he*" will not match because the wildcard matches do not span word boundaries, but a cts:element-value-query for "hello *" will match. A search for "*" will match, because a "*" wildcard by itself is defined to match the value. Similarly, stemming rules are applied to each term, so a search for "hello friends" would match when stemming is enabled for the query because "friends" matches "friend". For an example, see the fourth example below.

Similarly, because a "*" wildcard by itself is defined to match the value, the following query will match any element with the QName my-element, regardless of the wildcard indexes enabled in the database configuration:

cts:element-value-query(xs:QName("my-element"), "*", "wildcarded")

Example

  cts:search(//module,
    cts:element-value-query(
      xs:QName("function"),
      "MarkLogic Corporation"))

  => .. relevance-ordered sequence of 'module' element
  ancestors of 'function' elements whose text
  content equals 'MarkLogic Corporation'.

Example

  cts:search(//module,
    cts:element-value-query(
      xs:QName("function"),
      "MarkLogic Corporation", "case-insensitive"))

  => .. relevance-ordered sequence of 'module' element
  ancestors of 'function' elements whose text
  content equals 'MarkLogic Corporation', or any other
  case-shift like 'MARKLOGIC CorpoRation'.

Example

  cts:search(//module,
    cts:and-query((
      cts:element-value-query(
        xs:QName("function"),
        "MarkLogic Corporation",
        "punctuation-insensitive", 0.5),
      cts:element-value-query(
        xs:QName("title"),
        "Word Query"))))
  => .. relevance-ordered sequence of 'module' elements
  which are ancestors of both:
  (a) 'function' elements with text content equal to
      'MarkLogic Corporation', ignoring embedded
      punctuation,
  AND
  (b) 'title' elements with text content equal to
      'Word Query', with the results of the first sub-query
      query given weight 0.5, and the results of the second
      sub-query given the default weight 1.0.  As a result,
      the title phrase 'Word Query' counts more heavily
      towards the relevance score.

Example

let $node := <my-node>hello friend</my-node>
return (
cts:contains($node, cts:element-value-query(xs:QName('my-node'),
      "hello friends", "stemmed")),
cts:contains($node, cts:element-value-query(xs:QName('my-node'),
      "he*", "wildcarded")),
cts:contains($node, cts:element-value-query(xs:QName('my-node'),
      "hello f*", "wildcarded"))
)

=> true
   false
   true

Comments

  • How can we put case insensitive search for path range queries? I want to make case insensitive search for path /pathSyantx = (case insensitive value of $Type) let $xyz:= cts:and-query(( cts:collection-query(concat("data://", val, "/test")), cts:path-range-query("/pathSynatx", "=",$Type) ))
  • When using cts:element-value-query (and the other value queries), then I mostly call it with the parameters $options="exact" and $weight=0. Why "exact"? Because in case of value queries I mostly need exact matches. For example I have an element <status> with values like "finishing" and "finished". Without the option "exact", I would probably run a stemmed query which would not differentiate between "finishing" and "finished". Why weight=0? Because otherwise the value query would contribute to the relevance score. As an example I'd like to reuse the status element from above. Assume that every document has one status element and we search for documents with status="finished". If the weight is greater than 0 than smaller documents will be ranked higher because the term "finished" has a higher term-frequency in the smaller documents. Therefore I use weight=0 so that the value query has no influence on the ranking.
  • what is the diffidence between cts:element-query and cts:element-value-query.As both of the below query return same result . cts:search(fn:doc(), cts:element-query( xs:QName("iid"), "7019")) cts:search(fn:doc(), cts:element-value-query( xs:QName("iid"), "7019")) If they both mean the same then what is the use of creating both ..if only one of them can do the work Is there is any special case where they are used for different purpose ??
  • people ask why to use cts: type queries when pure xpath would be just as fine. I agree with them and we work very hard to make XPATH under the covers reuse indexes and work just as hard but there are situations where cts really helps consider the following xpath, which specifically chooses name element that is a child of the doc element and matches with an exact value of "Sub Part". //doc[name eq "Tommy"] useful ... but what if you want to employ fulltext methods, for example, how would u search for Günther using Gunther with this method ? Things can get difficult real fast ... hence the usefulness of something like cts:element-value-query. cts:search(//doc, cts:element-value-query( fn:QName("","name"), "Gunther", ("diacritic-insensitive"))) maybe some day we could consider implementing XPATH fulltext spec http://www.w3.org/TR/xpath-full-text-10/ this is a relatively new specification and it remains to be seen how broad adoption will be.
    • will it work if you use: //doc[xdmp:diacritic-less(name) eq "Gunther"] ...?
      • I would fully expect that to 'work' ... the question is if it would work using built in indexes; you would need to test with your specific data and perhaps trace with xdmp:query-trace() to verify if index is being used. If one was willing to use xdmp:diacritic-less then I would advise them to use cts:element-value-query. Using built in indexes with cts is great, but you can always setup your own indexes (or word lexicons, etc...) and achieve exactly what you want to do with very good query speeds. How one approaches this is based on the data (size, complexity) being queried ... at small scales it tends to be fast already (and for MarkLogic small scale typically tends to be other technologies 'large scale'). At 'bigdata' scales this is where you really need to ensure that you have set indexes and tweaked the 'racing car' to perform at its best.
Powered by MarkLogic Server 7.0-4.1 and rundmc | Terms of Use | Privacy Policy