Loading TOC...

MarkLogic 12 EA 2 Product Documentation
cts:element-attribute-word-query

cts:element-attribute-word-query(
   $element-name as xs:QName*,
   $attribute-name as xs:QName*,
   $text as xs:string*,
   [$options as xs:string*],
   [$weight as xs:double?]
) as cts:element-attribute-word-query

Summary

Returns a query matching elements by name with attributes by name with text content containing a given phrase.

Parameters
element-name One or more element QNames to match. When multiple QNames are specified, the query matches if any QName matches.
attribute-name One or more attribute QNames to match. When multiple QNames are specified, the query matches if any QName matches.
text Some words or phrases to match. When multiple strings are specified, the query matches if any string matches.
options Options to this query. The default is ().

Options include:

"case-sensitive"
A case-sensitive query.
"case-insensitive"
A case-insensitive query.
"diacritic-sensitive"
A diacritic-sensitive query.
"diacritic-insensitive"
A diacritic-insensitive query.
"punctuation-sensitive"
A punctuation-sensitive query.
"punctuation-insensitive"
A punctuation-insensitive query.
"whitespace-sensitive"
A whitespace-sensitive query.
"whitespace-insensitive"
A whitespace-insensitive query.
"stemmed"
A stemmed query.
"unstemmed"
An unstemmed query.
"wildcarded"
A wildcarded query.
"unwildcarded"
An unwildcarded query.
"exact"
An exact match query. Shorthand for "case-sensitive", "diacritic-sensitive", "punctuation-sensitive", "whitespace-sensitive", "unstemmed", and "unwildcarded".
"lang=iso639code"
Specifies the language of the query. The iso639code code portion is case-insensitive, and uses the languages specified by ISO 639. The default is specified in the database configuration.
"min-occurs=number"
Specifies the minimum number of occurrences required. If fewer that this number of words occur, the fragment does not match. The default is 1.
"max-occurs=number"
Specifies the maximum number of occurrences required. If more than this number of words occur, the fragment does not match. The default is unbounded.
"synonym"
Specifies that all of the terms in the $text parameter are considered synonyms for scoring purposes. The result is that occurrences of more than one of the synonyms are scored as if there are more occurrences of the same term (as opposed to having a separate term that contributes to score).
"lexicon-expand=value"
The value is one of full, prefix-postfix, off, or heuristic (the default is heuristic). An option with a value of lexicon-expand=full specifies that wildcards are resolved by expanding the pattern to words in a lexicon (if there is one available), and turning into a series of cts:word-queries, even if this takes a long time to evaluate. An option with a value of lexicon-expand=prefix-postfix specifies that wildcards are resolved by expanding the pattern to the pre- and postfixes of the words in the word lexicon (if there is one), and turning the query into a series of character queries, even if it takes a long time to evaluate. An option with a value of lexicon-expand=off specifies that wildcards are only resolved by looking up character patterns in the search pattern index, not in the lexicon. An option with a value of lexicon-expand=heuristic, which is the default, specifies that wildcards are resolved by using a series of internal rules, such as estimating the number of lexicon entries that need to be scanned, seeing if the estimate crosses certain thresholds, and (if appropriate), using another way besides lexicon expansion to resolve the query.
*
"lexicon-expansion-limit=number"
Specifies the limit for lexicon expansion. This puts a restriction on the number of lexicon expansions that can be performed. If the limit is exceeded, the server may raise an error depending on whether the "limit-check" option is set. The default value for this option will be 4096.
"limit-check"
Specifies that an error will be raised if the lexicon expansion exceeds the specified limit.
"no-limit-check"
Specifies that error will not be raised if the lexicon expansion exceeds the specified limit. The server will try to resolve the wildcard.
weight A weight for this query. Higher weights move search results up in the relevance order. The default is 1.0. The weight should be between 64 and -16. Weights greater than 64 will have the same effect as a weight of 64. Weights less than the absolute value of 0.0625 (between -0.0625 and 0.0625) are rounded to 0, which means that they do not contribute to the score.

Usage Notes

If neither "case-sensitive" nor "case-insensitive" is present, $text is used to determine case sensitivity. If $text contains no uppercase, it specifies "case-insensitive". If $text contains uppercase, it specifies "case-sensitive".

If neither "diacritic-sensitive" nor "diacritic-insensitive" is present, $text is used to determine diacritic sensitivity. If $text contains no diacritics, it specifies "diacritic-insensitive". If $text contains diacritics, it specifies "diacritic-sensitive".

If neither "punctuation-sensitive" nor "punctuation-insensitive" is present, $text is used to determine punctuation sensitivity. If $text contains no punctuation, it specifies "punctuation-insensitive". If $text contains punctuation, it specifies "punctuation-sensitive".

If neither "whitespace-sensitive" nor "whitespace-insensitive" is present, the query is "whitespace-insensitive".

If neither "wildcarded" nor "unwildcarded" is present, the database configuration and $text determine wildcarding. If the database has any wildcard indexes enabled ("three character searches", "two character searches", "one character searches", or "trailing wildcard searches") and if $text contains either of the wildcard characters '?' or '*', it specifies "wildcarded". Otherwise it specifies "unwildcarded".

If neither "stemmed" nor "unstemmed" is present, the database configuration determines stemming. If the database has "stemmed searches" enabled, it specifies "stemmed". Otherwise it specifies "unstemmed". If the query is a wildcarded query and also a phrase query (contains two or more terms), the wildcard terms in the query are unstemmed.

Negative "min-occurs" or "max-occurs" values will be treated as 0 and non-integral values will be rounded down. An error will be raised if the "min-occurs" value is greater than the "max-occurs" value.

Example

  cts:search(//module,
    cts:element-attribute-word-query(
      xs:QName("function"),
      xs:QName("type"),
      "MarkLogic Corporation"))

  => .. relevance-ordered sequence of 'module' element
  ancestors of 'function' elements that have a 'type'
  attribute whose value contains the phrase
  'MarkLogic Corporation'.

Example

  cts:search(//module,
    cts:element-attribute-word-query(
      xs:QName("function"),
      xs:QName("type"),
      "MarkLogic Corporation", "case-insensitive"))

  => .. relevance-ordered sequence of 'module' element
  ancestors of 'function' elements that have a 'type'
  attribute whose value contains the phrase
  'MarkLogic Corporation', or any other case-shift,
  like 'MARKLOGIC CorpoRation'.

Example

  cts:search(//module,
    cts:and-query((
      cts:element-attribute-word-query(
        xs:QName("function"),
        xs:QName("type"),
        "MarkLogic Corporation",
        "punctuation-insensitive", 0.5),
      cts:element-word-query(
        xs:QName("title"),
        "faster"))))

  => .. relevance-ordered sequence of 'module' element
  ancestors of both:
  (a) 'function' elements with 'type' attribute whose value
      contains the phrase 'MarkLogic Corporation',
      ignoring embedded punctuation,
  AND
  (b) 'title' elements whose text content contains the
      term 'faster',
  with the results of the first query given weight 0.5,
  as opposed to the default 1.0 for the second query.

Stack Overflow iconStack Overflow: Get the most useful answers to questions from the MarkLogic community, or ask your own question.