Loading TOC...

cts:field-word-query

cts:field-word-query(
   $field-name as xs:string*,
   $text as xs:string*,
   [$options as xs:string*],
   [$weight as xs:double?]
) as cts:field-word-query

Summary

Returns a query matching fields whose content contains the given phrase. If the specified field does not exist, this function throws an exception. A field is a named object that specified elements to include and exclude from a search, and can include score weights for any included elements. You create fields at the database level using the Admin Interface. For details on fields, see the chapter on "Fields Database Settings" in the Administrator's Guide.

Parameters
$field-name One or more field names to search over. If multiple field names are supplied, the match can be in any of the specified fields (or-query semantics).
$text The word or phrase to match. If multiple strings are specified, the query matches if any of the words or phrases match (or-query semantics).
$options Options to this query. The default is ().

Options include:

"case-sensitive"
A case-sensitive query.
"case-insensitive"
A case-insensitive query.
"diacritic-sensitive"
A diacritic-sensitive query.
"diacritic-insensitive"
A diacritic-insensitive query.
"punctuation-sensitive"
A punctuation-sensitive query.
"punctuation-insensitive"
A punctuation-insensitive query.
"whitespace-sensitive"
A whitespace-sensitive query.
"whitespace-insensitive"
A whitespace-insensitive query.
"stemmed"
A stemmed query.
"unstemmed"
An unstemmed query.
"wildcarded"
A wildcarded query.
"unwildcarded"
An unwildcarded query.
"exact"
An exact match query. Shorthand for "case-sensitive", "diacritic-sensitive", "punctuation-sensitive", "whitespace-sensitive", "unstemmed", and "unwildcarded".
"lang=iso639code"
Specifies the language of the query. The iso639code code portion is case-insensitive, and uses the languages specified by ISO 639. The default is specified in the database configuration.
"distance-weight=number"
A weight applied based on the minimum distance between matches of this query. Higher weights add to the importance of proximity (as opposed to term matches) when the relevance order is calculated. The default value is 0.0 (no impact of proximity). The weight should be between 64 and -16. Weights greater than 64 will have the same effect as a weight of 64. This parameter has no effect if the word positions index is not enabled. This parameter has no effect on searches that use score-simple, score-random, or score-zero (because those scoring algorithms do not consider term frequency, proximity is irrelevant).
"min-occurs=number"
Specifies the minimum number of occurrences required. If fewer that this number of words occur, the fragment does not match. The default is 1.
"max-occurs=number"
Specifies the maximum number of occurrences required. If more than this number of words occur, the fragment does not match. The default is unbounded.
"synonym"
Specifies that all of the terms in the $text parameter are considered synonyms for scoring purposes. The result is that occurrences of more than one of the synonyms are scored as if there are more occurrences of the same term (as opposed to having a separate term that contributes to score).
"lexicon-expand=value"
The value is one of full, prefix-postfix, off, or heuristic (the default is heuristic). An option with a value of lexicon-expand=full specifies that wildcards are resolved by expanding the pattern to words in a lexicon (if there is one available), and turning into a series of cts:word-queries, even if this takes a long time to evaluate. An option with a value of lexicon-expand=prefix-postfix specifies that wildcards are resolved by expanding the pattern to the pre- and postfixes of the words in the word lexicon (if there is one), and turning the query into a series of character queries, even if it takes a long time to evaluate. An option with a value of lexicon-expand=off specifies that wildcards are only resolved by looking up character patterns in the search pattern index, not in the lexicon. An option with a value of lexicon-expand=heuristic, which is the default, specifies that wildcards are resolved by using a series of internal rules, such as estimating the number of lexicon entries that need to be scanned, seeing if the estimate crosses certain thresholds, and (if appropriate), using another way besides lexicon expansion to resolve the query.
*
"lexicon-expansion-limit=number"
Specifies the limit for lexicon expansion. This puts a restriction on the number of lexicon expansions that can be performed. If the limit is exceeded, the server may raise an error depending on whether the "limit-check" option is set. The default value for this option will be 4096.
"limit-check"
Specifies that an error will be raised if the lexicon expansion exceeds the specified limit.
"no-limit-check"
Specifies that error will not be raised if the lexicon expansion exceeds the specified limit. The server will try to resolve the wildcard.
$weight A weight for this query. Higher weights move search results up in the relevance order. The default is 1.0. The weight should be between 64 and -16. Weights greater than 64 will have the same effect as a weight of 64. Weights less than the absolute value of 0.0625 (between -0.0625 and 0.0625) are rounded to 0, which means that they do not contribute to the score.

Usage Notes

If you use cts:near-query with cts:field-word-query, the distance supplied in the near query applies to the whole document, not just to the field. For example, if you specify a near query with a distance of 3, it will return matches when the words or phrases are within 3 words in the whole document, even if some of those words are not in the specified field. For a code example illustrating this, see the second example below.

Phrases are determined based on words being next to each other (word positions with a distance of 1) and words being in the same instance of the field. Because field word positions are determined based on the fragment, not on the field, field phrases cannot span excluded elements (this is because MarkLogic Server breaks out of the field when it encounters the excluded element and start a new field when it encounters the next included element). Similarly, field phrases will not span included sibling elements. The second code example below illustrates this.

Field phrases will automatically phrase-through all child elements of an included element, until it encounters an explicitly excluded element. The third example below illustrates this. An example of when this automatic phrase-through behavior might be convenient is if you create a field that includes only the element ABSTRACT. Then all child elements of ABSTRACT are included in the field, and phrases would span all of the child elements (that is, phrases would "phrase-through" all the child elements).

Negative "min-occurs" or "max-occurs" values will be treated as 0 and non-integral values will be rounded down. An error will be raised if the "min-occurs" value is greater than the "max-occurs" value.

Example

(: Assume the database configuration includes a field named "myField"
 : on the paths /root/a/name and /root/b/name and a corresponding
 : field range index.
 :)

cts:search(fn:doc(), cts:field-word-query("myField", ("amy", "bill")))

(: Then the search matches all documents that contain either "amy" or
 : "bill" in the value of /root/a/name or /root/b/name (the field). 
 : For example, it would match this document:
 :
 :   <root><a><name>bill</name></a></root>
 :
 : But would not match this document:
 :
 :   <root><c><name>bill</name></c></root>
 :
 : By contrast, if you defined an element index on the element "name"
 : and queried using cts:element-word-query, both documents would match.
 :)

Example

(: Assume the database configuration includes a field named "buzz"
 : on the path /hello/buzz, with localname "buzz" as an include and
 : localname "baz" as an exclude.
 :)

let $x :=
  <hello>word1 word2 word3
    <buzz>word4 word5</buzz>
    <baz>word6 word7 word8</baz>
    <buzz>word9 word10</buzz>
  </hello>
return (
  cts:contains($x, cts:near-query(
    (cts:field-word-query("buzz", "word5"),
     cts:field-word-query("buzz", "word9")), 3)),
  cts:contains($x, cts:near-query(
    (cts:field-word-query("buzz", "word5"),
     cts:field-word-query("buzz", "word9")), 4)),
  cts:contains($x,
    cts:field-word-query("buzz", "word5 word9")))

(:
 : Returns the sequence ("false", "true", "false").
 : The first part does not match because "word5" and "word9" do 
 : not occur within 3 words of each other; distance is calculated 
 : based on the whole node (or document if querying documents in 
 : the database), rather than on the field. The distance requirement
 : of the second near-query (4) is met, so the query matches and
 : returns true. The third query does not match because there
 : are words between "word5" and "word9", and the phrase is based
 : on the entire node, not on the field.
:)

Example

(: Assume the database configuration includes a field named "buzz"
 : on the path /hello/buzz, with localname "buzz" as an include and
 : localname "baz" as an exclude.
 :)
let $x :=
<hello>
  <buzz>word1 word2
    <gads>word3 word4 word5</gads>
    <zukes>word6 word7 word8</zukes>
  word9 word10
  </buzz>
</hello>
return (
cts:contains($x,
  cts:field-word-query("buzz", "word2 word3")))

(: Returns "true" because the children of "buzz" are not excluded, 
 : and are therefore automatically phrased through.
:)

Comments

  • I'm sorry the examples are confusing. We'll try to make them more clear. Thank you for the feedback. You can also learn more about fields here: http://docs.marklogic.com/guide/admin/fields Regarding the last example, I wonder if you have a properly configured field on the database you tried the example on. You should get the expected result if you create a path field named buzz, with /hello/buzz or //buzz as the path, and buzz as an include. A root field named buzz, with "include root" false (which is the default) and buzz as an include should also work. You can check your field configuration with a query similar to the following, or by looking on the Admin Interface: <pre><code> xquery version "1.0-ml"; import module namespace admin = "http://marklogic.com/xdmp/admin" at "/MarkLogic/admin.xqy"; let $config := admin:get-configuration() return admin:database-get-field($config, xdmp:database("Documents"), "buzz") </code></pre> If you run the above query, the output should include an included-element child with "buzz" as the localname if your field is properly configured, whether you've configured a path field or a root field.
  • the last example is not giving the mentioned result
  • while writing tutorial, make sure it is easy to understand for others.we can talk to ourselves and convince ourself that we have talked to other's,but that is not the case. These things in above article is not that difficult to explain but still you make it look like complex shit. nothing wrong in learning "how to write a tutorial and why you are writing a tutorial"
    • Please see my reply, above. Apologies for not replying directly to your post originally due to my fumble fingers.
Powered by MarkLogic Server 7.0-4.1 and rundmc | Terms of Use | Privacy Policy