MarkLogic Server 11.0 Product Documentation
cts:field-words

cts:field-words(
   $field-names as xs:string*,
   [$start as xs:string?],
   [$options as xs:string*],
   [$query as cts:query?],
   [$quality-weight as xs:double?],
   [$forest-ids as xs:unsignedLong*]
) as xs:string*

Summary

Returns words from the specified field word lexicon. This function requires an field lexicon for each of the field specified in the function. If there is not an field word lexicon configured for any of the specified fields, an exception is thrown. The words are returned in collation order.

Parameters
field-names One or more field names.
start A starting word. Returns only this word and any following words from the lexicon. If the parameter is not in the lexicon, then it returns the words beginning with the next word.
options Options. The default is ().

Options include:

"ascending"
Words should be returned in ascending order.
"descending"
Words should be returned in descending order.
"any"
Words from any fragment should be included.
"document"
Words from document fragments should be included.
"properties"
Words from properties fragments should be included.
"locks"
Words from locks fragments should be included.
"collation=URI"
Use the lexicon with the collation specified by URI.
"limit=N"
Return no more than N words. You should not use this option with the "skip" option. Use "truncate" instead.
"skip=N"
Skip over fragments selected by the cts:query to treat the Nth fragment as the first fragment. Words from skipped fragments are not included. Only applies when a $query parameter is specified.
"sample=N"
Return only words from the first N fragments after skip selected by the cts:query. Only applies when a $query parameter is specified.
"truncate=N"
Include only words from the first N fragments after skip selected by the cts:query. Only applies when a $query parameter is specified.
"score-logtfidf"
Compute scores using the logtfidf method. Only applies when a $query parameter is specified.
"score-logtf"
Compute scores using the logtf method. Only applies when a $query parameter is specified.
"score-simple"
Compute scores using the simple method. Only applies when a $query parameter is specified.
"score-random"
Compute scores using the random method. Only applies when a $query parameter is specified.
"score-zero"
Compute all scores as zero. Only applies when a $query parameter is specified.
"checked"
Word positions should be checked when resolving the query.
"unchecked"
Word positions should not be checked when resolving the query.
"too-many-positions-error"
If too much memory is needed to perform positions calculations to check whether a document matches a query, return an XDMP-TOOMANYPOSITIONS error, instead of accepting the document as a match.
"concurrent"
Perform the work concurrently in another thread. This is a hint to the query optimizer to help parallelize the lexicon work, allowing the calling query to continue performing other work while the lexicon processing occurs. This is especially useful in cases where multiple lexicon calls occur in the same query (for example, resolving many facets in a single query).
"map"
Return results as a single map:map value instead of as an xs:string* sequence .
query Only include words in fragments selected by the cts:query. The words do not need to match the query, but the words must occur in fragments selected by the query. The fragments are not filtered to ensure they match the query, but instead selected in the same manner as "unfiltered" cts:search operations. If a string is entered, the string is treated as a cts:word-query of the specified string.
quality-weight A document quality weight to use when computing scores. The default is 1.0.
forest-ids A sequence of IDs of forests to which the search will be constrained. An empty sequence means to search all forests in the database. The default is ().

Usage Notes

Only one of "ascending" or "descending" may be specified in the options parameter. If neither "ascending" nor "descending" is specified, then the default is "ascending".

Only one of "any", "document", "properties", or "locks" may be specified in the options parameter. If none of "any", "document", "properties", or "locks" are specified and there is a $query parameter, then the default is "document". If there is no $query parameter then the default is "any".

Only one of the "score-logtfidf", "score-logtf", "score-simple", "score-random", or "score-zero" options may be specified in the options parameter. If none of "score-logtfidf", "score-logtf", "score-simple", "score-random", or "score-zero" are specified, then the default is "score-logtfidf".

Only one of the "checked" or "unchecked" options may be specified in the options parameter. If neither "checked" nor "unchecked" are specified, then the default is "checked".

If "collation=URI" is not specified in the options parameter, then the default collation is used. If a lexicon with that collation does not exist, an error is thrown.

If "sample=N" is not specified in the options parameter, then all included words may be returned. If a $query parameter is not present, then "sample=N" has no effect.

If "truncate=N" is not specified in the options parameter, then words from all fragments selected by the $query parameter are included. If a $query parameter is not present, then "truncate=N" has no effect.

Only words that can be matched with field-word-query are included. That is, only words present in immediate text node children of the specified field as well as any text node children of child fields defined in the Admin Interface as field-word-query-throughs or phrase-throughs.

When run without a $query parameter and as a user with the admin role, the word lexicon functions return results that might include words from deleted fragments. However, when run as a user with the admin role and without a $query parameter, the word lexicon functions run faster (because they do not need to look up where each word comes from). It is therefore faster to run word lexicon functions as an admin user without passing a $query parameter.

To incrementally fetch a subset of the values returned by this function, use fn:subsequence on the output, rather than the "skip" option. The "skip" option is based on fragments matching the query parameter (if present), not on values. A fragment matched by query might contain multiple values or no values. The number of fragments skipped does not correspond to the number of values. Also, the skip is applied to the relevance ordered query matches, not to the ordered values list.

When using the "skip" option, use the "truncate" option rather than the "limit" option to control the number of matching fragments from which to draw values.

Example

  cts:field-words("animal","aardvark")
  => ("aardvark","aardvarks","aardwolf",...)
Powered by MarkLogic Server | Terms of Use | Privacy Policy