Loading TOC...

cts:values

cts:values(
   $range-indexes as cts:reference*,
   [$start as xs:anyAtomicType?],
   [$options as xs:string*],
   [$query as cts:query?],
   [$quality-weight as xs:double?],
   [$forest-ids as xs:unsignedLong*]
) as xs:anyAtomicType*

Summary

Returns values from the specified value lexicon(s). Value lexicons are implemented using range indexes; consequently this function requires a range index for each of the $range-indexes specified in the function. If there is not a range index configured for each of the specified range indexes, an exception is thrown.

Parameters
$range-indexes A sequence of references to range indexes.
$start A starting value. The parameter type must match the lexicon type. If the parameter value is not in the lexicon, then the values are returned beginning with the next value.
$options Options. The default is ().

Options include:

"ascending"
Values should be returned in ascending order.
"descending"
Values should be returned in descending order.
"any"
Values from any fragment should be included.
"document"
Values from document fragments should be included.
"properties"
Values from properties fragments should be included.
"locks"
Values from locks fragments should be included.
"frequency-order"
Values should be returned ordered by frequency.
"item-order"
Values should be returned ordered by item.
"fragment-frequency"
Frequency should be the number of fragments with an included value. This option is used with cts:frequency.
"item-frequency"
Frequency should be the number of occurences of an included value. This option is used with cts:frequency.
"timezone=TZ"
Return timezone sensitive values (dateTime, time, date, gYearMonth, gYear, gMonth, and gDay) adjusted to the timezone specified by TZ. Example timezones: Z, -08:00, +01:00.
"limit=N"
Return no more than N values. You should not use this option with the "skip" option. Use "truncate" instead.
"skip=N"
Skip over fragments selected by the cts:query to treat the Nth fragment as the first fragment. Values from skipped fragments are not included. This option affects the number of fragments selected by the cts:query to calculate frequencies. Only applies when a $query parameter is specified.
"sample=N"
Return only values from the first N fragments after skip selected by the cts:query. This option does not affect the number of fragments selected by the cts:query to calculate frequencies. Only applies when a $query parameter is specified.
"truncate=N"
Include only values from the first N fragments after skip selected by the cts:query. This option also affects the number of fragments selected by the cts:query to calculate frequencies. Only applies when a $query parameter is specified.
"score-logtfidf"
Compute scores using the logtfidf method. Only applies when a $query parameter is specified.
"score-logtf"
Compute scores using the logtf method. Only applies when a $query parameter is specified.
"score-simple"
Compute scores using the simple method. Only applies when a $query parameter is specified.
"score-random"
Compute scores using the random method. Only applies when a $query parameter is specified.
"score-zero"
Compute all scores as zero. Only applies when a $query parameter is specified.
"checked"
Word positions should be checked when resolving the query.
"unchecked"
Word positions should not be checked when resolving the query.
"too-many-positions-error"
If too much memory is needed to perform positions calculations to check whether a document matches a query, return an XDMP-TOOMANYPOSITIONS error, instead of accepting the document as a match.
"eager"
Perform most of the work concurrently before returning the first item from the indexes, and only some of the work sequentially while iterating through the rest of the items. This usually takes the shortest time for a complete item-order result or for any frequency-order result.
"lazy"
Perform only some the work concurrently before returning the first item from the indexes, and most of the work sequentially while iterating through the rest of the items. This usually takes the shortest time for a small item-order partial result.
"concurrent"
Perform the work concurrently in another thread. This is a hint to the query optimizer to help parallelize the lexicon work, allowing the calling query to continue performing other work while the lexicon processing occurs. This is especially useful in cases where multiple lexicon calls occur in the same query (for example, resolving many facets in a single query).
"map"
Return results as a single map:map value instead of as an xs:anyAtomicType* sequence .
$query Only include values in fragments selected by the cts:query, and compute frequencies from this set of included values. The values do not need to match the query, but they must occur in fragments selected by the query. The fragments are not filtered to ensure they match the query, but instead selected in the same manner as "unfiltered" cts:search operations. If a string is entered, the string is treated as a cts:word-query of the specified string.
$quality-weight A document quality weight to use when computing scores. The default is 1.0.
$forest-ids A sequence of IDs of forests to which the search will be constrained. An empty sequence means to search all forests in the database. The default is ().

Usage Notes

Only one of "frequency-order" or "item-order" may be specified in the options parameter. If neither "frequency-order" nor "item-order" is specified, then the default is "item-order".

Only one of "fragment-frequency" or "item-frequency" may be specified in the options parameter. If neither "fragment-frequency" nor "item-frequency" is specified, then the default is "fragment-frequency".

Only one of "ascending" or "descending" may be specified in the options parameter. If neither "ascending" nor "descending" is specified, then the default is "ascending" if "item-order" is specified, and "descending" if "frequency-order" is specified.

Only one of "any", "document", "properties", or "locks" may be specified in the options parameter. If none of "any", "document", "properties", or "locks" are specified and there is a $query parameter, then the default is "document". If there is no $query parameter then the default is "any".

Only one of the "score-logtfidf", "score-logtf", "score-simple", "score-random", or "score-zero" options may be specified in the options parameter. If none of "score-logtfidf", "score-logtf", "score-simple", "score-random", or "score-zero" are specified, then the default is "score-logtfidf".

Only one of the "checked" or "unchecked" options may be specified in the options parameter. If neither "checked" nor "unchecked" are specified, then the default is "checked".

If "collation=URI" is not specified in the options parameter, then the default collation is used. If a lexicon with that collation does not exist, an error is thrown.

If "sample=N" is not specified in the options parameter, then all included values may be returned. If a $query parameter is not present, then "sample=N" has no effect.

If "truncate=N" is not specified in the options parameter, then values from all fragments selected by the $query parameter are included. If a $query parameter is not present, then "truncate=N" has no effect.

To incrementally fetch a subset of the values returned by this function, use fn:subsequence on the output, rather than the "skip" option. The "skip" option is based on fragments matching the query parameter (if present), not on values. A fragment matched by query might contain multiple values or no values. The number of fragments skipped does not correspond to the number of values. Also, the skip is applied to the relevance ordered query matches, not to the ordered values list.

When using the "skip" option, use the "truncate" option rather than the "limit" option to control the number of matching fragments from which to draw values.

Example

(:
  Assuming that there are path namespaces defined with the following prefixes:
  my: http://aaa.com
  his: http://bbb.com

  Further assuming that a path index is defined using the above namespaces,
  '/my:a[@his:b="B1"]/my:c'.
:)
  xquery version "1.0-ml";

  declare namespace my = "http://aaa.com";
  declare namespace his = "http://bbb.com";

  xdmp:document-insert("/abc1.xml",<my:a his:b="B1"><my:c>C1</my:c></my:a>),
  xdmp:document-insert("/abc2.xml",<my:a his:b="B2"><my:c>C2</my:c></my:a>)

  (: The following is based on the above setup :)
  xquery version "1.0-ml";

  declare namespace my = "http://aaa.com";
  declare namespace his = "http://bbb.com";

  cts:values(cts:path-reference('/my:a[@his:b="B1"]/my:c'))
  =>
    C1

Stack Overflow iconStack Overflow: Get the most useful answers to questions from the MarkLogic community, or ask your own question.

Comments

The commenting feature on this page is enabled by a third party. Comments posted to this page are publicly visible.
  • Why does this function automatically show a sequence with a fn:distinct-values() run over it? I have 100 documents, all with the same testdata. Instead of getting 200 of the elementvalues I ask for, I get 2 values. I tried to run this with a fn:random(), so instead of the set value it now gives me 200 times a random value. Do you know which function in need in order to get the 200 values I ask for?
    • Hi Mick - cts:values() returns data from the range index. A range index will only contain a specific value once and then have references to the document IDs that contain the value. I'm a visual learner, so imagine a scenario like the one in the image below, where you have 2 documents each with a <first-name> element: https://uploads.disquscdn.com/images/d367926dafdaf3cd2fb78ca6eff990cea4b34f5f2519bcffbf4983c4ae17b03c.png If you created a range index on the <first-name> element, you could visualize the structure of the range index as follows: https://uploads.disquscdn.com/images/27eff081004cad640333a7b3bfaeff9e4567d5062b4d4f46755e5e57a0323b1c.png Hopefully understanding the structure of the range index helps you understand why cts:values() only brings back a value once even though that value is in multiple documents. If you want to know how many documents contain a certain value, you might take a look at the xdmp:estimate() function. If you want to know which documents contain a certain value, you might take a look at the cts:uris() function. If you truly want to return all the values (including repeats) then an XPath expression should work, for example: /doc/name/first-name/string() But be aware that you could bring back a lot of data and therefore this wouldn't be a good idea at scale.