cts.elementValueRanges( element-names as xs.QName[], [bounds as (String | Number | Boolean | null | Array | Object)[]], [options as String[]], [query as cts.query?], [quality-weight as Number?], [forest-ids as (Number|String)[]] ) as Sequence
Returns value ranges from the specified element value lexicon(s). Value lexicons are implemented using range indexes; consequently this function requires an element range index for each element specified in the function. If there is not a range index configured for each of the specified elements, an exception is thrown.
The values are divided into buckets. The $bounds parameter specifies the number of buckets and the size of each bucket. All included values are bucketed, even those less than the lowest bound or greater than the highest bound. An empty sequence for $bounds specifies one bucket, a single value specifies two buckets, two values specify three buckets, and so on.
If you have string values and you pass a $bounds parameter as in the following call:
cts.elementValueRanges(xs.QName("myElement"), ["f", "m"])
The first bucket contains string values that are less than the
string f
, the second bucket contains string values greater than
or equal to f
but less than m
, and the third bucket
contains string values that are greater than or equal to m
.
For each non-empty bucket, an ObjectNode is returned.
Each ObjectNode has a minimum
property and a maximum
property. If a bucket is bounded, its ObjectNode will also have a
lowerBound
property if it is bounded from below, and
a upperBound
property if it is bounded from above.
Empty buckets return nothing unless the "empties" option is specified.
Parameters | |
---|---|
element-names | One or more element QNames. |
bounds | A sequence of range bounds. The types must match the lexicon type. The values must be in strictly ascending order, otherwise an exception is thrown. |
options |
Options. The default is ().
Options include:
|
query |
Only include values in fragments selected by the cts:query ,
and compute frequencies from this set of included values.
The values do not need to match the query, but they must occur in
fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts.search
operations. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
quality-weight | A document quality weight to use when computing scores. The default is 1.0. |
forest-ids | A sequence of IDs of forests to which the search will be constrained. An empty sequence means to search all forests in the database. The default is (). |
Only one of "frequency-order" or "item-order" may be specified in the options parameter. If neither "frequency-order" nor "item-order" is specified, then the default is "item-order".
Only one of "fragment-frequency" or "item-frequency" may be specified in the options parameter. If neither "fragment-frequency" nor "item-frequency" is specified, then the default is "fragment-frequency".
Only one of "ascending" or "descending" may be specified in the options parameter. If neither "ascending" nor "descending" is specified, then the default is "ascending" if "item-order" is specified, and "descending" if "frequency-order" is specified.
Only one of "eager" or "lazy" may be specified in the options parameter. If neither "eager" nor "lazy" is specified, then the default is "eager" if "frequency-order" or "empties" is specified, otherwise "lazy".
Only one of "any", "document", "properties", or "locks" may be specified in the options parameter. If none of "any", "document", "properties", or "locks" are specified and there is a $query parameter, then the default is "document". If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", "score-simple", "score-random", or "score-zero" options may be specified in the options parameter. If none of "score-logtfidf", "score-logtf", "score-simple", "score-random", or "score-zero" are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified in the options parameter. If neither "checked" nor "unchecked" are specified, then the default is "checked".
If "collation=URI" is not specified in the options parameter, then the default collation is used. If a lexicon with that collation does not exist, an error is thrown.
If "sample=N" is not specified in the options parameter,
then ranges with all included values may be returned. If a
$query
parameter is not present, then "sample=N"
has no effect.
If "truncate=N" is not specified in the options parameter,
then values from all fragments selected by the $query
parameter
are included. If a $query
parameter is not present, then
"truncate=N" has no effect.
To incrementally fetch a subset of the values returned by this function,
use
fn.subsequence
on the output, rather than
the "skip" option. The "skip" option is based on fragments matching the
query
parameter (if present), not on values. A fragment
matched by query might contain multiple values or no values.
The number of fragments skipped does not correspond to the number of
values. Also, the skip is applied to the relevance ordered query matches,
not to the ordered results list.
When using the "skip" option, use the "truncate" option rather than the "limit" option to control the number of matching fragments from which to draw values.
// Run the following to load data for this example. // Make sure you have an int element range index on // number. declareUpdate(); for (x=1;x<11;x++) { xdmp.documentInsert("/" + x + ".xml", fn.head(xdmp.unquote('<root><number>' + x + '</number></root>'))); }; ********* // Now run the following: cts.elementValueRanges(xs.QName("number"), [5, 10, 15, 20], "empties") => {"minimum":1, "maximum":4, "upperBound":5} {"minimum":5, "maximum":9, "lowerBound":5, "upperBound":10} {"minimum":10, "maximum":10, "lowerBound":10, "upperBound":15} {"lowerBound":15, "upperBound":20} {"lowerBound":20}
// this query has the database fragmented on SPEECH and // finds four ranges of SPEAKERs, against the shakespeare database cts.elementValueRanges(xs.QName("SPEAKER"), ["F","N","S"]); => {"minimum":"", "maximum":"EXTON", "upperBound":"F"} {"minimum":"FABIAN", "maximum":"MYRMIDONS", "lowerBound":"F", "upperBound":"N"} {"minimum":"NATHANIEL", "maximum":"RUTLAND", "lowerBound":"N", "upperBound":"S"} {"minimum":"Sailor", "maximum":"YOUNG SIWARD", "lowerBound":"S"} </pre></apidoc:example> <apidoc:example class="xquery"><pre xml:space="preserve"><![CDATA[ (: this is the same query as above, but it is getting the counts of the number of SPEAKERs for each bucket :) for $bucket in cts:element-value-ranges(xs:QName("SPEAKER"),("F","N","S")) return cts:frequency($bucket); => 9598 11321 5166 4981
// this is the same query as above, but it is getting the counts // of the number of SPEAKERs for each bucket const res = new Array(); for (const range of cts.elementValueRanges(xs.QName("SPEAKER"), ["F","N","S"]) ) { res.push(cts.frequency(range)); }; res; => [9598, 11321, 5166, 4981]