Loading TOC...

cts:element-values

cts:element-values(
   $element-names as xs:QName*,
   [$start as xs:anyAtomicType?],
   [$options as xs:string*],
   [$query as cts:query?],
   [$quality-weight as xs:double?],
   [$forest-ids as xs:unsignedLong*]
) as xs:anyAtomicType*

Summary

Returns values from the specified element value lexicon(s). Value lexicons are implemented using range indexes; consequently this function requires an element range index for each element specified in the function. If there is not a range index configured for each of the specified elements, an exception is thrown.

Parameters
$element-names One or more element QNames. If you specify multiple lexicons, they must all be over the same value type (string, int, etc.).
$start A starting value. The parameter type must match the lexicon type. If the parameter value is not in the lexicon, then the values are returned beginning with the next value.
$options Options. The default is ().

Options include:

"ascending"
Values should be returned in ascending order.
"descending"
Values should be returned in descending order.
"any"
Values from any fragment should be included.
"document"
Values from document fragments should be included.
"properties"
Values from properties fragments should be included.
"locks"
Values from locks fragments should be included.
"frequency-order"
Values should be returned ordered by frequency.
"item-order"
Values should be returned ordered by item.
"fragment-frequency"
Frequency should be the number of fragments with an included value. This option is used with cts:frequency.
"item-frequency"
Frequency should be the number of occurences of an included value. This option is used with cts:frequency.
"type=type"
Use the lexicon with the type specified by type (int, unsignedInt, long, unsignedLong, float, double, decimal, dateTime, time, date, gYearMonth, gYear, gMonth, gDay, yearMonthDuration, dayTimeDuration, string, or anyURI)
"collation=URI"
Use the lexicon with the collation specified by URI.
"timezone=TZ"
Return timezone sensitive values (dateTime, time, date, gYearMonth, gYear, gMonth, and gDay) adjusted to the timezone specified by TZ. Example timezones: Z, -08:00, +01:00.
"limit=N"
Return no more than N values. You should not use this option with the "skip" option. Use "truncate" instead.
"skip=N"
Skip over fragments selected by the cts:query to treat the Nth fragment as the first fragment. Values from skipped fragments are not included. This option affects the number of fragments selected by the cts:query to calculate frequencies. Only applies when a $query parameter is specified.
"sample=N"
Return only values from the first N fragments after skip selected by the cts:query. This option does not affect the number of fragments selected by the cts:query to calculate frequencies. Only applies when a $query parameter is specified.
"truncate=N"
Include only values from the first N fragments after skip selected by the cts:query. This option also affects the number of fragments selected by the cts:query to calculate frequencies. Only applies when a $query parameter is specified.
"score-logtfidf"
Compute scores using the logtfidf method. Only applies when a $query parameter is specified.
"score-logtf"
Compute scores using the logtf method. Only applies when a $query parameter is specified.
"score-simple"
Compute scores using the simple method. Only applies when a $query parameter is specified.
"score-random"
Compute scores using the random method. Only applies when a $query parameter is specified.
"score-zero"
Compute all scores as zero. Only applies when a $query parameter is specified.
"checked"
Word positions should be checked when resolving the query.
"unchecked"
Word positions should not be checked when resolving the query.
"too-many-positions-error"
If too much memory is needed to perform positions calculations to check whether a document matches a query, return an XDMP-TOOMANYPOSITIONS error, instead of accepting the document as a match.
"eager"
Perform most of the work concurrently before returning the first item from the indexes, and only some of the work sequentially while iterating through the rest of the items. This usually takes the shortest time for a complete item-order result or for any frequency-order result.
"lazy"
Perform only some the work concurrently before returning the first item from the indexes, and most of the work sequentially while iterating through the rest of the items. This usually takes the shortest time for a small item-order partial result.
"concurrent"
Perform the work concurrently in another thread. This is a hint to the query optimizer to help parallelize the lexicon work, allowing the calling query to continue performing other work while the lexicon processing occurs. This is especially useful in cases where multiple lexicon calls occur in the same query (for example, resolving many facets in a single query).
"map"
Return results as a single map:map value instead of as an xs:anyAtomicType* sequence .
"coordinate-system=name"
Use the lexicon that is configured with the specified coordinate system. Allowed values: "wgs84", "wgs84/double", "raw", "raw/double". Only applicable if the lexicon value type is point or long-lat-point.
"precision=value"
Use the lexicon that is configured with the specified precision. Allowed values: float and double. Only applicable if the lexicon value type is point or long-lat-point. This value takes precedence over the precision implicit in the coordinate system name.
$query Only include values in fragments selected by the cts:query, and compute frequencies from this set of included values. The values do not need to match the query, but they must occur in fragments selected by the query. The fragments are not filtered to ensure they match the query, but instead selected in the same manner as "unfiltered" cts:search operations. If a string is entered, the string is treated as a cts:word-query of the specified string.
$quality-weight A document quality weight to use when computing scores. The default is 1.0.
$forest-ids A sequence of IDs of forests to which the search will be constrained. An empty sequence means to search all forests in the database. The default is ().

Usage Notes

Only one of "frequency-order" or "item-order" may be specified in the options parameter. If neither "frequency-order" nor "item-order" is specified, then the default is "item-order".

Only one of "fragment-frequency" or "item-frequency" may be specified in the options parameter. If neither "fragment-frequency" nor "item-frequency" is specified, then the default is "fragment-frequency".

Only one of "ascending" or "descending" may be specified in the options parameter. If neither "ascending" nor "descending" is specified, then the default is "ascending" if "item-order" is specified, and "descending" if "frequency-order" is specified.

Only one of "eager" or "lazy" may be specified in the options parameter. If neither "eager" nor "lazy" is specified, then the default is "lazy" if "item-order" is specified, and "eager" if "frequency-order" is specified.

Only one of "any", "document", "properties", or "locks" may be specified in the options parameter. If none of "any", "document", "properties", or "locks" are specified and there is a $query parameter, then the default is "document". If there is no $query parameter then the default is "any".

Only one of the "score-logtfidf", "score-logtf", "score-simple", "score-random", or "score-zero" options may be specified in the options parameter. If none of "score-logtfidf", "score-logtf", "score-simple", "score-random", or "score-zero" are specified, then the default is "score-logtfidf".

Only one of the "checked" or "unchecked" options may be specified in the options parameter. If neither "checked" nor "unchecked" are specified, then the default is "checked".

If "collation=URI" is not specified in the options parameter, then the default collation is used. If a lexicon with that collation does not exist, an error is thrown.

If "sample=N" is not specified in the options parameter, then all included values may be returned. If a $query parameter is not present, then "sample=N" has no effect.

If "truncate=N" is not specified in the options parameter, then values from all fragments selected by the $query parameter are included. If a $query parameter is not present, then "truncate=N" has no effect.

To incrementally fetch a subset of the values returned by this function, use fn:subsequence on the output, rather than the "skip" option. The "skip" option is based on fragments matching the query parameter (if present), not on values. A fragment matched by query might contain multiple values or no values. The number of fragments skipped does not correspond to the number of values. Also, the skip is applied to the relevance ordered query matches, not to the ordered values list.

When using the "skip" option, use the "truncate" option rather than the "limit" option to control the number of matching fragments from which to draw values.

Example

(:
  an element range index must exist on "animal" or
  this example throws XDMP-ELEMRIDXNOTFOUND
:)
  cts:element-values(xs:QName("animal"),"aardvark")
  => ("aardvark","aardvarks","aardwolf",...)

Comments

  • We had a cts-element-values query that had a cts-element-value-query that searched for a set of words in multiple elements. We had 'score-simple' and truncate=100 specified to bring back the first 100 results. The expectation was that it would bring back the first 100 results from a result-set that was ordered by the score-simple ranking. However, it seems like the truncate applies before score-simple ranking rather than after. This means that truncate=100 may have different results in the first 10 places than truncate=10. Is this the expected behavior? Using limit=100 rather than truncate=100 seems to do what we want - that is, apply score-simple ranking first and then get the first 100 results.
    • With truncate, here's what I'd expect: 1. apply the query 2. rank fragments by score-simple 3. drop all but the first 100 fragments from consideration 4. return all the values from those first 100 fragments Note that depending on the content of those fragments, you could get 100 values (if each fragment has one target element), more than 100 (if some fragments have more than one target element), or fewer than 100 (if not all fragments have the element). It sounds like you wanted no more than 100 values, so using limit=100 is the right way to go.
      • Thanks, David. That's what we were expecting, but from some tests I ran it seems to be doing step 3 before step 2. If I run the same query with limits 10 and then 100, if the ranking happens before truncation, I'd expect to see the same 10 values first in both results sets. But this is not the case. I see some of the first 10 results from the first query in other positions in the 100 results. When using limit, I get the same order of results irrespective of what value I set the limit to.
        • Interesting. Do you have a support contract? If so, could you report this, preferably with some sample data? Then we can investigate whether there's a bug related to score-simple.
  • How to get the structured query representation for this element-values query. I don't see this available in the structured query syntax reference. I want to execute this : cts:element-values(xs:QName("instanceName"))
    • cts:element-values is a lexicon query, rather than a search, so structured query is not appropriate here. All the Client APIs (REST, Java, Node.js) provide interfaces for querying lexicons that enable you to do the equivalent of cts:element-values. See /v1/values{name} (REST), QueryManager.values (Java), or databaseClient.values (Node.js).
      • Thanks Kim. This is what I am trying to do: Create a element range index for entityName. The below cts query works fine in qconsole and gives me all unique values for enityName. cts:element-values(xs:QName("entityName")) Now trying to get the same result from java which is not working. I am sure I am doing something wrong with newValuesDefinition. Below is the snippet. Can you please point me to the right reference/link? QueryManager queryMgr = client.newQueryManager(); ValuesDefinition valuesDef = queryMgr.newValuesDefinition("entityName"); ValuesHandle vh = queryMgr.values(valuesDef, new ValuesHandle());
        • In Java, the name passed to newValuesDefinition needs to be the name of a values definition in query options installed on the server. It is not merely your element name. Essentially, you must define the index or lexicon to be queried, via query options. The easiest thing might be to look at the test here: https://github.com/marklogic/java-client-api/blob/3.0-master/src/test/java/com/marklogic/client/test/ValuesHandleTest.java In future, I encourage you to ask questions like this on StackOverflow as you will reach a much, much wider audience.
          • Thanks Kim. Got it working. Attaching code snippet in case anyone needs it. Have one more question: Is there any way we can restrict the results to documents from a particular collection? I realize that we are querying the index values here. So when I created the range index it builds the index for entityName for all documents in the database. If my requirement is to only look up documents in a particular collection and get distinct values for a element in documents from that collection alone what options do I have? Code snippet: QueryManager queryMgr = client.newQueryManager(); QueryOptionsManager optionsMgr = client.newServerConfigManager().newQueryOptionsManager(); String valueOptionQuery = queryBuilderService.generateDistinctValueQueryXML(elementName); optionsMgr.writeOptions("valuesoptions", new StringHandle(valueOptionQuery)); ValuesDefinition vdef = queryMgr.newValuesDefinition(elementName, "valuesoptions"); ValuesHandle vh = queryMgr.values(vdef, new ValuesHandle()); for (CountedDistinctValue value : vh.getValues()) { distinctValues.add(value.get("xs:string", String.class)); } >>>> elementName is : entityName >>>> valueOptionQuery generated by custom code queryBuilderService.generateDistinctValueQueryXML: <search:options xmlns:search="http://marklogic.com/appservices/search"> <search:values name="entityName"> <search:range type="xs:string"> <search:element name="entityName"/> </search:range> </search:values> </search:options>
    • cts:element-values() returns a list of values, rather than contributing to a search. In the REST API, you can do the equivalent with <a href="http://docs.marklogic.com/REST/GET/v1/values/[name]">/v1/values/[name]</a>. Is that what you're trying to accomplish?
      • Thanks David. Commented on Kim Coleman's answer.
  • (: one more example :) declare namespace ns1 = "http://somenamespace.com/ns1"; declare namespace ns2 = "http://somenamespace.com/ns2"; cts:element-values(xs:QName("ns1:someIndexedElement") , (), (), cts:element-query(xs:QName("ns1:contextElement") , cts:and-query(( cts:element-query( xs:QName("ns1:onemoreElement"), cts:and-query(()) ) , cts:element-value-query( xs:QName("ns2:onemoreElement"), "someValue") )) ) )
Powered by MarkLogic Server 7.0-4.1 and rundmc | Terms of Use | Privacy Policy