cts:field-value-co-occurrences( $field-name-1 as xs:string, $field-name-2 as xs:string, [$options as xs:string*], [$query as cts:query?], [$quality-weight as xs:double?], [$forest-ids as xs:unsignedLong*] ) as element(cts:co-occurrence)*
Returns value co-occurrences (that is, pairs of values, both of which appear
in the same fragment) from the specified field value lexicon(s). The
values are returned as an XML element
with two children, each child
containing one of the co-occurring values. You can use
cts:frequency
on each item returned to find how many times the pair occurs.
Value lexicons are implemented using range indexes; consequently
this function requires an field range index for each field specified
in the function. If there is not a range index configured for each
of the specified fields, an exception is thrown.
Parameters | |
---|---|
field-name-1 | A string. |
field-name-2 | A string. |
options |
Options. The default is ().
Options include:
|
query |
Only include co-occurrences in fragments selected by the cts:query ,
and compute frequencies from this set of included co-occurrences.
The co-occurrences do not need to match the query, but they must occur in
fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search
operations. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
quality-weight | A document quality weight to use when computing scores. The default is 1.0. |
forest-ids | A sequence of IDs of forests to which the search will be constrained. An empty sequence means to search all forests in the database. The default is (). |
Only one of "frequency-order" or "item-order" may be specified in the options parameter. If neither "frequency-order" nor "item-order" is specified, then the default is "item-order".
Only one of "fragment-frequency" or "item-frequency" may be specified in the options parameter. If neither "fragment-frequency" nor "item-frequency" is specified, then the default is "fragment-frequency".
Only one of "ascending" or "descending" may be specified in the options parameter. If neither "ascending" nor "descending" is specified, then the default is "ascending" if "item-order" is specified, and "descending" if "frequency-order" is specified.
Only one of "eager" or "lazy" may be specified in the options parameter. If neither "eager" nor "lazy" is specified, then the default is "eager" if "frequency-order" or "map" is specified, otherwise "lazy".
Only one of "any", "document", "properties", or "locks" may be specified in the options parameter. If none of "any", "document", "properties", or "locks" are specified and there is a $query parameter, then the default is "document". If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", "score-simple", "score-random", or "score-zero" options may be specified in the options parameter. If none of "score-logtfidf", "score-logtf", "score-simple", "score-random", or "score-zero" are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified in the options parameter. If neither "checked" nor "unchecked" are specified, then the default is "checked".
If "collation=URI" is not specified in the options parameter, then the default collation is used. If a lexicon with that collation does not exist, an error is thrown.
If "sample=N" is not specified in the options parameter,
then all included co-occurrences may be returned.
If a $query
parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specified in the options parameter,
then co-occurrences from all fragments selected by the
$query
parameter are included.
If a $query
parameter is not present, then
"truncate=N" has no effect.
To incrementally fetch a subset of the co-occurrences returned by this
function, use fn:subsequence
on the output, rather than
the "skip" option. The "skip" option is based on fragments matching the
query
parameter (if present), not on occurrences. A fragment
matched by query might contain multiple occurrences or no occurrences.
The number of fragments skipped does not correspond to the number of
values. Also, the skip is applied to the relevance ordered query matches,
not to the ordered co-occurrences list.
When using the "skip" option, use the "truncate" option rather than the "limit" option to control the number of matching fragments from which to draw values.
(: Suppose we insert these two documents in the database. Document 1: <doc> <name1> <i11>John</i11><e12>Smith</e12><i13>Griffith</i13> </name1> <name2> <i21>Will</i21><e22>Tim</e22><i23>Shields</i23> </name2> </doc> Document 2: <doc> <name1> <i11>Will<e12>Frank</e12>Shields</i11> </name1> <name2> <i21>John<e22>Tim</e22>Griffith</i21> </name2> </doc> :) (: Now suppose we have two fields aname1 and aname2 defined on the database. The field aname1 includes element "name1" and excludes "e12". The field aname2 includes element "name2" and excludes "e22". Both the fields have field range indexes configures with positions ON. :) cts:field-value-co-occurrences("aname1","aname2") => <cts:co-occurrence xmlns:cts="http://marklogic.com/cts" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xs="http://www.w3.org/2001/XMLSchema"> <cts:value xsi:type="xs:string">John Griffith</cts:value> <cts:value xsi:type="xs:string">Will Shields</cts:value> </cts:co-occurrence> <cts:co-occurrence xmlns:cts="http://marklogic.com/cts" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xs="http://www.w3.org/2001/XMLSchema"> <cts:value xsi:type="xs:string">Will Shields</cts:value> <cts:value xsi:type="xs:string">John Griffith</cts:value> </cts:co-occurrence>
(: Here is another example that finds co-occurrence between field value and an element-value using cts:element-value-co-occurrences() API. :) (: Suppose we have the following document in the database. :) <doc> <person> <name> <first-name>Will</first-name> <middle-name>Frank</middle-name> <last-name>Shields</last-name> </name> <address> <ZIP>92341</ZIP> </address> <phoneNumber>650-472-4444</phoneNumber> </person> <person> <name> <first-name>John</first-name> <middle-name>Tim</middle-name> <last-name>Hearst</last-name> </name> <address> <ZIP>96345</ZIP> </address> <phoneNumber>750-947-5555</phoneNumber> </person> </doc> (: This database has element range indexes defined on elements ZIP and phoneNumber. Positions are set true on the range indexes. There is a field, named "aname" defined on this database which excludes element middle-name. A string range index is configured on the field "aname". Position is set true on the database. In the following query we are using lexicons on field values of "aname" and element value "ZIP" to determine value co-occurrences. However, notice the field is being treated as if it were an element with a MarkLogic predefined namespace "http://marklogic.com/fields". :) declare namespace my="http://marklogic.com/fields"; cts:element-value-co-occurrences(xs:QName("ZIP"),xs:QName("my:aname")) => <cts:co-occurrence xmlns:cts="http://marklogic.com/cts" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xs="http://www.w3.org/2001/XMLSchema"> <cts:value xsi:type="xs:int">68645</cts:value> <cts:value xsi:type="xs:string">Jill Tom Lawless</cts:value> </cts:co-occurrence> <cts:co-occurrence xmlns:cts="http://marklogic.com/cts" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xs="http://www.w3.org/2001/XMLSchema"> <cts:value xsi:type="xs:int">68645</cts:value> <cts:value xsi:type="xs:string">Nancy Smith Finkman</cts:value> </cts:co-occurrence> <cts:co-occurrence xmlns:cts="http://marklogic.com/cts" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xs="http://www.w3.org/2001/XMLSchema"> <cts:value xsi:type="xs:int">92341</cts:value> <cts:value xsi:type="xs:string">John Tim Hearst</cts:value> </cts:co-occurrence> <cts:co-occurrence xmlns:cts="http://marklogic.com/cts" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xs="http://www.w3.org/2001/XMLSchema"> <cts:value xsi:type="xs:int">92341</cts:value> <cts:value xsi:type="xs:string">Will Frank Shields</cts:value> </cts:co-occurrence> <cts:co-occurrence xmlns:cts="http://marklogic.com/cts" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xs="http://www.w3.org/2001/XMLSchema"> <cts:value xsi:type="xs:int">93452</cts:value> <cts:value xsi:type="xs:string">Jill Tom Lawless</cts:value> </cts:co-occurrence> <cts:co-occurrence xmlns:cts="http://marklogic.com/cts" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xs="http://www.w3.org/2001/XMLSchema"> <cts:value xsi:type="xs:int">93452</cts:value> <cts:value xsi:type="xs:string">Nancy Smith Finkman</cts:value> </cts:co-occurrence> <cts:co-occurrence xmlns:cts="http://marklogic.com/cts" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xs="http://www.w3.org/2001/XMLSchema"> <cts:value xsi:type="xs:int">96345</cts:value> <cts:value xsi:type="xs:string">John Tim Hearst</cts:value> </cts:co-occurrence> <cts:co-occurrence xmlns:cts="http://marklogic.com/cts" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xs="http://www.w3.org/2001/XMLSchema"> <cts:value xsi:type="xs:int">96345</cts:value> <cts:value xsi:type="xs:string">Will Frank Shields</cts:value> </cts:co-occurrence>