MarkLogic 10 Product Documentation
cts.searchcts.search(
query as cts.query?,
[options as String[]],
[quality-weight as Number?],
[forest-ids as (Number|String)[]]
) as Sequence
Summary
Returns a relevance-ordered sequence of nodes specified by a given query.
Parameters |
query |
A
cts.query
specifying the search to perform. If a string is entered, the string is
treated as a
cts.wordQuery of the specified string.
|
options |
Options to this search. The default is ().
Options include:
"filtered"
A filtered search (the default). Filtered searches
eliminate any false-positive matches and properly resolve cases where
there are multiple candidate matches within the same fragment.
Filtered search results fully satisfy the specified
cts.query .
"unfiltered"
An unfiltered search. An unfiltered search selects fragments from the indexes
that are candidates to satisfy the specified cts.query and returns the fragments or
associated documents if there is fragmentation. Unfiltered searches are useful
because of the performance they afford when jumping deep into the result set
(for example, when paginating a long result set and jumping to the 1,000,000th
result). However, depending on the cts.query specified, the structure of the
documents in the database, and the configuration of the database, unfiltered
searches may yield false-positive results being included in the search results.
Unfiltered searches may also result in missed matches or in incorrect matches,
especially when there may be negated false positives.
"score-logtfidf"
Compute scores using the logtfidf method (the default scoring
method). This uses the formula:
log(term frequency) * (inverse document frequency)
"score-logtf"
Compute scores using the logtf method. This does not take into
account how many documents have the term and uses the formula:
log(term frequency)
"score-simple"
Compute scores using the simple method. The score-simple
method gives a score of 8*weight for each matching term in the
cts.query expression, and then scales the score up by
multiplying by 256. It does not matter how
many times a given term matches (that is, the term
frequency does not matter); each match contributes 8*weight
to the score. For example, the following query (assume the
default weight of 1) would give a score of 8*256=2048 for
any fragment with one or more matches for "hello", a score of
16*256=4096
for any fragment that also has one or more matches for "goodbye",
or a score of zero for fragments that have no matches for
either term:
cts:or-query(("hello", "goodbye"))
"score-random"
Compute scores using the random method. The score-random
method gives a random value to the score. You can use this
to randomly choose fragments matching a query.
"score-zero"
Compute all scores as zero.
When combined with a quality weight of zero,
this is the fastest consistent scoring method.
- "checked"
Word positions are checked (the default) when resolving
the query. Checked searches eliminate false-positive matches for
phrases during the index resolution phase of search processing.
- "unchecked"
Word positions are not checked when resolving the
query. Unchecked searches do not take into account word positions
and can lead to false-positive matches during the index resolution
phase of search processing. This setting is useful
for debugging, but not recommended for normal use.
- "too-many-positions-error"
- If too much memory is needed to perform positions calculations
to check whether a document matches a query,
return an XDMP-TOOMANYPOSITIONS error,
instead of accepting the document as a match.
- "faceted"
Do a little more work to save faceting information about
fragments matching this search so that calculating facets
will be faster.
- "unfaceted"
Do not save faceting information about fragments matching
this search.
- "relevance-trace"
Collect relevance score computation details with which you
can generate a trace report using
cts.relevanceInfo .
Collecting this information is costly and will significantly
slow down your search, so you should only use it when using
cts.relevanceInfo to tune a query.
- "format-FORMAT"
Limit the search to documents in document format specified
by FORMAT (binary, json, text, or xml)
- "any"
- Search from any fragment.
- "document"
- Search from document fragments.
- "properties"
- Search only from properties fragments.
- "locks"
- search only from locks fragments.
- cts:order Specification
A sequence of
cts.order specifications. The order
is evaluated in the order each appears in the sequence. Default:
(cts:score-order("descending"),cts:document-order("ascending")) .
The sequence typically consists of one or more of:
cts:index-order ,
cts:score-order ,
cts:confidence-order ,
cts:fitness-order ,
cts:quality-order ,
cts:document-order ,
cts:unordered . When using
cts:index-order , there must be a range index defined
on the index(es) specified by the cts:reference
specification (for example,
cts:element-reference .)
|
quality-weight |
A document quality weight to use when computing scores.
The default is 1.0.
|
forest-ids |
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is (). In the XQuery version, you can use
cts:search with this
parameter and an empty cts:and-query to specify a
forest-specific XPath statement (see the third
example below). If you
use this to constrain an XPath to one or more forests, you should set
the quality-weight to zero to keep the XPath document
order.
|
Usage Notes
Each node that
cts.search
returns has a score with which
it is associated. To access the score, use the
cts.score
function. The nodes are returned in relevance order (most relevant to least
relevant), where more relevant nodes have a higher score.
Only one of the "filtered" or "unfiltered" options may be specified
in the options parameter. If neither "filtered" nor "unfiltered", is
specified then the default is "filtered".
Only one of the "score-logtfidf", "score-logtf", "score-simple",
"score-random", or "score-zero" options may be specified in the
options parameter.
If none of "score-logtfidf", "score-logtf", "score-simple", "score-random",
or "score-zero" are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter. If the neither "checked" nor "unchecked" are
specified, then the default is "checked".
Only one of the "faceted" or "unfaceted" options may be specified
in the options parameter. If the neither "faceted" nor "unfaceted" are
specified, then the default is "unfaceted".
If the cts:query
specified is the
empty string (equivalent to cts.wordQuery("")
), then the
search returns an empty Iterator.
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
With the cts.indexOrder
parameter, results with no comparable index value are always returned at the end of the ordered
result sequence.
Example
cts.search(cts.wordQuery("with flowers"));
=> ... an Iterator of any node containing the phrase 'with flowers'.
Example
fn.subsequence(cts.search(cts.wordQuery("with flowers")), 1, 10);
=> ... an Iterator of the first 10 nodes containing the phrase 'with flowers'.
Example
fn.subsequence(
cts.search("hello", ["unfiltered",
cts.indexOrder(cts.elementReference(fn.QName("", "Title")))
] ), 1, 10);
=> Returns the first 10 documents with the word "hello", unfiltered,
ordered by the range index on the element "Title". An element
range index on Title is required for this search, otherwise it
throws an exception.
Stack Overflow: Get the most useful answers to questions from the MarkLogic community, or ask your own question.