cts:search
cts:search(
$expression as node()*,
$query as cts:query?,
[$options as (cts:order|xs:string)*],
[$quality-weight as xs:double?],
[$forest-ids as xs:unsignedLong*]
) as node()*
Summary
Returns a relevance-ordered sequence of nodes specified by a given query.
Parameters |
$expression |
An expression to be searched.
This must be an inline fully searchable path expression.
|
query |
A cts:query
specifying the search to perform. If a string is entered, the string is
treated as a cts:word-query
of the specified string.
|
options |
Options to this search. The default is ().
Options include:
"filtered"
A filtered search (the default). Filtered searches
eliminate any false-positive matches and properly resolve cases where
there are multiple candidate matches within the same fragment.
Filtered search results fully satisfy the specified
cts:query .
"unfiltered"
An unfiltered search. An unfiltered search
selects fragments from the indexes that are candidates to satisfy
the specified cts:query , and then it returns
a single node from within each fragment that satisfies the specified
searchable path expression. Unfiltered searches are useful because
of the performance they afford when jumping deep into the
result set (for example, when paginating a long result set and
jumping to the 1,000,000th result). However, depending on the
searchable path expression, the
cts:query specified, the structure of the documents in
the database, and the configuration of the database, unfiltered
searches may yield false-positive results being included in the
search results. Unfiltered searches may also result in missed
matches or in incorrect matches, especially when there are
multiple candidate matches within a single fragment.
To avoid these problems, you should only use unfiltered searches
on top-level XPath expressions (for example, document nodes,
collections, directories) or on fragment roots. Using unfiltered
searches on complex XPath expressions or on XPath expressions that
traverse below a fragment root can result in unexpected results.
"score-logtfidf"
Compute scores using the logtfidf method (the default scoring
method). This uses the formula:
log(term frequency) * (inverse document frequency)
"score-logtf"
Compute scores using the logtf method. This does not take into
account how many documents have the term and uses the formula:
log(term frequency)
"score-simple"
Compute scores using the simple method. The score-simple
method gives a score of 8*weight for each matching term in the
cts:query expression, and then scales the score up by
multiplying by 256. It does not matter how
many times a given term matches (that is, the term
frequency does not matter); each match contributes 8*weight
to the score. For example, the following query (assume the
default weight of 1) would give a score of 8*256=2048 for
any fragment with one or more matches for "hello", a score of
16*256=4096
for any fragment that also has one or more matches for "goodbye",
or a score of zero for fragments that have no matches for
either term:
cts:or-query(("hello", "goodbye"))
"score-random"
Compute scores using the random method. The score-random
method gives a random value to the score. You can use this
to randomly choose fragments matching a query.
"score-zero"
Compute all scores as zero.
When combined with a quality weight of zero,
this is the fastest consistent scoring method.
- "checked"
Word positions are checked (the default) when resolving
the query. Checked searches eliminate false-positive matches for
phrases during the index resolution phase of search processing.
- "unchecked"
Word positions are not checked when resolving the
query. Unchecked searches do not take into account word positions
and can lead to false-positive matches during the index resolution
phase of search processing. This setting is useful
for debugging, but not recommended for normal use.
- "too-many-positions-error"
- If too much memory is needed to perform positions calculations
to check whether a document matches a query,
return an XDMP-TOOMANYPOSITIONS error,
instead of accepting the document as a match.
- "faceted"
Do a little more work to save faceting information about
fragments matching this search so that calculating facets
will be faster.
- "unfaceted"
Do not save faceting information about fragments matching
this search.
- "relevance-trace"
Collect relevance score computation details with which you
can generate a trace report using cts:relevance-info .
Collecting this information is costly and will significantly
slow down your search, so you should only use it when using
cts:relevance-info to tune a query.
- "format-FORMAT"
Limit the search to documents in document format specified
by FORMAT (binary, json, text, or xml)
- cts:order Specification
A sequence of cts:order specifications. The order
is evaluated in the order each appears in the sequence. The sequence
typically consists of one or more of:
cts:index-order ,
cts:score-order ,
cts:confidence-order ,
cts:fitness-order ,
cts:quality-order ,
cts:document-order ,
cts:unordered . When using
cts:index-order , there must be a range index defined
on the index(es) specified by the cts:referennce
specification (for example,
cts:element-reference .)
|
quality-weight |
A document quality weight to use when computing scores.
The default is 1.0.
|
forest-ids |
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is (). In the XQuery version, you can use
cts:search with this
parameter and an empty cts:and-query to specify a
forest-specific XPath statement (see the third
example below). If you
use this to constrain an XPath to one or more forests, you should set
the quality-weight to zero to keep the XPath document
order.
|
Usage Notes
Queries that use cts:search
require that
the XPath expression
searched is fully searchable. A fully searchable path is one that
has no steps that are unsearchable and whose last step is searchable.
You can use the
xdmp:query-trace()
function to see if the path is fully
searchable. If there are no entries in the xdmp:query-trace()
output indicating that a step is unsearchable, and if the last step
is searchable, then that path is fully
searchable. Queries that use cts:search
on unsearchable
XPath expressions will fail with an error message. You can often make
the path expressions fully searchable by rewriting the query or adding
new indexes.
Each node that
cts:search
returns has a score with which
it is associated. To access the score, use the
cts:score
function. The nodes are returned in relevance order (most relevant to least
relevant), where more relevant nodes have a higher score.
Only one of the "filtered" or "unfiltered" options may be specified
in the options parameter. If neither "filtered" nor "unfiltered", is
specified then the default is "filtered".
Only one of the "score-logtfidf", "score-logtf", "score-simple",
"score-random", or "score-zero" options may be specified in the
options parameter.
If none of "score-logtfidf", "score-logtf", "score-simple", "score-random",
or "score-zero" are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter. If the neither "checked" nor "unchecked" are
specified, then the default is "checked".
Only one of the "faceted" or "unfaceted" options may be specified
in the options parameter. If the neither "faceted" nor "unfaceted" are
specified, then the default is "unfaceted".
If the cts:query
specified is the
empty string (equivalent to cts:word-query("")
), then the
search returns the empty sequence.
Example
cts:search(//SPEECH,
cts:word-query("with flowers"))
=> ... a sequence of 'SPEECH' element ancestors (or self)
of any node containing the phrase 'with flowers'.
Example
cts:search(collection("self-help")/book,
cts:element-query(xs:QName("title"), "meditation"),
"score-simple", 1.0, (xdmp:forest("prod"),xdmp:forest("preview")))
=> ... a sequence of book elements matching the XPath
expression which are members of the "self-help"
collection, reside in the "prod" or "preview" forests and
contain "meditation" in the title element, using the
"score-simple" option.
Example
cts:search(/some/xpath, cts:and-query(()), (), 0.0,
xdmp:forest("myForest"))
=> ... a sequence of /some/xpath elements that are
in the forest named "myForest". Note the
empty and-query, which matches all documents (and
scores them all the same) and the quality-weight
of 0, which together make each result have a score
of 0, which keeps the results in document order.
Example
cts:search(fn:doc(), "hello",
("unfiltered",
cts:index-order(cts:element-reference(xs:QName("Title")))
) )[1 to 10]
=> Returns the first 10 documents with the word "hello", unfiltered,
ordered by the range index on the element "Title". An element
range index on Title is required for this search, otherwise it
throws an exception.
Stack Overflow: Get the most useful answers to questions from the MarkLogic community, or ask your own question.