Loading TOC...

cts:search

cts:search(
   $expression as node()*,
   $query as cts:query?,
   [$options as (cts:order|xs:string)*],
   [$quality-weight as xs:double?],
   [$forest-ids as xs:unsignedLong*]
) as node()*

Summary

Returns a relevance-ordered sequence of nodes specified by a given query.

Parameters
$expression An expression to be searched. This must be an inline fully searchable path expression.
$query A cts:query specifying the search to perform. If a string is entered, the string is treated as a cts:word-query of the specified string.
$options Options to this search. The default is ().

Options include:

"filtered"

A filtered search (the default). Filtered searches eliminate any false-positive matches and properly resolve cases where there are multiple candidate matches within the same fragment. Filtered search results fully satisfy the specified cts:query.

"unfiltered"

An unfiltered search. An unfiltered search selects fragments from the indexes that are candidates to satisfy the specified cts:query, and then it returns a single node from within each fragment that satisfies the specified searchable path expression. Unfiltered searches are useful because of the performance they afford when jumping deep into the result set (for example, when paginating a long result set and jumping to the 1,000,000th result). However, depending on the searchable path expression, the cts:query specified, the structure of the documents in the database, and the configuration of the database, unfiltered searches may yield false-positive results being included in the search results. Unfiltered searches may also result in missed matches or in incorrect matches, especially when there are multiple candidate matches within a single fragment. To avoid these problems, you should only use unfiltered searches on top-level XPath expressions (for example, document nodes, collections, directories) or on fragment roots. Using unfiltered searches on complex XPath expressions or on XPath expressions that traverse below a fragment root can result in unexpected results.

"score-logtfidf"

Compute scores using the logtfidf method (the default scoring method). This uses the formula:

  log(term frequency) * (inverse document frequency)

"score-logtf"

Compute scores using the logtf method. This does not take into account how many documents have the term and uses the formula:

  log(term frequency)

"score-simple"

Compute scores using the simple method. The score-simple method gives a score of 8*weight for each matching term in the cts:query expression, and then scales the score up by multiplying by 256. It does not matter how many times a given term matches (that is, the term frequency does not matter); each match contributes 8*weight to the score. For example, the following query (assume the default weight of 1) would give a score of 8*256=2048 for any fragment with one or more matches for "hello", a score of 16*256=4096 for any fragment that also has one or more matches for "goodbye", or a score of zero for fragments that have no matches for either term:

  cts:or-query(("hello", "goodbye"))

"score-random"

Compute scores using the random method. The score-random method gives a random value to the score. You can use this to randomly choose fragments matching a query.

"score-zero"

Compute all scores as zero. When combined with a quality weight of zero, this is the fastest consistent scoring method.

"checked"

Word positions are checked (the default) when resolving the query. Checked searches eliminate false-positive matches for phrases during the index resolution phase of search processing.

"unchecked"

Word positions are not checked when resolving the query. Unchecked searches do not take into account word positions and can lead to false-positive matches during the index resolution phase of search processing. This setting is useful for debugging, but not recommended for normal use.

"too-many-positions-error"
If too much memory is needed to perform positions calculations to check whether a document matches a query, return an XDMP-TOOMANYPOSITIONS error, instead of accepting the document as a match.
"faceted"

Do a little more work to save faceting information about fragments matching this search so that calculating facets will be faster.

"unfaceted"

Do not save faceting information about fragments matching this search.

"relevance-trace"

Collect relevance score computation details with which you can generate a trace report using cts:relevance-info. Collecting this information is costly and will significantly slow down your search, so you should only use it when using cts:relevance-info to tune a query.

"format-FORMAT"

Limit the search to documents in document format specified by FORMAT (binary, json, text, or xml)

cts:order Specification

A sequence of cts:order specifications. The order is evaluated in the order each appears in the sequence. The sequence typically consists of one or more of: cts:index-order, cts:score-order, cts:confidence-order, cts:fitness-order, cts:quality-order, cts:document-order, cts:unordered. When using cts:index-order, there must be a range index defined on the index(es) specified by the cts:reference specification (for example, cts:element-reference.)

$quality-weight A document quality weight to use when computing scores. The default is 1.0.
$forest-ids A sequence of IDs of forests to which the search will be constrained. An empty sequence means to search all forests in the database. The default is (). In the XQuery version, you can use cts:search with this parameter and an empty cts:and-query to specify a forest-specific XPath statement (see the third example below). If you use this to constrain an XPath to one or more forests, you should set the quality-weight to zero to keep the XPath document order.

Usage Notes

Queries that use cts:search require that the XPath expression searched is fully searchable. A fully searchable path is one that has no steps that are unsearchable and whose last step is searchable. You can use the xdmp:query-trace() function to see if the path is fully searchable. If there are no entries in the xdmp:query-trace() output indicating that a step is unsearchable, and if the last step is searchable, then that path is fully searchable. Queries that use cts:search on unsearchable XPath expressions will fail with an error message. You can often make the path expressions fully searchable by rewriting the query or adding new indexes.

Each node that cts:search returns has a score with which it is associated. To access the score, use the cts:score function. The nodes are returned in relevance order (most relevant to least relevant), where more relevant nodes have a higher score.

Only one of the "filtered" or "unfiltered" options may be specified in the options parameter. If neither "filtered" nor "unfiltered", is specified then the default is "filtered".

Only one of the "score-logtfidf", "score-logtf", "score-simple", "score-random", or "score-zero" options may be specified in the options parameter. If none of "score-logtfidf", "score-logtf", "score-simple", "score-random", or "score-zero" are specified, then the default is "score-logtfidf".

Only one of the "checked" or "unchecked" options may be specified in the options parameter. If the neither "checked" nor "unchecked" are specified, then the default is "checked".

Only one of the "faceted" or "unfaceted" options may be specified in the options parameter. If the neither "faceted" nor "unfaceted" are specified, then the default is "unfaceted".

If the cts:query specified is the empty string (equivalent to cts:word-query("")), then the search returns the empty sequence.

Example

  cts:search(//SPEECH,
    cts:word-query("with flowers"))

  => ... a sequence of 'SPEECH' element ancestors (or self)
     of any node containing the phrase 'with flowers'.

Example

  cts:search(collection("self-help")/book,
    cts:element-query(xs:QName("title"), "meditation"),
    "score-simple", 1.0, (xdmp:forest("prod"),xdmp:forest("preview")))

  => ... a sequence of book elements matching the XPath
     expression which are members of the "self-help"
     collection, reside in the "prod" or "preview" forests and
     contain "meditation" in the title element, using the
     "score-simple" option.

Example

  cts:search(/some/xpath, cts:and-query(()), (), 0.0,
    xdmp:forest("myForest"))

  => ... a sequence of /some/xpath elements that are
     in the forest named "myForest".  Note the
     empty and-query, which matches all documents (and
     scores them all the same) and the quality-weight
     of 0, which together make each result have a score
     of 0, which keeps the results in document order.

Example

cts:search(fn:doc(), "hello",
    ("unfiltered",
     cts:index-order(cts:element-reference(xs:QName("Title")))
    ) )[1 to 10]
=> Returns the first 10 documents with the word "hello", unfiltered,
   ordered by the range index on the element "Title".  An element
   range index on Title is required for this search, otherwise it
   throws an exception.

Comments

  • I want to search documents by collection and word query then from those documents I want to get some values like UUID, with that UUID I want to get other documents related the same transaction with URIs /client/document/[UUID] and /provider(document/[UUID] I show my JSON document (there is more information but I'm just showing a fragment), with URI /company/transaction/5ca21fb8-f6e7-4b17-97d1-287ba8a95d4e and it belongs to TRANSACTION collection: { "UUID":"5ca21fb8-f6e7-4b17-97d1-287ba8a95d4e" "Status": { "Status": "ACCEPTED", "Processed": "0000-00-00T00:00:00-0:00" } } Here is my xquery: xquery version "1.0-ml"; declare function local:findTransactions() { for $i in cts:search(fn:doc()/Status, cts:and-query((cts:collection-query("TRANSACTION"), cts:word-query("ACCEPTED"), cts:word-query("0000-00-00T00:00:00-0:00"))))[1 to 100] let $nodeUpdate := xdmp:node-replace(fn:doc(base-uri($i))/Status/Processed, text {local:DateTimeToISO ()}) return fn:doc(base-uri($i))/UUID }; declare function local:findDocumentXML($n as xs:string) { fn:doc(fn:concat ( "/client/document/", $n ) ), fn:doc(fn:concat ( "/provider(document", $n ) ) }; declare function local:DateTimeToISO () { fn:concat( fn:concat( fn:format-dateTime(fn:current-dateTime(), "[Y0001]-[M01]-[D01]") ,"T" ) , fn:concat( fn:format-dateTime(fn:current-dateTime(), "[H01]:[m01]:[s01]"), "Z") ) }; local:findDocumentXML( local:findTransactions( ) ) But this is not fast, how could I improve this query?
    • Hi Frank. Your request here is making a lot of updates in a single transaction, which means getting a lot of write locks. If you're doing this against a large number of documents, I can see why it would get slow. I could suggest some ways to tinker with the query and with local:DateTimeToISO (if you just go with fn:current-dateTime(), your format will be standards compliant and have the time zone offset set appropriately, based on the server where it's running), but the write locks are probably what needs to be dealt with. For that, you could use xdmp:invoke() with the "separate-transaction" option, but I'd just use CORB2 (https://github.com/marklogic/corb2). BTW, not many people get notifications of comments; consider posting questions like this on Stack Overflow with the "marklogic" tag.
  • check out Evan Lenz provides a great overview of cts API here http://developer.marklogic.com/blog/grokking-the-cts-api
Powered by MarkLogic Server 7.0-4.1 and rundmc | Terms of Use | Privacy Policy