MarkLogic 12 Product Documentation
cts.similarQuery

cts.similarQuery(
   nodes as Node[],
   [weight as Number?],
   [options as Node?]
) as cts.similarQuery

Summary

Returns a query matching nodes similar to the model nodes. It uses an algorithm which finds the most "relevant" terms in the model nodes (that is, the terms with the highest scores), and then creates a query equivalent to a cts:or-query of those terms. By default 16 terms are used.

Parameters

nodes Some model nodes.

weight A weight for this query. Higher weights move search results up in the relevance order. The default is 1.0. The weight should be between 64 and -16. Weights greater than 64 will have the same effect as a weight of 64. Weights less than the absolute value of 0.0625 (between -0.0625 and 0.0625) are rounded to 0, which means that they do not contribute to the score.

Parameters
nodes	Some model nodes.
weight	A weight for this query. Higher weights move search results up in the relevance order. The default is 1.0. The weight should be between 64 and -16. Weights greater than 64 will have the same effect as a weight of 64. Weights less than the absolute value of 0.0625 (between -0.0625 and 0.0625) are rounded to 0, which means that they do not contribute to the score.
options	An object containing the options for defining which terms to generate and how to evaluate them. The following is a sample options object: { maxTerms: 20 } See the `cts.distinctiveTerms` options for the valid options to use with this function. Note that enabling index settings that are disabled in the database configuration will not affect the results, as similar documents will not be found on the basis of terms that do not exist in the actual database index.

options

An object containing the options for defining which terms to generate and how to evaluate them. The following is a sample options object:



    {
      maxTerms: 20
    }

See the cts.distinctiveTerms options for the valid options to use with this function.

Note that enabling index settings that are disabled in the database configuration will not affect the results, as similar documents will not be found on the basis of terms that do not exist in the actual database index.

Usage Notes

As the number of fragments in a database grows, the results of cts.similarQuery become increasingly accurate. For best results, there should be at least 10,000 fragments for 32-bit systems, and 1,000 fragments for 64-bit systems.

Example

  // Note that although the API is defined as taking string
  // representations of nodes, MarkLogic will automatically convert
  // sequences of nodes to an array of their string representation
  // when required.
  cts.search(
    cts.similarQuery(fn.doc("nodes-like-this.xml"),
                     2, { maxTerms: 20 })
    );
=> the number of fragments containing any node similar
   to 'nodes-like-this.xml'.

MarkLogic 12 Product Documentationcts.similarQuery

Summary

Usage Notes

Example

MarkLogic 12 Product Documentation
cts.similarQuery