MarkLogic 10 Product Documentation
hadoop:get-splits

hadoop:get-splits(
   $nsbindings as xs:string*,
   $doc-selector as xs:string,
   $query as xs:string
) as item()*

Summary

This function returns (forest_id, record_count, host_name) tuples where forest_id and host_name identify the target forest of the split input, and record_count is a rough estimate of the number of input key-value pairs in the split.

Parameters
nsbindings Name space binding declaration.
doc-selector A fully searchable path expression to define document scope for the splits.
query A cts:query specifying the search to perform for the splits.

Usage Notes

The function creates split tuples using a searchable expression and cts:query. The parameters to hadoop:get-splits and hadoop.getSplits determine the documents under consideration in each forest, equivalent to the $expression and $query parameters of cts:search. The function returns an estimate rather than a true count to improve performance.

Example

xquery version "1.0-ml";
import module namespace hadoop= "http://marklogic.com/xdmp/hadoop" 
          at "/MarkLogic/hadoop.xqy";
hadoop:get-splits('', 'fn:doc()', 'cts:and-query(())')
=>
8456374036761185098 97 doc.marklogic.com
  
Powered by MarkLogic Server | Terms of Use | Privacy Policy