
hadoop:get-splits( $nsbindings as xs:string*, $doc-selector as xs:string, $query as xs:string ) as item()*
This function returns (forest_id, record_count,
host_name) tuples where forest_id and
host_name identify the target forest of the split input,
and record_count is a rough estimate of the number of
input key-value pairs in the split.
The function creates split tuples using a searchable expression and
cts:query. The parameters to hadoop:get-splits
and hadoop.getSplits determine the documents under
consideration in each forest, equivalent to the $expression
and $query parameters of cts:search. The
function returns an estimate rather than a true count to improve
performance.
xquery version "1.0-ml";
import module namespace hadoop= "http://marklogic.com/xdmp/hadoop"
at "/MarkLogic/hadoop.xqy";
hadoop:get-splits('', 'fn:doc()', 'cts:and-query(())')
=>
8456374036761185098 97 doc.marklogic.com