User-Defined Function API 11.0
|
Encapsulation of a User Defined Function for performing aggregate analysis across co-occurrences in range indexes. More...
#include <MarkLogic.h>
Public Member Functions |
|
virtual AggregateUDF * | clone () const =0 |
Create a copy of an AggregateUDF. More... |
|
virtual void | close ()=0 |
Release an AggregateUDF clone. More... |
|
virtual void | start (Sequence &arg, Reporter &r)=0 |
Initialize an aggregate MapReduce job. More... |
|
virtual void | finish (OutputSequence &os, Reporter &r)=0 |
Finalize the results of an aggregate MapReduce job and prepare them for return to the calling application. More... |
|
virtual void | map (TupleIterator &values, Reporter &r)=0 |
Entry point for performing map analysis. MarkLogic Server calls this method at lesat once per stand. More... |
|
virtual void | reduce (const AggregateUDF *o, Reporter &r)=0 |
Reduce the intermediate results of map analysis to a final result. More... |
|
virtual void | encode (Encoder &e, Reporter &r)=0 |
Serialize this object's state so the object can be distributed across a MarkLogic Server cluster. More... |
|
virtual void | decode (Decoder &d, Reporter &r)=0 |
De-serialize this object's state so the object can be reconstituted on a remote host. More... |
|
virtual RangeIndex::Order | getOrder () const |
Determine the order of range index input values. More... |
|
Protected Member Functions |
|
AggregateUDF (unsigned version=MARKLOGIC_API_VERSION) | |
Construct an object compatible with a specific MarkLogic Native Plugin API version. More... |
|
Encapsulation of a User Defined Function for performing aggregate analysis across co-occurrences in range indexes.
You must implement a subclass of this class.
When you install a subclass of AggregateUDF as a native plugin, MarkLogic servers can use In-Database MapReduce to apply your algorithm to N-way co-occurrences between values in range indexes. Analysis is performed in parallel across the hosts in a cluster, across forests on each host, and across stands in each forest.
Your aggregate algorithm can be accessed from XQuery (cts:aggregate
), Java (com.marklogic.client.config.QueryOptions.Aggregate
), and REST (the /values
resource) APIs.
To make your algorithm available:
markLogicPlugin
. For details, see "Implementing an Aggregate User-Defined Function" in the Application Developer's Guide.
To learn about range index co-occurrences, see "Browsing With Lexicons" in the Search Developer's Guide.
|
protected |
Construct an object compatible with a specific MarkLogic Native Plugin API version.
You should not override the default version number.
MarkLogic Server uses the version to enforce plugin consistency across all hosts in a cluster. The API version against which your plugin is compiled must match the API version supported by the MarkLogic Server instance(s) on which your plugin executes.
For more information, see "Registering an Aggregate UDF" in the Application Developer's Guide.
|
pure virtual |
Create a copy of an AggregateUDF.
MarkLogic Server uses this method to instantiate objects for aggregate analysis jobs and the map and reduce tasks within them. When an object is cloned for a map or reduce task, you can assume AggregateUDF::start has already been called on the original object, so UDF-specific arguments are already populated.
The object returned by this method must persist until AggregateUDF::close is called.
|
pure virtual |
Release an AggregateUDF clone.
MarkLogic server calls this method when this object is no longer needed.
De-serialize this object's state so the object can be reconstituted on a remote host.
You should call Decoder::decode on all any state information this object. You can decode data members in any order, but but you must use the same order in both encode and decode.
d | The decoder with which to de-serialize the data members of this object. |
r | Mechanism for logging errors and other messages. |
Serialize this object's state so the object can be distributed across a MarkLogic Server cluster.
You should call Encoder::encode on all data members of this this object. You can encode data members in any order, but but you must use the same order in both encode and decode.
e | The encoder with which to serialize the data members of this object. |
r | Mechanism for logging errors and other messages. |
|
pure virtual |
Finalize the results of an aggregate MapReduce job and prepare them for return to the calling application.
MarkLogic Server calls this method once per analysis job. For example, once per cts:aggregate invocation. Final analysis results should be recorded in the provided OutputSequence.
os | Write the final results of your analysis here. |
r | Mechanism for logging errors and other messages. |
|
virtual |
Determine the order of range index input values.
Override this method to indicate what ordering your map input values should have. MarkLogic Server queries this setting when building input for map tasks.
If you do not override this method, descending order is used.
|
pure virtual |
Entry point for performing map analysis. MarkLogic Server calls this method at lesat once per stand.
Record the results of your map analysis on this object. MarkLogic Server invokes your AggregateUDF::reduce method to consolidate the results from all map calls.
values | An iterator over the N-way co-occurrence tuples for the current stand. |
r | Mechanism for logging errors and other messages. |
|
pure virtual |
Reduce the intermediate results of map analysis to a final result.
MarkLogic Server invokes this method once per analysis job. For example, once per call to cts:aggregate
that invokes your aggregate UDF.
Record your final results on this AggregateUDF object. MarkLogic Server subsequently invokes AggregateUDF::finish to prepare the results for return to the application.
o | An object of your aggregate whose intermediate state should be folded into the this object. |
r | Mechanism for logging errors and other messages. |
Initialize an aggregate MapReduce job.
MarkLogic Server calls this method once per analysis job. For example, once per cts:aggregate invocation. Use this method to initialize the object with any initial state needed to perform the entire analysis. This information is made available to all map and reduce tasks.
arg | The implementation-specific arguments supplied by the caller of your algorithm. |
r | Mechanism for logging errors and other messages. |