cts.classify( dataNodes as Array, classifier as Object, [options as Object?], [trainingNodes as Array] ) as Array
Classifies
an array
of nodes based on training data. The training data is in the form
of a classifier specification, which is generated from the output of
cts.train
. Returns labels for
each of the input documents in the same order as the input document.
cts.classify
classifies
an array of nodes using the output from cts.train
.
The dataNodes
and classifier
parameters
are respectively the nodes to
be classified and the specification output from cts.train
.
cts.classify
can use either supports
or
weights
forms of the classifier
output
from cts.train
(see Output
Formats). If the supports
form is used, the training
nodes must be passed as the 4th parameter. The options
parameter is an options object.
The output is an array of label objects of the form:
Each label corresponds to the data node in the corresponding
position in the input sequence. There will be an
object for each class where the document passed the class
threshold. The val
property gives the
class membership value for the data node in the given class. Values
greater than zero indicate likely class membership, values less than
zero indicate likely non-membership. Adjusting thresholds can give
more or less selective classification. Increasing the threshold
leads to a more selective classification (that is, decreases the
likelihood of classification in the class). Decreasing the threshold
gives less selective classification.
var firsthalf = fn.subsequence( xdmp.directory("/shakespeare/plays/", "1"), 1, 19); var plays1 = firsthalf.clone(); var secondhalf = fn.subsequence( xdmp.directory("/shakespeare/plays/", "1"), 20, 37); var plays2 = secondhalf.clone(); var labels = []; for (var x of firsthalf) { var singleClass = [{"name": fn.head(xdmp.documentProperties(xdmp.nodeUri(x))). xpath("//playtype/fn:string()") }]; labels.push({"classes": singleClass}); } var classifier = cts.train(plays1.toArray(), labels, {"classifierType": "supports", "useDbConfig": true, "epsilon": 0.00001 }); cts.classify(plays2.toArray(), classifier, {}, plays1.toArray()); => [ { "classes": [ { "name": "HISTORY", "val": 4.29498338699341 }, { "name": "COMEDY", "val": 2.83974766731262 }, { "name": "TRAGEDY", "val": -0.454397678375244 } ] }, { "classes": [ { "name": "HISTORY", "val": 3.70210886001587 }, { "name": "COMEDY", "val": 2.59831714630127 }, { "name": "TRAGEDY", "val": -0.404506534337997 } ] }, ... ]