cts:thresholds( $computed-labels as element(cts:label)*, $known-labels as element(cts:label)*, [$recall-weight as xs:double?] ) as element(cts:thresholds)?
Compute precision, recall, the F measure, and thresholds for the classes computed by the classifier, by comparing with the labels for the same set.
You use the output of cts:thresholds
to determine
the best thresholds values for your data, based on the first pass
through the first part of your training data. The output of cts:thresholds
provides you
with precision and recall measurements at the calculated thresholds
for each class. The following are the definitions of the attributes
of the thresholds
element returned by
cts:thresholds
:
name
threshold
cts:classify
when
classifying documents, and is defined to be the positive
or negative distance from the hyperplane which represents the edge of
the class.
precision
recall
F
(the F-measure)
xs:double('+INF'))
indicates
that weighting is recall only.let $firsthalf := xdmp:directory("/shakespeare/plays/", "1")[1 to 19] let $secondhalf := xdmp:directory("/shakespeare/plays/", "1")[20 to 37] let $firstlabels := for $x in $firsthalf return <cts:label> <cts:class name="{xdmp:document-properties(xdmp:node-uri($x)) //playtype/fn:string()}"/> </cts:label> let $secondlabels := for $x in $secondhalf return <cts:label> <cts:class name={xdmp:document-properties(xdmp:node-uri($x)) //playtype/fn:string()}/> </cts:label> let $classifier := cts:train($firsthalf, $firstlabels, <options xmlns="cts:train"> <classifier-type>supports</classifier-type> </options>) let $classifysecond := cts:classify($secondhalf, $classifier, <options xmlns="cts:classify"/>, $firsthalf) return cts:thresholds($classifysecond, $secondlabels) (: This returns the computed thresholds for the second half of the plays in a Shakespeare database, based on a classifier trained with the first half of the plays. For example: <thresholds xmlns="http://marklogic.com/cts"> <class name="TRAGEDY" threshold="0.221948" precision="1" recall="0.666667" f="0.8" count="3"/> <class name="COMEDY" threshold="0.114389" precision="0.916667" recall="1" f="0.956522" count="11"/> <class name="HISTORY" threshold="0.567648" precision="1" recall="1" f="1" count="4"/> </thresholds> :)