MarkLogic 9 Product Documentation
cts:entity-highlight

cts:entity-highlight(
   $node as node(),
   $expr as item()*,
   [$dict as cts:entity-dictionary]
) as node()

Summary

Find entities in a node and replace each matched entity with the result of evaluating an XQuery expression .

Parameters
node A node to run entity highlight on. The node must be either a document node or an element node; it cannot be a text node.
$expr An expression with which to replace each match. You can use the variables $cts:text, $cts:node, $cts:entity-type, $cts:normalized-text, $cts:entity-id, $cts:start, and $cts:action in the expression. See the Usage Notes for details.
dict The entity dictionary to use for matching entities in the text of the input node. If you omit this parameter, the default entity dictionary is used. (No default dictionaries currently exist.) See the Usage Notes for details.

Usage Notes

You can use this function to easily highlight entities in an XML document in an arbitrary manner. If you do not need fine-grained control of the XML markup returned, you can use the library function entity:enrich instead.

The following variables are available for use inline in the expr parameter. These varibles make aspects of the matched entity available to your inline code.

$cts:node as text()
The node containing the match.
$cts:text as xs:string
The matched text. In the case of overlapping matches, this value may not encompass the entirety of the entity match string. Rather, it contains only the non-overlapping part of the text, in order to prevent introduction of duplicate text in the final result.
$cts:entity-type as xs:string
The type of the matched entity, as defined by the type field of the matching entity dictionary entry.
$cts:entity-id as xs:string
The ID of the matched entity, as defined by the id field of the matching entity dictionary entry.
$cts:normalized-text as xs:string
The normalized entity text (only applicable for some languages).
$cts:start as xs:integer
The offset (in codepoints) of the start of $cts:text in the matched text node.
$cts:action as xs:string
The action to take. Use xdmp:set on this variable in your inline code to specify what should happen next. Set the value to one of the following values:
"continue"
Walk the next match. If there are no more matches, return all evaluation results. This is the default action.
"skip"
Skip walking any more matches and return all evaluation results.
"break"
Stop walking matches and return all evaluation results.

See Also

Example

xquery version "1.0-ml";

let $dictionary := cts:entity-dictionary((
  cts:entity("11208172", "Nixon", "Nixon", "person"),
  cts:entity("11208172", "Nixon", "Richard Nixon", "person"),
  cts:entity("11208172", "Nixon", "Richad M. Nixon", "person"),
  cts:entity("11208172", "Nixon", "Richard Milhous Nixon", "person"),
  cts:entity("11208172", "Nixon", "President Nixon", "person"),
  cts:entity("08932568", "Paris", "Paris", "district:national capital"),
  cts:entity("09145751", "Paris", "Paris", "district:town"),
  cts:entity("09500217", "Paris", 'Paris', "mythical being")
))
let $input-xml := <node>Richard Nixon never visited Paris.</node>
return
cts:entity-highlight($input-xml,
   (if ($cts:text ne "") 
    then element { fn:replace($cts:entity-type, ":| ", "-") } { $cts:text }
    else ())
   ,$dictionary)

(: Returns output similar to the following. (Whitespace has been added
 : here to improve readability.)
 : 
 : <node>
 :   <person>Richard Nixon</person> never visited 
 :   <district-national-capital>Paris</district-national-capital>.
 : </node>
 :)
Powered by MarkLogic Server | Terms of Use | Privacy Policy