cts:entity-highlight( $node as node(), $expr as item()*, [$dict as cts:entity-dictionary] ) as node()
Find entities in a node and replace each matched entity with the result of evaluating an XQuery expression .
You can use this function to easily highlight entities in an XML document in an arbitrary manner. If you do not need fine-grained control of the XML markup returned, you can use the library function entity:enrich instead.
The following variables are available for use
inline in the expr
parameter. These varibles make aspects
of the matched entity available to your inline code.
$cts:node
astext()
- The node containing the match.
$cts:text
asxs:string
- The matched text. In the case of overlapping matches, this value may not encompass the entirety of the entity match string. Rather, it contains only the non-overlapping part of the text, in order to prevent introduction of duplicate text in the final result.
$cts:entity-type as
xs:string
- The type of the matched entity, as defined by the
type
field of the matching entity dictionary entry.$cts:entity-id as
xs:string
- The ID of the matched entity, as defined by the
id
field of the matching entity dictionary entry.$cts:normalized-text
asxs:string
- The normalized entity text (only applicable for some languages).
$cts:start
asxs:integer
- The offset (in codepoints) of the start of
$cts:text
in the matched text node.$cts:action
asxs:string
- The action to take. Use
xdmp:set
on this variable in your inline code to specify what should happen next. Set the value to one of the following values:
- "continue"
- Walk the next match. If there are no more matches, return all evaluation results. This is the default action.
- "skip"
- Skip walking any more matches and return all evaluation results.
- "break"
- Stop walking matches and return all evaluation results.
xquery version "1.0-ml"; let $dictionary := cts:entity-dictionary(( cts:entity("11208172", "Nixon", "Nixon", "person"), cts:entity("11208172", "Nixon", "Richard Nixon", "person"), cts:entity("11208172", "Nixon", "Richad M. Nixon", "person"), cts:entity("11208172", "Nixon", "Richard Milhous Nixon", "person"), cts:entity("11208172", "Nixon", "President Nixon", "person"), cts:entity("08932568", "Paris", "Paris", "district:national capital"), cts:entity("09145751", "Paris", "Paris", "district:town"), cts:entity("09500217", "Paris", 'Paris', "mythical being") )) let $input-xml := <node>Richard Nixon never visited Paris.</node> return cts:entity-highlight($input-xml, (if ($cts:text ne "") then element { fn:replace($cts:entity-type, ":| ", "-") } { $cts:text } else ()) ,$dictionary) (: Returns output similar to the following. (Whitespace has been added : here to improve readability.) : : <node> : <person>Richard Nixon</person> never visited : <district-national-capital>Paris</district-national-capital>. : </node> :)