cts:entity-walk( $node as node(), $expr as item()*, [$dict as cts:entity-dictionary] ) as item()*
Walk an XML document or element node, evaluating an
expression
against any matching entities. This function is similar to
cts:entity-highlight
in how it processes matched entities, but it differs in what it returns.
The following variables are available for use
inline in the expr
parameter. These varibles make aspects
of the matched entity available to your inline code.
$cts:node
astext()
- The node containing the match.
$cts:text
asxs:string
- The matched text. In the case of overlapping matches, this value may not encompass the entirety of the entity match string. Rather, it contains only the non-overlapping part of the text, in order to prevent introduction of duplicate text in the final result.
$cts:entity-type
asxs:string
- The type of the matched entity, as defined by the
type
field of the matching entity dictionary entry.$cts:entity-id
asxs:string
- The ID of the matched entity, as defined by the
id
field of the matching entity dictionary entry.$cts:normalized-text
asxs:string
- The normalized entity text (only applicable to some languages).
$cts:start
asxs:integer
- The offset (in codepoints) of the start of
$cts:text
in the matched text node.$cts:action
asxs:string
- The action to take. Use
xdmp:set
on this variable in your inline code to specify what should happen next. Usexdmp:set
to set the value to one of the following:
- "continue"
- Walk the next match. If there are no more matches, return all evaluation results. This is the default action.
- "skip"
- Skip walking any more matches and return all evaluation results.
- "break"
- Stop walking matches and return all evaluation results.
xquery version "1.0-ml"; (: NOTE: The fields of each line below must be TAB separated. :) let $dictionary := cts:entity-dictionary-parse( "11208172 Nixon Nixon person:head of state 11208172 Nixon Richard Nixon person:head of state 11208172 Nixon Richard M. Nixon person:head of state 11208172 Nixon Richard Milhous Nixon person:head of state 11208172 Nixon President Nixon person:head of state 08932568 Paris Paris administrative district:national capital 09145751 Paris Paris administrative district:town 09500217 Paris Paris imaginary being:mythical being " ) let $input-node := <node>Nixon visited Paris</node> return cts:entity-walk($input-node, (object-node { "type": $cts:entity-type, "text": $cts:text, "normText": $cts:normalized-text, "id": $cts:entity-id, "start": $cts:start }), $dictionary) (: Produces output similar to the following: : { "type":"person:head of state", : "text":"Nixon", "normText":"Nixon", "id":"11208172", "start":1} : { "type":"administrative district:national capital", : "text":"Paris", "normText":"Paris", "id":"08932568", "start":15} : { "type":"administrative district:town", : "text":"Paris", "normText":"Paris", "id":"09145751", "start":15} : { "type":"imaginary being:mythical being", : "text":"Paris", "normText":"Paris", "id":"09500217", "start":15} :)
Stack Overflow: Get the most useful answers to questions from the MarkLogic community, or ask your own question.