MarkLogic 10 Product Documentation
cts:entity-dictionary

cts:entity-dictionary(
   $entities as cts:entity*,
   [$options as xs:string*]
) as cts:entity-dictionary

Summary

Returns a cts:entity-dictionary object.

Parameters
entities The entities to put into the dictionary.
options Dictionary building options. The default is case-sensitive, allow-overlaps, and whole-words.

Options include:

"case-sensitive"
Entity names are case-sensitive.
"case-insensitive"
Entity names are case-insensitive.
"whole-words"
Require that matches align with token boundaries.
"partial-words"
Allow matches to fall within token boundaries.
"allow-overlaps"
Allow overlapping entity labels.
"remove-overlaps"
Remove overlapping entity labels.

Usage Notes

Only one of "case-sensitive" and "case-insensitive", "whole-words" and "partial-words", and "allow-overlaps" and "remove-overlaps" is permitted. It is strongly recommended that the defaults be used.

Use this method when creating ad hoc entity dictionaries, or as a prelude to saving the entity dictionary to the database.

Example

let $dict := 
  cts:entity-dictionary(
    for $alt in ("ADA", "Obamacare", "Affordable Care Act")
    return cts:entity("E1", "ADA", $alt", "Law")
  )
return 
  cts:entity-highlight(<root>ADA is often called Obamacare</root>,
    element {$cts:entity-type} {attribute norm {$cts:normalized-text}, $cts:text}, $dict)
=>
<root><Law norm="ADA">ADA</Law> is often called <Law norm="ADA">Obamacare</Law>.</root>

Example

import module namespace entity = "http://marklogic.com/entity"
  at "/MarkLogic/entity.xqy";

xdmp:document-insert("/entities/example.txt",
  entity:skos-dictionary("http://example.org/ontology","en"))
;

cts:entity-walk(doc("mydoc.xml"), 
  <entity type="{$cts:entity-type}">{$cts:text}</entity>,
  cts:entity-dictionary-get("/entities/example.txt"))
  
)
Powered by MarkLogic Server | Terms of Use | Privacy Policy