MarkLogic 10 Product Documentation
entity:dictionary-loadentity:dictionary-load(
$path as xs:string,
$uri as xs:string,
[$options as xs:string*]
) as empty-sequence()
Summary
Load an entity dictionary from the filesystem into the database in the
appropriate format.
Parameters |
path |
The path to a text file containing the dictionary entries.
For format details, see the Usage Notes.
|
uri |
The URI of the dictionary to be created.
|
options |
Options with which you can control the behavior of the entity dictionary.
You can specify the following options. It is strongly recommended that
you use the default option settings.
"case-sensitive" or "case-insensitive" :
Perform case-sensitive or case-insensitive matching of entities names.
Specify one or the other. Default: "case-sensitive" .
"remove-overlaps" or "allow-overlaps" :
Either eliminate entities with the overlapping names or allow them.
Specify one or the other. Default: "allow-overlaps" .
"whole-words" or "partial-words" :
Either require matches to align with token boundaries, or allow
matches to fall within token boundaries. Specify one or the other.
Default: "whole-words" .
|
Usage Notes
The entity dictionary should be a text file containing one line per
dictionary entry. Each line (entry) must consist of the following
tab delimited fields, in the order shown: identifier, normalized text,
matching text, entity type. For more details on the meaning of each
field, see
cts:entity.
See Also
Example
(:
Assuming "/data/example.txt" contains
11208172 Nixon Nixon person:head of state
11208172 Nixon Richard Nixon person:head of state
11208172 Nixon Richard M. Nixon person:head of state
11208172 Nixon Richard Milhous Nixon person:head of state
11208172 Nixon President Nixon person:head of state:person
08932568 Paris Paris administrative district:national capital
09145751 Paris Paris administrative district:town
09500217 Paris Paris imaginary being:mythical being
The URI "/ontology/people" will contain an entity dictionary with
four entities (11208172 with 5 alternative matching texts, and the three
entities 08932568, 09145751, and 09500217 with the same matching text).
:)
import module namespace entity="http://marklogic.com/entity"
at "/MarkLogic/entity.xqy";
entity:dictionary-load("/data/example.txt","/ontology/people")
Copyright © 2024 MarkLogic Corporation. MARKLOGIC is a
registered trademark of MarkLogic Corporation.