MarkLogic 10 Product Documentation
entity.dictionaryLoad

entity.dictionaryLoad(
   path as String,
   uri as String,
   [options as String[]]
) as null

Summary

Load an entity dictionary from the filesystem into the database in the appropriate format.

Parameters
path The path to a text file containing the dictionary entries. For format details, see the Usage Notes.
uri The URI of the dictionary to be created.
options Options with which you can control the behavior of the entity dictionary. You can specify the following options. It is strongly recommended that you use the default option settings.
  • "case-sensitive" or "case-insensitive": Perform case-sensitive or case-insensitive matching of entities names. Specify one or the other. Default: "case-sensitive".
  • "remove-overlaps" or "allow-overlaps": Either eliminate entities with the overlapping names or allow them. Specify one or the other. Default: "allow-overlaps".
  • "whole-words" or "partial-words": Either require matches to align with token boundaries, or allow matches to fall within token boundaries. Specify one or the other. Default: "whole-words".

Usage Notes

The entity dictionary should be a text file containing one line per dictionary entry. Each line (entry) must consist of the following tab delimited fields, in the order shown: identifier, normalized text, matching text, entity type. For more details on the meaning of each field, see cts.entity.

See Also

Example

// Assume "/data/example.txt" contains the following data:
//
// 11208172	Nixon	Nixon	person:head of state
// 11208172	Nixon	Richard Nixon	person:head of state
// 11208172	Nixon	Richard M. Nixon	person:head of state
// 11208172	Nixon	Richard Milhous Nixon	person:head of state
// 11208172	Nixon	President Nixon	person:head of state:person
// 08932568	Paris	Paris	administrative district:national capital
// 09145751	Paris	Paris	administrative district:town
// 09500217	Paris	Paris	imaginary being:mythical being

declareUpdate();
const entity = require('/MarkLogic/entity');

entity.dictionaryLoad('/space/rest/ent-dict.txt','/ontology/people');

// The URI "/ontology/people" now contain an entity dictionary with 
// four entities (11208172 with 5 alternative matching texts, and the 
// three entities 08932568, 09145751, and 09500217 with the same matching text.
    
Powered by MarkLogic Server | Terms of Use | Privacy Policy