MarkLogic 9 Product Documentation
cts.entityDictionaryParse

cts.entityDictionaryParse(
   contents as String[],
   [options as String[]]
) as cts.entityDictionary

Summary

Construct a cts:entity-dictionary object by parsing it from a formatted string.

Parameters
contents The dictionary entries to parse. Each line (or string) must consist of four tab-delimited fields: The entity ID, the normalized form of the entity, the word or phrase to match during entity identification, and the entity type. For more details about the fields, see cts.entity. Multiple formatted strings can be passed in and they will be combined into a single dictionary object.
options Options with which you can control the behavior of the entity dictionary. You can specify the following options. It is strongly recommended that you use the default option settings.
  • "case-sensitive" or "case-insensitive": Perform case-sensitive or case-insensitive matching of entities names. Specify one or the other. Default: "case-sensitive".
  • "remove-overlaps" or "allow-overlaps": Either eliminate entities with the overlapping names or allow them. Specify one or the other. Default: "allow-overlaps".
  • "whole-words" or "partial-words": Either require matches to align with token boundares, or allow matches to fall within token boundaries. Specify one or the other. Default: "whole-words".

See Also

Example

'use strict';
const entity = require('/MarkLogic/entity');

// NOTE: The fields in the array items below must be TAB separated.
const dictionary =
  cts.entityDictionaryParse([
    '11208172	Nixon	Nixon	person:head of state',
    '11208172	Nixon	Richard Nixon	person:head of state',
    '11208172	Nixon	Richard M. Nixon	person:head of state',
    '11208172	Nixon	Richard Milhous Nixon	person:head of state',
    '11208172	Nixon	President Nixon	person:head of state',
    '08932568	Paris	Paris	administrative district:national capital',
    '09145751	Paris	Paris	administrative district:town',
    '09500217	Paris	Paris	imaginary being:mythical being'
  ]);
const node = new NodeBuilder()
                   .addElement('node', 'Nixon visited Paris')
                   .toNode();
entity.enrich(node, dictionary);

// Returns output similar to the following. (Whitespace added to improve
// readability.)
//
// <node xmlns:e="http://marklogic.com/entity">
//   <e:entity type="person:head of state">Nixon</e:entity> 
//   visited 
//   <e:entity type="administrative district:national capital">Paris</e:entity>
// </node>
Powered by MarkLogic Server | Terms of Use | Privacy Policy