cdict.dictionaryWrite( lang as String, dict as element(cdict.dictionary), [tokenization as Boolean] ) as null
Insert or update a custom dictionary for language.
Parameters | |
---|---|
lang | The ISO language code of the dictionary. |
dict | A custom dictionary. For details on the structure, see Custom Dictionary Format in the Search Developer's Guide. |
tokenization | Whether to insert the dictionary for use in tokenization or stemming. Set to true for tokenization, false for stemming. Default: false (stemming). This parameter is ignored for languages that use a single dictionary for both stemming and tokenization, such as Japanese and Chinese. |
custom-dictionary-admin
role or
the following privileges:
http://marklogic.com/xdmp/privileges/custom-dictionary-admin
If your language configure uses user-defined lexer and/or stemmer plugins, you can define additional privileges for finer control. For details, see Custom Dictionary Security Considerations in the Search Developer's Guide.
Any xml:lang
attribute on the dictionary element is ignored.
The lang
parameter determines what language the dictionary
is associated with.
When you configure a dictionary for a language, it is associated with the stemmer or lexer configured for the language. If you change the stemmer/lexer for the language, you will need to write the dictionary again.
Changes affecting stemming and tokenization take effect immediately. Queries started after a custom dictionary is written or deleted will use the new behavior.
Documents are not automatically reindexed after a custom dictionary change. To get accurate results for stemmed searches, documents must be reindexed. If it is not practical to reindex all documents, use this process to selectively reindex affected documents:
word
element of dictionary
entries that are added, removed, or modified.'use strict'; const cdict = require('/MarkLogic/custom-dictionary'); const dict = fn.head(xdmp.unquote( '<cdict:dictionary xmlns:cdict="http://marklogic.com/xdmp/custom-dictionary">' + '<cdict:entry>' + '<cdict:word>Furbies</cdict:word>' + '<cdict:stem>Furby</cdict:stem>' + '</cdict:entry>' + '<cdict:entry>' + '<cdict:word>servlets</cdict:word>' + '<cdict:stem>servlet</cdict:stem>' + '</cdict:entry>' + '</cdict:dictionary>' )).root; cdict.dictionaryWrite('en', dict);