Skip to main content

Administrating MarkLogic Server

Index Settings That Affect Documents

When you change any index settings for a database, the new settings take effect based on whether reindexing is enabled (reindexer enable set to true). For more details on text indexes, see Text Indexing.

In general, adding index options slows document loading and increases the size of database files.

Database Setting

Description

language

Specifies the default language for content in this database. Any content without an xml:lang attribute will be indexed in the language specified here. You should have a license key if you specify a non-English language; if you specify a non-english language and do not have a license for that language, the stemming and tokenization will be generic.

stemmed searches

Controls the level of stemming applied to word searches. Stemmed searches match not only the exact word in the search, but also words that come from the same stem and mean the same thing (for example, a search for be will also match the term is). For more details on stemmed searches, see Understanding and Using Stemmed Searches in the Search Developer’s Guide.

word searches

Whether or not to enable unstemmed word searches. Enables searches for exact matches of words.

word positions

Index word positions for faster phrase and cts:near-query searches.

fast phrase searches

Speeds up phrase searches by eliminating some false positive results.

fast reverse searches

Speeds up reverse query searches by indexing saved queries.

triple index

Enables the RDF triple index to support SPARQL execution over RDF triples. When this parameter is true, sem:sparql() can be used, but document loading is slower and the database files are larger.

Note

This feature requires a valid semantics license key.

triple positions

Specifies whether to index positional data to speed up the performance of proximity queries that use cts:triple-range-query().

fast case sensitive searches

Speeds up case sensitive searches by eliminating some false positive results.

fast diacritic sensitive searches

Speeds up diacritic-sensitive searches by eliminating some false positive results.

fast element word searches

Speeds up element-word searches by eliminating some false positive results.

element word positions

Index element word positions for faster element-based phrase and cts:near-query searches.

fast element phrase searches

Speeds up element phrase searches by eliminating some false positive results.

element value positions

Index element word positions for faster element-based phrase and cts:near-query searches that use cts:element-value-query().

attribute value positions

Index attribute word positions for faster attribute-based phrase and cts:near-query searches that use cts:element-value-query() and faster cts:element-query searches that use a cts:element-attribute-*-query.

field value searches

Enables searches that use cts:field-value-query.

field value positions

Enables positions for searches that use cts:field-value-query.

three character searches

Enables wildcard searches where the search pattern contains three or more consecutive non-wildcard characters (for example, abc*x, *abc, a?bcd). When combined with a codepoint word lexicon, speeds the performance of any wildcard search (including searches with fewer than three consecutive non-wildcard characters). MarkLogic recommends combining the three character search index with a codepoint collation word lexicon. For more details about wildcard searches, see Understanding and Using Wildcard Searches in the Search Developer’s Guide.

three character word positions

Index word positions for three-character wildcard queries.

fast element character searches

Enables wildcard searches and speeds up element-based wildcard searches. For more details about wildcard searches, see Understanding and Using Wildcard Searches in the Search Developer’s Guide.

trailing wildcard searches

Faster wildcard searches with the wildcard at the end of the search pattern (for example, abc*). For more details about wildcard searches, see Understanding and Using Wildcard Searches in the Search Developer’s Guide.

trailing wildcard word positions

Index word positions for trailing wildcard searches.

fast element trailing wildcard searches

Faster wildcard searches with the wildcard at the end of the search pattern within a specific element, but slower document loads and larger database files.

word lexicon

Maintains a lexicon of all of the words in a database, with uniqueness determined by a specified collation. Additionally, works in combination with the three character search index to speed wildcard searches. For more details about wildcard searches, see Understanding and Using Wildcard Searches in the Search Developer’s Guide.

two character searches

Enables wildcard searches where the search pattern contains two or more consecutive non-wildcard characters (for example, ab*). This index is not needed if you have three character searches and a word lexicon. For more details about wildcard searches, see Understanding and Using Wildcard Searches in the Search Developer’s Guide.

one character searches

Enables wildcard searches where the search pattern contains a single non-wildcard characters (for example, a*). This index is not needed if you have three character searches and a word lexicon. For more details about wildcard searches, see Understanding and Using Wildcard Searches in the Search Developer’s Guide.

uri lexicon

Maintains a lexicon of all of the URIs used in a database. The URI lexicon speeds up queries that constrain on URIs. It is like a range index of all of the URIs in the database. To access values from the URI lexicon, use the cts:uris or cts:uri-match APIs.

collection lexicon

Maintains a lexicon of all of the collection URIs used in a database. The collection lexicon speeds up queries that constrain on collections. It is like a range index of all of the collection URIs in the database. To access values from the collection lexicon, use the cts:collections or cts:collection-match APIs.