Skip to main content

Administrating MarkLogic Server

Choosing a Fragmentation Strategy

Proper fragmentation is important to performance. Before you specify how to fragment the XML data being loaded, you need to plan your fragmentation strategy. Apply the following guidelines:

  • Fragments are described generically using XML element names.

  • Fragments for XML documents should be between 10K and 100K in size (these are just general guidelines; in some situations, larger or smaller fragment sizes can work fine, and there are many factors that will affect performance for a given fragment size including disk block size, how many fragments are in the database, how often fragments are accessed, the types of queries used in the application, and so on).

  • Fragments can be (and in many cases, should be) nested hierarchically.

  • Smaller fragment sizes allow more efficient element-level updates in the database, but excessively small fragments can slow down both loading speed and query performance.

  • Larger fragment sizes can also slow down query performance by requiring excessive loading of data from disk in resolving queries.

  • In general, within the size range set above, larger fragment sizes deliver higher-performance overall than smaller fragment sizes.

  • Text and small binary documents must fit in a single fragment. Therefore, set the database in memory tree size parameter to 1 to 2 MB larger than your largest text or small binary file. The largest small binary file size is always constrained by the “large size threshold” database configuration setting.

After you decide how to fragment your data, you can use the Fragment Roots or Fragment Parents method.

Both methods turn your fragmentation strategy into concrete rules for the system.