Skip to main content

Using MarkLogic Content Pump (mlcp)

How Assignment Policy Affects Optimization

This section describes how your choice of document assignment policy can introduce additional limitations and risks. Assignment policy is a database configuration setting that affects how MarkLogic Server selects what forest to insert a document into or move a document into during rebalancing. For details, see Rebalancer Document Assignment Policies in Administrating MarkLogic Server.

Note

Assignment policy was introduced with MarkLogic 7 and mlcp v1.2. If you use an earlier version of mlcp with MarkLogic 7 or later, the database you import data into with -fastload or -output_directory must be using the legacy assignment policy.

The following table summarizes the limitations imposed by each assignment policy. If you do not explicitly set assignment policy, the default is Legacy or Bucket.

Assignment Policy

Notes

Legacy (default)

Bucket

You can safely use -fastload if:

  • there are no pre-existing documents in the database with the same URIs; or

  • you use -output_directory; or

  • the URIs may be in use, but the forest topology has not changed since the documents were created, and the documents were not initially inserted using user-specified forest placement.

Statistical

You can only use -fastload to create new documents; updates are not supported. You should use -output_directory to ensure there are no updates.

All documents in a batch are inserted into the same forest. The rebalancer may subsequently move the documents if the batch size is large enough to cause the forest to become unbalanced.

If you set -fastload to true and mlcp determines database rebalancing is occurring or needs to be done at the start of a job, an error occurs.

Range

You can only use -fastload to create new documents; updates are not supported. You should use -output_directory to ensure there are no updates.

You should use -output_partition to tell mlcp which partition to insert documents into. The partition you specify is used even if it is not the correct partition according to your configured partition policy.

You can only use -fastload optimizations with range policy if you are licensed for Tiered Storage.

If you set -fastload to true and mlcp determines database rebalancing is occurring or needs to be done at the start of a job, an error occurs.

Query

You can only use -fastload to create new documents; updates are not supported. You should use -output_directory to ensure there are no updates.

You should use -output_partition to tell mlcp which partition to insert documents into. The partition you specify is used even if it is not the correct partition according to your configured partition policy.

You can only use -fastload optimizations with range policy if you are licensed for Tiered Storage.

If you set -fastload to true and mlcp determines database rebalancing is occurring or needs to be done at the start of a job, an error occurs.