Skip to main content

Using MarkLogic Content Pump (mlcp)

Extract Command Line Options

This section summarizes the command line options available with the mlcp extract command. An extract command requires the -input_file_path and -output_file_path options. That is, an extract command has the following form:

mlcp.sh extract -input_file_path forest-path \
    -output_file_path dest-path ...

The following table lists command line options that define the characteristics of the extraction:

Option

Description

-collection_filter comma-list

A comma-separated list of collection URIs. mlcp extracts only documents in these collections. This option can be combined with other filter options. Default: All documents.

-compress boolean

Whether or not to compress the output. mlcp might generate multiple compressed files. Default: false.

-directory_filter comma-list

A comma-separated list of database directory names. mlcp extracts only documents from these directories, plus related metadata. Directory names should usually end with “/”. This option can be combined with other filter options. Default: All documents and related metadata.

-max_split_size number

The maximum number of document fragments processed per split. Default: 50000.

-mode string

Export mode. Accepted values: local.

-options_file string

Specify an options file pathname from which to read additional command line options. If you use an options file, this option must appear first. For details, see Options File Syntax.

-output_file_path string

Destination directory where the documents are saved. The directory must not already exist.

-thread_count number

The number of threads to spawn for concurrent exporting. The total number of threads spawned by the process can be larger than this number, but this option caps the number of concurrent sessions with MarkLogic Server. Only available in local mode. Default: 4.

-type_filter comma-list

A comma-separated list of document types. mlcp extracts only documents with these types. This option can be combined with other filter options. Allowed documentypes: xml, text, binary. Default: All documents.