Import Command Line Options
This section summarizes the command line options available with the mlcp import
command. The following command line options define your connection to MarkLogic:
Option |
Description |
---|---|
|
Required. A comma-separated list of hosts through which mlcp can connect to the destination MarkLogic Server. You must specify at least one host. For more details, see How mlcp Uses the Host List. |
|
Password for the MarkLogic Server user specified with |
|
Port number of the destination MarkLogic Server. There should be an XDBC App Server on this port. Default: 8000. |
|
MarkLogic Server user with which to import documents. Required, unless using Kerberos authentication. |
The following table lists command line options that define the characteristics of the import operation:
Option |
Description |
---|---|
|
When splitting an aggregate input file into multiple documents, the name of the element to use as the output document root. Default: The first child element under the root element. |
|
The namespace of the element specified by |
|
Deprecated. Use When splitting an aggregate input file into multiple documents, the element or attribute name within the document root to use as the document URI. Default: In local mode, |
|
User API Key unique to each MarkLogic Cloud user for obtaining session token. Required along with |
|
When importing documents from a database archive, whether or not to ignore missing metadata files. If this is |
|
A base URL that maps to a port on the destination MarkLogic server when connecting through a reverse proxy. |
|
The number of documents to process in a single request to MarkLogic Server. Default: 100. Maximum: 200. |
|
A comma-separated list of collection URIs. Only usable with |
|
The character encoding of input documents when |
|
When importing documents from an archive, whether to copy document collections from the source archive to the destination. Only applies when |
|
When importing documents from an archive, whether to copy document key-value metadata from the source archive to the destination. Only applies when |
|
When importing documents from an archive, whether to copy document permissions from the source archive to the destination. Only applies with |
|
When importing documents from an archive, whether to copy document properties from the source archive to the destination. Only applies with |
|
When importing documents from an archive, whether to copy document quality from the source archive to the destination. Only applies when |
|
The name of the destination database. Default: The database associated with the destination App Server identified by |
|
When importing content with |
|
When importing content with - |
|
Deprecated. use When importing content - |
|
When importing content with - |
|
A comma-separated list of database directory names. Only usable with |
|
The type of document to create when |
|
Whether or not to force optimal performance, even at the risk of creating duplicate document URIs. See Time vs. Correctness: Understanding -fastload Tradeoffs. Default: |
|
Add each loaded document to a collection corresponding to the name of the input file. You cannot use this option when |
|
When importing content with - |
|
Whether or not the source data is compressed. Default: false. |
|
When |
|
A regular expression describing the filesystem location(s) to use for input. For details, see Regular Expression Syntax. |
|
Load only input files that match this regular expression from the path(s) matched by |
|
The input file type. Accepted value: |
|
Password to a Java KeyStore containing the User Private Key(s) and Certificate(s); if available mlcp will select the first available certificate from the KeyStore that satisfy the TLS Certificate Request from the MarkLogic Server. Can be passed along with the existing |
|
Path to a Java KeyStore containing the User Private Key(s) and Certificate(s); if available mlcp will select the first available certificate from the KeyStore that satisfies the TLS Certificate Request from the MarkLogic Server. Can be passed along with the existing |
|
When importing from files, the maximum number of bytes in one input split. Default: The maximum Long value ( |
|
The maximum percentage (integer between 0 and 100) of available server threads used by mlcp for import jobs. Default: 100. |
|
The maximum number of threads that run mlcp. This command line option is optional. |
|
When importing from files, the minimum number of bytes in one input split. Default: 0. |
|
Ingestion mode. Accepted values: |
|
The modules root path to use when applying a server-side transformation. Default: The modules root configured for the App Server. If you also use |
|
Specify the name of the modules database to use when applying a server-side transformation. Accepted values: |
|
The default namespace for all XML documents created during loading. |
|
Specify an options file pathname from which to read additional command line options. If you use an options file, this option must appear first. For details, see Options File Syntax. |
|
Whether or not to delete all content in the output database directory prior to loading. Default: |
|
A comma separated list of collection URIs. Loaded documents are added to these collections. |
|
The destination database directory in which to create the loaded documents. If the directory exists, its contents are removed prior to ingesting new documents. Using this option enables |
|
Only usable with |
|
The |
|
Only usable with |
|
The name of the database partition in which to create documents. For details, see How Assignment Policy Affects Optimization, and Range Partitions or Query Partitions in Administrating MarkLogic Server. |
|
A comma separated list of ( |
|
The quality of loaded documents. Default: 0. |
|
Specify a prefix to prepend to the default URI. Used to construct output document URIs. For details, see Controlling Database URIs During Ingestion. |
|
A comma separated list of ( |
|
Specify a suffix to append to the default URI Used to construct output document URIs. For details, see Controlling Database URIs During Ingestion. |
|
The initial delay (in minutes) before mlcp starts sending polling request to check the available server threads. Default: 1. |
|
The time interval (in minutes) mlcp sends polling request to check the current available server threads. Default: 1. |
|
Restrict mlcp to connect to MarkLogic only through the hosts listed in the |
|
Whether or not to divide input data into logical chunks to support more concurrency. Only supported when |
|
Enable/disable SSL secured communication with MarkLogic. Default: false. If you set this option to true, your App Server must be SSL enabled. For details, see Connecting to MarkLogic Using SSL. |
|
Specify the protocol that mlcp should use when creating an SSL connection to MarkLogic. You must include this option if you use the |
|
Whether or not to stream documents to MarkLogic Server. Applies only when |
|
The temporal collection into which the temporal documents are to be loaded. For details on loading temporal documents into MarkLogic, see Using MarkLogic Content Pump (mlcp) to Load Temporal Documents in the Temporal Developer’s Guide. |
|
The number of threads to spawn for concurrent loading. Instead of using 4 as the default thread count prior to 10.0-4.2, mlcp now conducts initial polling to identify the available server threads on the port that handles mlcp requests. mlcp then uses this value as the default thread count. Users can overwrite it by specifying |
|
The maximum number of threads that can be assigned to each split. If you specify The total number of thread count, however, is controlled by the newly calculated thread count or |
|
NOTE: This option is deprecated, ignored, and will be removed in a future release. mlcp always behaves as if Applicable only when |
|
The number of requests to MarkLogic Server per transaction. Default: 1. Maximum: 4000/actualBatchSize. |
|
The local name of a custom content transformation function installed on MarkLogic Server. Ignored if |
|
The path in the modules database or modules directory of a custom content transformation function installed on MarkLogic Server. This option is required to enable a custom transformation. For details, see Transforming Content During Ingestion. |
|
The namespace URI of the custom content transformation function named by |
|
Optional extra data to pass through to a custom transformation function. Ignored if |
|
Password to a Java TrustStore containing any necessary CA Certificates needed to verify the TLS Server Authentication connection. If no TrustStore is provided the default TrustStore used by the existing Can be passed along with the existing |
|
Path to a Java TrustStore containing any necessary CA Certificates needed to verify the TLS Server Authentication connection. If no TrustStore is provided the default TrustStore used by the existing Can be passed along with the existing |
|
A comma-separated list of document types. Only usable with |
|
Specify a field, XML element name, or JSON property name to use as the basis of the output document URIs when importing delimited text, aggregate XML, or line-delimited JSON data. With With - |
|
The degree of repair to attempt on XML documents in order to create well-formed XML. Accepted values: |
We do not recommend using concurrent mlcp jobs. Regardless of the version, mlcp doesn’t support concurrent jobs if mlcp is importing from/exporting to the same data file. In addition, beginning in 10.0-4.2, each mlcp job uses the maximum number of threads available on the server as the default thread count (more about this can be found in the 10.0-4.2 release notes). Therefore, using concurrent mlcp jobs will not improve performance, as one job is already using full concurrent capacity.