You can use the mlcp
version command to generate a report of key software versions mlcp detects in your runtime environment. This is useful for confirming your path and other environment settings create the environment you expect or mlcp requires.
$ mlcp.sh version ContentPump version: 8.0 Java version: 1.7.0_45 Hadoop version: 2.6.0 Supported MarkLogic versions: 6.0 - 8.0
Note that not all features of mlcp are supported by all versions of MarkLogic, even within the reported range of supported versions. For example, if MarkLogic version X introduces a new feature that is supported by mlcp, that doesn't mean you can use mlcp to work with the feature in MarkLogic version X-1.
All mlcp command lines include host and port information for connecting to MarkLogic Server. This host must be reachable from the host where you run mlcp. In distributed mode, this host must also be reachable from all the nodes in your Hadoop cluster.
In addition, mlcp connects directly to hosts in your MarkLogic Server cluster that contain forests of the target database. Therefore, all the hosts that serve a target database must be reachable from the host where mlcp runs (local mode) or the nodes in your Hadoop cluster (distributed mode).
Mlcp gets the lists of participating hosts by querying your MarkLogic Server cluster configuration. If a hostname returned by this query is not resolvable, mlcp will not be able to connect, which can prevent document loading.
If you think you might have connection issues, enable debug level logging to see details on name resolution and connection failures. For details, see Enabling Debug Level Messages.
You can enable debug level log messages to see detailed debugging information about what mlcp is doing. Debug logging generates many messages, so you should not enable it unless you need it to troubleshoot a problem.
/conf/log4j.properties. For example, if mlcp is installed in
log4j.properties, set the properties
DEBUG. For example, include the following:
documents, and the document type is set to (or determined to be) XML, but the input file fails to parse properly as XML. Correct the error in the input data and try again.
-input_file_pathto a location containing compressed files, but you do not set
-input_compression_codec. In this case, mlcp will load the compressed files as binary documents, rather than creating documents from the contents of the compressed files.
-document_typeto a value inconsistent with the input data referenced by
ATTEMPTED_INPUT_RECORD_COUNT is non-zero and
SKIPPED_INPUT_RECORD_COUNT is non-zero, then there are probably formatting errors in your input that mlcp detected on the client. Correct the input errors and try again. For example:
If mlcp reports an
ATTEMPTED_INPUT_RECORD_COUNT of 0, then the tool found no input documents meeting your requirements. If there are errors or warnings, correct them and try again. If there are no errors, then the combination of options on your command line probably does not select any suitable documents. For example:
Depending on your JVM version, you might see the message'Unable to load realm info from SCDynamicStore' when using mlcp if your system has Kerberos installed and
krb5.conf doesn't explictly list the realm information. You can safely ignore this message.
XDMP_SPECIALPROP when importing documents from an archive is caused by attempting to update the 'last modified' document property that is maintained by MarkLogic on the destination database. To eliminate this error, choose one of the following solutions:
falseon your import command line so that mlcp does attempt to import any document properties.