Supported Input Format Summary

Use the -input_file_type option to tell mlcp the format of the data in each input file (or each entry inside a compressed file). This option controls if/how mlcp converts the content into database documents.

The default input type is documents, which means each input file or ZIP file entry creates one database document. All other input file types represent composite input formats which can yield multiple database documents per input file.

The following table provides a quick reference of the supported input file types, along with the allowed document types for each, and whether or not they can be passed to mlcp as compressed files.

input_file_type	Document Type	`-input_compressed` permitted
documents	XML, JSON, text, or binary; controlled with `-document_type`.	Yes
archive	As in the database: XML, JSON, text, and/or binary documents, plus metadata. The type is not under user control.	No (archives are already in compressed format)
delimited_text	XML or JSON	Yes
delimited_json	JSON	Yes
sequencefile	XML, text or binary; controlled with these options: `-input_sequencefile_value_class -input_sequencefile_value_type`	No. However, the contents can be compressed when you create the sequence file. Compression is bound up with the value class you use to generate and import the file.
aggregates	XML	Yes
rdf	Serialized RDF triples, in one of several formats. For details, see Supported RDF Triple Formats in the Semantic Graph Developer’s Guide. RDF/JSON is not supported.	Yes
forest	As in the database: XML, JSON, text, and/or binary documents. The type is not under user control.	No

When the input file type is documents or sequencefile you must consider both the input format (-input_file_type) and the output document format (-document_type). In addition, for some input formats, input can come from either compressed or uncompressed files (-input_compressed).

The -document_type option controls the database document format when -input_file_type is documents or sequencefile. MarkLogic Server supports text, JSON, XML, and binary documents. If the document type is not explicitly set with these input file types, mlcp uses the input file suffix to determine the type. For details, see How mlcp Determines Document Type.

Note

You cannot use mlcp to perform document conversions. Your input data should match the stated document type. For example, you cannot convert XML input into a JSON document just by setting -document_type json.

In this section:

Using MarkLogic Content Pump (mlcp)

Supported Input Format Summary

Note

Search results