Supported Input Format Summary
Use the -input_file_type
option to tell mlcp the format of the data in each input file (or each entry inside a compressed file). This option controls if/how mlcp converts the content into database documents.
The default input type is documents
, which means each input file or ZIP file entry creates one database document. All other input file types represent composite input formats which can yield multiple database documents per input file.
The following table provides a quick reference of the supported input file types, along with the allowed document types for each, and whether or not they can be passed to mlcp as compressed files.
input_file_type |
Document Type |
|
---|---|---|
documents |
XML, JSON, text, or binary; controlled with |
Yes |
archive |
As in the database: XML, JSON, text, and/or binary documents, plus metadata. The type is not under user control. |
No (archives are already in compressed format) |
delimited_text |
XML or JSON |
Yes |
delimited_json |
JSON |
Yes |
sequencefile |
XML, text or binary; controlled with these options:
|
No. However, the contents can be compressed when you create the sequence file. Compression is bound up with the value class you use to generate and import the file. |
aggregates |
XML |
Yes |
rdf |
Serialized RDF triples, in one of several formats. For details, see Supported RDF Triple Formats in the Semantic Graph Developer’s Guide. RDF/JSON is not supported. |
Yes |
forest |
As in the database: XML, JSON, text, and/or binary documents. The type is not under user control. |
No |
When the input file type is documents
or sequencefile
you must consider both the input format (-input_file_type
) and the output document format (-document_type
). In addition, for some input formats, input can come from either compressed or uncompressed files (-input_compressed
).
The -document_type
option controls the database document format when -input_file_type
is documents
or sequencefile
. MarkLogic Server supports text, JSON, XML, and binary documents. If the document type is not explicitly set with these input file types, mlcp uses the input file suffix to determine the type. For details, see How mlcp Determines Document Type.
Note
You cannot use mlcp to perform document conversions. Your input data should match the stated document type. For example, you cannot convert XML input into a JSON document just by setting -document_type json
.