Example: Generating Documents from a CSV File
When you import content from delimited text files, mlcp creates an XML or JSON document for each line of input after the initial header line.
The default document type is XML. To create JSON documents, use -document_type json
.
When creating XML documents, each document has a root node of <root>
and child elements with names corresponding to each column title. You can override the default root element name using the -delimited_root_name
option; for details, see Customizing XML Output.
When creating JSON documents, each document is rooted at an unnamed object containing JSON properties with names corresponding to each column title. By default, the values for JSON are always strings. Use -data_type
to override this behavior; for details, see Controlling Data Type in JSON Output.
For example, if you have the following data and mlcp command:
# Windows users, see 976fb286-6c4d-43fc-9d1c-d2d3ea060668 $ cat example.csv first,last george,washington betsy,ross $ mlcp.sh ... -mode local -input_file_path /space/mlcp/data \ -input_file_type delimited_text ...
Then mlcp creates the XML output shown in the table below. To generate the JSON output, add -document_type json
to the mlcp command line.
XML Output |
JSON Output |
---|---|
<root> <first>george</first> <last>washington</last> </root> <root> <first>betsy</first> <last>ross</last> </root> |
{ "first": "george", "last": "washington" } { "first": "betsy", "last": "ross" } |