Skip to main content

Using MarkLogic Content Pump (mlcp)

Controlling Data Type in JSON Output

When creating JSON documents, the default value type is string. You can use the -data_type option to specify explicit data types for some or all columns. The option accepts comma-separated list of columnName, typeName pairs, where the typeName can be one of number, boolean, or string.

For example, if you have an input file called “catalog.csv” that looks like the following:

id, price, in-stock
12345, 8.99, true
67890, 2.00, false

Then the default output documents look similar to the following. Notice that all the property values are strings.

{ "id": "12345",
  "price": "8.99",
  "in-stock": "true"
}

The following example command uses the -data_type option to make the “price” property a number value and the “in-stock” property a boolean value. Since the “id” field is not specified in the -data_type option, it remains a string.

$ mlcp.sh ... -mode local -input_file_path catalog.csv \
    -input_file_type delimited_text -document_type json \
    -data_type "price,number,in-stock,boolean"...
{ "id": "12345",
  "price": 8.99,
  "in-stock": true
}