Skip to main content

Using MarkLogic Content Pump (mlcp)

Controlling the Output Document URI

By default, the document URIs use the value in the first column. For example, if your input data looks like the following:

first,last
george,washington
betsy,ross

Then importing this data with no URI related options creates two documents with name corresponding to the “first” value. The URI will be “george” and “betsy”.

Use -uri_id to choose a different column or -generate_uri to have MarkLogic Server automatically generate a unique URI for each document. For example, the following command creates the documents “washington” and “ross”:

# Windows users, see Modifying the Example Commands for Windows
$ mlcp.sh ... -mode local -input_file_path /space/mlcp/data \
    -input_file_type delimited_text -uri_id last

Note that URIs generated with -generate_uri are only guaranteed to be unique across your import operation. For details, see Default Document URI Construction.

You can further tailor the URIs using -output_uri_prefix and -output_uri_suffix. These options apply even when you use -generate_uri. For details, see Controlling Database URIs During Ingestion.

If your URI IDs are not unique, you can overwrite one document in your input set with another. Importing documents with non-unique URI IDs from multiple threads can also cause deadlocks.