Skip to main content

Using MarkLogic Content Pump (mlcp)

Reducing Memory Consumption With Streaming

The streaming protocol allows you to insert a large document into the database without holding the entire document in memory. Streaming uploads documents to MarkLogic Server in 128k chunks.

Streaming content into the database usually requires less memory on the host running mlcp, but ingestion can be slower because it introduces additional network overhead. Streaming also does not take advantage of mlcp’s builtin retry mechanism. If an error occurs that is normally retryable, the job will fail.

Note

Streaming is only usable when -input_file_type is documents. You cannot use streaming with delimited text files, sequence files, or archives.

To use streaming, enable the -streaming option. For example:

# Windows users, see Modifying the Example Commands for Windows
$ mlcp.sh import -username user -password password -host localhost \
    -port 8000 -input_file_path /my/dir -streaming