Skip to main content

Using MarkLogic Content Pump (mlcp)

Export Documents

Use the mlcp export command to export documents from a MarkLogic Server database into files on your filesystem. You can export documents to several formats, including files, compressed files, and database archives. For details, see Exporting Content from MarkLogic Server.

You can identify the documents to export in several ways, including by URI, by directory, by collection, and by XPath expression. This example uses a directory filter. Recall that the input documents were loaded with URIs of the form /gs/import/filename. Therefore, we can easily extract the files by database directory using -directory_filter /gs/import/.

This example exports documents from the default database associated with the App Server on port 8000. Use the -database option to export documents from a different database.

Use the following procedure to export the documents inserted in Load Documents:

  1. If you are not already at the top level of your work area, change directory to this location. That is, the gs folder created in Prepare to Run the Examples. For example:

    cd gs
  2. Extract the previously inserted documents into a directory named export. The export directory must not already exist.

    Linux:
      mlcp.sh export -options_file conn.txt -output_file_path export \
        -directory_filter /gs/import/
    Windows:
      mlcp.bat export -options_file conn.txt -output_file_path export ^
        -directory_filter /gs/import/

You should see output similar to the following, but with a timestamp prefix on each line. The “OUTPUT_RECORDS: 2” line indicates mlcp exported 2 files.

INFO mapreduce.MarkLogicInputFormat: Fetched 1 forest splits.
INFO mapreduce.MarkLogicInputFormat: Made 1 splits.
INFO contentpump.LocalJobRunner:  completed 100%
INFO contentpump.LocalJobRunner: com.marklogic.mapreduce.MarkLogicCounter:
INFO contentpump.LocalJobRunner: INPUT_RECORDS: 2
INFO contentpump.LocalJobRunner: OUTPUT_RECORDS: 2
INFO contentpump.LocalJobRunner: Total execution time: 0 sec

The exported documents are in gs/export. A filesystem directory is created for each directory step in the original document URI. Therefore, you should now have the following directory structure:

gs/
  export/
    gs/
      import/
        one.xml
        two.json