Loading TOC...
mlcp User Guide (PDF)

mlcp User Guide — Chapter 5

Exporting Content from MarkLogic Server

You can export content in a MarkLogic Server database to files or an archive. Use archives to copy content from one MarkLogic Server database to another. Output can be written to the native filesystem or to HDFS.

For a list of export related command line options, see Export Command Line Options.

You can also use mlcp to extract documents directly from offline forests. For details, see Using Direct Access to Extract or Copy Documents.

This section covers the following topics:

Exporting Documents as Files

Use the mlcp export command to export documents in their original format as files on the native filesystem or HDFS. For example, you can export an XML document as a text file containing XML, or a binary document as a JPG image.

To export documents from a database as files:

  1. Select the files to export. For details, see Filtering Document Exports.
    • To select documents in one or more collections, set -collection_filter to a comma separated list of collection URIs.
    • To select documents in one or more database directories, set -directory_filter to a comma separated list of directory URIs.
    • To select documents matching an XPath expression, use -document_selector. To use namespace prefixes in the XPath expression, define the prefix binding using -path_namespace.
    • To select documents matching a query, use -query_filter, alone or in combination with one of the other filter options. False postives are possible; for details, see Understanding When Filters Are Accurate.
    • To select all documents in the database, leave -collection_filter, -directory_filter, -document_selector, and -query_filter unset.
  2. Set -output_file_path to the destination file or directory on the native filesystem or HDFS.
  3. To prettyprint exported XML when using local mode, set -indented to true.

Directory names specified with -directory_filter should end with '/'.

When using -document_selector to filter by XPath expression, you can define namespace prefixes using the -path_namespace option. For example:

-path_namespace 'ex1,http://marklogic.com/example,ex2,http://my/ex2'
-document_selector '/ex1:elem[ex2:attr > 10]'

Document URIs are URI-decoded before filesystem directories or filenames are constructed for them. For details, see How URI Decoding Affects Output File Names.

For a full list of export options, see Export Command Line Options.

The following example exports selected documents in the database to the native filesystem directory /space/mlcp/export/files. The directory filter selects only the documents in /plays.

# Windows users, see Modifying the Example Commands for Windows 
$ mlcp.sh export -host localhost -port 8000 -username user \
    -password password -mode local -output_file_path \
    /space/mlcp/export/files -directory_filter /plays/

Exporting Documents to a Compressed File

Use the mlcp export command to export documents in their original format as files in a compressed ZIP file on the native filesystem or HDFS.

To export documents from a database as files:

  1. Select the files to export. For details, see Filtering Document Exports.
    • To select documents in one or more collections, set -collection_filter to a comma separated list of collection URIs.
    • To select documents in one or more database directories, set -directory_filter to a comma separated list of directory URIs.
    • To select documents matching an XPath expression, use -document_selector. To use namespace prefixes in the XPath expression, define the prefix binding using -path_namespace.
    • To select documents matching a query, use -query_filter, alone or in combination with one of the other filter options. False postives are possible; for details, see Understanding When Filters Are Accurate.
    • To select all documents in the database, leave -collection_filter, -directory_filter,-document_selector, and -query_filter unset.
  2. Set -output_file_path to the destination directory on the native filesystem or HDFS. This directory must not already exist.
  3. Set -compress to true.
  4. To prettyprint exported XML when using local mode, set -indented to true.

For a full list of export options, see Export Command Line Options.

The zip files created by export have filenames of the form timestamp-seqnum.zip.

The following example exports all the documents in the database to the directory /space/examples/export on the native filesystem.

# Windows users, see Modifying the Example Commands for Windows 
$ mlcp.sh export -host localhost -port 8000 -username user \
    -password password -mode local \
    -output_file_path /space/examples/export -compress true$ ls /space/examples/export
20120823135307-0700-000000-XML.zip

Exporting to an Archive

Use the mlcp export command with an output type of archive to create a database archive that includes content and metadata. You can use the mlcp import command to copy the archive to another database or restore database contents.

To export database content to an archive file with mlcp:

  1. Select the documents to export. For details, see Filtering Archive and Copy Contents.
    • To select documents in one or more collections, set -collection_filter to a comma separated list of collection URIs.
    • To select documents in one or more database directories, set -directory_filter to a comma separated list of directory URIs.
    • To select documents matching an XPath expression, use -document_selector. To use namespace prefixes in the XPath expression, define the prefix binding using -path_namespace.
    • To select documents matching a query, use -query_filter, alone or in combination with one of the other filter options. False postives are possible; for details, see Understanding When Filters Are Accurate.
    • To select all documents in the database, leave -collection_filter, -directory_filter, -document_selector, and -query_filter unset.
  2. Set -output_file_path to the destination directory on the native filesystem or HDFS. This directory must not already exist.
  3. Set -output_type to archive.
  4. If you want to exclude some or all document metadata from the archive:
    • Set -copy_collections to false to exclude document collections metadata.
    • Set -copy_permissions to false to exclude document permissions metadata.
    • Set -copy_properties to false to exclude document properties.
    • Set -copy_quality to false to exclude document quality metadata.

For a full list of export options, see Export Command Line Options.

The following example exports all documents and metadata to the directory /space/examples/exported. After export, the directory contains one or more compressed archive files.

# Windows users, see Modifying the Example Commands for Windows 
$ mlcp.sh export -host localhost -port 8000 -username user \
    -password password -mode local \
    -output_file_path /space/examples/exported -output_type archive

The following example exports only documents in the database directory /plays/, including their collections, properties, and quality, but excluding permissions:

# Windows users, see Modifying the Example Commands for Windows 
$ mlcp.sh export -host localhost -port 8000 -username user \
    -password password -mode local \
    -output_file_path /space/examples/exported -output_type archive \
    -copy_permissions false -directory_filter /plays/

You can use the mlcp import command to import an archive into a database. For details, see Loading Content and Metadata From an Archive.

How URI Decoding Affects Output File Names

This discussion only applies when -output_type is document.

When you export a document to a file (or to a file in a compressed file), the output file name is based on the document URI. The document URI is decoded to form the file name. For example, if the document URI is 'foo%20bar.xml', then the output file name is 'foo bar.xml'.

If the document URI does not conform to the standard URI syntax of RFC 3986, decoding may fail, resulting in unexpected file names. For example, if the document URI contains unescaped special characters then the raw URI may be used.

If the document URI contains a scheme, the scheme is removed. If the URI contains both a scheme and an authority, both are removed. For example, if the document URI is 'file:foo/bar.xml', then the output file path is output_file_path/foo/bar.xml. If the document URI is 'http://marklogic.com/examples/bar.xml' (contains a scheme and an authority), then the output file path is output_file_path/examples/bar.xml.

If the document URI includes directory steps, then corresponding output subdirectories are created. For example, if the document URI is '/foo/bar.xml', then the output file path is output_file_path/foo/bar.xml.

Controlling What is Exported, Copied, or Extracted

By default, mlcp exports all documents or all documents and metadata in the database, depending on whether you are exporting in document or archive format or copying the database. Several command line options are available to enable customization. This section covers the following topics:

Filtering Document Exports

This section covers options available for filtering what is exported by the mlcp export command when -output_type is document.

By default, mlcp exports all documents in the database. That is, mlcp exports the equivalent of fn:collection(). The following options allow you to filter what is exported. These options are mutually exclusive.

  • -directory_filter - export only the documents in the listed database directories. You cannot use this option with -collection_filter or -document-selector.
  • -collection_filter - export only the documents in the listed collections. You cannot use this option with -directory_filter or -document_selector.
  • -document_selector - export only documents selected by the specified XPath expression. You cannot use this option with -directory_filter or -collection_filter. Use -path_namespace to define namespace prefixes.
  • -query_filter - export only documents matched by the specified cts query. You can use this option alone or in combination with a directory, collection or document selector filter. You can only use this filter with the export and copy commands. Results may not be accurate; for details, see Understanding When Filters Are Accurate.

    When filtering with a document selector, the XPath filtering expression should select fragment roots only. An XPath expression that selects nodes below the root is very inefficient.

When using -document_selector to filter by XPath expression, you can define namespace prefixes using the -path_namespace option. For example:

-path_namespace 'ex1,http://marklogic.com/example,ex2,http://my/ex2'
-document_selector '/ex1:elem[ex2:attr > 10]'

Filtering Archive and Copy Contents

This section covers options available for controlling what is exported by mlcp export when -output_type is archive, or what is copied by the mlcp copy command.

By default, all documents and metadata are exported/copied. The following options allow you to modify this behavior:

  • -directory_filter - export/copy only the documents in the listed database directories, including related metadata. You cannot use this option with -collection_filter or -document_selector.
  • -collection_filter - export/copy only the documents in the listed collections, including related metadata. You cannot use this options with -directory_filter or -document_selector.
  • -document_selector - export/copy only documents selected by the specified XPath expression.You cannot use this option with -directory_filter or -collection_filter. Use -path_namespace to define namespace prefixes.
  • -query_filter - export/copy only documents matched by the specified cts query. You can use this option alone or in combination with a directory, collection or document selector filter. Results may not be accurate; for details, see Understanding When Filters Are Accurate.
  • -copy_collections - whether to include collection metadata
  • -copy_permissions - whether to include permissions metadata
  • -copy_properties - whether to include naked and document properties
  • -copy_quality - whether to include document quality metadata

If you set all the -copy_* options to false when exporting to an archive, the archive contains no metadata. When you import an archive with no metadata, you must set -archive_metadata_optional to true.

When filtering with a document selector, the XPath filtering expression should select fragment roots only. An XPath expression that selects nodes below the root is very inefficient.

When using -document_selector to filter by XPath expression, you can define namespace prefixes using the -path_namespace option. For example:

-path_namespace 'ex1,http://marklogic.com/example,ex2,http://my/ex2'
-document_selector '/ex1:elem[ex2:attr > 10]'

Understanding When Filters Are Accurate

When you use -directory_filter, -collection_filter, or -document_selector without -query_filter, the set of documents selected by mlcp exactly matches your filtering criteria.

The query you supply with -query_filter is used in an unfiltered search, which means there can be false positives among the selected documents. When you combine -query_filter with -directory_filter, -collection_filter, or -document_selector, mlcp might select documents that do not meet your directory, collection, or path filter criteria.

The interaction between -query_filter and the other filtering options is similar to the following. In this example, the search can match documents that are not in the 'parts' collection.

-collection_filter parts 
-query_filter 'cts:word-query("widget")'

==> selects the documents to export similar to the following:

cts:search(
  fn:collection("parts"), 
  cts:word-query("widget"), 
  ("unfiltered"))

To learn more about the implications of unfiltered searches, see Fast Pagination and Unfiltered Searches in the Query Performance and Tuning Guide.

Example: Exporting Documents Matching a Query

This example demonstrates how to use -query_filter to select documents for export. You can apply the same technique to filtering the source documents when copying documents from one database to another.

The -query_filter option accepts a serialized XML cts:query or JSON cts.query as its value. For example, the following table shows the serialization of a cts word query, prettyprinted for readability:

FormatExample
XML
<cts:word-query xmlns:cts="http://marklogic.com/cts">
  <cts:text xml:lang="en">mark</cts:text>
</cts:word-query>
JSON
{"wordQuery":{
  "text":["huck"], 
  "options":["lang=en"]
}}

Using an options file is recommended when using -query_filter because both XML and JSON serialized queries contain quotes and other characters that have special meaning to the Unix and Windows command shells, making it challenging to properly escape the query. For details, see Options File Syntax.

For example, you can create an options file similar to the following. It should contain at least 2 lines: One for the option name and one for the serialized query. You can include other options in the file.

FormatOptions File Contents
XML
-query_filter
<cts:word-query xmlns:cts="http://marklogic.com/cts"><cts:text xml:lang="en">mark</cts:text></cts:word-query>
JSON
-query_filter
{"wordQuery":{"text":["huck"], "options":["lang=en"]}}

If you save the option in a file named 'query_filter.txt', then following mlcp command uses the export files from the database that contain the word 'huck':

# Windows users, see Modifying the Example Commands for Windows 
$ mlcp.sh export -host localhost -port 8000 -username user \
    -password password -mode local -output_file_path \
    /space/mlcp/export/files -options_file query_filter.txt

You can combine -query_filter with another filtering option. For example, the following command combines the query with a collection filter. The command exports only documents containing the word 'huck' in the collection named 'classics':

$ mlcp.sh export -host localhost -port 8000 -username user \
    -password password -mode local -output_file_path \
    /space/mlcp/export/files -options_file query_filter.txt
    -collection_filter classics

The documents selected by -query_filter can include false positives, including documents that do not match other filter criteria. For details, see Understanding When Filters Are Accurate.

You can use Query Console to generate a serialized query by wrapping the result of a cts query constructor call in an XML node or JSON object and then serializing it with the xdmp:quote XQuery function or the xdmp.quote JavaScript function. Use XQuery to generate the XML serialization and Server-Side JavaScript to generate the JSON serialization.

The following example demonstrates generating a serialized XML cts:and-query or JSON cts.andQuery using the wrapper technique. Copy either example into Query Console, select the appropriate query type, and run it to see the output.

LanguageExample
XQuery
xquery version "1.0-ml";
let $query := cts:and-query((
  cts:word-query("mark"), 
  cts:word-query("twain")
))
let $q := xdmp:quote(
  <query>{$query}</query>/*, 
  <options xmlns="xdmp:quote"><indent>no</indent></options>
)
return $q

(: Output: (whitespace added for readability)
<cts:and-query xmlns:cts="http://marklogic.com/cts">
  <cts:word-query>
    <cts:text xml:lang="en">mark</cts:text>
  </cts:word-query>
  <cts:word-query>
    <cts:text xml:lang="en">twain</cts:text>
  </cts:word-query>
</cts:and-query>
:)
Server-Side JavasScript
var wrapper = 
  { query:
      cts.andQuery([
        cts.wordQuery("huck"),
        cts.wordQuery("tom")])
  };
xdmp.quote(wrapper.query.toObject())

/* Output: (whitespace added for readability)
{"andQuery":{
  "queries":[
    {"wordQuery":{"text":["huck"], "options":["lang=en"]}},
    {"wordQuery":{"text":["tom"], "options":["lang=en"]}}
  ]
}}
*/

Notice that in the XML example, the xdmp:quote 'indent' option is used to disable XML prettyprinting, making the output better suited for inclusion on the mlcp command line:

xdmp:quote(
  <query>{$query}</query>/*, 
  <options xmlns="xdmp:quote"><indent>no</indent></options>
)

Notice that in the JavaScript example, it is necessary to call toObject on the wrapped query to get the proper JSON serialization. Using toObject converts the value to a JavaScript object which xdmp.quote will serialize as JSON.

xdmp.quote(wrapper.query.toObject())

If you want to test your serialized query before using it with mlcp, you can round-trip your XML query with cts:search in XQuery or your JSON query with cts.search or the JSearch API in Server-Side JavaScript, as shown in the following examples.

LanguageExample
XQuery
xquery version "1.0-ml";
let $wrapper := 
  <query>{
    cts:and-query((
      cts:word-query("tom"),
      cts:word-query("huck")))
  }</query>
let $q := xdmp:quote(
  $wrapper/*, 
  <options xmlns="xdmp:quote"><indent>no</indent></options>)
return cts:search(
  fn:doc(), 
  cts:query(xdmp:unquote($q)/*[1])
)
Server-Side JavasScript
var wrapper = 
  { query:
      cts.andQuery([
        cts.wordQuery("huck"),
        cts.wordQuery("tom")])
  };
var serializedQ = xdmp.quote(wrapper.query.toObject())
cts.search(
  cts.query(xdmp.unquote(serializedQ).next().value.root))

Note that xdmp:unquote returns a document node in XQuery, so you need to use XPath to address the underlying query element root node when reconstructing the query:

cts:query(xdmp:unquote($q)/*[1])

Similarly, xdmp.unquote in JavaScript returns a ValueIterator on document nodes, so you must 'dereference' both the iterator and the document node when reconstructing the query:

cts.query(xdmp.unquote(serializedQ).next().value.root)

Filtering Forest Contents

This section covers options available for filtering what is extracted from from a forest when you use Direct Access. That is, when you use the mlcp import command with -input_file_type forest or the mlcp extract command.

By default, mlcp extracts all documents in the input forests. That is, mlcp extracts the equivalent of fn:collection(). The following options allow you to filter what is extracted from a forest with Direct Access. These options can be combined.

  • -type_filter: Extract only documents with the listed content type (text, XML, or binary).
  • -directory_filter: Extract only the documents in the listed database directories.
  • -collection_filter: Extract only the documents in the listed collections.

For example, following combination of options extracts only XML documents in the collections named '2004' or '2005'.

mlcp.sh extract -type_filter xml -collection_filter "2004,2005" ...

Similarly, the following options import only binary documents in the source database directory /images/:

mlcp.sh import -input_file_type forest \
    -type_filter binary -directory_filter /images/

When you use Direct Access, filtering is performed in the process that reads the forest files rather than being performed by MarkLogic Server. For example, in local mode, filters are applied by mlcp on the host where you run it; in distributed mode, filters are applied by each Hadoop task that reads in forest data.

In addition, filtering cannot be applied until after a document is read from the forest. When you import or extract files from a forest file, mlcp must 'touch' every document in the forest.

For details, see Using Direct Access to Extract or Copy Documents.

Extracting a Consistent Database Snapshot

By default, when you export or copy database contents, content is extracted from the source database at multiple points in time. You get whatever is in the database when mlcp accesses a given document. If the database contents are changing while the job runs, the results are not deterministic relative to the starting time of the job. For example, if a new document is inserted into the database while an export job is running, it might or might not be included in the export.

If you require a consistent snapshot of the database contents during an export or copy, use the -snapshot option to force all documents to be read from the database at a consistent point in time. The submission time of the job is used as the timestamp. Any changes to the database occurring after this time are not reflected in the output.

If a merge occurs while exporting or copying a consistent snapshot, and the merge eliminates a fragment that is subsequently accessed by the mlcp job, you may get an XDMP-OLDSTAMP error. If this occurs, the documents included in the same batch or task may not be included in the export/copy result. If the source database is on MarkLogic Server 7 or later, you may be able to work around this problem by setting the merge timestamp to retain fragments for a time period longer than the expected running time of the job; for details, see Understanding and Controlling Database Merges in the Administrator's Guide.

Advanced Document Selection and Transformation

The mlcp tool uses the MarkLogic Connector for Hadoop to distribute work across your MarkLogic cluster, even when run in local mode. When you use the mlcp export command, MarkLogic Server acts as an input source for a Hadoop MapReduce job. The exported documents are the output of the job. You can take low level control of the job by setting connector and Hadoop configuration properties.

Setting low level configuration properties is an advanced technique. You should understand how to use the MarkLogic Connector for Hadoop before attempting this. For details, see Advanced Input Mode in the MarkLogic Connector for Hadoop Developer's Guide.

The following list describes some use cases in which you might choose to set low level configuration properties:

Similar use cases and techniques apply to copy operations. For details, see Advanced Document Selection for Copy.

The following table lists some connector and Hadoop configuration properties relevant to advanced configuration for export.

Configuration PropertyDescription
mapreduce.marklogic.input.modeControls whether the connectors runs in basic or advanced mode. Set to 'advanced'.
mapreduce.marklogic.input.splitqueryA query that generates input splits. This distributes work across export tasks. The query can be either XQuery or Server-Side JavaScript. For details, see Creating Input Splits in the MarkLogic Connector for Hadoop Developer's Guide.
mapreduce.marklogic.input.queryA query that selects the input fragments to export. You can use the input query to apply server-side transformations to each output item. The query can be either XQuery or Server-Side JavaScript. For details, see Creating Input Key-Value Pairs in the MarkLogic Connector for Hadoop Developer's Guide.
mapreduce.job.inputformat.class

Optional. You do not need to set this property unless your input query produces something other than documents.

This property identifies a subclass of the connector InputFormat class, describing the 'type' of the values produced by the input query. You can create your own InputFormat subclass, but most applications will use one of the classes defined by the connector, such as DocumentInputFormat, which is the default used by mlcp. For details, see InputFormat Subclasses in the MarkLogic Connector for Hadoop Developer's Guide.

You can pass a connector configuration file through mlcp with the -conf option. The -conf option must appear after -options_file (if present) and before any other mlcp options. The following example command demonstrates the -conf option.

$ mlcp.sh export -conf conf.xml -host localhost -port 8000 \
    -username user -password password -mode local \
    -output_file_path /space/examples/exported \
    -directory_filter /binaies/

The following example connector configuration file uses an XQuery split query (mapreduce.marklogic.input.splitquery) to distribute the documents across export tasks, and an XQuery transformation query (mapreduce.marklogic.input.query) that returns just the first 1000 bytes of each selected binary document.

<property>
  <name>mapreduce.marklogic.input.query</name>
  <value><![CDATA[
    xquery version "1.0-ml"; 
    declare namespace mlmr="http://marklogic.com/hadoop";
    declare variable $mlmr:splitstart as xs:integer external;
    declare variable $mlmr:splitend as xs:integer external;
    for $doc in fn:doc()[$mlmr:splitstart to $mlmr:splitend]
    return xdmp:subbinary($doc/binary(), 1, 1000)
  ]]></value>
</property>
<property>
  <name>mapreduce.marklogic.input.splitquery</name>
  <value><![CDATA[
    xquery version "1.0-ml"; 
    import module namespace hadoop = "http://marklogic.com/xdmp/hadoop" 
      at "/MarkLogic/hadoop.xqy"; 
    hadoop:get-splits('', 'fn:doc()', '()')
  ]]></value>
</property>
<property>
  <name>mapreduce.marklogic.input.mode</name>
  <value>advanced</value>
</property>

For more details and examples, see the MarkLogic Connector for Hadoop Developer's Guide.

Export Command Line Options

This section summarizes the command line options available with the mlcp export command. The following command line options define your connection to MarkLogic:

OptionDescription
-host string
Hostname of the source MarkLogic Server. Required.
-port number
Port number of the source MarkLogic Server. There should be an XDBC App Server on this port. The App Server must not be SSL-enabled. Default: 8000.
-username string
MarkLogic Server user from which to export documents. Required, unless using Kerberos authentication.
-password string
Password for the MarkLogic Server user specified with -username. Required, unless using Kerberos authentication.

The following table lists command line options that define the characteristics of the export operation:

OptionDescription
-collection_filter comma-list
A comma-separated list of collection URIs. mlcp exports only documents in these collections, plus related metadata. This option may not be combined with -directory_filter or -document_selector. Default: All documents and related metadata.
-compress boolean
Whether or not to compress the output document. Only applicable when -output_type is document. Default: false.
-conf filename
Pass extra setting to Hadoop when using distributed mode. For details, see Setting Custom Hadoop Options and Properties. This option must appear before mlcp-specific options.
-content_encoding string
The character encoding of output documents when -input_file_type is documents. The option value must be a character set name accepted by your JVM; see java.nio.charset.Charset. Default: UTF-8. Set to system to use the platform default encoding for the host on which mlcp runs.
-copy_collections boolean
When exporting documents to an archive, whether or not to copy collections to the destination. Default: true.
-copy_permissions boolean
When exporting documents to an archive, whether or not to copy document permissions to the destination. Default: true.
-copy_properties boolean
When exporting documents to an archive, whether or not to copy properties to the destination. Default: true.
-copy_quality boolean
When exporting documents to an archive, whether or not to copy document quality to the destination. Default: true.
-D property=value
Pass a configuration property setting to Hadoop when using distributed mode. For details, see Setting Custom Hadoop Options and Properties. This option must appear before mlcp-specific options.
-database string
The name of the source database. Default: The database associated with the source App Server identified by -host and -port.
-directory_filter comma-list
A comma-separated list of database directory names. mlcp exports only documents from these directories, plus related metadata. Directory names should usually end with '/'. This option may not be combined with -collection_filter or -document_selector. Default: All documents and related metadata.
-document_selector string
Specifies an XPath expression used to select which documents are exported from the database. The XPath expression should select fragment roots. This option may not be combined with -directory_filter or -collection_filter. Default: All documents and related metadata.
-hadoop_conf_dir string
When using distributed mode, the Hadoop config directory. For details, see Configuring Distributed Mode.
-indented boolean
Whether to pretty-print XML output. Default: false.
-max_split_size number
The maximum number of document fragments processed per split. Default: 20000 in local mode, 50000 in distributed mode.
-mode string
Export mode. Accepted values: distributed, local. Distributed mode requires Hadoop. Default: local, unless you set the HADOOP_CONF_DIR variable; for details, see Configuring Distributed Mode.
-options_file string
Specify an options file pathname from which to read additional command line options. If you use an options file, this option must appear first. For details, see Options File Syntax.
-output_file_path string
Destination directory where the archive or documents are saved. The directory must not already exist.
-output_type string
The type of output to produce. Accepted values: document, archive. Default: document.
-path_namespace comma-list
Specifies one or more namespace prefix bindings for namespace prefixes usable in path expressions passed to -document_selector. The list items should be alternating pairs of prefix names and namespace URIs, such as 'pfx1,http://my/ns1,pfx2,http://my/ns2'.
-query_filter string
Specifies a query to apply when selecting documents for export. The argument must be the XML serialization of a cts:query or JSON serialization of a cts.query. Only documents matching the query are considered for export; false positives are possible. For details, see Controlling What is Exported, Copied, or Extracted.
-snapshot boolean
Whether or not to export a consistent point-in-time snapshot of the database contents. Default: false. When true, the job submission time is used as the database read timestamp for selecting documents to export. For details, see Extracting a Consistent Database Snapshot.
-thread_count number
The number of threads to spawn for concurrent exporting. The total number of threads spawned by the process can be larger than this number, but this option caps the number of concurrent sessions with MarkLogic Server. Only available in local mode. Default: 4.

« Previous chapter
Next chapter »
Powered by MarkLogic Server 7.0-4.1 and rundmc | Terms of Use | Privacy Policy